本文已参与「新人创作礼」活动,一起开启掘金创作之路。
一、参考资料
二、系统环境
系统:Ubuntu16.04
显卡:GeForce GTX1650,4GB
已安装CUDA:10.2
待安装CUDA:11.0
三、重要说明
- 准备工作,参考博客 显卡/cudn/cuDNN相关查询。
- 尽量保持最新的显卡驱动。
- CUDA需要与gcc版本对齐,参考资料: Ubuntu18.04安装cuda10.0 NVIDIA官网版本对齐
- 维护多个cuda版本:cuda安装到/usr/local/目录下,可以通过命令切换不同版本。
lrwxrwxrwx 1 root root 9 9月 4 19:58 cuda -> cuda-11.0/ drwxr-xr-x 18 root root 4096 9月 4 18:50 cuda-10.2/ drwxr-xr-x 16 root root 4096 9月 4 19:54 cuda-11.0/ drwxr-xr-x 17 root root 4096 8月 12 10:40 cuda-8.0/ drwxr-xr-x 18 root root 4096 9月 4 14:33 cuda-9.1/ drwxr-xr-x 18 root root 4096 9月 4 16:17 cuda-9.2/
- TF版本别用太新的:使用pip install tensorflow-gpu=1.x.0安装。
- 遇到问题不要无脑google:先自行分析原因,尝试办法,然后再google。
四、关键步骤
-
安装依赖
sudo apt-get install freeglut3-dev build-essential libx11-dev libxmu-dev libxi-dev libgl1-mesa-glx libglu1-mesa libglu1-mesa-dev
-
安装cuda,参考下文 ==安装过程选择选项==
sudo sh cuda_11.0.2_450.51.05_linux.run
-
CUDA安装成功
=========== = Summary = =========== Driver: Not Selected Toolkit: Installed in /usr/local/cuda-11.0/ Samples: Installed in /home/yichao/ Please make sure that - PATH includes /usr/local/cuda-11.0/bin - LD_LIBRARY_PATH includes /usr/local/cuda-11.0/lib64, or, add /usr/local/cuda-11.0/lib64 to /etc/ld.so.conf and run ldconfig as root To uninstall the CUDA Toolkit, run cuda-uninstaller in /usr/local/cuda-11.0/bin Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-11.0/doc/pdf for detailed information on setting up CUDA. ***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least .00 is required for CUDA 11.0 functionality to work. To install the driver using this installer, run the following command, replacing <CudaInstaller> with the name of this run file: sudo <CudaInstaller>.run --silent --driver Logfile is /var/log/cuda-installer.log
-
安装cuDNN 参考资料 CUDA、CUDNN在Ubuntu下的安装及配置
-
配置CUDA相关环境变量
Tensorflow官方安装例程要求注意的是:配置PATH和LD_LIBRARY_PATH和CUDA_HOME环境变量. #修改配置文件 sudo gedit ~/.bashrc #在文件结尾处添加 export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64 export PATH=$PATH:/usr/local/cuda/bin export CUDA_HOME=$CUDA_HOME:/usr/local/cuda # 更新配置 source ~/.bashrc
-
cuda多个版本的切换
# 查看当前cuda软链接,显示当前CUDA版本为10.0 ls -lh /usr/local # 删除之前创建的 cuda 软链接 sudo rm -rf /usr/local/cuda # 创建新 cuda 软链接 sudo ln -s /usr/local/cuda-11.0 /usr/local/cuda # 查看当前cuda软链接,显示当前CUDA版本为11.0 ls -lh /usr/local
安装过程选择选项
-
存在驱动,是否删除之前的驱动继续下面的操作?
Existing package manager installation of the driver found. It is strongly recommended that you remove this before continuing. Abort Continue
选择 [Continue],回车
-
是否接受协议
Do you accept the above EULA? (accept/decline/quit): accept
选择 [accept],回车
-
选择安装选项
CUDA Installer - [ ] Driver [ ] 450.51.05 + [X] CUDA Toolkit 11.0 [X] CUDA Samples 11.0 [X] CUDA Demo Suite 11.0 [X] CUDA Documentation 11.0 Options Install
不选驱动,选择 [Install],回车
Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 387.26? (y)es/(n)o/(q)uit: no
-
是否创建软链接
A symlink already exists at /usr/local/cuda. Update to this installation? Yes No # 首次安装,选Yes,安装额外的版本,选No
选择 [No],回车