Here's the reformatted document with Chinese notes and clear organization:
NVIDIA驱动安装指南
Update your system:
-
(This lists recommended drivers for your GPU.)
-
Install the recommended driver automatically:
bash
-
sudo ubuntu-drivers autoinstall
(Alternatively, install a specific version, e.g.,
sudo apt install nvidia-driver-550
) -
Reboot your system:
bash
-
sudo reboot
-
Verify installation:
bash
nvidia-smi
(This should show GPU details if the driver is working.)
3. CUDA Toolkit安装
注意:Ubuntu 24.04需先安装libtinfo5
wget http://archive.ubuntu.com/ubuntu/pool/main/n/ncurses/libtinfo5_6.1-1ubuntu1_amd64.deb
sudo dpkg -i libtinfo5_6.1-1ubuntu1_amd64.deb
从NVIDIA官网下载CUDA 12.4: this step will auto install the driver booth 开发者下载页面
Installer Type(deb local)
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin
sudo mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/12.4.0/local_installers/cuda-repo-ubuntu2204-12-4-local_12.4.0-550.54.14-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu2204-12-4-local_12.4.0-550.54.14-1_amd64.deb
sudo cp /var/cuda-repo-ubuntu2204-12-4-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cuda-toolkit-12-4
4. cuDNN安装
wget https://developer.download.nvidia.com/compute/cudnn/9.6.0/local_installers/cudnn-local-repo-ubuntu2404-9.6.0_1.0-1_amd64.deb
sudo dpkg -i cudnn-local-repo-ubuntu2404-9.6.0_1.0-1_amd64.deb
sudo cp /var/cudnn-local-repo-ubuntu2404-9.6.0/cudnn-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cudnn
5. 环境变量配置
nano ~/.bashrc
# 在文件末尾添加:
export PATH=/usr/local/cuda/bin:$PATH
# ctrl+o then enter ctrl+x保存后执行:
source ~/.bashrc
# 验证安装:
nvcc -V
6. Miniconda安装
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh
# 安装后重新打开终端
conda --version # 验证安装
7. Python环境配置
# 创建环境
conda create --name unsloth_env python=3.11
conda activate unsloth_env
# 安装PyTorch(CUDA 12.4)
pip install torch==2.5.0 torchvision==0.20.0 torchaudio==2.5.0 --index-url https://download.pytorch.org/whl/cu124
# 验证CUDA可用性
python -c "import torch; print(torch.cuda.is_available())"
8. Flash Attention 2安装
# 从GitHub下载(注意选择ABI=FALSE的版本)
pip install flash_attn-2.7.1.post1+cu12torch2.5cxx11abiFALSE-cp311-cp311-linux_x86_64.whl
9. Git安装(如需要)
sudo apt update
sudo apt install git
10. Unsloth安装
pip install "unsloth[cu124-ampere-torch250] @ git+https://github.com/unslothai/unsloth.git"
11. LLaMA-Factory安装
git clone --depth 1 https://github.com/hiyouga/LLaMA-Factory.git
cd LLaMA-Factory
pip install -e ".[torch,metrics]" -i https://pypi.tuna.tsinghua.edu.cn/simple
12. GPU监控
watch -n 1 nvidia-smi # 实时监控GPU状态(每秒刷新)
13. vLLM环境配置
conda create --name vllm_env python=3.11
conda activate vllm_env
pip install vllm -i https://pypi.tuna.tsinghua.edu.cn/simple
14. API服务启动脚本(run_vllm.sh)
#!/bin/bash
# 激活conda环境
conda activate vllm_env
# 启动API服务(后台运行)
python -m vllm.entrypoints.openai.api_server \
--model /home/hzw/Documents/Meta-Llama-3.1-8B1212 \
--tensor-parallel-size 1 \
--gpu-memory-utilization 0.9 \
--max-model-len 16384 \
--uvicorn-log-level debug &
PYTHON_PID=$! # 获取进程ID
# 运行客户端
/home/hzw/Documents/AIRequestClient
# 检查客户端执行状态
if [ $? -eq 0 ]; then
echo "客户端执行成功"
else
echo "客户端执行失败"
exit 1
fi
# 终止API服务
kill $PYTHON_PID
if [ $? -eq 0 ]; then
echo "API服务已终止"
else
echo "终止API服务失败"
fi
15. 脚本权限设置
chmod +x run_vllm.sh # 添加执行权限
./run_vllm.sh # 运行脚本
关键说明:
- 所有
-i https://pypi.tuna.tsinghua.edu.cn/simple
参数表示使用清华镜像源加速下载 - 注意根据实际CUDA版本(如12.4对应cu124)和PyTorch版本(如2.5.0)选择正确的包
- 路径
/home/hzw/Documents/
需替换为实际模型存放路径 - 使用
watch -n 1 nvidia-smi
可实时监控GPU显存占用情况