运行 飞浆paddle paddlenlp 配置环境常见错误

1,577 阅读3分钟

百度的东西,向来不是很喜欢,但是开源的paddlenlp模型还是很好用的,在配置环境时由于不熟悉会遇到一些问题。以下将我在配置环境时遇到的问题做一个记录,也希望能帮助到遇到同样问题朋友

问题1:

Error: ../paddle/phi/kernels/gpu/embedding_kernel.cu:45 Assertion id < N failed. Id should smaller than 40000 but received an id value: 4342587581546123388.
Error: ../paddle/phi/kernels/gpu/embedding_kernel.cu:45 Assertion id < N failed. Id should smaller than 40000 but received an id value: 4342587581546123388.
Error: ../paddle/phi/kernels/gpu/embedding_kernel.cu:45 Assertion id < N failed. Id should smaller than 40000 but received an id value: 4342587581546123388.
[Hint: 'CUBLAS_STATUS_NOT_INITIALIZED'.  The cuBLAS library was not initialized. This is usually caused by the lack of a prior cublasCreate() call, an error in the CUDA Runtime API called by the cuBLAS routine, or an error in the hardware setup.  To correct: call cublasCreate() prior to the function call; and check that the hardware, an appropriate version of the driver, and the cuBLAS library are correctly installed.  ] (at ..\paddle\phi\backends\gpu\gpu_resources.cc:235)
[operator < linear > error]

这个是安装的 paddlepaddle-gpu 和 cudatoolkit 版本低了

使用conda安装的 paddlepaddle-gpu 官方命令如下:

conda install paddlepaddle-gpu==2.6.1 cudatoolkit=11.7 -c https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/Paddle/ -c conda-forge

从安装 cudatoolkit=11.7里可以看出这个cuda版本为11.7。用nvidia-smi命令看下安装cuda是啥版本的,如果更新过驱动,大概率cuda是高于11.7,如果是比较老的显卡,比如20系,则可以尝试降低本机cuda版本,我一台笔记本2050,最大可以安装到cuda12.0,在往上升级cuda,就会报错。

如果比较新的显卡比如40系,建议使用如下指令安装最新版paddlepaddle-gpu

conda install paddlepaddle-gpu==3.0.0b1 paddlepaddle-cuda=12.3 -c paddle -c nvidia 

我在4070上即使降低机器的cuda的版本也没法运行成功,最后只能升级paddlepaddle-gpu。

问题2:

W0812 10:38:31.533855  8368 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 8.9, Driver API Version: 12.6, Runtime API Version: 12.3
W0812 10:38:31.534919  8368 dynamic_loader.cc:313] Note: [Recommend] copy cudnn into CUDA installation directory.
For instance, download cudnn-10.0-windows10-x64-v7.6.5.32.zip from NVIDIA's official website,
then, unzip it and copy it into C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0
You should do this according to your CUDA installation directory and CUDNN version.

这个提示意思是本机没有安装CUDA,但是提示给出的解决方案没啥用处,可以在链接

developer.nvidia.com/cuda-toolki…

下载对应CUDA版本并且安装,不知道安装那个版本,可以用 nvidia-smi命令 看下当前机器安装的是啥驱动版本,然后在链接:

docs.nvidia.com/cuda/cuda-t…

看看能安装啥CUDA版本

驱动与cuda对照表

问题3:

RuntimeError: (PreconditionNotMet) The third-party dynamic library (cudnn64_9.dll) that Paddle depends on is not configured correctly. (error code is 126)
Suggestions:

Check if the third-party dynamic library (e.g. CUDA, CUDNN) is installed correctly and its version is matched with paddlepaddle you installed.
Configure third-party dynamic library environment variables as follows:


Linux: set LD_LIBRARY_PATH by export LD_LIBRARY_PATH=...
Windows: set PATH by set PATH=XXX;%PATH%
Mac: set  DYLD_LIBRARY_PATH by export DYLD_LIBRARY_PATH=... [Note: After Mac OS 10.11, using the DYLD_LIBRARY_PATH is impossible unless System Integrity Protection (SIP) is disabled.] (at C:\home\workspace\Paddle\paddle\phi\backends\dynload\dynamic_loader.cc:340)

这个问题有可能会和问题2同时出,虽然安装了cuda,还需要安装cuDNN,下载链接:developer.nvidia.com/rdp/cudnn-a…

以及

developer.nvidia.com/cudnn-downl…

然后根据CUDA版本选择cuDNN版本,我这里提示 (cudnn64_9.dll),需要的是版本9,在第一个链接里面找不到,需要在第二个链接下载。下载后得到一个zip包。将压缩包里面的 lib/x64 目录下的文件,拷贝到python所在环境下的Lib\site-packages\paddle\libs目录下即可

总结:

要想paddlenlp跑起来,需要 显卡驱动,cuda版本,cuDNN版本,以及paddlenlp版本都要一一对应上,才可以运行,新手朋友需要多多注意。