vllm推理模型报错 RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use

2 阅读1分钟

报错:

RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method

解决方法

[Bug]: When tensor_parallel_size>1, RuntimeError: Cannot re-initialize CUDA in forked subprocess. · Issue #6152 · vllm-project/vllm (github.com)