Index-TTS2的安装和使用本文手把手一步一步记录了怎么安装最新的Index-TTS2，并记录了怎么解决遇到的CUD

一、参考文档

二、下载源码：

git clone https://github.com/index-tts/index-tts.git
cd index-tts
git lfs pull  # download large repository files

如果下载不了，则去下载zip文件后解压。我是在windows机器上面执行完后采用xftp将indext-tts目录整体传送到Linxu机器上的。

官方强烈建议采用uv工具安装，因此这里也是采用uv安装、运行程序。

conda create -n index-tts2 python=3.10
conda activate index-tts2
pip install -U uv -i https://mirrors.aliyun.com/pypi/simple
如果报错，则直接下载https://astral.sh/uv/install.sh，然后执行chmod +x uv-installer.sh,./uv-installer.sh

#根据配置文件（pyproject.toml、requirements.txt 等），安装所有必需依赖和所有可选依赖组，并生成锁文件确保环境一致性
uv sync --all-extras
如果上面的uv sync --all-extras安装有问题，则用下面的2个命令之一尝试：
uv sync --all-extras --default-index "https://mirrors.aliyun.com/pypi/simple"
uv sync --all-extras --default-index "https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple"

二、下载模型：

# 安装 modelscope 工具
uv tool install "modelscope"
如果安装较慢，换如下命令：
uv tool install "modelscope" --default-index "https://mirrors.aliyun.com/pypi/simple"

安装有如下警告信息：
warning: `/root/.local/bin` is not on your PATH. To use installed tools, run `export PATH="/root/.local/bin:$PATH"` or `uv tool update-shell`.

按照提示执行：
export PATH="/root/.local/bin:$PATH"
 
# 下载 IndexTTS-2 模型到 checkpoints 目录
modelscope download --model IndexTeam/IndexTTS-2 --local_dir checkpoints

除了上述模型外，项目首次运行时还会自动下载一些小型模型。如果你的网络环境访问 HuggingFace 较慢，建议在运行代码前先执行以下命令：
export HF_ENDPOINT="https://hf-mirror.com"

这个下载模型时间比较长，慢慢等待下载完毕。

三、环境要求

要求cuda 12.8及更高版本

1０G左右的显存

四、使用

要运行脚本，你必须使用 uv run <file.py> 命令，以确保代码在你当前的 "uv" 环境中执行。有时可能还需要将当前目录添加到你的 PYTHONPATH 环境变量中，以帮助程序找到 IndexTTS 模块：

cd /data4/index-tts/

PYTHONPATH="$PYTHONPATH:." uv run indextts/infer_v2.py

PYTHONPATH="$PYTHONPATH:." uv run indextts/infer_cctv.py

第一次运行过程会比较慢，慢慢等待，其会下载模型，比如：model.safttensors （2.32G）

RTF: 0.6080

4.1.使用单个参考音频文件合成新的语音（语音克隆）：

from indextts.infer_v2 import IndexTTS2
tts = IndexTTS2(cfg_path="checkpoints/config.yaml", model_dir="checkpoints", use_fp16=False, use_cuda_kernel=False, use_deepspeed=False)
text = "Translate for me, what is a surprise!"
tts.infer(spk_audio_prompt='examples/voice_01.wav', text=text, output_path="gen.wav", verbose=True)

4.2.使用独立的、带有情感的参考音频文件来调节语音合成：

from indextts.infer_v2 import IndexTTS2
tts = IndexTTS2(cfg_path="checkpoints/config.yaml", model_dir="checkpoints", use_fp16=False, use_cuda_kernel=False, use_deepspeed=False)
text = "酒楼丧尽天良，开始借机竞拍房间，哎，一群蠢货。"
tts.infer(spk_audio_prompt='examples/voice_07.wav', text=text, output_path="gen.wav", emo_audio_prompt="examples/emo_sad.wav", verbose=True)

4.3.当指定情感参考音频文件时，你可以选择设置 emo_alpha 参数来调整其对合成结果的影响程度。有效取值范围为 0.0 至 1.0，默认值为 1.0（即 100%）：

from indextts.infer_v2 import IndexTTS2
tts = IndexTTS2(cfg_path="checkpoints/config.yaml", model_dir="checkpoints", use_fp16=False, use_cuda_kernel=False, use_deepspeed=False)
text = "酒楼丧尽天良，开始借机竞拍房间，哎，一群蠢货。"
tts.infer(spk_audio_prompt='examples/voice_07.wav', text=text, output_path="gen.wav", emo_audio_prompt="examples/emo_sad.wav", emo_alpha=0.9, verbose=True)

4.4.也可以省略情感参考音频，转而提供一个包含8个浮点数的列表来指定每种情感的强度，顺序如下：[高兴、愤怒、悲伤、害怕、厌恶、忧郁、惊讶、平静]。此外，你还可以使用 use_random 参数在推理过程中引入随机性；默认值为 False，将其设置为 True 即可启用随机性：

from indextts.infer_v2 import IndexTTS2
tts = IndexTTS2(cfg_path="checkpoints/config.yaml", model_dir="checkpoints", use_fp16=False, use_cuda_kernel=False, use_deepspeed=False)
text = "哇塞！这个爆率也太高了！欧皇附体了！"
tts.infer(spk_audio_prompt='examples/voice_10.wav', text=text, output_path="gen.wav", emo_vector=[0, 0, 0, 0, 0, 0, 0.45, 0], use_random=False, verbose=True)

注意：启用随机采样将降低语音合成的语音克隆保真度

from indextts.infer_v2 import IndexTTS2
tts = IndexTTS2(cfg_path="checkpoints/config.yaml", model_dir="checkpoints", use_fp16=False, use_cuda_kernel=False, use_deepspeed=False)
text = "哇塞！这个爆率也太高了！欧皇附体了！"
tts.infer(spk_audio_prompt='examples/voice_10.wav', text=text, output_path="gen.wav", emo_vector=[0, 0, 0, 0, 0, 0, 0.45, 0], use_random=False, verbose=True)

4.5.或者，你可以启用 use_emo_text 参数，根据你提供的文本脚本来引导情感表达。随后，你的文本脚本将自动转换为情感向量。建议在使用文本情感模式时，将 emo_alpha 设置为约 0.6（或更低），以使语音听起来更自然。你也可以通过 use_random 参数引入随机性（默认值为 False；设为 True 则启用随机性）：

from indextts.infer_v2 import IndexTTS2
tts = IndexTTS2(cfg_path="checkpoints/config.yaml", model_dir="checkpoints", use_fp16=False, use_cuda_kernel=False, use_deepspeed=False)
text = "快躲起来！是他要来了！他要来抓我们了！"
tts.infer(spk_audio_prompt='examples/voice_12.wav', text=text, output_path="gen.wav", emo_alpha=0.6, use_emo_text=True, use_random=False, verbose=True)

4.6.也可以通过 emo_text 参数直接提供特定的文本情感描述。随后，你的情感文本将自动转换为情感向量。这样，你可以分别控制文本脚本和文本情感描述：

from indextts.infer_v2 import IndexTTS2
tts = IndexTTS2(cfg_path="checkpoints/config.yaml", model_dir="checkpoints", use_fp16=False, use_cuda_kernel=False, use_deepspeed=False)
text = "快躲起来！是他要来了！他要来抓我们了！"
emo_text = "你吓死我了！你是鬼吗？"
tts.infer(spk_audio_prompt='examples/voice_12.wav', text=text, output_path="gen.wav", emo_alpha=0.6, use_emo_text=True, emo_text=emo_text, use_random=False, verbose=True)

注意：IndexTTS2 仍支持汉字与拼音的混合建模。当需要精确控制发音时，请提供带有特定拼音标注的文本来启用拼音控制功能。请注意，拼音控制并非对所有可能的声母-韵母组合都有效；仅支持有效的汉语拼音情况。完整有效条目列表请参考 checkpoints/pinyin.vocab。

之前你做DE5很好，所以这一次也DEI3做DE2很好才XING2，如果这次目标完成得不错的话，我们就直接打DI1去银行取钱。

六、重装CUDA和cuＤＮＮ

运行过程中报错

Traceback (most recent call last):
  File "/data4/index-tts/indextts/infer_v2.py", line 842, in <module>
    tts.infer(spk_audio_prompt=prompt_wav, text=text, output_path="gen.wav", verbose=True)
  File "/data4/index-tts/indextts/infer_v2.py", line 372, in infer
    return list(self.infer_generator(
  File "/data4/index-tts/indextts/infer_v2.py", line 444, in infer_generator
    spk_cond_emb = self.get_emb(input_features, attention_mask)
  File "/data4/index-tts/.venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    return func(*args, **kwargs)
  File "/data4/index-tts/indextts/infer_v2.py", line 219, in get_emb
    vq_emb = self.semantic_model(
  File "/data4/index-tts/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/data4/index-tts/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl
    return forward_call(*args, **kwargs)
  File "/data4/index-tts/.venv/lib/python3.10/site-packages/transformers/models/wav2vec2_bert/modeling_wav2vec2_bert.py", line 1027, in forward
    encoder_outputs = self.encoder(
  File "/data4/index-tts/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/data4/index-tts/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl
    return forward_call(*args, **kwargs)
  File "/data4/index-tts/.venv/lib/python3.10/site-packages/transformers/models/wav2vec2_bert/modeling_wav2vec2_bert.py", line 533, in forward
    layer_outputs = layer(
  File "/data4/index-tts/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/data4/index-tts/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl
    return forward_call(*args, **kwargs)
  File "/data4/index-tts/.venv/lib/python3.10/site-packages/transformers/models/wav2vec2_bert/modeling_wav2vec2_bert.py", line 452, in forward
    hidden_states = self.conv_module(hidden_states, attention_mask=conv_attention_mask)
  File "/data4/index-tts/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/data4/index-tts/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl
    return forward_call(*args, **kwargs)
  File "/data4/index-tts/.venv/lib/python3.10/site-packages/transformers/models/wav2vec2_bert/modeling_wav2vec2_bert.py", line 208, in forward
    hidden_states = self.pointwise_conv1(hidden_states)
  File "/data4/index-tts/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/data4/index-tts/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl
    return forward_call(*args, **kwargs)
  File "/data4/index-tts/.venv/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 371, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "/data4/index-tts/.venv/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 366, in _conv_forward
    return F.conv1d(
RuntimeError: cuDNN error: CUDNN_STATUS_NOT_INITIALIZED

###　6.1.重装cuda 12.9.1：

查看安装的显卡驱动版本：nvidia-smi

Driver Version: 570.181

根据显卡驱动版本安装适配的ＣＵＤＡ版本：

在这个　docs.nvidia.com/cuda/cuda-t…

所以需要安装１２.ｘ版本的ＣＵＤＡ。

然后去在developer.nvidia.com/cuda-toolki…

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin
sudo mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/12.9.1/local_installers/cuda-repo-ubuntu2204-12-9-local_12.9.1-575.57.08-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu2204-12-9-local_12.9.1-575.57.08-1_amd64.deb
sudo cp /var/cuda-repo-ubuntu2204-12-9-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cuda-toolkit-12-9

6.2.重装cudnn

developer.nvidia.com/cudnn-downl… 选择不同的参数其最终安装的命令会不一样：

wget https://developer.download.nvidia.com/compute/cudnn/9.17.0/local_installers/cudnn-local-repo-ubuntu2204-9.17.0_1.0-1_amd64.deb
sudo dpkg -i cudnn-local-repo-ubuntu2204-9.17.0_1.0-1_amd64.deb
sudo cp /var/cudnn-local-repo-ubuntu2204-9.17.0/cudnn-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cudnn

若要安装适用于 CUDA 12 的版本，请执行上述配置，但安装 CUDA 12 特定软件包：

sudo apt-get -y install cudnn9-cuda-12

重装CUDA和cuDNN后，重新执行一次uv同步：

uv sync --all-extras --default-index "https://mirrors.aliyun.com/pypi/simple"