运行本地大模型
ollama run qwen3-coder-next
代理大模型
litellm --config config.yaml --port 4000
本机环境下需要切换到master环境下执行litellm
conda activate master
config.yaml
model_list:
- model_name: claude-sonnet-4-5-20250929
litellm_params:
model: ollama/qwen3-coder-next
api_key: sk-lmstudio
api_base: http://localhost:11434
general_settings:
master_key: sk-lmstudio-proxy-11434
router_settings:
routing_strategy: simple-shuffle
若没有安装litellm,使用命令:
pip install 'litellm[proxy]'
model_name 必须是 claude 支持的模型
配置文件
~/.claude/settings.json
{
"env": {
"ANTHROPIC_AUTH_TOKEN": "sk-lmstudio-proxy-11434",
"ANTHROPIC_BASE_URL": "http://localhost:4000",
"ANTHROPIC_MODEL": "claude-sonnet-4-5-20250929",
"ANTHROPIC_SMALL_FAST_MODEL": "claude-sonnet-4-5-20250929",
"CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC": 1,
"CLAUDE_CODE_MAX_OUTPUT_TOKENS": "6000"
},
"permissions": {
"allow": [],
"deny": []
}
}
注意 ANTHROPIC_MODEL 配置的模型名必须是包含在 http://localhost:4000/v1/models 返回模型名中
curl http://localhost:4000/v1/models \
-H "Authorization: Bearer sk-lmstudio-proxy-11434"
{"data":[{"id":"claude-sonnet-4-5-20250929","object":"model","created":1677610602,"owned_by":"openai"}],"object":"list"}