ollama运行本地大模型

8 阅读1分钟

运行本地大模型

ollama run qwen3-coder-next

ollama

代理大模型

litellm --config config.yaml --port 4000

本机环境下需要切换到master环境下执行litellm
conda activate master

config.yaml

model_list:
  - model_name: claude-sonnet-4-5-20250929
    litellm_params:
      model: ollama/qwen3-coder-next
      api_key: sk-lmstudio
      api_base: http://localhost:11434

general_settings:
  master_key: sk-lmstudio-proxy-11434

router_settings:
  routing_strategy: simple-shuffle

若没有安装litellm,使用命令:

pip install 'litellm[proxy]'

model_name 必须是 claude 支持的模型

配置文件

~/.claude/settings.json

{
  "env": {
    "ANTHROPIC_AUTH_TOKEN": "sk-lmstudio-proxy-11434",
    "ANTHROPIC_BASE_URL": "http://localhost:4000",
    "ANTHROPIC_MODEL": "claude-sonnet-4-5-20250929",
    "ANTHROPIC_SMALL_FAST_MODEL": "claude-sonnet-4-5-20250929",
    "CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC": 1,
    "CLAUDE_CODE_MAX_OUTPUT_TOKENS": "6000"
  },
  "permissions": {
    "allow": [],
    "deny": []
  }
}

注意 ANTHROPIC_MODEL 配置的模型名必须是包含在 http://localhost:4000/v1/models 返回模型名中

curl http://localhost:4000/v1/models \
  -H "Authorization: Bearer sk-lmstudio-proxy-11434"

{"data":[{"id":"claude-sonnet-4-5-20250929","object":"model","created":1677610602,"owned_by":"openai"}],"object":"list"}