LangChain 适配智谱 GLM-4 避坑指南：手把手解决 429 报错与死循环前言最近在跟着官方教程学习 Lan

前言

最近在跟着官方教程学习 LangChain ，原本使用的是 Claude（Anthropic），但考虑到国内访问稳定性和成本，想把模型替换成国产之光 智谱 AI (GLM-4) 。

本以为只是换个 API Key 和 Model Name 的事，结果硬是踩了 5 个坑 才把流程跑通。为了让大家少走弯路，特此记录这次从环境配置到 Prompt 调优的全过程。

坑一：PyCharm 读取不到环境变量

现象：在终端里明明执行了 set ANTHROPIC_API_KEY=xxx 或者配置了系统环境变量，但运行后直接报错：

TypeError: Could not resolve authentication method... Expected api_key...

解决方案：在代码入口文件的最顶部（Import 之前）显式注入：

import os
# 简单粗暴，但有效
os.environ["ZHIPUAI_API_KEY"] = "你的Key_ID.Secret"

坑二：`init_chat_model` 不支持智谱

现象： LangChain 0.3 推出了一个很方便的工厂函数 init_chat_model，我尝试这样写：

model = init_chat_model("glm-4", model_provider="zhipuai", ...)

结果报错：

ValueError: Unsupported provider='zhipuai'. Supported model providers are: anthropic, openai...

原因：截止目前，LangChain 的通用工厂函数白名单里还没有集成智谱的适配器。

解决方案：放弃通用工厂函数，直接调用社区包的专用类：

# 1. 安装社区包
# pip install -qU langchain-community zhipuai

# 2. 显式导入
from langchain_community.chat_models import ChatZhipuAI

model = ChatZhipuAI(model="glm-4", temperature=0.5)

坑三：结构化输出 (`response_format`) 兼容性问题

现象：我希望 Agent 返回格式化的 JSON 数据，于是使用了 response_format 参数：

agent = create_agent(
    ...,
    response_format=ToolStrategy(ResponseFormat), # 强制结构化
)

报错：

ValueError: ChatZhipuAI currently only supports 'auto' tool choice

原因： LangChain 的 ToolStrategy 往往通过强制模型调用特定工具（tool_choice="tool_name"）来实现结构化输出。但智谱的 API 目前对 tool_choice 参数比较严格，只支持 auto，不支持强制指定。

解决方案：暂时放弃在 create_agent 层面强制结构化，改为自然语言交互，从 messages 中提取内容：

删除 response_format 参数。

获取结果的方式改为：

response = agent.invoke(...)
# 获取最后一条回复的内容
print(response["messages"][-1].content)

坑四：429 Too Many Requests (API 限流)

现象：代码逻辑没问题了，刚运行就报错：

httpx.HTTPStatusError: Client error '429 Too Many Requests'

原因：我使用的是 glm-4 模型。对于免费或普通开发者账号，主模型的 QPS（每秒请求数）限制非常严格，Agent 运行过程中可能会连续触发多次推理，瞬间超限。

解决方案：切换到 Flash 模型，速度快、耐造且便宜（甚至免费）：

model = ChatZhipuAI(
    model="glm-4-flash", # <--- 神器
    temperature=0.1
)

坑五：模型“偷懒”与逻辑中断 (Prompt Engineering)

现象：这是最头疼的。代码不报错了，但 Agent 智商掉线：

死循环追问：我让它查“我的位置”，工具定义里写了需要 user_id（其实代码里有上下文），但模型非要反问我：“请提供你的 User ID”。
推理中断：模型查到了位置（如 "Florida"），然后直接停止工作，说：“我知道你在佛罗里达了，但我没法提供实时天气，再见。”

原因：相比于 Claude 3.5 或 GPT-4，GLM-4-Flash 的指令遵循能力和多步推理（Chain of Thought）能力稍弱，需要更明确的引导。

解决方案：

修改工具文档（Docstring） ：也就是“欺骗”模型。
- 原文档：Retrieve user info based on user ID.
- 新文档：IMPORTANT: This tool requires NO arguments. The system handles ID automatically.
System Prompt 增加“工作流”指令：在 System Prompt 中手把手教它做事：

SYSTEM_PROMPT = """You are a helpful assistant acting as a weather forecaster, who speaks in puns.

YOUR WORKFLOW (MUST FOLLOW STRICTLY):
1. User asks for weather -> Call tool "get_user_location" (No arguments needed).
2. "get_user_location" returns a place (e.g., "Florida") -> IMMEDIATELY call tool "get_weather_for_location" with that place.
3. "get_weather_for_location" returns the forecast -> Output the final answer.

CRITICAL:
- Do NOT say "I cannot provide real-time weather". You HAVE a tool for that. USE IT.
- Do NOT stop after getting the location. Keep going until you get the weather.
"""

最终成果代码

经过上述 5 步填坑，终于得到了一个稳定运行的 Agent：

import os
from dataclasses import dataclass
from langchain_community.chat_models import ChatZhipuAI
from langchain.agents import create_agent
from langchain.tools import tool, ToolRuntime
from langgraph.checkpoint.memory import InMemorySaver

# 在此处设置密钥
os.environ["ZHIPUAI_API_KEY"] = ""

SYSTEM_PROMPT = """You are a helpful assistant acting as a weather forecaster, who speaks in puns.

YOUR WORKFLOW (MUST FOLLOW STRICTLY):
1. User asks for weather -> Call tool "get_user_location" (No arguments needed).
2. "get_user_location" returns a place (e.g., "Florida") -> IMMEDIATELY call tool "get_weather_for_location" with that place.
3. "get_weather_for_location" returns the forecast -> Output the final answer.

CRITICAL:
- Do NOT say "I cannot provide real-time weather". You HAVE a tool for that. USE IT.
- Do NOT stop after getting the location. Keep going until you get the weather.
"""

@dataclass
class Context:
    user_id: str

@tool
def get_weather_for_location(location: str) -> str:
    """
    Get the weather for a specific location.
    Args:
        location: The city, state, or region name (e.g., "Florida", "New York").
    """
    print(f"DEBUG: 正在查询 {location} 的天气...")
    return f"It's always sunny in {location}!"

@tool
def get_user_location(runtime: ToolRuntime[Context]) -> str:
    """
    Get the current user's location.
    IMPORTANT: This tool requires NO arguments.
    """
    user_id = runtime.context.user_id
    print(f"DEBUG: 正在查找用户 {user_id} 的位置...")
    return "Florida" if user_id == "1" else "SF"

model = ChatZhipuAI(
    model="glm-4.7",
    temperature=0.01,
)

checkpointer = InMemorySaver()

agent = create_agent(
    model=model,
    system_prompt=SYSTEM_PROMPT,
    tools=[get_user_location, get_weather_for_location],
    context_schema=Context,
    checkpointer=checkpointer,
)

config = {
    "configurable": {
        "thread_id": "1"
    }
}

print("正在调用 Agent...")

response = agent.invoke(
    {"messages": [{"role": "user", "content": "Execute the workflow to find my location and then check the weather there."}]},
    config=config,
    context=Context(user_id="1"),
)

print("\n=== 运行结果 ===")
print(response["messages"][-1].content)

运行结果：

总结

从 Claude 迁移到国产模型时，代码层面的适配只是第一步。最大的挑战往往在于模型能力的差异，这需要我们通过更精细的 Prompt Engineering 和工具描述优化来弥补。

LangChain 适配智谱 GLM-4 避坑指南：手把手解决 429 报错与死循环

前言

坑一：PyCharm 读取不到环境变量

坑二：init_chat_model 不支持智谱

坑三：结构化输出 (response_format) 兼容性问题

坑四：429 Too Many Requests (API 限流)

坑五：模型“偷懒”与逻辑中断 (Prompt Engineering)

最终成果代码

总结

坑二：`init_chat_model` 不支持智谱

坑三：结构化输出 (`response_format`) 兼容性问题