Gemini API 今天更新了一个杀手级功能，一个请求同时调 Google 搜索 + 自定义函数 🔥昨晚刷 Goog

昨晚刷 Google 开发者博客发现 Gemini API 悄悄放了个大招——Compound Tools，说白了就是一个 API 请求里可以同时用 Google Search、Google Maps 这些内置工具和你自己写的函数。

之前搞 Agent 的都知道，想让 AI 先搜索再调你的后端接口，得自己写一堆编排逻辑，LangChain 套 LangGraph 再套个 Router，调试起来想砸键盘。现在 Gemini 直接在 API 层面支持了，一个请求搞定。

先说结论

特性	之前	现在
工具组合	要么内置工具，要么自定义函数，二选一	同一个请求里混用，随便组合
上下文传递	自己拼 JSON 传来传去	Context Circulation 自动流转
调试追踪	日志里一坨，鬼知道哪个工具返回的	每个工具调用有唯一 ID
延迟	多次请求，累计延迟高	单次请求，端到端延迟降 40%+

什么是 Compound Tools？

简单讲：你可以在一个 generate_content 调用里同时传 google_search 和你自己的 function_declarations，Gemini 会自动决定先调哪个、后调哪个，中间结果自动传递。

举个实际场景：用户问「北京今天适合户外跑步吗？帮我查下最近的体育公园」

以前你得：

调一次 Gemini + Google Search 查天气
拿到结果，自己判断，再调一次 Gemini + Maps 查公园
拿到两个结果，再调一次 Gemini 生成回答
三次 API 调用，三次网络延迟，自己写胶水代码

现在：一个请求，Gemini 自动编排，内部搜索+地图+你的函数，一次搞定。

上手实操

安装最新 SDK

pip install -U google-genai

最简示例：搜索 + 自定义函数

from google import genai
from google.genai import types

client = genai.Client(api_key="YOUR_API_KEY")

# 定义你的业务函数
get_weather = {
    "name": "getWeather",
    "description": "获取指定城市的实时天气",
    "parameters": {
        "type": "object",
        "properties": {
            "city": {
                "type": "string",
                "description": "城市名，如：北京、上海",
            },
        },
        "required": ["city"],
    },
}

# 关键：google_search 和 function_declarations 放在同一个 Tool 里
response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents="帮我查一下 Utqiaġvik 在哪里，顺便看看那儿今天天气怎么样",
    config=types.GenerateContentConfig(
        tools=[
            types.Tool(
                google_search=types.ToolGoogleSearch(),
                function_declarations=[get_weather]
            ),
        ],
        # 这个 flag 是关键！不开的话工具之间不共享上下文
        include_server_side_tool_invocations=True
    ),
)

注意那个 include_server_side_tool_invocations=True，这是整个功能的开关。不加这个 flag，工具之间各玩各的，上下文不流通。

解析混合响应

返回的 response 里会同时包含内置工具的调用结果和你函数的调用请求：

for part in response.candidates[0].content.parts:
    # 内置工具（Google Search）的调用和结果
    if part.tool_call:
        print(f"[内置工具调用] {part.tool_call.tool_type}, ID: {part.tool_call.id}")
    if part.tool_response:
        print(f"[内置工具结果] {part.tool_response.tool_type}, ID: {part.tool_response.id}")

    # 你的自定义函数调用（需要你来执行）
    if part.function_call:
        print(f"[函数调用] {part.function_call.name}, 参数: {part.function_call.args}")
        print(f"  调用 ID: {part.function_call.id}")  # 回传时要用这个 ID

    # 最终文本
    if part.text:
        print(f"[回答] {part.text}")

这里有个重要区别：

tool_call / tool_response：内置工具，Gemini 服务端自动执行，你不用管
function_call：你的自定义函数，Gemini 只告诉你「我想调这个函数」，你得自己执行然后把结果传回去

多轮对话：传回函数结果

# 第一轮的函数调用 ID（从 response 里拿）
func_call_id = None
for part in response.candidates[0].content.parts:
    if part.function_call:
        func_call_id = part.function_call.id

# 构建完整历史（必须保留所有 parts，包括内置工具的调用记录）
history = [
    types.Content(
        role="user",
        parts=[types.Part(text="帮我查一下 Utqiaġvik 在哪里，顺便看看那儿今天天气怎么样")]
    ),
    # 把第一轮的完整 response 原样传回（包含 tool_call、tool_response、function_call）
    response.candidates[0].content,
    # 你的函数执行结果
    types.Content(
        role="user",
        parts=[types.Part(
            function_response=types.FunctionResponse(
                name="getWeather",
                response={"temperature": "-15°C", "condition": "暴雪", "wind": "北风6级"},
                id=func_call_id  # 必须和 function_call 的 ID 对应
            )
        )]
    )
]

# 第二轮请求
response_2 = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents=history,
    config=types.GenerateContentConfig(
        tools=[
            types.Tool(
                google_search=types.ToolGoogleSearch(),
                function_declarations=[get_weather]
            ),
        ],
        include_server_side_tool_invocations=True
    ),
)

print(response_2.text)
# 输出类似：Utqiaġvik（原名巴罗）是美国最北端的城市，位于阿拉斯加...
# 今天当地气温 -15°C，正在下暴雪，北风6级，建议待在室内...

Context Circulation 到底干了啥

这是这次更新最精华的部分。以前内置工具调用完，结果就丢了，下一轮对话模型根本不知道上一轮搜了啥。

现在开了 include_server_side_tool_invocations 之后，每次内置工具的调用和返回都会作为 parts 保留在对话历史里。也就是说：

Gemini 调了 Google Search 搜「Utqiaġvik 在哪」
搜索结果作为 tool_response 存在上下文里
接下来调你的 getWeather 函数时，模型已经知道这个城市在阿拉斯加
你返回天气数据后，模型综合搜索结果 + 天气数据生成最终回答

整个过程模型的推理链路是连贯的，不是割裂的。

踩坑提醒：回传历史时必须保留 id 和 thought_signature 字段，少传一个字段就会报错。别问我怎么知道的，调了两小时才发现是 thought_signature 没传回去导致的 400 错误。

实际场景：做个「智能选餐」Agent

来个更贴近实际的例子——用户说「我在望京 SOHO，帮我找个人均 100 以内的日料，顺便看看大众点评评分」：

# 定义两个业务函数
search_restaurant = {
    "name": "searchRestaurant",
    "description": "搜索指定位置附近的餐厅",
    "parameters": {
        "type": "object",
        "properties": {
            "location": {"type": "string", "description": "位置"},
            "cuisine": {"type": "string", "description": "菜系"},
            "budget": {"type": "number", "description": "人均预算（元）"}
        },
        "required": ["location", "cuisine"]
    }
}

get_rating = {
    "name": "getRating",
    "description": "获取餐厅在大众点评的评分和评价",
    "parameters": {
        "type": "object",
        "properties": {
            "restaurant_name": {"type": "string"},
            "city": {"type": "string"}
        },
        "required": ["restaurant_name"]
    }
}

response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents="我在望京SOHO，帮我找个人均100以内的日料，顺便看看评分",
    config=types.GenerateContentConfig(
        tools=[
            types.Tool(
                google_search=types.ToolGoogleSearch(),  # 搜索实时信息
                google_maps=types.ToolGoogleMaps(),       # 地理位置数据
                function_declarations=[search_restaurant, get_rating]
            ),
        ],
        include_server_side_tool_invocations=True
    ),
)

Gemini 会自动编排：先用 Maps 定位望京 SOHO → 搜索附近日料 → 调你的 searchRestaurant 函数拿菜单价格 → 调 getRating 拿评分 → 综合生成推荐。

一个请求，四个工具协作。这要是自己用 LangChain 编排，光 Router 逻辑就够写半天的。

国内怎么调？

Gemini API 在国内直连不太稳定，我的做法是用兼容 OpenAI 协议的聚合平台中转。比如我现在用的 ofox.ai，改个 base_url 就能调 Gemini 3 全系列，延迟还比直连低不少（走的阿里云节点）：

from openai import OpenAI

# 用 OpenAI 兼容接口调 Gemini
client = OpenAI(
    api_key="your-ofox-key",
    base_url="https://api.ofox.ai/v1"
)

response = client.chat.completions.create(
    model="gemini-3-flash",
    messages=[{"role": "user", "content": "你好"}],
    # 注意：compound tools 需要用原生 google-genai SDK
    # OpenAI 兼容接口目前支持基础对话和 function calling
)

不过要注意，compound tools 这个特性目前只有 Google 原生 SDK 支持，OpenAI 兼容接口暂时只能用基础的 function calling。如果你的场景不需要 Google Search/Maps 内置工具混用，普通 function calling 用兼容接口就够了。

踩坑记录

坑 1：flag 名字太长记不住

include_server_side_tool_invocations 这个 flag 名字长得离谱，每次都得翻文档复制。建议封装一下：

COMPOUND_TOOLS_CONFIG = types.GenerateContentConfig(
    tools=[...],
    include_server_side_tool_invocations=True  # 就这个
)

坑 2：response parts 顺序不固定

别假设 parts[0] 一定是 tool_call，parts[1] 一定是 text。实测下来顺序跟模型的推理路径有关，每次可能不一样。老老实实遍历判断类型。

坑 3：thought_signature 必须原样传回

多轮对话时，response.candidates[0].content 里的每个 part 都可能带 thought_signature 字段。这个字段是模型内部推理链的签名，必须原样传回，少了就 400。

我一开始只传了 text、function_call、tool_call 这些看得懂的字段，把 thought_signature 过滤掉了（觉得是调试信息），结果折腾了两小时才定位到问题。

坑 4：模型选择

目前 compound tools 只支持 gemini-3-flash-preview 和 gemini-3-pro-preview。用 gemini-2.0-flash 之类的老模型会报 INVALID_ARGUMENT。

小结

这个更新对做 AI Agent 的开发者来说确实是好消息——以前最痛苦的不是写业务逻辑，是写工具编排的胶水代码。现在 Gemini 把这层活儿揽过去了，你只管定义函数、处理结果就行。

当然目前还是 preview 状态，生产环境建议再观望一下。但如果你在做 demo 或者内部工具，现在就可以上手试了，体验确实流畅很多。

Google 这波属于闷声干大事，这个功能要是早半年出，LangChain 的 Agent 模块估计得重写。