通过调用模型效果盲测接口,开发者可以实现一次调用过程中,使用多种不同的模型进行内容生成。适用于在实际业务场景下,做用户对不同模型底座的智能体的喜好对比。
说明:
通过本接口进行模型效果测评后,您可以使用效果反馈接口,将测评数据反馈给百宝箱,百宝箱会将收集到的反馈数据进行计分排行,用于展示在不同业务场景下,最适配的底层模型。
前提条件
在调用本接口前,请先完成应用的发布
请求地址
百宝箱支持开发者在以下两个请求地址中选择任意一个地址发起请求。
- 地址 1:
POST``https://api.tbox.cn/api/model/responses - 地址 2:
POST``https://api.tbox.cn/api/model/chat/completions
请求头
| 参数名 | 是否必填 | 类型 | 说明 | 示例 |
|---|---|---|---|---|
| Authorization | 是 | String | 用于验证客户端身份的访问令牌,你可以在百宝箱中获取,获取方式可参见:授权管理。 | pat_2j4e******THUIVRH1 |
| AppCode | 是 | String | 应用接入点 code。获取路径:访问模型效果盲测 > 新建应用接入点 > appCode。 | 202506e******00450562 |
请求参数
下文将按照请求地址的不同分别介绍请求参数。
model/responses
| 参数名 | 是否必填 | 类型 | 说明 | 示例 |
|---|---|---|---|---|
| model | 是 | String | 需要测试的模型。支持:- auto:随机模型; "模型名称":指定模型,多个模型时,请用英文逗号分割。 | auto |
| input | 是 | String | 替换为您想要提问的问题。如果是多轮对话,需要符合 OpenAI messages 协议。 | 今天的国际金价是多少 |
| user | 是 | String | 用户 ID。 | - |
| tools | 否 | Array | 模型效果盲测中内置的全网搜索工具。 | [{"type": "web_search_preview"}] |
| extra_body | 否 | Object | 目前只接收一个参数n,本次需要使用几个模型进行生成,仅当model=auto 时生效。 | "n": 3 |
请求示例
curl --location 'https://api.tbox.cn/api/model/responses' \
--header 'Authorization: Bearer {your_token}' \
--header 'AppCode: {app_code}' \
--header 'Content-Type: application/json' \
--header 'Accept: text/event-stream' \
--header 'Cookie: receive-cookie-deprecation=1' \
--data '{
"model": "auto",
"input": "今天的国际金价是多少?",
"user": "{user_id}",
"tools": [
{
"type": "web_search_preview"
}
],
"stream": true,
"extra_body": {
"n": 3
}
}'
model/completions
| 参数名 | 是否必填 | 类型 | 说明 | 示例 |
|---|---|---|---|---|
| model | 是 | String | 需要测试的模型。支持:- auto:随机模型; "模型名称":指定模型,多个模型时,请用英文逗号分割。 | auto |
| messages | 是 | Object | 替换为您想要提问的问题。如果是多轮对话,需要符合 OpenAI messages 协议。 | [{"role": "system", "content": "你是一个专业翻译,保持原文风格"},{"role": "user", "content": "The pursuit of knowledge is a lifelong journey"}] |
| n | 是 | String | 本次测评涉及的模型数量。- 仅 model=auto 时生效。 数量限制: 1 ≤ n ≤ 5。 | 3 |
| stream | 是 | Boolean | 是否使用流式输出响应内容。 | false |
请求示例
curl --location 'https://api.tbox.cn/api/model/chat/completions' \
--header 'Authorization: Bearer {your_token}' \
--header 'AppCode: {app_code}' \
--header 'Content-Type: application/json' \
--data '{
"model": "auto",
"messages": [
{"role": "system", "content": "你是一个专业翻译,保持原文风格"},
{"role": "user", "content": "The pursuit of knowledge is a lifelong journey"}
],
"n": 2,
"stream": false
}'
返回参数
model/responses
| 参数 | 类型 | 说明 | 示例 |
|---|---|---|---|
| event | String | 事件类型 | response.output_text.delta |
| data.type | String | 事件详细类型 | response.output_text.delta |
| data.response.id | String | 响应唯一 ID | resp_7b151ed6e2fc4cb899ff433886b81664 |
| data.response.status | String | 当前状态 | in_progress / completed |
| data.response.model | String | 使用的模型名称 | qwen-max-latest |
| data.item_id | String | 输出项 ID | 7c83dc6375b443cbb3b1e27c0dd3ff75 |
| data.output_index | Integer | 输出项索引位置 | 0 |
| data.content_index | Integer | 内容段索引位置 | 0 |
| data.delta | String | 增量文本内容(Unicode 编码) | \u4eca\u5929 |
| data.tool_id | String | 工具调用唯一 ID | call_IyGKdnpKpJ6m1maVLA9kdksE |
返回示例
event: response.created
data: {"type": "response.created", "response": {"id": "resp_7b151ed6e2fc4cb899ff433886b81664", "object": "response", "created_at": 1749721513, "status": "in_progress", "error": null, "incomplete_details": null, "instructions": null, "max_output_tokens": 512, "model": "qwen-max-latest", "output": [], "parallel_tool_calls": false, "previous_response_id": "", "reasoning": {"effort": 0, "summary": null}, "store": true, "temperature": 0.7, "text": {"format": {"type": "text"}}, "tool_choice": "auto", "tools": [{"type": "web_search_preview"}], "top_p": 0.9, "truncation": "disabled", "usage": null, "user": "ezK16lM", "metadata": {}}}
event: response.in_progress
data: {"type": "response.in_progress", "response": {"id": "resp_7b151ed6e2fc4cb899ff433886b81664", "object": "response", "created_at": 1749721513, "status": "in_progress", "error": null, "incomplete_details": null, "instructions": null, "max_output_tokens": 512, "model": "qwen-max-latest", "output": [], "parallel_tool_calls": false, "previous_response_id": "", "reasoning": {"effort": 0, "summary": null}, "store": true, "temperature": 0.7, "text": {"format": {"type": "text"}}, "tool_choice": "auto", "tools": [{"type": "web_search_preview"}], "top_p": 0.9, "truncation": "disabled", "usage": null, "user": "ezK16lM", "metadata": {}}}
event: response.output_item.added
data: {"type": "response.output_item.added", "item": {"id": "7c83dc6375b443cbb3b1e27c0dd3ff75", "status": "in_progress", "type": "message", "role": "assistant", "content": []}, "output_index": 0}
event: response.web_search_call.in_progress
data: {"type": "response.web_search_call.in_progress", "output_index": 0, "model": "qwen-max-latest", "item_id": "7c83dc6375b443cbb3b1e27c0dd3ff75", "tool_id": "call_IyGKdnpKpJ6m1maVLA9kdksE", "tool_input": ""}
...
event: response.output_text.delta
data: {"type": "response.output_text.delta", "model": "qwen-max-latest", "item_id": "7c83dc6375b443cbb3b1e27c0dd3ff75", "output_index": 18, "content_index": 0, "delta": "\u4eca\u5929"}
event: response.output_text.delta
data: {"type": "response.output_text.delta", "model": "qwen-max-latest", "item_id": "7c83dc6375b443cbb3b1e27c0dd3ff75", "output_index": 19, "content_index": 0, "delta": "\u7684"}
...
event: response.output_item.done
data: {"type": "response.output_item.done", "output_index": 98, "item": {"id": "7c83dc6375b443cbb3b1e27c0dd3ff75", "type": "message", "status": "completed", "role": "assistant", "content": [{"type": "output_text", "text": "\u6211\u53ef\u4ee5\u5e2e\u60a8\u67e5\u8be2\u4eca\u5929\u7684\u56fd\u9645\u91d1\u4ef7\u3002\u8ba9\u6211\u901a\u8fc7\u641c\u7d22\u83b7\u53d6\u6700\u65b0\u4fe1\u606f\u3002\u6839\u636e\u6700\u65b0\u641c\u7d22\u7ed3\u679c\uff0c\u4eca\u5929\u7684\u56fd\u9645\u91d1\u4ef7\u4fe1\u606f\u5982\u4e0b\uff1a\n\n**\u6700\u65b0\u56fd\u9645\u91d1\u4ef7\uff082025\u5e743\u670826\u65e5\uff09\uff1a**\n- \u56fd\u9645\u9ec4\u91d1\u4ef7\u683c\uff1a3017.77 \u7f8e\u5143/\u76ce\u53f8\n\n\u6b64\u5916\uff0c\u6839\u636e\u5176\u4ed6\u641c\u7d22\u7ed3\u679c\u663e\u793a\u7684\u6700\u8fd1\u6570\u636e\uff1a\n- \u4f26\u6566\u73b0\u8d27\u9ec4\u91d1\u4ef7\u683c\uff1a2936.57 \u7f8e\u5143/\u76ce\u53f8\uff08\u6da8\u5e45+0.049%\uff09\n- \u7ebd\u7ea6\u671f\u8d27\u56fd\u9645\u91d1\u4ef7\uff1a2953.24 \u7f8e\u5143/\u76ce\u53f8\uff08\u6da8\u5e45+0.144%\uff09\n- \u4e2d\u56fd\u9ec4\u91d1\u4ef7\u683c\uff08\u4e0a\u6d77\u9ec4\u91d1\u4ea4\u6613\u6240\uff09\uff1a683.73 \u4eba\u6c11\u5e01/\u514b\uff08\u8dcc\u5e45-0.401%\uff09\n\n\u8bf7\u6ce8\u610f\uff0c\u91d1\u4ef7\u4f1a\u968f\u5e02\u573a\u6ce2\u52a8\u800c\u53d8\u5316\uff0c\u4ee5\u4e0a\u6570\u636e\u4ec5\u4f9b\u53c2\u8003\uff0c\u5b9e\u9645\u4ea4\u6613\u8bf7\u4ee5\u5b98\u65b9\u62a5\u4ef7\u4e3a\u51c6\u3002", "annotations": []}], "model": "qwen-max-latest"}}
event: response.completed
data: {"id": "resp_7b151ed6e2fc4cb899ff433886b81664", "object": "response", "created_at": 1749721513, "status": "completed", "error": null, "incomplete_details": null, "instructions": null, "max_output_tokens": 512, "model": "qwen-max-latest", "output": [{"id": "7c83dc6375b443cbb3b1e27c0dd3ff75", "type": "message", "status": "completed", "role": "assistant", "content": [{"type": "output_text", "text": "\u4eca\u5929\u7684\u56fd\u9645\u91d1\u4ef7\u4e3a2861.81\u7f8e\u5143/\u76ce\u53f8\uff0c\u6298\u5408\u4eba\u6c11\u5e01\u4e3a670.70\u5143/\u514b\u3002\u8bf7\u6ce8\u610f\uff0c\u5177\u4f53\u4ef7\u683c\u53ef\u80fd\u4f1a\u56e0\u5e02\u573a\u6ce2\u52a8\u800c\u6709\u6240\u53d8\u5316\u3002", "annotations": []}], "model": "qwen-max-latest"}, {"id": "7c83dc6375b443cbb3b1e27c0dd3ff75", "type": "message", "status": "completed", "role": "assistant", "content": [{"type": "output_text", "text": "\u6211\u53ef\u4ee5\u5e2e\u60a8\u67e5\u8be2\u4eca\u5929\u7684\u56fd\u9645\u91d1\u4ef7\u3002\u8ba9\u6211\u901a\u8fc7\u641c\u7d22\u83b7\u53d6\u6700\u65b0\u4fe1\u606f\u3002\u6839\u636e\u6700\u65b0\u641c\u7d22\u7ed3\u679c\uff0c\u4eca\u5929\u7684\u56fd\u9645\u91d1\u4ef7\u4fe1\u606f\u5982\u4e0b\uff1a\n\n**\u6700\u65b0\u56fd\u9645\u91d1\u4ef7\uff082025\u5e743\u670826\u65e5\uff09\uff1a**\n- \u56fd\u9645\u9ec4\u91d1\u4ef7\u683c\uff1a3017.77 \u7f8e\u5143/\u76ce\u53f8\n\n\u6b64\u5916\uff0c\u6839\u636e\u5176\u4ed6\u641c\u7d22\u7ed3\u679c\u663e\u793a\u7684\u6700\u8fd1\u6570\u636e\uff1a\n- \u4f26\u6566\u73b0\u8d27\u9ec4\u91d1\u4ef7\u683c\uff1a2936.57 \u7f8e\u5143/\u76ce\u53f8\uff08\u6da8\u5e45+0.049%\uff09\n- \u7ebd\u7ea6\u671f\u8d27\u56fd\u9645\u91d1\u4ef7\uff1a2953.24 \u7f8e\u5143/\u76ce\u53f8\uff08\u6da8\u5e45+0.144%\uff09\n- \u4e2d\u56fd\u9ec4\u91d1\u4ef7\u683c\uff08\u4e0a\u6d77\u9ec4\u91d1\u4ea4\u6613\u6240\uff09\uff1a683.73 \u4eba\u6c11\u5e01/\u514b\uff08\u8dcc\u5e45-0.401%\uff09\n\n\u8bf7\u6ce8\u610f\uff0c\u91d1\u4ef7\u4f1a\u968f\u5e02\u573a\u6ce2\u52a8\u800c\u53d8\u5316\uff0c\u4ee5\u4e0a\u6570\u636e\u4ec5\u4f9b\u53c2\u8003\uff0c\u5b9e\u9645\u4ea4\u6613\u8bf7\u4ee5\u5b98\u65b9\u62a5\u4ef7\u4e3a\u51c6\u3002", "annotations": []}], "model": "anthropic/claude-3.7-sonnet"}], "parallel_tool_calls": false, "previous_response_id": "", "reasoning": {"effort": 0, "summary": null}, "store": true, "temperature": 0.7, "text": {"format": {"type": "text"}}, "tool_choice": "auto", "tools": [{"type": "web_search_preview"}], "top_p": 0.9, "truncation": "disabled", "usage": null, "user": "ezK16lM", "metadata": {}, "type": "response.completed"}
event: response.completed
data: [DONE]
model/completions
| 参数 | 类型 | 说明 |
|---|---|---|
| id | String | 本次请求的唯一标识 |
| object | String | 固定值 "chat.completion" |
| created | Integer | 响应生成的时间戳(Unix 时间戳,单位:秒) |
| models | Array[Str] | 使用的模型列表 |
| choices | Array[Choice] | 包含模型生成的文本结果数组,数组长度通常和 n 值相同; |
| usage | Usage | Token 消耗统计信息 |
choice
| 参数 | 类型 | 说明 |
|---|---|---|
| finish_reason | String | 生成结束原因(示例:"stop" 表示正常终止) |
| index | Integer | 结果在 choices数组中的索引(从 0 开始) |
| logprobs | null | 保留字段 |
| message | ChoiceMessage | 模型生成的消息内容 |
| content_filter_results | Object | 内容安全过滤结果 |
| message_id | String | 消息的唯一标识 |
| model | String | 生成此结果的模型名称 |
ChoiceMessage
| 字段名 | 类型 | 说明 |
|---|---|---|
| content | String | 核心输出,模型生成的文本内容; |
| refusal | null | 拒绝回答时的说明(null 表示未触发拒绝) |
| role | String | 消息角色(固定为 "assistant",表示模型回复) |
| audio | null | 音频输出(null 表示无音频) |
| function_call | null | 函数调用请求(null 表示未调用函数) |
| tool_calls | null | 工具调用请求(null 表示未调用工具) |
Usage
| 字段名 | 类型 | 说明 |
|---|---|---|
| prompt_tokens | Integer | 输入提示消耗的 Token 数量 |
| completion_tokens | Integer | 输出结果消耗的 Token 数量 |
| total_tokens | Integer | 总 Token 消耗量 |
返回示例
{
"id": "9f10f75262614014b495357c15c8a694",
"object": "chat.completion",
"created": 1750411923,
"models": [
"meta-llama/llama-4-scout-17b-16e-instruct",
"deepseek/deepseek-r1-distill-qwen-14b"
],
"choices": [
{
"finish_reason": "stop",
"index": 0,
"logprobs": null,
"message": {
"content": "求知是终身的旅程",
"refusal": null,
"role": "assistant",
"audio": null,
"function_call": null,
"tool_calls": null
},
"content_filter_results": {
"hate": {
"filtered": false
},
"self_harm": {
"filtered": false
},
"sexual": {
"filtered": false
},
"violence": {
"filtered": false
},
"jailbreak": {
"filtered": false,
"detected": false
},
"profanity": {
"filtered": false,
"detected": false
}
},
"message_id": "bae9679e151c40e980605257e9eb162e",
"model": "meta-llama/llama-4-scout-17b-16e-instruct"
},
{
"finish_reason": "stop",
"index": 1,
"logprobs": null,
"message": {
"content": "<think>\nAlright, so I need to translate the sentence "The pursuit of knowledge is a lifelong journey" into Chinese. Let me think about how to approach this.\n\nFirst, the main idea is "the pursuit of knowledge." In Chinese, that would probably be "知识的追求." But wait, sometimes people use "知识追求" without the 的, but maybe adding 的 makes it smoother. I think "知识的追求" sounds better here.\n\nNext, the verb "is" translates to "是." Then "a lifelong journey." "Lifelong" can be translated as "一生的" or "毕生的." I think "毕生的" might sound a bit more elegant, like "毕生的旅程." Alternatively, "一生的旅程" is also correct, but "毕生的旅程" feels more poetic, which suits the original sentence's tone.\n\nPutting it all together: "知识的追求是一生的旅程" or "知识的追求是毕生的旅程." I think between these two, "毕生的旅程" adds a touch of formality and depth, matching the idea of a lifelong pursuit.\n\nAnother option is "知识的追求是一生的旅程," which is straightforward and clear. It depends on the context whether to go for elegance or simplicity. Since the original sentence has a philosophical tone, maybe leaning towards the more elegant version is better.\n\nWait, I can also consider the flow. "知识的追求" as a noun phrase feels a bit heavy, so maybe "追求知识" could work too, but then it changes the structure. But the original sentence starts with "The pursuit of knowledge," which is a noun phrase, so to keep that structure, "知识的追求" is better.\n\nSo, finalizing, "知识的追求是一生的旅程" or "知识的追求是毕生的旅程." Either is fine, but I think the second one has a nicer ring to it.\n</think>\n\n知识的追求是毕生的旅程。",
"refusal": null,
"role": "assistant",
"audio": null,
"function_call": null,
"tool_calls": null
},
"content_filter_results": {
"hate": {
"filtered": false
},
"self_harm": {
"filtered": false
},
"sexual": {
"filtered": false
},
"violence": {
"filtered": false
},
"jailbreak": {
"filtered": false,
"detected": false
},
"profanity": {
"filtered": false,
"detected": false
}
},
"message_id": "348903ddedf943b7a12185843243c1d6",
"model": "deepseek/deepseek-r1-distill-qwen-14b"
}
],
"usage": {
"prompt_tokens": 49,
"completion_tokens": 417,
"total_tokens": 466
}
}
后续可选操作
通过本文档所述接口进行模型效果测评后,您可以使用效果反馈接口,将测评数据反馈给百宝箱,百宝箱将收集到的反馈数据进行计分排行,用于展示在不同业务场景下,最适配的底层模型。接口说明请参见:盲测结果反馈。
常见问题
使用模型效果盲测(含结果)能力时,开发者可能遇见的问题以及对应的解法可参见:常见问题。