精准0误差,输入价格打骨折!OpenAI官宣API支持结构化输出,JSON准确率100%

513 阅读11分钟
【新智元导读】 程序员福音!OpenAI新推出的模型API全部支持重构输出,JSON Schema匹配率高达100%,成本还降低了一半。

还在绞尽脑汁想一堆提示词,为第一操作后五花八门的输出结果而头疼?OpenAI 终于听到了群众的呼声,为广大开发者送上渴望已久的第一大功能。OpenAI今日宣布新功能上线,ChatGPT API现已支持JSON格式输出。

图片

JSON(JavaScript Object Notation)是文件和数据交换格式的行业标准,因为它既易于人类读取又易于机器解析。然而,LLM经常与JSON对着干,会产生幻觉,要不生成仅部分遵循完全指令的响应,要不就生成大量「天书」,根本无法解析。图片这就需要开发人员使用多种工具、尝试不同的提示或请求等来生成开源理想的产出重复结果,持续消耗力。构造输出功能于今天发布,以上棘手的问题迎刃而解,确保模型生成的输出与JSON中规定的模式相匹配。一直以来,构造输出功能是开发人员呼声最高的头号功能,奥特曼在推文中也表示,该版本是应广大用户的要求发布的。图片OpenAI 发布的新功能确实击中了许多开发者的心,他们一致认为「这是一件大事」。纷纷留言表示赞叹,直呼“好极了!”。图片几家欢喜几家愁,OpenAI的这次更新,又让人担心会砍掉公司。图片然而,对于更多的普通用户来说,他们更关心的问题是 GPT-5 到底什么时候发布,至于 JSON Schema,「那是什么?」图片图片毕竟,没有 GPT-5 的消息,OpenAI 今年秋季的 DevDay,可能与去年相比,将会安静了许多。

轻松 确保安全

有了构造输出,只需要定义一个JSON Schema,AI就可以「任性」,乖乖不再按照指令要求输出数据。而且,新的功能扩展让AI变得更加听话,还能最大程度提高输出内容的可靠性。在对复杂的 JSON 模式的跟踪评估中,赋形输出的新模型 gpt-4o-2024-08-06 获得了 100% 的满分。相比之下,gpt-4-0613 的得分小于 40 %。图片注意,JSON Schema 的功能就是 OpenAI 在去年的 DevDay 上推出的。现在,OpenAI在API中扩展了这项功能,确保模型生成的输出与开发人员提供的JSON Schema完全匹配。从非结构化输入生成结构化数据是人工智能应用中的核心示例之一。开发人员使用OpenAI API构建强大的助手,能够通过函数调用获取数据并回答问题,提取数据以进行数据输入,并构建多步骤的智能体工作流(多步代理工作流程),从而允许LLM采取行动。

技术原理

OpenAI采用了一种双管齐下的方法来提高模型输出与JSON Schema的匹配度。最新的gpt-4o-2024-08-06模型经过训练,可以更好地理解复杂的模式并生成匹配的输出。尽管模型性能已显着提升,在基准测试中达到了93%的准确性,但不确定性仍然存在。为了保证开发者构建应用的稳定性,OpenAI提供了一种更高准确度的方法来约束模型的输出,从而实现100%的可靠性。

肾脏解码

OpenAI采用了一种称为约束采样或约束解码的技术,情况下,模型生成输出时完全不受约束,可以从词汇表中选择任何代币作为下一个输出。灵活可能导致错误,例如,在生成有效的JSON时,轻松插入无效字符。为了避免此类错误,OpenAI 使用动态约束解码的方法,确保生成的输出令牌始终符合提供的架构。为了实现这一点,OpenAI 将提供 JSON Schema 转换为上下文关联文法(CFG)。对于每个 JSON 模式,OpenAI 计算出一个代表该模式的语法,并在高效采样期间地访问剩余的组件。该方法不仅使生成的输出更准确,还减少了这种延迟。首次请求新模式可能会有额外的处理时间,但后面的请求通过服务器机制实现快速响应。

备選方案

除CFG方法外,其他方法通常使用有限状态机(FSM)或正则表达式来进行约束解码。然而,这些方法在动态更新有效令牌时的能力有限。特别是对于复杂的查询或队列数据结构,FSM 通常难以处理。OpenAI的CFG方法在表达复杂模式时表现出色。例如,支持分层模式的JSON模式在OpenAI API上已得到实现,但无法通过FSM方法表达。

投资折扣

支持函数调用的所有模型实习实现重构输出,包括最新的GPT-4o和GPT-4o-mini模型,以及适配器模型。此功能可在Chat Completions API、Assistants API和Batch API上使用,并兼容窗口输入。与gpt-4o-2024-05-13版本相比,gpt-4o-2024-08-06版本在成本上也增加了优势,开发者在输入端可以节省50%的成本(2.50美元/1M oken) ,在输出端节省33%的成本(10.00美元/1M代币)。

使用结构化输出

在API中可以使用两种形式的插入格式输出:

函数

通过在函数定义中设置strict: true,可以实现通过工具的输出格式。此功能适用于支持工具的所有型号,包括所有型号gpt-4-0613和gpt-3.5-turbo-0613及更高版本。启用构造输出后,模型输出将与提供的工具定义相匹配。示例请求:

POST /v1/chat/completions
{
  "model""gpt-4o-2024-08-06",
  "messages": [
    {
      "role""system",
      "content""You are a helpful assistant. The current date is August 6, 2024. You help users query for the data they are looking for by calling the query function."
    },
    {
      "role""user",
      "content""look up all my orders in may of last year that were fulfilled but not delivered on time"
    }
  ],
  "tools": [
    {
      "type""function",
      "function": {
        "name""query",
        "description""Execute a query.",
        "strict"true,
        "parameters": {
          "type""object",
          "properties": {
            "table_name": {
              "type""string",
              "enum": ["orders"]
            },
            "columns": {
              "type""array",
              "items": {
                "type""string",
                "enum": [
                  "id",
                  "status",
                  "expected_delivery_date",
                  "delivered_at",
                  "shipped_at",
                  "ordered_at",
                  "canceled_at"
                ]
              }
            },
            "conditions": {
              "type""array",
              "items": {
                "type""object",
                "properties": {
                  "column": {
                    "type""string"
                  },
                  "operator": {
                    "type""string",
                    "enum": ["="">""<"">=""<=""!="]
                  },
                  "value": {
                    "anyOf": [
                      {
                        "type""string"
                      },
                      {
                        "type""number"
                      },
                      {
                        "type""object",
                        "properties": {
                          "column_name": {
                            "type""string"
                          }
                        },
                        "required": ["column_name"],
                        "additionalProperties"false
                      }
                    ]
                  }
                },
                "required": ["column""operator""value"],
                "additionalProperties"false
              }
            },
            "order_by": {
              "type""string",
              "enum": ["asc""desc"]
            }
          },
          "required": ["table_name""columns""conditions""order_by"],
          "additionalProperties"false
        }
      }
    }
  ]
}

示例输出:

{
  "table_name": "orders",
  "columns": ["id", "status", "expected_delivery_date", "delivered_at"],
  "conditions": [
    {
      "column": "status",
      "operator": "=",
      "value": "fulfilled"
    },
    {
      "column": "ordered_at",
      "operator": ">=",
      "value": "2023-05-01"
    },
    {
      "column": "ordered_at",
      "operator": "<",
      "value": "2023-06-01"
    },
    {
      "column": "delivered_at",
      "operator": ">",
      "value": {
        "column_name": "expected_delivery_date"
      }
    }
  ],
  "order_by": "asc"
}

response_format参数的新选项

开发人员现在可以通过response_format的新选项json_schema选择是否需要规定格式的输出。当模型不调用工具,而是以格式化方式响应用户时,此功能非常有用。此功能适用于最新的GPT-4o型号:今天发布的gpt-4o-2024-08-06和gpt-4o-mini-2024-07-18 。将response_format设置为strict:true时,模型输出将与提供的模式匹配。示例请求:

POST /v1/chat/completions
{
  "model""gpt-4o-2024-08-06",
  "messages": [
    {
      "role""system",
      "content""You are a helpful math tutor."
    },
    {
      "role""user",
      "content""solve 8x + 31 = 2"
    }
  ],
  "response_format": {
    "type""json_schema",
    "json_schema": {
      "name""math_response",
      "strict"true,
      "schema": {
        "type""object",
        "properties": {
          "steps": {
            "type""array",
            "items": {
              "type""object",
              "properties": {
                "explanation": {
                  "type""string"
                },
                "output": {
                  "type""string"
                }
              },
              "required": ["explanation""output"],
              "additionalProperties"false
            }
          },
          "final_answer": {
            "type""string"
          }
        },
        "required": ["steps""final_answer"],
        "additionalProperties"false
      }
    }
  }
}

示例输出:

{
  "steps": [
    {
      "explanation": "Subtract 31 from both sides to isolate the term with x.",
      "output": "8x + 31 - 31 = 2 - 31"
    },
    {
      "explanation": "This simplifies to 8x = -29.",
      "output": "8x = -29"
    },
    {
      "explanation": "Divide both sides by 8 to solve for x.",
      "output": "x = -29 / 8"
    }
  ],
  "final_answer": "x = -29 / 8"
}

开发人员可以使用逐步生成答案,以实现预期输出的输出。根据OpenAI的说法,开发人员不需要验证或重试格式不正确的响应,并且该功能允许更简单的提示。

原生SDK支持

OpenAI称他们的Python和Node SDK已经更新,暂时支持重构输出。为工具提供架构或响应格式就像提供Pydantic或Zod对象一样简单,OpenAI的SDK能够将数据类型转换为支持的JSON模式、自动将JSON响应反序列化为类型化数据结构以及解析拒绝。

from enum import Enum
from typing import Union

from pydantic import BaseModel

import openai
from openai import OpenAI


class Table(str, Enum):
    orders"orders"
    customers"customers"
    products"products"


class Column(str, Enum):
    id"id"
    status"status"
    expected_delivery_date"expected_delivery_date"
    delivered_at"delivered_at"
    shipped_at"shipped_at"
    ordered_at"ordered_at"
    canceled_at"canceled_at"


class Operator(str, Enum):
    eq"="
    gt">"
    lt"<"
    le"<="
    ge">="
    ne"!="


class OrderBy(str, Enum):
    asc"asc"
    desc"desc"


class DynamicValue(BaseModel):
    column_name: str


class Condition(BaseModel):
    column: str
    operator: Operator
    value: Union[str, int, DynamicValue]


class Query(BaseModel):
    table_name: Table
    columns: list[Column]
    conditions: list[Condition]
    order_by: OrderBy


client = OpenAI()

completion = client.beta.chat.completions.parse(
    model="gpt-4o-2024-08-06",
    messages=[
        {
            "role""system",
            "content""You are a helpful assistant. The current date is August 6, 2024. You help users query for the data they are looking for by calling the query function.",
        },
        {
            "role""user",
            "content""look up all my orders in may of last year that were fulfilled but not delivered on time",
        },
    ],
    tools=[
        openai.pydantic_function_tool(Query),
    ],
)

print(completion.choices[0].message.tool_calls[0].function.parsed_arguments)

而且,本机构造输出支持也可用于response_format。

from pydantic import BaseModel

from openai import OpenAI


class Step(BaseModel):
    explanation: str
    output: str


class MathResponse(BaseModel):
    steps: list[Step]
    final_answer: str


client = OpenAI()

completion = client.beta.chat.completions.parse(
    model="gpt-4o-2024-08-06",
    messages=[
        {"role""system""content""You are a helpful math tutor."},
        {"role""user""content""solve 8x + 31 = 2"},
    ],
    response_format=MathResponse,
)

message = completion.choices[0].message
if message.parsed:
    print(message.parsed.steps)
    print(message.parsed.final_answer)
else:
    print(message.refusal)

其他

开发人员经常使用 OpenAI 的模型来生成各种数据。还有一些例子包括:-根据用户意图动态生成用户界面开发人员可以使用格式输出来创建代码或 UI 生成应用程序。使用相同的response_format ,可以根据用户输入生成不同的UI。例如,创建「园丁的登录界面」:图片就是用以下代码生成的:

{
  "type": "div",
  "label": "",
  "children": [
    {
      "type": "header",
      "label": "",
      "children": [
        {
          "type": "div",
          "label": "Green Thumb Gardening",
          "children": [],
          "attributes": [{ "name": "className", "value": "site-title" }]
        },
        {
          "type": "div",
          "label": "Bringing Life to Your Garden",
          "children": [],
          "attributes": [{ "name": "className", "value": "site-tagline" }]
        }
      ],
      "attributes": [{ "name": "className", "value": "header" }]
    },
    {
      "type": "section",
      "label": "",
      "children": [
        {
          "type": "div",
          "label": "",
          "children": [
            {
              "type": "div",
              "label": "About Us",
              "children": [
                {
                  "type": "div",
                  "label": "At Green Thumb Gardening, we specialize in transforming your outdoor spaces into beautiful, thriving gardens. Our team has decades of experience in horticulture and landscape design.",
                  "children": [],
                  "attributes": [
                    { "name": "className", "value": "about-description" }
                  ]
                }
              ],
              "attributes": [{ "name": "className", "value": "about-section" }]
            }
          ],
          "attributes": [{ "name": "className", "value": "content" }]
        }
      ],
      "attributes": [{ "name": "className", "value": "about-container" }]
    },
    {
      "type": "section",
      "label": "",
      "children": [
        {
          "type": "div",
          "label": "",
          "children": [
            {
              "type": "div",
              "label": "Our Services",
              "children": [
                {
                  "type": "div",
                  "label": "Garden Design",
                  "children": [],
                  "attributes": [
                    { "name": "className", "value": "service-item" }
                  ]
                },
                {
                  "type": "div",
                  "label": "Plant Care & Maintenance",
                  "children": [],
                  "attributes": [
                    { "name": "className", "value": "service-item" }
                  ]
                },
                {
                  "type": "div",
                  "label": "Seasonal Cleanup",
                  "children": [],
                  "attributes": [
                    { "name": "className", "value": "service-item" }
                  ]
                },
                {
                  "type": "div",
                  "label": "Custom Landscaping",
                  "children": [],
                  "attributes": [
                    { "name": "className", "value": "service-item" }
                  ]
                }
              ],
              "attributes": [{ "name": "className", "value": "services-list" }]
            }
          ],
          "attributes": [{ "name": "className", "value": "content" }]
        }
      ],
      "attributes": [{ "name": "className", "value": "services-container" }]
    }
  ],
  "attributes": [{ "name": "className", "value": "landing-page" }]
}

- 将最终答案与支撑性的推理或附加评论分开为模型提供单独思考的链字段可以提高响应的最终质量。答案:

{
  "model": "gpt-4o-2024-08-06",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant"
    },
    {
      "role": "user",
      "content": "9.11 and 9.9 -- which is bigger?"
    }
  ],
  "response_format": {
    "type": "json_schema",
    "json_schema": {
      "name": "reasoning_schema",
      "strict": true,
      "schema": {
        "type": "object",
        "properties": {
          "reasoning_steps": {
            "type": "array",
            "items": {
              "type": "string"
            },
            "description": "The reasoning steps leading to the final conclusion."
          },
          "answer": {
            "type": "string",
            "description": "The final answer, taking into account the reasoning steps."
          }
        },
        "required": ["reasoning_steps", "answer"],
        "additionalProperties": false
      }
    }
  }
}

结构化输出:

{
  "reasoning_steps": [
    "First step is to compare the numbers 9.11 and 9.9.",
    "Both numbers have the same whole number part, which is 9.",
    "To compare the decimal parts, convert them to the same number of decimal places.",
    "9.11 has two decimal places: it is 9.11.",
    "9.9 has one decimal place: it can be rewritten as 9.90.",
    "Now, compare 9.11 and 9.90 by looking at the decimal parts.",
    "Compare 11 with 90.",
    "90 is greater than 11, so 9.90 is greater than 9.11."
  ],
  "answer": "9.9 is bigger than 9.11."
}

- 从非结构化数据中提取结构化数据例如,指示模型从会议记录中提取事务、指定日期和作业等内容。答案:

POST /v1/chat/completions
{
  "model""gpt-4o-2024-08-06",
  "messages": [
    {
      "role""system",
      "content""Extract action items, due dates, and owners from meeting notes."
    },
    {
      "role""user",
      "content""...meeting notes go here..."
    }
  ],
  "response_format": {
    "type""json_schema",
    "json_schema": {
      "name""action_items",
      "strict"true,
      "schema": {
        "type""object",
        "properties": {
          "action_items": {
            "type""array",
            "items": {
              "type""object",
              "properties": {
                "description": {
                  "type""string",
                  "description""Description of the action item."
                },
                "due_date": {
                  "type": ["string""null"],
                  "description""Due date for the action item, can be null if not specified."
                },
                "owner": {
                  "type": ["string""null"],
                  "description""Owner responsible for the action item, can be null if not specified."
                }
              },
              "required": ["description""due_date""owner"],
              "additionalProperties"false
            },
            "description""List of action items from the meeting."
          }
        },
        "required": ["action_items"],
        "additionalProperties"false
      }
    }
  }
}

结构化输出:

{
  "action_items": [
    {
      "description": "Collaborate on optimizing the path planning algorithm",
      "due_date": "2024-06-30",
      "owner": "Jason Li"
    },
    {
      "description": "Reach out to industry partners for additional datasets",
      "due_date": "2024-06-25",
      "owner": "Aisha Patel"
    },
    {
      "description": "Explore alternative LIDAR sensor configurations and report findings",
      "due_date": "2024-06-27",
      "owner": "Kevin Nguyen"
    },
    {
      "description": "Schedule extended stress tests for the integrated navigation system",
      "due_date": "2024-06-28",
      "owner": "Emily Chen"
    },
    {
      "description": "Retest the system after bug fixes and update the team",
      "due_date": "2024-07-01",
      "owner": "David Park"
    }
  ]
}

安全结构化输出安全是 OpenAI 的护理任务——新的格式化输出功能将遵守 OpenAI 现有的安全政策,并且仍然允许模型不拒绝安全的请求。为了使开发简单,API响应上有一个新的拒绝字符串值,它允许开发人员以编程方式检测模型是否生成更拒绝而不是与架构匹配的输出。当响应不包含拒绝并且模型的响应没有过早(如finish_reason所示)时,模型的响应将可靠地生成与提供的架构匹配的有效JSON。

{
  "id": "chatcmpl-9nYAG9LPNonX8DAyrkwYfemr3C8HC",
  "object": "chat.completion",
  "created": 1721596428,
  "model": "gpt-4o-2024-08-06",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "refusal": "I'm sorry, I cannot assist with that request."
      },
      "logprobs": null,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 81,
    "completion_tokens": 11,
    "total_tokens": 92
  },
  "system_fingerprint": "fp_3407719c7f"
}

参考资料:openai.com/index/intro…