LangChain4j 如何 Structured Outputs(结构化输出)?

1,529 阅读9分钟

本篇文件介绍LangChain4j框架如何支持将大模型返回内容格式化输出,比如转换为List、Map或者自定义的POJO。

现在许多LLMs和LLM提供商都支持以结构化格式(通常是JSON)生成输出,这些输出很容易映射到Java对象,并在应用程序的其它部分使用。

对于 Structured Outputs 赋予了多种含义和用途。

  • 大模型格式化提取信息能力,比如从非结构化的文本中提取信息,并映射到Java对象。
  • 大模型结构化输出能力,适应于响应格式和函数调用。

NOTE:LangChain4j 0.36.2 Structured Outputs支持略有变动!!!官方文档之-structured-outputs

结构化输出方案

目前有如下四种方式可以实现大模型结构化输出(从最可靠到最不可靠);

  • JSON Schema 【虽然可靠,但存在限制
  • Tools(Function Calling) 【工具(函数调用)】
  • Prompting + JSON Mode 【提示词 + JSON 模式】
  • Prompting 【提示词】

JSON Schema

JSON Schema是一种基于JSON格式的声明性语言,用于描述和验证JSON数据结构。它定义一组规则,这些规则可以用于确保JSON数据符合特定的格式和结构。

what is jsonschema

哪些模型支持JSON Schema

  • OpenAI
  • Google AI Gemini
  • Azure OpenAI
  • GitHub Models

目前仅 OpenAI 和 Google AI Gemini 支持 JSON Schema,Azure OpenAI 和 GitHub Models 很快将会支持。其它模型还未准备支持,随着大模型的发展以及框架的迭代升级应该也会支持。

ChatLanguageModel 与 JSON Schema 结合使用

需求:从一段文本中提取 Person 类型

John is 42 years old and lives an independent life. He stands 1.75 meters tall 
and carries himself with confidence. Currently unmarried, he enjoys the freedom to focus on his 
personal goals and interests.

Person对象

record Person(String name, int age, double height, boolean married) {  
}

代码实现

import com.fasterxml.jackson.core.JsonProcessingException;
import com.fasterxml.jackson.databind.ObjectMapper;
import dev.langchain4j.data.message.UserMessage;
import dev.langchain4j.model.chat.ChatLanguageModel;
import dev.langchain4j.model.chat.request.ChatRequest;
import dev.langchain4j.model.chat.request.ResponseFormat;
import dev.langchain4j.model.chat.request.ResponseFormatType;
import dev.langchain4j.model.chat.request.json.JsonObjectSchema;
import dev.langchain4j.model.chat.response.ChatResponse;
import dev.langchain4j.model.openai.OpenAiChatModel;

public class ChatLanguageModelWithJsonSchema {
    public static void main(String[] args) throws JsonProcessingException {
        ResponseFormat responseFormat = ResponseFormat.builder()
                .type(ResponseFormatType.JSON) // type can be either TEXT (default) or JSON
                .jsonSchema(dev.langchain4j.model.chat.request.json.JsonSchema.builder()
                        .name("Person") // OpenAI requires specifying the name for the schema
                        .rootElement(JsonObjectSchema.builder() // see [1] below
                                .addStringProperty("name")
                                .addIntegerProperty("age")
                                .addNumberProperty("height")
                                .addBooleanProperty("married")
                                .required("name", "age", "height", "married") // see [2] below
                                .build())
                        .build())
                .build();

        UserMessage userMessage = UserMessage.from("""
                John is 42 years old and lives an independent life.
                He stands 1.75 meters tall and carries himself with confidence.
                Currently unmarried, he enjoys the freedom to focus on his personal goals and interests.
                """);

        ChatRequest chatRequest = ChatRequest.builder()
                .responseFormat(responseFormat)
                .messages(userMessage)
                .build();

        ChatLanguageModel chatModel = OpenAiChatModel.builder()
                .apiKey("demo")
                .modelName("gpt-4o-mini")
                .logRequests(true)
                .logResponses(true)
                .build();

        ChatResponse chat = chatModel.chat(chatRequest);

        String output = chat.aiMessage().text();
        System.out.println(output);  // {"name":"John","age":42,"height":1.75,"married":false}

        Person person = new ObjectMapper().readValue(output, Person.class);
        System.out.println(person);
    }
}
JSON Schema 结构

image.png 顶层接口JsonSchemaElement,具有以下子类型;

  • JsonObjectSchema 用于对象类型 JsonObjectSchema.java
  • JsonStringSchema 用于String、Char或者Character 字符串类型
  • JsonIntegerSchema 用于 int/Integer、long/Long、BigInteger 类型
  • JsonNumberSchema 用于 float/Float、double/Double、BigDecimal 类型
  • JsonBooleanSchema 用于布尔类型
  • JsonEnumSchema 用于枚举类型
  • JsonArraySchema 用于数组和集合类型
  • JsonReferenceSchema 支持递归(例如,Person 有 Set<Person> children 字段),目前仅 OpenAI 支持
  • JsonAnyOfSchema 支持多态性,目前仅 OpenAI 支持

对于以上对象的构建及 API 的使用可以看源码,这里就不多介绍了。

使用JSON Schema 存在一定的限制;

  • 仅 Open AI 和 Gemini 支持。
  • 在流式模式下不支持JSON Schema,这个限制相当致命
  • 对于 JsonReferenceSchema 和 JsonAnyOfSchema 仅 OpenAI 支持

AI Services 与 JSON Schema 结合使用

package org.ivy.structured.controller;

import dev.langchain4j.model.chat.ChatLanguageModel;
import dev.langchain4j.model.openai.OpenAiChatModel;
import dev.langchain4j.service.AiServices;

public class AiServicesWithJsonSchema {
    public static void main(String[] args) {
        ChatLanguageModel chatModel = OpenAiChatModel.builder() // see [1] below
                .apiKey("demo")
                .modelName("gpt-4o-mini")
                .responseFormat("json_schema") // see [2] below
                .strictJsonSchema(true) // see [2] below
                .logRequests(true)
                .logResponses(true)
                .build();

        PersonExtractor personExtractor = AiServices.create(PersonExtractor.class, chatModel); // see [1] below

        String text = """
        John is 42 years old and lives an independent life.
        He stands 1.75 meters tall and carries himself with confidence.
        Currently unmarried, he enjoys the freedom to focus on his personal goals and interests.
        """;
        Person person = personExtractor.extractPersonFrom(text);
        System.out.println(person); // Person[name=John, age=42, height=1.75, married=false]
    }

    interface PersonExtractor {
        Person extractPersonFrom(String text);
    }

    record Person(String name, int age, double height, boolean married) {
    }
}

比第一种方式使用更加简单,同样在使用上也存在一定的限制;

  • 仅 Open AI 和 Gemini 支持。
  • 在流式模式下不支持JSON Schema,这个限制相当致命
  • 需要手动开启 ChatLanguageModel 支持 json schema
  • 对于 JsonReferenceSchema 和 JsonAnyOfSchema 仅 OpenAI 支持
  • 返回类型仅支持 POJO 或者 Result<T>, 如果需要其它类型则包装在 POJO 中
  • 生成的json schema中所有的字段都为必填,目前无法改为非必填。

NOTE:当 LLM 不支持 JSON Schema 功能,或者未启用,或者返回类型不是 POJO 时,AI Service 回退到提示。

Tools(Function Calling)

在函数调用时,一般会对工具(函数)进行描述。当大模型返回包含 ToolExecutionRequest 的 AiMessage,在ToolExecutionRequest.arguments() 中的 JSON 会解析为 POJO。

Prompting + JSON Mode

coming soon

Prompting

coming soon

LangChain4j 结构化输出示例

支持返回类型

AI Services 支持的返回类型;

  • 字符串 String
  • 基本类型 boolean/byte/short/int/long/float/double
  • 对象类型 Boolean/Byte/Short/Integer/Long/Float/Double/BigDecimal
  • 时间类型 Date/LocalDate/LocalTime/LocalDateTime
  • 集合类型 List<String>/Set<String>
  • 枚举类型 Enum
  • 自定义 POJO。
  • 自定义 Result<T>
  • 大模型回复消息 AiMessage

布尔与枚举

@AiService
public interface SentimentAnalyzer {
    @UserMessage("Does {{it}} have positive sentiment?")
    boolean isPositive(String text);

    @UserMessage("Analyze sentiment of {{it}}")
    Sentiment analyzeSentimentOf(String text);

    enum Sentiment {
        POSITIVE, NEGATIVE, NEUTRAL
    }
}

http测试 [http/sentiment-rest.http]

### 测试返回布尔值
GET http://localhost:8806/boolean?prompt=This is great!
### 测试返回布尔值
GET http://localhost:8806/boolean?prompt=It's awful!
### 测试返回枚举值
GET http://localhost:8806/enum?prompt=This is great!
### 测试返回枚举值
GET http://localhost:8806/enum?prompt=I don't know.

数字类型

package org.ivy.chatmemory.service;

import dev.langchain4j.service.UserMessage;
import dev.langchain4j.service.spring.AiService;

import java.math.BigDecimal;
import java.math.BigInteger;

@AiService
public interface NumberExtractor {
    @UserMessage("Extract a number from {{it}}")
    int extractInt(String text);

    @UserMessage("Extract a long number from {{it}}")
    long extractLong(String text);

    @UserMessage("Extract a big integer from {{it}}")
    BigInteger extractBigInteger(String text);

    @UserMessage("Extract a float number from {{it}}")
    float extractFloat(String text);

    @UserMessage("Extract a double number from {{it}}")
    double extractDouble(String text);

    @UserMessage("Extract a big decimal from {{it}}")
    BigDecimal extractBigDecimal(String text);
}

http测试[http/number-extractor-rest.http]

prompt: After countless millennia of computation, the supercomputer Deep Thought finally announced that the answer to the ultimate question of life, the universe, and everything was forty two.

### 测试返回int值
GET http://localhost:8806/int?prompt=
### 测试返回BigInteger值
GET http://localhost:8806/integer?prompt=
### 测试返回float值
GET http://localhost:8806/float?prompt=
### 测试返回double值
GET http://localhost:8806/double?prompt=
### 测试返回lōng值
GET http://localhost:8806/long?prompt=
### 测试返回BigDecimal值
GET http://localhost:8806/bigDecimal?prompt=xxx

Bean POJO

  • 简单的方式
package org.ivy.chatmemory.service;

import dev.langchain4j.service.UserMessage;
import dev.langchain4j.service.spring.AiService;
import lombok.Getter;
import lombok.Setter;
import lombok.ToString;

import java.time.LocalDate;

@AiService
public interface PojoExtractor {

    @UserMessage("Extract information about a person from {{it}}")
    Person extractPerson(String text);


    @Getter
    @Setter
    @ToString
    class Person {
        private String firstName;
        private String lastName;
        private LocalDate birthDate;
    }
}

http测试 [http/pojo-extractor-rest.http]

GET localhost:8806/pojo?prompt="In 1968, amidst the fading echoes of Independence Day, "
                    + "a child named John arrived under the calm evening sky. "
                    + "This newborn, bearing the surname Doe, marked the start of a new journey."
accept: application/json

输出结果:

{
  "firstName": "John",
  "lastName": "Doe",
  "birthDate": "1968-07-04"
}
  • 带有@Description描述的pojo
package org.ivy.chatmemory.service;

import dev.langchain4j.model.input.structured.StructuredPrompt;
import dev.langchain4j.model.output.structured.Description;
import dev.langchain4j.service.spring.AiService;
import lombok.AllArgsConstructor;
import lombok.Getter;
import lombok.Setter;
import lombok.ToString;

import java.time.LocalDate;
import java.util.List;

@AiService
public interface PojoDescription {

    // =========给出原材料创建菜肴的步骤和时间 =================
    /**
     * 给出一组原料,做出菜肴
     *
     * @param ingredients:菜肴的原料
     * @return Recipe
     */
    Recipe createRecipe(String... ingredients);

    Recipe createRecipe(CreateRecipePrompt prompt);

    /**
     * 烹饪步骤、需要时间等
     */
    @Getter
    @Setter
    @ToString
    class Recipe {
        // 字段添加描述信息
        @Description("short title, 3 words maximum")
        private String title;
        @Description("short description, 2 sentences maximum")
        private String description;
        @Description("each step should be described in 4 words, steps should be rhyme")
        private List<String> steps;
        private Integer preparationTimeMinutes;
    }

    @Getter
    @AllArgsConstructor
    @StructuredPrompt("Create a recipe of a {{dish}} that can be prepared using only {{ingredients}}")
    class CreateRecipePrompt {
        private String dish;
        private List<String> ingredients;
    }

http测试

### 黄瓜、番茄、羊奶酪、洋葱、橄榄
GET http://localhost:8806/recipe?prompt="cucumber", "tomato", "feta", "onion", "olives"

###
GET http://localhost:8806/recipe2?prompt=["cucumber", "tomato", "feta", "onion", "olives"]

日期与时间

package org.ivy.chatmemory.service;

import dev.langchain4j.service.UserMessage;
import dev.langchain4j.service.spring.AiService;

import java.time.LocalDate;
import java.time.LocalDateTime;
import java.time.LocalTime;

@AiService
public interface DateTimeExtractor {
    @UserMessage("Extract date from {{it}}")
    LocalDate extractDate(String text);

    @UserMessage("Extract time from {{it}}")
    LocalTime extractTime(String text);

    @UserMessage("Extract date and time from {{it}}")
    LocalDateTime extractDateTime(String text);
}

http测试[date-time-extractor-rest.http]

GET http://localhost:8806/extract-date?prompt=The tranquility pervaded the evening of 1968, just fifteen minutes shy of midnight," +
                    " following the celebrations of Independence Day.

### 测试返回 LocalTime 类型的时间
GET http://localhost:8806/extract-time?prompt=The tranquility pervaded the evening of 1968, just fifteen minutes shy of midnight," +
                    " following the celebrations of Independence Day.

### 测试返回 LocalDateTime 类型的时间
GET http://localhost:8806/extract-datetime?prompt=The tranquility pervaded the evening of 1968, just fifteen minutes shy of midnight," +
                    " following the celebrations of Independence Day.

Result<T>

package org.ivy.chatmemory.service;

import dev.langchain4j.model.input.structured.StructuredPrompt;
import dev.langchain4j.service.Result;
import dev.langchain4j.service.spring.AiService;
import lombok.AllArgsConstructor;
import lombok.Data;
import lombok.Getter;

import java.time.LocalDate;

@AiService
public interface ResultExtractor {

    Result<Film> film(FilmPrompt prompt);

    // =============== 查询某个导演最受欢迎的电影,返回电影信息 ==========
    @Getter
    @AllArgsConstructor
    @StructuredPrompt("请问{{director}}导演最受欢迎的电影是什么?哪年发行的,电影讲述的什么内容?")
    class FilmPrompt {
        private String director;
    }

    @Data
    class Film {
        private String title;
        private String director;
        private LocalDate publicationDate;
        private String description;
    }
}

http测试[result-extractor-rest.http]

### 张艺谋导演最受欢迎的电影
GET http://localhost:8806/film?prompt=张艺谋

结构化输出源码分析

OutputParser

OutputParser是所有Parser的统一接口,定义了两个方法,一个完成解析和对格式返回的描述。

    package dev.langchain4j.model.output;
    
    public interface OutputParser<T> {
        /**
         * 将文本转换为T类型
         * @param text 需要转换的文本
         * @return 转换后的类型.
         */
        T parse(String text);

        /**
         * 需要转换格式的描述.
         * @return 文本格式的描述.
         */
        String formatInstructions();
    }

实现类

image.png

ServiceOutputParser

真正逻辑实现的地方,重点看两个方法,一个是解析方法,一个是返回值描述方法。

public static String outputFormatInstructions(Class<?> returnType) {
    // 返回类型为String或者大模型本身的类型,则不需要返回描述
    if (returnType == String.class
            || returnType == AiMessage.class
            || returnType == TokenStream.class
            || returnType == Response.class) {
        return "";
    }
    // 在定义方法时,返回时不允许为void类型,也是一个规则!!!!
    if (returnType == void.class) {
        throw illegalConfiguration("Return type of method '%s' cannot be void");
    }
    // 返回值是枚举类型,特殊处理一下
    if (returnType.isEnum()) {
        String formatInstructions = new EnumOutputParser(returnType.asSubclass(Enum.class)).formatInstructions();
        return "\nYou must answer strictly in the following format: " + formatInstructions;
    }
    // 如果返回的类型是基本的类型,则直接从Map中获取,OUTPUT_PARSERS维护了类型与解析器的关系
    OutputParser<?> outputParser = OUTPUT_PARSERS.get(returnType);
    if (outputParser != null) {
        String formatInstructions = outputParser.formatInstructions();
        return "\nYou must answer strictly in the following format: " + formatInstructions;
    }
    // 返回值是List/Set类型
    if (returnType == List.class || returnType == Set.class) {
        return "\nYou must put every item on a separate line.";
    }
    // 最后是对象类型,返回Json
    return "\nYou must answer strictly in the following JSON format: " + jsonStructure(returnType, new HashSet<>());
}
// 对Json描述进行处理
private static String jsonStructure(Class<?> structured, Set<Class<?>> visited) {
    StringBuilder jsonSchema = new StringBuilder();

    jsonSchema.append("{\n");
    for (Field field : structured.getDeclaredFields()) {
        String name = field.getName();
        if (name.equals("__$hits$__") || java.lang.reflect.Modifier.isStatic(field.getModifiers())) {
            // Skip coverage instrumentation field.
            continue;
        }
        jsonSchema.append(format(""%s": (%s),\n", name, descriptionFor(field, visited)));
    }

    int trailingCommaIndex = jsonSchema.lastIndexOf(",");
    if (trailingCommaIndex > 0) {
        jsonSchema.delete(trailingCommaIndex, trailingCommaIndex +1);
    }
    jsonSchema.append("}");
    return jsonSchema.toString();
}

此方法对返回值,一并发给大模型,然后大模型根据返回格式的说明 + 提示词,进行对应的返回。返回后需要调用parse方法进行转换

public static Object parse(Response<AiMessage> response, Class<?> returnType) {

    if (returnType == Response.class) {
        return response;
    }
    // 返回值 AiMessage类型
    AiMessage aiMessage = response.content();
    if (returnType == AiMessage.class) {
        return aiMessage;
    }
    // 返回值是String类型
    String text = aiMessage.text();
    if (returnType == String.class) {
        return text;
    }
    // 返回值是基本类型
    OutputParser<?> outputParser = OUTPUT_PARSERS.get(returnType);
    if (outputParser != null) {
        return outputParser.parse(text);
    }
    // 返回值是List类型
    if (returnType == List.class) {
        return asList(text.split("\n"));
    }
    // 返回值是Set类型
    if (returnType == Set.class) {
        return new HashSet<>(asList(text.split("\n")));
    }
    // json转换为对象
    return Json.fromJson(text, returnType);
}

示例代码与总结

深度使用LangChain4j对结构化数据输出的使用,并详细分析了框架的实现原理及处理流程。整体的思路并不难,大家重点看如下几个类的源码就可以了。

  • DefaultAiServices #build 方法,核心流程在此
  • ServiceOutputParser 负责处理发送给大模型的格式化指令 以及 解析大模型返回的文本到对应的returnType
  • OutputParser 接口及 14个实现类。

这几部分代码明白了,结构化数据返回就一目了然了。

Github示例代码