LangChain4j的AiServer

459 阅读8分钟

AiServer概念

  • 也就是AI服务,思想是将与LLM和其他组件交互的复杂性隐藏在简单的API后面。
  • 通过声明的方式定义具体所需API的接口.然后LangChain4j提供实现该接口的对象(代理),可以将AI服务视为应用程序服务层的中间件。

AiServer为什么是高级特性

  • 代码复杂度低:声明式接口,自动处理底层逻辑。
  • 快速使用:AiServer已经对底层代码进行了一层封装,只需要调用AiServer就能实现快速开发。
  • 内置标准化规则:比如自动将数值,时间类等转化为JSON。
  • 扩展性高:可以通过注解和配置扩展工具和记忆策略。

简单的AiServer

  • 定义一个带chat方法的接口。
interface Assistant {
    String chat(String userMessage);
}
  • 创建服务模型。
ChatLanguageModel model = OpenAiChatModel.builder()
    .apiKey(System.getenv("OPENAI_API_KEY"))
    .modelName(GPT_4_O_MINI)
    .build();
  • 调用AiServer服务。
Assistant assistant = AiServices.create(Assistant.class, model);
  • 实现高级的Model调用。
//通过实现类嵌入式调用Model
String answer = assistant.chat("Hello");
System.out.println(answer); // Hello, how can I help you?

AiServer注解

@SystemMessage

  • 指要使用的系统提醒信息/模板。

使用举例

interface Friend {
    @SystemMessage("You are a good friend of mine. Answer using slang.")
    String chat(String userMessage);
}
Friend friend = AiServices.create(Friend.class, model);
String answer = friend.chat("Hello"); // Hey! What's up?

当调用chat()方法时,就会把@SystemMessage的信息作为系统提示词传入模型.也就是告诉AI,You are a good friend of mine. Answer using slang.Ai会根据userMessage的信息做出回答。

@SystemMessage也可以从资源加载模板,比如@SystemMessage(fromResource = "my-prompt-template.txt"),从my-prompt-template.txt加载到的数据都作为系统提示词。

系统消息提供者动态定义

Friend friend = AiServices.builder(Friend.class)
    .chatLanguageModel(model)
    .systemMessageProvider(chatMemoryId -> "You are a good friend of mine. Answer using slang.")
    .build();
  • 根据聊天记忆 ID(用户或对话)提供不同的系统消息。

@UserMessage

  • 适合模型不支持SystemMessage的情况,或者单纯想使用。

  • interface Friend {
        @UserMessage("You are a good friend of mine. Answer using slang. {{it}}")
        String chat(String userMessage);
    }
    Friend friend = AiServices.create(Friend.class, model);
    
    String answer = friend.chat("Hello");// Hey! What's shakin'?
    

@V

  • 为模板变量分配自定义名称。
interface Friend {
    @UserMessage("You are a good friend of mine. Answer using slang. {{message}}")
    String chat(@V("message") String userMessage);
}

以下是官网提供的示例

@SystemMessage("Given a name of a country, answer with a name of it's capital")
String chat(String userMessage);

@SystemMessage("Given a name of a country, answer with a name of it's capital")
String chat(@UserMessage String userMessage);

@SystemMessage("Given a name of a country, {{answerInstructions}}")
String chat(@V("answerInstructions") String answerInstructions, @UserMessage String userMessage);

@SystemMessage("Given a name of a country, answer with a name of it's capital")
String chat(@UserMessage String userMessage, @V("country") String country); // userMessage 包含 "{{country}}" 模板变量

@SystemMessage("Given a name of a country, {{answerInstructions}}")
String chat(@V("answerInstructions") String answerInstructions, @UserMessage String userMessage, @V("country") String country); // userMessage 包含 "{{country}}" 模板变量

@SystemMessage("Given a name of a country, answer with a name of it's capital")
@UserMessage("Germany")
String chat();

@SystemMessage("Given a name of a country, {{answerInstructions}}")
@UserMessage("Germany")
String chat(@V("answerInstructions") String answerInstructions);

@SystemMessage("Given a name of a country, answer with a name of it's capital")
@UserMessage("{{it}}")
String chat(String country);

@SystemMessage("Given a name of a country, answer with a name of it's capital")
@UserMessage("{{country}}")
String chat(@V("country") String country);

@SystemMessage("Given a name of a country, {{answerInstructions}}")
@UserMessage("{{country}}")
String chat(@V("answerInstructions") String answerInstructions, @V("country") String country);

多模态

  • 目前AiServer暂不支持。

返回类型

  • 如果输入为String,那么LLM输出将不经过任何处理直接返回(也为String)。
  • 其他类型,AiServer会在返回之前将LLM生成的输出,解析为指定的类型再输出。
  • AiServer通过Result<T>包装数据类型。

Result

  • content:返回的数据结果。解析后映射为List<String>

  • tokenUsage:本次调用消耗的Token数量,包含输入输出的Token数。源码如下:

PixPin_2025-05-14_16-34-09.png

  • sources:返回结果生成时参考的数据来源(在使用RAG并且配置向量数据库时触发),返回List<Content>。源码如下:

PixPin_2025-05-14_16-38-04.png

  • finishReason:返回模型生成结果的终止原因,判断结果是否正常生成,返回FinishReason。源码如下:

PixPin_2025-05-14_16-44-19.png

  • toolExecutions:返回Ai模型在执行过程中调用工具的执行记录(当配置工具调用且存在@Tool时触发),返回List<ToolExecution>。源码如下:

PixPin_2025-05-14_16-39-27.png

PixPin_2025-05-14_16-39-35.png

ToolExecutionRequest参数

  • id:工具的唯一标识符,可以通过id快速定位工具。
  • name:调用工具的方法名称,定位和@Tool注解注册的工具方法一致的名称。
  • arguments:Json格式化后的调用参数

以上就是Result的解析,接下来举个使用例子:

interface Assistant {  
    @UserMessage("Generate an outline for the article on the following topic: {{it}}")
    Result<List<String>> generateOutlineFor(String topic);
}
Result<List<String>> result = assistant.generateOutlineFor("Java");

List<String> outline = result.content();
TokenUsage tokenUsage = result.tokenUsage();
List<Content> sources = result.sources();
List<ToolExecution> toolExecutions = result.toolExecutions();
FinishReason finishReason = result.finishReason();

结构化输出

返回boolean

interface SentimentAnalyzer {
    @UserMessage("Does {{it}} has a positive sentiment?")
    //方法返回类型为boolean
    boolean isPositive(String text);

}

SentimentAnalyzer sentimentAnalyzer = AiServices.create(SentimentAnalyzer.class, model);
//获取到的类型也为boolean
boolean positive = sentimentAnalyzer.isPositive("It's wonderful!");
//返回结果示例: true

返回为Enum

//返回枚举
enum Priority {
    CRITICAL, HIGH, LOW
}

interface PriorityAnalyzer {    
    @UserMessage("Analyze the priority of the following issue: {{it}}")
    Priority analyzePriority(String issueDescription);
}

PriorityAnalyzer priorityAnalyzer = AiServices.create(PriorityAnalyzer.class, model);

Priority priority = priorityAnalyzer.analyzePriority("The main payment gateway is down, and customers cannot process transactions.");
// 返回结果示例: CRITICAL

返回类型为POJO

//POJO类
class Person {
    @Description("first name of a person") // 您可以添加可选描述,帮助 LLM 更好地理解
    String firstName;
    String lastName;
    LocalDate birthDate;
    Address address;
}

@Description("an address")
class Address {
    String street;
    Integer streetNumber;
    String city;
}

interface PersonExtractor {

    @UserMessage("Extract information about a person from {{it}}")
    Person extractPersonFrom(String text);
}

PersonExtractor personExtractor = AiServices.create(PersonExtractor.class, model);

String text = """
            In 1968, amidst the fading echoes of Independence Day,
            a child named John arrived under the calm evening sky.
            This newborn, bearing the surname Doe, marked the start of a new journey.
            He was welcomed into the world at 345 Whispering Pines Avenue
            a quaint street nestled in the heart of Springfield
            an abode that echoed with the gentle hum of suburban dreams and aspirations.
            """;

Person person = personExtractor.extractPersonFrom(text);

System.out.println(person); 
  • 返回结果示例为: Person { firstName = "John", lastName = "Doe", birthDate = 1968-07-04, address = Address { ... } }

JSON 模式

  • 在提取自定义 POJO(实际上是 JSON,然后解析为 POJO)时,建议在模型配置中启用"JSON 模式"。 这样,LLM 将被强制以有效的 JSON 格式响应。

使用示例:

//OpenAI&结构较新的模型比如gpt-4o-mini、gpt-4o-2024-08-06
OpenAiChatModel.builder()
    ...
    .supportedCapabilities(RESPONSE_FORMAT_JSON_SCHEMA)
    .strictJsonSchema(true)
    .build();
    
//OpenAI&结构较旧的模型比如gpt-3.5-turbo、gpt-4
OpenAiChatModel.builder()
    ...
    .responseFormat("json_object")
    .build();

此处以OpenAI为例,其他模型详见官方文档。

流式处理

TokenStream


interface Assistant {

    TokenStream chat(String message);
}

StreamingChatLanguageModel model = OpenAiStreamingChatModel.builder()
    .apiKey(System.getenv("OPENAI_API_KEY"))
    .modelName(GPT_4_O_MINI)
    .build();

Assistant assistant = AiServices.create(Assistant.class, model);

TokenStream tokenStream = assistant.chat("Tell me a joke");

//tokenStream流式返回
tokenStream.onPartialResponse((String partialResponse) -> System.out.println(partialResponse))
    .onRetrieved((List<Content> contents) -> System.out.println(contents))
    .onToolExecuted((ToolExecution toolExecution) -> System.out.println(toolExecution))
    .onCompleteResponse((ChatResponse response) -> System.out.println(response))
    .onError((Throwable error) -> error.printStackTrace())
    .start();
  • 下面从源码角度刨析以下TokenStream:

PixPin_2025-05-14_17-11-31.png

  • onPartialResponse:每次语言模型生成新的部分响应时, 将调用提供的消费者。
  • onRetrieved:通过检索到的content,将调用提供的消费者。
  • onToolExecuted:当任何工具执行完成时,将调用提供的消费者。
  • onCompleteResponse:当语言模型完成响应流式传输时,将调用提供的处理器。
  • onError:当流式传输过程中发生错误时,将调用提供的消费者。

Flux

  • pom依赖。
<dependency>
    <groupId>dev.langchain4j</groupId>
    <artifactId>langchain4j-reactor</artifactId>
    <version>1.0.0-beta3</version>
</dependency>
  • 使用示例。
interface Assistant {
  Flux<String> chat(String message);
}

聊天记忆

  • 如果是单个用户:
Assistant assistant = AiServices.builder(Assistant.class)
    .chatLanguageModel(model)
    //携带最大为十条的聊天记忆信息
    .chatMemory(MessageWindowChatMemory.withMaxMessages(10))
    .build();
  • 多个用户:就要使用ChatMemoryProvider维护各自的实例对话
  • 通过分配MemoryID来避免共享同一个会话,每次调用AI服务时提供自己的会话ID可以实现会话隔离
interface Assistant  {
    String chat(@MemoryId int memoryId, @UserMessage String message);
}

Assistant assistant = AiServices.builder(Assistant.class)
    .chatLanguageModel(model)
    .chatMemoryProvider(memoryId -> MessageWindowChatMemory.withMaxMessages(10))
    .build();

String answerToKlaus = assistant.chat(1, "Hello, my name is Klaus");
String answerToFrancine = assistant.chat(2, "Hello, my name is Francine");
  • 获取会话记忆内容:
List<ChatMessage> messagesWithKlaus = assistant.getChatMemory(1).messages();
  • 删除会话:
boolean chatMemoryWithFrancineEvicted = assistant.evictChatMemory(2);

工具调用

  • 官网示例:
class Tools {    
    @Tool
    int add(int a, int b) {
        return a + b;
    }

    @Tool
    int multiply(int a, int b) {
        return a * b;
    }
}

Assistant assistant = AiServices.builder(Assistant.class)
    .chatLanguageModel(model)
    //使用工具调用
    .tools(new Tools())
    .build();

String answer = assistant.chat("What is 1+2 and 3*4?");

RAG调用

EmbeddingStore embeddingStore  = ...
EmbeddingModel embeddingModel = ...

ContentRetriever contentRetriever = new EmbeddingStoreContentRetriever(embeddingStore, embeddingModel);

Assistant assistant = AiServices.builder(Assistant.class)
    .chatLanguageModel(model)
    //实现RAG调用
    .contentRetriever(contentRetriever)
    .build();

自动审核

public class ServiceWithAutoModerationExample {
    
    interface Chat {
        @Moderate
        String chat(String text);
    }

    public static void main(String[] args) {

        OpenAiModerationModel moderationModel = OpenAiModerationModel.builder()
                .apiKey(ApiKeys.OPENAI_API_KEY)
                .modelName(TEXT_MODERATION_LATEST)
                .build();

        ChatModel chatModel = OpenAiChatModel.builder()
                .apiKey(ApiKeys.OPENAI_API_KEY)
                .modelName(GPT_4_O_MINI)
                .build();

        Chat chat = AiServices.builder(Chat.class)
                .chatModel(chatModel)
                .moderationModel(moderationModel)
                .build();

        try {
            chat.chat("I WILL KILL YOU!!!");
        } catch (ModerationException e) {
            System.out.println(e.getMessage());
            // Text "I WILL KILL YOU!!!" violates content policy
        }
    }
}

执行流程

  1. 当用户调用chat()方式时,AiServer代理会审核将传入的参数交给审核模型审核
  2. 如果审核通过,继续调用ChatModel;如果审核未通过,返回违规标志给AiServer代理,抛出ModerationException异常。

解释

  • @Model这个注解标记该方法需要自动审核,在调用方法前会触发自动审核流程。(类比于AOP)
interface Chat {
        @Moderate 
        String chat(String text);
    }
  • 这段代码构建了一个审核Model。
  OpenAiModerationModel moderationModel = OpenAiModerationModel.builder()
                .apiKey(ApiKeys.OPENAI_API_KEY)
                .modelName(TEXT_MODERATION_LATEST)
                .build();
  • 构建聊天Model。
ChatModel chatModel = OpenAiChatModel.builder()
                .apiKey(ApiKeys.OPENAI_API_KEY)
                .modelName(GPT_4_O_MINI)
                .build();
  • 组装为AiServer。
 Chat chat = AiServices.builder(Chat.class)
                .chatModel(chatModel)
                .moderationModel(moderationModel)
                .build();

链式AiServer

interface GreetingExpert {

    @UserMessage("Is the following text a greeting? Text: {{it}}")
    boolean isGreeting(String text);
}

interface ChatBot {

    @SystemMessage("You are a polite chatbot of a company called Miles of Smiles.")
    String reply(String userMessage);
}

class MilesOfSmiles {

    private final GreetingExpert greetingExpert;
    private final ChatBot chatBot;
    
    ...
    
    public String handle(String userMessage) {
        if (greetingExpert.isGreeting(userMessage)) {
            return "Greetings from Miles of Smiles! How can I make your day better?";
        } else {
            return chatBot.reply(userMessage);
        }
    }
}

GreetingExpert greetingExpert = AiServices.create(GreetingExpert.class, llama2);

ChatBot chatBot = AiServices.builder(ChatBot.class)
    .chatLanguageModel(gpt4)
    .contentRetriever(milesOfSmilesContentRetriever)
    .build();

MilesOfSmiles milesOfSmiles = new MilesOfSmiles(greetingExpert, chatBot);

String greeting = milesOfSmiles.handle("Hello");
System.out.println(greeting); // Greetings from Miles of Smiles! How can I make your day better?

String answer = milesOfSmiles.handle("Which services do you provide?");
System.out.println(answer); // At Miles of Smiles, we provide a wide range of services ...

参考资料

官方文档

docs.langchain4j.dev/tutorials/a…

YouTuBe介绍视频

www.youtube.com/watch?v=Bx2…

源码地址

github.com/langchain4j…