LangChain之prompt、model、parser这里是LangChain学习笔记第二章。 runnable对象

prompt、model和parser分别对应LLM模型应用中的输入、模型处理、输出三个步骤。

Output Parser

可以将模型返回的文本信息解析为更结构化的表示，现在越来越多的模型支持函数调用，来完成这一步骤，langchain更推荐使用functiontool-calling这一特性。

StringOutputParser

获取api返回数据的context部分内容

import { StringOutputParser } from "@langchain/core/output_parsers";
const parser = new StringOutputParser();

截屏2024-10-12 17.39.29.png

StructuredOutputParser

// 使用zod库定义一个对象
const zodSchema = z.object({
  // 定义answer类型为string，并添加描述
  answer: z.string().describe("answer to the user's question"),
  source: z.string().describe(
      "source used to answer the user's question, should be a website."
    ),
});
typeof zodSchema
{
  answer: string,
  source: string
}
const parser = StructuredOutputParser.fromZodSchema(zodSchema);
// 打印parser实际内容
console.log(parser.getFormatInstructions());

// 可以看到实际生成的是langchain编写的prompt
// 目标
You must format your output as a JSON value that adheres 
to a given "JSON Schema" instance.
// 介绍json schema的定义
"JSON Schema" is a declarative language that allows you to annotate and validate JSON documents.

For example, the example "JSON Schema" instance {{"properties": {{"foo": {{"description": "a list of test words", "type": "array", "items": {{"type": "string"}}}}}}, "required": ["foo"]}}}}
would match an object with one required property, "foo". The "type" property specifies "foo" must be an "array", and the "description" property semantically describes it as "a list of test words". The items within "foo" must be strings.
Thus, the object {{"foo": ["bar", "baz"]}} is a well-formatted instance of this example "JSON Schema". The object {{"properties": {{"foo": ["bar", "baz"]}}}} is not well-formatted.

// 强调目标
Your output will be parsed and type-checked according to the provided schema instance, so make sure all fields in your output match the schema exactly and there are no trailing commas!
// 指定我们定义的json结构
Here is the JSON Schema instance your output must adhere to. Include the enclosing markdown codeblock:
```json
{"type":"object","properties":{"answer":{"type":"string","description":"answer to the user's question"},"source":{"type":"string","description":"source used to answer the user's question, should be a website."}},"required":["answer","source"],"additionalProperties":false,"$schema":"http://json-schema.org/draft-07/schema#"}

const chain = RunnableSequence.from([
  ChatPromptTemplate.fromTemplate(
    "Answer the users question as best as possible.\n{format_instructions}\n{question}"
  ),
  ollama,
  parser,
]);
const response = await chain.invoke({
  question: "What is the most popular video game?",
  format_instructions: parser.getFormatInstructions(),
});
// response
{
  answer: "According to various online sources such as IGN and GameSpot, the most popular video game can be subjective and vary depending on personal preferences and platform. However, some of the best-selling video games of all time include Minecraft, Grand Theft Auto V, and Tetris.",
  source: "https://www.ign.com/articles/2019/01/14/best-selling-video-games-of-all-time"
}

fix errors in output parsing

模型的输出并不是时时完美的，有时并不满足输出的要求，使用OutputFixingParser尝试修复错误。

const zodSchema = z.object({  
    name: z.string().describe("name of an actor"),  
    film_names: z  
    .array(z.string())  
    .describe("list of names of films they starred in"),  
});  
  
const parser = StructuredOutputParser.fromZodSchema(zodSchema);  
// 示例：错误的返回信息
const misformatted = "{'name': 'Tom Hanks', 'film_names': ['Forrest Gump']}";
// 不满足定义好的json格式
await parser.parse(misformatted); // error ****
// 使用OutputFixingParser调用模型，并传递错误返回信息，让模型纠正错误
const parserWithFix = OutputFixingParser.fromLLM(ollama, parser);

await parserWithFix.parse(misformatted);
// response
{ name: "Tom Hanks", film_names: [ "Forrest Gump" ] }

LCEL是langChain设计并推荐的语法，可以很好地组装并利用chains。同时为了更方便地创建自定义chain,langchain定义了一个叫runnable的接口，langchain的许多组件都继承了这个接口，比如chat models, LLMs, output parsers, retrievers, prompt templates，也就是说，这些组件都可以使用接口内定义的方法。

runnable对象调用接口

invoke

对于单个输入的调用接口

import { Ollama } from "@langchain/community/llms/ollama";
import { HumanMessage } from "@langchain/core/messages";
import { StringOutputParser } from "@langchain/core/output_parsers";

const ollama = new Ollama({
    baseUrl:"http://localhost:11434",
    model:"llama3.1"
})
const outputParser = new StringOutputParser();
const simpleChain = ollama.pipe(outputParser);
await simpleChain.invoke([
    new HumanMessage("Give me a random knowledge about earth")
]);

截屏2024-10-08 21.36.03.png

batch

调用多个输入

await simpleChain.batch([
    [new HumanMessage("Give me a random piece knowledge about earth")],
    [new HumanMessage("Then translate the knowledge to Chinese")]
])

可以看到区别于invoke调用，返回值以数组形式返回。只不过现在输入内容不包含上下文对话信息，所以模型没办法对上一个回答的内容进行翻译。截屏2024-10-08 21.35.19.png

stream

流式传输响应数据

const stream = await simpleChain.stream([new HumanMessage("Give me a random piece knowledge about earth")]);
for await (const chunk of stream){
    console.log(chunk)
}

输出内容以流式形式返回截屏2024-10-08 21.40.55.png

fallback

withFallbacks 出错重试

LangChain之prompt、model、parser

prompts

String PromptTemplates

ChatPromptTemplates

partially format prompt templates

Partial with strings

Partial With Functions

compose prompts together

pipelinePrompt