LangChain has a number of components designed to help build question-answering applications, and RAG applications more generally. To familiarize ourselves with these, we’ll build a simple Q&A application over a text data source. Along the way we’ll go over a typical Q&A architecture, discuss the relevant LangChain components, and highlight additional resources for more advanced Q&A techniques. We’ll also see how LangSmith can help us trace and understand our application. LangSmith will become increasingly helpful as our application grows in complexity.

翻译：

LangChain具有许多旨在帮助构建问答应用程序以及更一般的RAG应用程序的组件。为了熟悉这些组件，我们将在文本数据源上构建一个简单的问答应用程序。在此过程中，我们将介绍典型的问答架构，讨论相关的LangChain组件，并重点介绍更高级的问答技术所需的额外资源。我们还将了解LangSmith如何帮助我们跟踪和理解我们的应用程序。随着我们的应用程序变得越来越复杂，LangSmith将变得越来越有用。

单词解释：

components (n.) 组件，指构成一个整体的各个部分。
relevant (adj.) 相关的，有重大关系的。
architecture (n.) 架构，指系统或建筑物的结构和设计。
resources (n.) 资源，指可供利用的物质、信息或人力等。
trace (v.) 追踪，指跟踪或追溯某物的来源或过程。
complexity (n.) 复杂性，指某物具有许多相关联的部分或方面，因而难以理解或处理。

例句解析：

LangChain has a number of components designed to help build question-answering applications, and RAG applications more generally.
- 这句话中，主语是LangChain，谓语是has，宾语是a number of components。designed to help build question-answering applications是现在分词短语作定语，修饰components，说明这些组件是设计来帮助构建问答应用程序的。more generally是副词短语，表示“更一般地说”。
LangSmith will become increasingly helpful as our application grows in complexity.
- 这句话是一个复合句，主句是LangSmith will become increasingly helpful，表示“LangSmith将变得越来越有用”。as our application grows in complexity是状语从句，表示“随着我们的应用程序变得越来越复杂”。
To familiarize ourselves with these, we’ll build a simple Q&A application over a text data source.
- 这句话中，To familiarize ourselves with these是不定式短语作目的状语，表示“为了熟悉这些组件”。we’ll build a simple Q&A application over a text data source是主句，表示“我们将在文本数据源上构建一个简单的问答应用程序”。

Architecture(建筑,结构)

We’ll create a typical RAG application as outlined in the Q&A introduction, which has two main components:

Indexing: a pipeline for ingesting data from a source and indexing it. This usually happens offline.

Retrieval and generation: the actual RAG chain, which takes the user query at run time and retrieves the relevant data from the index, then passes that to the model.

The full sequence from raw data to answer will look like:

翻译：我们将创建一个典型的RAG（检索增强生成）应用程序，如问答介绍中所述，它有两个主要组件：

索引：一个用于从数据源获取数据并为其建立索引的管道。这通常发生在线下。

检索和生成：实际的RAG链，它在运行时接收用户查询，并从索引中检索相关数据，然后将其传递给模型。

从原始数据到答案的完整序列将如下所示：

单词解释：

pipeline - 管道：在这里指的是一系列处理步骤，用于完成特定任务（如数据索引）。
ingesting - 摄取：指的是从数据源中获取或接收数据。
indexing - 索引：为数据创建索引，以便快速检索。
retrieval - 检索：从数据集中查找和获取信息的过程。
generation - 生成：在这里指的是使用模型生成文本或回答的过程。
runtime - 运行时：程序正在执行的时候。
relevant - 相关的：与特定主题或查询有关联的。

例句解析：

"Indexing: a pipeline for ingesting data from a source and indexing it."
- 主干是：“Indexing（主语）+ is（系动词）+ a pipeline（表语）”，使用了一般现在时，表示“索引是一个管道”。
- "for ingesting data from a source and indexing it" 是介词短语作定语，修饰pipeline，说明这个管道的作用是从数据源获取数据并为其建立索引。
"Retrieval and generation: the actual RAG chain, which takes the user query at runtime and retrieves the relevant data from the index, then passes that to the model."
- 主干是：“Retrieval and generation（主语）+ are（系动词）+ the actual RAG chain（表语）”，使用了一般现在时，表示“检索和生成是实际的RAG链”。但这里省略了系动词are，直接用冒号和后面的内容对主语进行解释说明。
- "which takes the user query at runtime and retrieves the relevant data from the index, then passes that to the model" 是非限制性定语从句，修饰the actual RAG chain，详细描述了RAG链的工作过程。

总结：这段文本主要介绍了RAG应用程序的两个主要组件及其功能。通过翻译和解析，我们可以更好地理解这段文本的内容和结构。

Indexing

Load: First we need to load our data. We’ll use DocumentLoaders for this.
Split: Text splitters break large Documents into smaller chunks. This is useful both for indexing data and for passing it in to a model, since large chunks are harder to search over and won’t fit in a model’s finite context window.
Store: We need somewhere to store and index our splits, so that they can later be searched over. This is often done using a VectorStore and Embeddings model.
加载：首先，我们需要加载我们的数据。我们将为此使用DocumentLoaders。
拆分：文本拆分器将大文档拆分成较小的块。这对于数据索引和将数据传递给模型都很有用，因为大块文本更难搜索，并且不适合模型的有限上下文窗口。
存储：我们需要某个地方来存储和索引我们的拆分数据，以便稍后可以进行搜索。这通常通过使用VectorStore和Embeddings模型来完成。

单词解释：

DocumentLoaders：这是一个用于加载文档数据的工具或类。在此上下文中，它用于加载需要处理的数据。
Text splitters：文本拆分器，用于将大段文本拆分成较小的部分或块，便于后续处理和分析。
Chunks：在此指拆分后的小块文本。大块文本被拆分成多个小块，便于处理。
VectorStore：一个用于存储向量的数据库或存储系统。在此上下文中，它可能用于存储文本拆分后的向量表示。
Embeddings model：词嵌入模型，用于将文本或单词转换为向量表示。这些向量可以用于各种NLP任务，如搜索、聚类和分类。

例句解析：

“Text splitters break large Documents into smaller chunks.”
- 主干是：“Text splitters（主语）+ break（谓语）+ large Documents（宾语）+ into smaller chunks（宾补）”。这句话使用了现在时，描述了文本拆分器的作用是将大文档拆分成小块。
“This is useful both for indexing data and for passing it in to a model...”
- 这是一个并列句，通过并列连词“both...and...”连接了两个介词短语“for indexing data”和“for passing it in to a model”，作为形容词“useful”的补足语，说明了拆分文本对于数据索引和模型输入都是有用的。
“We need somewhere to store and index our splits...”
- 主干是：“We（主语）+ need（谓语）+ somewhere（宾语）”。这句话使用了现在时，表示我们需要一个地方来存储和索引拆分后的数据。“to store and index our splits”是不定式短语作后置定语，修饰“somewhere”，说明了需要这个地方的目的。
Retrieval and generation
1. Retrieve: Given a user input, relevant splits are retrieved from storage using a Retriever.
2. Generate: A ChatModel / LLM produces an answer using a prompt that includes the question and the retrieved data
翻译：

检索：给定用户输入，使用检索器从存储中检索相关拆分。生成：聊天模型/LLM使用包含问题和检索到的数据的提示来生成答案。

单词解释：
1. Retriever：检索器，在这里指的是用于从存储中检索相关信息的工具或系统。
2. ChatModel / LLM：聊天模型/大型语言模型（Large Language Model），这是一种能够生成自然语言文本的机器学习模型，通常用于对话系统或文本生成任务。
例句解析：
- "Given a user input, relevant splits are retrieved from storage using a Retriever."
  - 主干是：“relevant splits（主语）+ are retrieved（谓语）”，使用被动语态，表示“相关的拆分被检索”。
  - “from storage”是介词短语作状语，表示拆分是从存储中检索的。
  - “using a Retriever”是现在分词短语作状语，表示使用检索器进行检索。
  - “Given a user input”是过去分词短语作状语，表示在给定用户输入的情况下进行检索。
- "A ChatModel / LLM produces an answer using a prompt that includes the question and the retrieved data"
  - 主干是：“A ChatModel / LLM（主语）+ produces（谓语）+ an answer（宾语）”，使用一般现在时，陈述事实，表示聊天模型/LLM生成答案。
  - “using a prompt”是现在分词短语作状语，表示通过使用提示来生成答案。
  - “that includes the question and the retrieved data”是定语从句，修饰prompt，表示提示中包含问题和检索到的数据。
总结：这段文本主要描述了信息检索和答案生成的两个步骤。首先，根据用户输入，使用检索器从存储中检索相关的拆分信息。然后，聊天模型/LLM利用包含问题和检索数据的提示来生成答案。

Setup

Dependencies

We’ll use an OpenAI chat model and embeddings and a Chroma vector store in this walkthrough, but everything shown here works with any ChatModel or LLM, Embeddings, and VectorStore or Retriever.

在这个演示中，我们将使用OpenAI聊天模型和嵌入，以及Chroma向量存储，但是这里展示的所有内容都适用于任何ChatModel或LLM，Embeddings，和VectorStore或Retriever。

We’ll use the following packages:
```
%pip install --upgrade --quiet  langchain langchain-community langchainhub langchain-openai langchain-chroma bs4
```
```
import getpass
import os

os.environ["OPENAI_API_KEY"] = getpass.getpass()

# import dotenv

# dotenv.load_dotenv()
```
LangSmith

Many of the applications you build with LangChain will contain multiple steps with multiple invocations of LLM calls. As these applications get more and more complex, it becomes crucial to be able to inspect what exactly is going on inside your chain or agent. The best way to do this is with LangSmith.

您使用LangChain构建的许多应用程序将包含多个步骤，每个步骤都会调用多次LLM。随着这些应用程序变得越来越复杂，能够检查链或代理内部到底发生了什么变得至关重要。最好的做法是使用LangSmith。

Note that LangSmith is not needed, but it is helpful. If you do want to use LangSmith, after you sign up at the link above, make sure to set your environment variables to start logging traces:

请注意，虽然LangSmith不是必需的，但它是有帮助的。如果你想使用LangSmith，在上面的链接注册后，确保设置你的环境变量以开始记录追踪：
```
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_API_KEY"] = getpass.getpass()
```
Preview

In this guide we’ll build a QA app over the LLM Powered Autonomous Agents blog post by Lilian Weng, which allows us to ask questions about the contents of the post.

在本指南中，我们将在Lilian Weng的由LLM驱动的自主代理人博客文章之上构建一个QA应用程序，该应用程序使我们能够提出有关文章内容的问题。

We can create a simple indexing pipeline and RAG chain to do this in ~20 lines of code:

我们可以用大约20行代码创建一个简单的索引管道和RAG链来做到这一点：
```
import bs4
from langchain import hub
from langchain_community.document_loaders import WebBaseLoader
from langchain_chroma import Chroma
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
```
API Reference:
以下全部使用OpenAPI

nstall dependencies
```
pip install -qU langchain-openai
```
Set environment variables
```
import getpass
import os

os.environ["OPENAI_API_KEY"] = getpass.getpass()
```
```
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-3.5-turbo-0125")
```
```
# Load, chunk and index the contents of the blog.
loader = WebBaseLoader(
    web_paths=("<https://lilianweng.github.io/posts/2023-06-23-agent/>",),
    bs_kwargs=dict(
        parse_only=bs4.SoupStrainer(
            class_=("post-content", "post-title", "post-header")
        )
    ),
)
docs = loader.load()

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
splits = text_splitter.split_documents(docs)
vectorstore = Chroma.from_documents(documents=splits, embedding=OpenAIEmbeddings())

# Retrieve and generate using the relevant snippets of the blog.
retriever = vectorstore.as_retriever()
prompt = hub.pull("rlm/rag-prompt")

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)
```
```
rag_chain.invoke("What is Task Decomposition?")
```
```
'Task decomposition is a technique used to break down complex tasks into smaller and simpler steps. It can be done through prompting techniques like Chain of Thought or Tree of Thoughts, or by using task-specific instructions or human inputs. Task decomposition helps agents plan ahead and manage complicated tasks more effectively.'
```
```
# cleanup
vectorstore.delete_collection()
```
Check out the LangSmith trace

Detailed walkthrough

Let’s go through the above code step-by-step to really understand what’s going on.

让我们逐步浏览上述代码，以真正理解发生了什么。

1. Indexing: Load

We need to first load the blog post contents. We can use DocumentLoaders for this, which are objects that load in data from a source and return a list of Documents. A Document is an object with some page_content (str) and metadata (dict).

我们首先需要加载博客文章的内容。我们可以使用DocumentLoaders来实现这一点，这些对象从源中加载数据并返回一个Documents列表。Document是一个带有一些page_content（str）和metadata（dict）的对象。

In this case we’ll use the WebBaseLoader, which uses urllib to load HTML from web URLs and BeautifulSoup to parse it to text. We can customize the HTML -> text parsing by passing in parameters to the BeautifulSoup parser via bs_kwargs (see BeautifulSoup docs). In this case only HTML tags with class “post-content”, “post-title”, or “post-header” are relevant, so we’ll remove all others.

在这个案例中，我们将使用WebBaseLoader，它使用urllib从网页URL加载HTML，然后使用BeautifulSoup将其转化为文本。我们可以通过传入参数到BeautifulSoup解析器（通过bs_kwargs）来自定义HTML -> 文本的解析（参见BeautifulSoup文档）。在这个案例中，仅有类为“post-content”，“post-title”或“post-header”的HTML标签是相关的，所以我们将移除所有其他标签。
```
import bs4
from langchain_community.document_loaders import WebBaseLoader

# Only keep post title, headers, and content from the full HTML.
bs4_strainer = bs4.SoupStrainer(class_=("post-title", "post-header", "post-content"))
loader = WebBaseLoader(
    web_paths=("<https://lilianweng.github.io/posts/2023-06-23-agent/>",),
    bs_kwargs={"parse_only": bs4_strainer},
)
docs = loader.load()
```
API Reference:
- WebBaseLoader
```
len(docs[0].page_content)
```
```
42824
```
```
print(docs[0].page_content[:500])
```
```
      LLM Powered Autonomous Agents

Date: June 23, 2023  |  Estimated Reading Time: 31 min  |  Author: Lilian Weng

Building agents with LLM (large language model) as its core controller is a cool concept. Several proof-of-concepts demos, such as AutoGPT, GPT-Engineer and BabyAGI, serve as inspiring examples. The potentiality of LLM extends beyond generating well-written copies, stories, essays and programs; it can be framed as a powerful general problem solver.
Agent System Overview#
In
```
Go deeper

DocumentLoader: Object that loads data from a source as list of Documents.
- Docs: Detailed documentation on how to use DocumentLoaders.
- Integrations: 160+ integrations to choose from.
- Interface: API reference for the base interface.
2. Indexing: Split

Our loaded document is over 42k characters long. This is too long to fit in the context window of many models. Even for those models that could fit the full post in their context window, models can struggle to find information in very long inputs.

我们加载的文档超过了42k个字符。这对于许多模型的上下文窗口来说太长了。即使有些模型能将整篇文章装入其上下文窗口，模型也可能很难在非常长的输入中找到信息。

To handle this we’ll split the Document into chunks for embedding and vector storage. This should help us retrieve only the most relevant bits of the blog post at run time.

为了处理这个问题，我们将Document分割成多个块进行嵌入和向量存储。这应该有助于我们在运行时只检索博客文章中最相关的部分。

In this case we’ll split our documents into chunks of 1000 characters with 200 characters of overlap between chunks. The overlap helps mitigate the possibility of separating a statement from important context related to it. We use the RecursiveCharacterTextSplitter, which will recursively split the document using common separators like new lines until each chunk is the appropriate size. This is the recommended text splitter for generic text use cases.

在这种情况下，我们将把文档分割成长度为1000个字符的块，并且块与块之间有200个字符的重叠。这种重叠有助于减少将与重要上下文相关的语句分开的可能性。我们使用RecursiveCharacterTextSplitter，它将使用常见的分隔符（如换行符）递归地分割文档，直到每个块的大小合适为止。这是推荐的文本分割器，适用于通用的文本用例。

单词解释：
1. mitigate：减轻，缓和。在这里，它指的是通过重叠字符的方式来减少因分割文本而可能导致的上下文断裂问题。
2. RecursiveCharacterTextSplitter：这是一个具体的文本分割器名称，其中"Recursive"意味着它会递归地进行操作，"CharacterTextSplitter"指的是它按照字符进行文本分割。
3. recursively：递归地。指的是某个操作会反复进行，每次操作的结果都会作为下一次操作的输入，直到满足某个条件为止。
4. separators：分隔符。在文本处理中，常用来标识不同部分或元素的界限，如逗号、句号、换行符等。
5. appropriate：适当的，合适的。在这里，它指的是每个分割后的文本块大小要适中，既不太大也不太小。
6. use cases：用例。指的是特定情境下如何使用产品或服务，这里指的是文本分割器在不同情况下的应用实例。
例句解析：
1. "We use the RecursiveCharacterTextSplitter, which will recursively split the document using common separators like new lines until each chunk is the appropriate size."
  - 这句话中，主句是"We use the RecursiveCharacterTextSplitter"，表示“我们使用RecursiveCharacterTextSplitter”。
  - "which will recursively split the document using common separators like new lines until each chunk is the appropriate size" 是一个非限制性定语从句，修饰"RecursiveCharacterTextSplitter"，说明这个分割器的工作原理是递归地使用常见分隔符（如换行符）来分割文档，直到每个块的大小合适。
2. "The overlap helps mitigate the possibility of separating a statement from important context related to it."
  - 这句话中，"The overlap"是主语，"helps mitigate"是谓语，"the possibility of separating a statement from important context related to it"是宾语。整个句子的意思是“重叠有助于减少将与重要上下文相关的语句分开的可能性”。
  - "of separating a statement from important context related to it" 是一个介词短语作定语，修饰"possibility"，说明这种可能性是关于什么的。
3. "In this case we’ll split our documents into chunks of 1000 characters with 200 characters of overlap between chunks."
  - 这句话是一个简单的陈述句，"In this case"是状语，"we"是主语，"’ll split"是将来时的谓语，"our documents"是宾语，"into chunks of 1000 characters with 200 characters of overlap between chunks"是宾补，说明分割的具体情况。整个句子的意思是“在这种情况下，我们将把文档分割成长度为1000个字符的块，块与块之间有200个字符的重叠”。
We set add_start_index=True so that the character index at which each split Document starts within the initial Document is preserved as metadata attribute “start_index”.

我们设置了 add_start_index=True，以便在每个拆分的文档开始于初始文档的字符索引在元数据属性“start_index”中保留。
```
from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000, chunk_overlap=200, add_start_index=True
)
all_splits = text_splitter.split_documents(docs)
```
API Reference:
- RecursiveCharacterTextSplitter
```
len(all_splits)
```
```
66
```
```
len(all_splits[0].page_content)
```
```
969
```
```
all_splits[10].metadata
```
```
{'source': '<https://lilianweng.github.io/posts/2023-06-23-agent/>',
 'start_index': 7056}
```
Go deeper

TextSplitter: Object that splits a list of Documents into smaller chunks. Subclass of DocumentTransformers.
- Explore Context-aware splitters, which keep the location (“context”) of each split in the original Document: - Markdown files
- Code (py or js)
- Scientific papers
- Interface: API reference for the base interface.
DocumentTransformer: Object that performs a transformation on a list of Documents.
- Docs: Detailed documentation on how to use DocumentTransformers
- Integrations
- Interface: API reference for the base interface.
3. Indexing: Store

Now we need to index our 66 text chunks so that we can search over them at runtime. The most common way to do this is to embed the contents of each document split and insert these embeddings into a vector database (or vector store). When we want to search over our splits, we take a text search query, embed it, and perform some sort of “similarity” search to identify the stored splits with the most similar embeddings to our query embedding. The simplest similarity measure is cosine similarity — we measure the cosine of the angle between each pair of embeddings (which are high dimensional vectors).

翻译：

现在我们需要对我们66个文本块进行索引，以便在运行时能进行搜索。最常用的方法是将每个拆分文档的内容嵌入，并将这些嵌入插入到一个向量数据库（或称为向量存储）中。当我们想要搜索这些拆分的内容时，我们取一个文本搜索查询，将其嵌入，并执行某种“相似性”搜索，以识别与查询嵌入最相似的存储拆分。最简单的相似性度量是余弦相似性——我们测量每对嵌入（它们是高维向量）之间的角度的余弦值。

单词解释：
1. embedding：嵌入，指的是将高维数据映射到低维空间的过程，常用于自然语言处理中，将词语、句子或文档转换为向量表示。
2. vector database：向量数据库，用于存储和检索向量数据的数据库，常用于相似度搜索和推荐系统。
3. cosine similarity：余弦相似性，通过计算两个向量的夹角余弦值来评估它们之间的相似性，值越接近1表示越相似。
4. high dimensional vectors：高维向量，指的是在多维空间中的向量，常用于机器学习和数据处理中表示复杂数据。
5. query embedding：查询嵌入，指的是将文本查询转换为向量表示，以便于在向量空间中进行搜索和比较。
6. runtime：运行时，指的是程序实际执行时的情况或环境。
例句解析：
1. "Now we need to index our 66 text chunks so that we can search over them at runtime."
  - 这句话使用了"so that"引导的目的状语从句，表示我们需要对66个文本块进行索引的目的是为了在运行时能够进行搜索。
2. "The simplest similarity measure is cosine similarity — we measure the cosine of the angle between each pair of embeddings (which are high dimensional vectors)."
  - 这句话中，破折号用于进一步解释前面的内容，即余弦相似性是如何测量的。同时，句中使用了定语从句"（which are high dimensional vectors）"来修饰"embeddings"，说明嵌入是高维向量。
3. "When we want to search over our splits, we take a text search query, embed it, and perform some sort of 'similarity' search to identify the stored splits with the most similar embeddings to our query embedding."
  - 这句话中使用了时间状语从句"When we want to search over our splits"，描述了当我们想要搜索拆分内容时的一系列动作。句中还使用了并列结构，通过"and"连接了三个并列的动词短语，分别是"take a text search query", "embed it", 和 "perform...search"，清晰地表达了搜索过程的步骤。
We can embed and store all of our document splits in a single command using the Chroma vector store and OpenAIEmbeddings model.

我们可以使用Chroma向量存储和OpenAIEmbeddings模型，将我们所有的文档分割嵌入并存储在一个命令中。
```
from langchain_chroma import Chroma
from langchain_openai import OpenAIEmbeddings

vectorstore = Chroma.from_documents(documents=all_splits, embedding=OpenAIEmbeddings())
```
API Reference:
- OpenAIEmbeddings
Go deeper

Embeddings: Wrapper around a text embedding model, used for converting text to embeddings.
- Docs: Detailed documentation on how to use embeddings.
- Integrations: 30+ integrations to choose from.
- Interface: API reference for the base interface.
VectorStore: Wrapper around a vector database, used for storing and querying embeddings.
- Docs: Detailed documentation on how to use vector stores.
- Integrations: 40+ integrations to choose from.
- Interface: API reference for the base interface.
This completes the Indexing portion of the pipeline. At this point we have a query-able vector store containing the chunked contents of our blog post. Given a user question, we should ideally be able to return the snippets of the blog post that answer the question.

[翻译] 这完成了管道中的索引部分。至此，我们已经有了一个可查询的向量存储，其中包含了我们的博客文章的块内容。给定一个用户问题，我们理应能够返回博客文章中回答该问题的片段。

[单词解释]
1. Indexing：索引，指的是在信息检索系统中，为了快速定位和检索数据而创建的数据结构或方法。
2. pipeline：管道，在这里指的是一系列处理步骤的集合，用于完成特定任务。
3. vector store：向量存储，指的是存储向量数据（如词向量、文档向量等）的数据库或存储系统。
4. chunked：分块的，指的是将数据分割成较小的、更易于处理和存储的块。
5. snippets：片段，通常指从较大文本中提取出的小段文本。
6. ideally：理想地，表示在理想情况下应该达到的状态或结果。
[例句解析]
1. "This completes the Indexing portion of the pipeline."
  - 语法解析：这是一个简单句，主语是"This"，谓语是"completes"，宾语是"the Indexing portion of the pipeline"。
  - 意义：这句话表示某个过程或操作完成了管道中的索引部分。
2. "Given a user question, we should ideally be able to return the snippets of the blog post that answer the question."
  - 语法解析：这是一个复合句，主句是"we should ideally be able to return the snippets of the blog post"，从句是"that answer the question"，修饰"snippets"。
  - 意义：这句话表示，在给定用户问题的情况下，我们理应能够返回博客文章中回答该问题的片段。这里的"that answer the question"是一个定语从句，用来修饰和限定"snippets"，说明这些片段是能够回答用户问题的。
  4. Retrieval and Generation: Retrieve
  
  Now let’s write the actual application logic. We want to create a simple application that takes a user question, searches for documents relevant to that question, passes the retrieved documents and initial question to a model, and returns an answer.
  
  现在让我们编写实际的应用逻辑。我们想创建一个简单的应用程序，该程序接收用户的问题，搜索与该问题相关的文档，将检索到的文档和初始问题传递给一个模型，然后返回一个答案。
  
  First we need to define our logic for searching over documents. LangChain defines a Retriever interface which wraps an index that can return relevant Documents given a string query.
  
  首先，我们需要定义我们搜索文档的逻辑。LangChain定义了一个Retriever接口，它封装了一个索引，可以根据字符串查询返回相关的Documents。
  
  The most common type of Retriever is the VectorStoreRetriever, which uses the similarity search capabilities of a vector store to facilitate retrieval. Any VectorStore can easily be turned into a Retriever with VectorStore.as_retriever():
  
  最常见的 Retriever 类型是 VectorStoreRetriever，它利用向量存储的相似性搜索功能来促进检索。任何 VectorStore 都可以轻松地通过 VectorStore.as_retriever() 转化为 Retriever。
```
retriever = vectorstore.as_retriever(search_type="similarity", search_kwargs={"k": 6})
```
```
retrieved_docs = retriever.invoke("What are the approaches to Task Decomposition?")
```
```
len(retrieved_docs)
```
```
6
```
```
print(retrieved_docs[0].page_content)
```
```
Tree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.
Task decomposition can be done (1) by LLM with simple prompting like "Steps for XYZ.\n1.", "What are the subgoals for achieving XYZ?", (2) by using task-specific instructions; e.g. "Write a story outline." for writing a novel, or (3) with human inputs.
```
  Go deeper
  
  Vector stores are commonly used for retrieval, but there are other ways to do retrieval, too.
  
  向量存储通常用于检索，但也有其他方法可以进行检索。
  
  Retriever: An object that returns Documents given a text query
  - Docs: Further documentation on the interface and built-in retrieval techniques. Some of which include:
    - MultiQueryRetriever generates variants of the input question to improve retrieval hit rate.
    - MultiVectorRetriever (diagram below) instead generates variants of the embeddings, also in order to improve retrieval hit rate.
    - Max marginal relevance selects for relevance and diversity among the retrieved documents to avoid passing in duplicate context.
    - Documents can be filtered during vector store retrieval using metadata filters, such as with a Self Query Retriever.
  - Integrations: Integrations with retrieval services.
  - Interface: API reference for the base interface.
  5. Retrieval and Generation: Generate
  
  Let’s put it all together into a chain that takes a question, retrieves relevant documents, constructs a prompt, passes that to a model, and parses the output.
  
  我们将整个过程串联起来：接收一个问题，检索相关文档，构建一个提示，将其传递给模型，并解析输出。
  
  单词解释：
  1. Retrieves - 检索，查找并获取相关数据或信息。
  2. Constructs - 构建，此处指根据所获取的信息构建一个输入提示。
  3. Passes - 传递，将构建的提示发送给另一个系统或模型。
  4. Parses - 解析，对输出的数据或信息进行分析和处理。
  5. Prompt - 提示，此处指的是给AI模型的一个输入，以获取相应的输出。
  6. Output - 输出，模型根据输入提示产生的结果。
  例句解析：
  1. "Let’s put it all together into a chain"
    - 这是一个祈使句，表示建议或提议。
    - “put it all together into a chain”意思是将整个流程串联起来。
  2. "that takes a question, retrieves relevant documents, constructs a prompt, passes that to a model, and parses the output."
    - 这是一个定语从句，描述了“chain”的具体操作和流程。
    - 句子中列举了一系列连贯的动作：接收问题、检索文档、构建提示、传递模型和解析输出，这些动作按顺序进行。
  "that takes a question" - 这是一个定语从句的片段，描述了“chain”的第一个功能或操作，即接收一个问题。
  
  "retrieves relevant documents" - 描述了“chain”的第二个功能，即根据问题检索相关的文档或资料。
  
  "constructs a prompt" - 描述了第三个功能，即基于检索到的文档构建一个输入提示。
  
  "passes that to a model" - 描述了第四个功能，将构建的提示传递给一个模型以获取输出。
  
  "and parses the output" - 描述了最后一个功能，即解析模型产生的输出。
  
  We’ll use the gpt-3.5-turbo OpenAI chat model, but any LangChain LLM or ChatModel could be substituted in.
  - OpenAI
  Install dependencies
```
pip install -qU langchain-openai
```
  Set environment variables
```
import getpass
import os

os.environ["OPENAI_API_KEY"] = getpass.getpass()
```
```
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-3.5-turbo-0125")
```
  We’ll use a prompt for RAG that is checked into the LangChain prompt hub (here).
```
from langchain import hub

prompt = hub.pull("rlm/rag-prompt")
```
```
example_messages = prompt.invoke(
    {"context": "filler context", "question": "filler question"}
).to_messages()
example_messages
```
```
[HumanMessage(content="You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.\nQuestion: filler question \nContext: filler context \nAnswer:")]
```
```
print(example_messages[0].content)
```
```
You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.
Question: filler question
Context: filler context
Answer:
```
  We’ll use the LCEL Runnable protocol to define the chain, allowing us to - pipe together components and functions in a transparent way - automatically trace our chain in LangSmith - get streaming, async, and batched calling out of the box
  
  我们将使用LCEL Runnable协议来定义链，这使得我们能够以透明的方式将组件和功能连接在一起，并在LangSmith中自动追踪我们的链，同时实现流式、异步和批处理调用的即插即用功能。
  
  单词解释：
  1. Protocol - 协议，在网络通信或程序设计中，规定了一套标准的规则和格式，以确保不同的系统或组件能够正确地交互。
  2. Pipe - 在这里指的是将不同的组件和功能连接起来，形成一个处理流程。
  3. Transparent - 透明的，此处指的是各个组件和功能之间的连接方式是清晰可见的。
  4. Trace - 追踪，监视程序执行过程中的状态或行为，以便进行调试或分析。
  5. Streaming - 流式处理，数据在到达时立即进行处理，而不是等待所有数据都接收完毕后再处理。
  6. Async - 异步的，指的是操作可以在不阻塞主线程的情况下进行，提高程序的响应性和效率。
  7. Batched calling - 批处理调用，将多个请求或操作组合在一起进行一次性处理，以提高效率。
  例句解析：
  1. "We’ll use the LCEL Runnable protocol to define the chain"
    - 这是一个将来时态的句子，表示将要使用LCEL Runnable协议来定义链。
    - LCEL Runnable是协议的名称，用于定义可运行的任务或操作。
  2. "allowing us to pipe together components and functions in a transparent way"
    - 这是一个现在分词短语作状语的结构，表示使用LCEL Runnable协议带来的好处之一是可以将组件和功能以透明的方式连接起来。
    - "pipe together"表示连接、组合的意思。
  3. "automatically trace our chain in LangSmith"
    - 这是一个动词不定式短语作状语的结构，表示在LangSmith中可以自动追踪我们定义的链。
    - "automatically"是副词，表示自动地。
  4. "get streaming, async, and batched calling out of the box"
    - 这是一个动词短语，表示可以即插即用地实现流式、异步和批处理调用的功能。
    - "out of the box"是一个习语，表示即插即用，不需要额外配置或修改。
```
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)
```
  API Reference:
  - StrOutputParser
  - RunnablePassthrough
```
for chunk in rag_chain.stream("What is Task Decomposition?"):
    print(chunk, end="", flush=True)
```
```
Task decomposition is a technique used to break down complex tasks into smaller and simpler steps. It involves transforming big tasks into multiple manageable tasks, allowing for easier interpretation and execution by autonomous agents or models. Task decomposition can be done through various methods, such as using prompting techniques, task-specific instructions, or human inputs.
```
  Check out the LangSmith trace
  
  Go deeper
  
  Choosing a model
  
  ChatModel: An LLM-backed chat model. Takes in a sequence of messages and returns a message.
  - Docs
  - Integrations: 25+ integrations to choose from.
  - Interface: API reference for the base interface.
  LLM: A text-in-text-out LLM. Takes in a string and returns a string.
  - Docs
  - Integrations: 75+ integrations to choose from.
  - Interface: API reference for the base interface.
  See a guide on RAG with locally-running models here.
  
  Customizing the prompt
  
  As shown above, we can load prompts (e.g., this RAG prompt) from the prompt hub. The prompt can also be easily customized:
```
from langchain_core.prompts import PromptTemplate

template = """Use the following pieces of context to answer the question at the end.
If you don't know the answer, just say that you don't know, don't try to make up an answer.
Use three sentences maximum and keep the answer as concise as possible.
Always say "thanks for asking!" at the end of the answer.

{context}

Question: {question}

Helpful Answer:"""
custom_rag_prompt = PromptTemplate.from_template(template)

rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | custom_rag_prompt
    | llm
    | StrOutputParser()
)

rag_chain.invoke("What is Task Decomposition?")
```
  API Reference:
  - PromptTemplate
```
'Task decomposition is a technique used to break down complex tasks into smaller and simpler steps. It involves transforming big tasks into multiple manageable tasks, allowing for a more systematic and organized approach to problem-solving. Thanks for asking!'
```
  Check out the LangSmith trace

langchain文档阅读 | 中英对照 | Q&A RAG quickstart

Architecture(建筑,结构)

Indexing

Retrieval and generation

Setup

Dependencies

LangSmith

Preview

API Reference:

nstall dependencies

Set environment variables

Detailed walkthrough

1. Indexing: Load

API Reference:

Go deeper

2. Indexing: Split

API Reference:

Go deeper

3. Indexing: Store

API Reference:

Go deeper

4. Retrieval and Generation: Retrieve

Go deeper

5. Retrieval and Generation: Generate

Install dependencies

Set environment variables

API Reference:

Go deeper

Choosing a model

Customizing the prompt

API Reference: