12.langchain 入门到放弃(六) retrieval
检索器是一个接口,它根据非结构化查询返回文档。它比矢量存储更通用。检索器不需要能够存储文档,只需返回(或检索)它们即可。矢量存储可以用作检索器的骨干,但也有其他类型的检索器。
检索器接受字符串查询作为输入,并返回文档列表作为输出。
LLM支持用户通过提示词进行查询检索,但是一旦用户访问一些私有数据、实时数据等,LLM模型回答就可能胡说八道。
检索增强 LLM ( Retrieval Augmented LLM ) ,简单来说,就是给 LLM 提供外部数据库,对于用户问题 ( Query ),通过一些信息检索 ( Information Retrieval, IR ) 的技术,
先从外部数据库中检索出和用户问题相关的信息,然后让 LLM 结合这些相关信息来生成结果。 检索增强 LLM 的简单示意图如下
以下为langchain的RAG流程
高级检索类型
Table columns:
- Name: Name of the retrieval algorithm.
- Index Type: Which index type (if any) this relies on.
- Uses an LLM: Whether this retrieval method uses an LLM.
- When to Use: Our commentary on when you should considering using this retrieval method.
- Description: Description of what this retrieval algorithm is doing.
| Name | Index Type | Uses an LLM | When to Use | Description |
|---|---|---|---|---|
| Vectorstore | Vectorstore | No | If you are just getting started and looking for something quick and easy. | This is the simplest method and the one that is easiest to get started with. It creates embeddings for each piece of text. |
| ParentDocument | Vectorstore + Document Store | No | If your pages have lots of smaller pieces of distinct information that are best indexed by themselves, but best retrieved all together. | This indexes multiple chunks for each document. Then you find the chunks that are most similar in embedding space, but you retrieve the whole parent document and return that (rather than individual chunks). |
| Multi Vector | Vectorstore + Document Store | Sometimes during indexing | If you are able to extract information from documents that you think is more relevant to index than the text itself. | This creates multiple vectors for each document. Each vector could be created in a myriad of ways - examples include summaries of the text and hypothetical questions. |
| Self Query | Vectorstore | Yes | If users are asking questions that are better answered by fetching documents based on metadata rather than similarity with the text. | This uses an LLM to transform user input into two things: (1) a string to look up semantically, (2) a metadata filer to go along with it. This is useful because oftentimes questions are about the METADATA of documents (not the content itself). |
| Contextual Compression | Any | Sometimes | If you are finding that your retrieved documents contain too much irrelevant information and are distracting the LLM. | This puts a post-processing step on top of another retriever and extracts only the most relevant information from retrieved documents. This can be done with embeddings or an LLM. |
| Time-Weighted Vectorstore | Vectorstore | No | If you have timestamps associated with your documents, and you want to retrieve the most recent ones | This fetches documents based on a combination of semantic similarity (as in normal vector retrieval) and recency (looking at timestamps of indexed documents) |
| Multi-Query Retriever | Any | Yes | If users are asking questions that are complex and require multiple pieces of distinct information to respond | This uses an LLM to generate multiple queries from the original one. This is useful when the original query needs pieces of information about multiple topics to be properly answered. By generating multiple queries, we can then fetch documents for each of them. |
| Ensemble | Any | No | If you have multiple retrieval methods and want to try combining them. | This fetches documents from multiple retrievers and then combines them. |
| Long-Context Reorder | Any | No | If you are working with a long-context model and noticing that it's not paying attention to information in the middle of retrieved documents. | This fetches documents from an underlying retriever, and then reorders them so that the most similar are near the beginning and end. This is useful because it's been shown that for longer context models they sometimes don't pay attention to information in the middle of the context window. |