15.langchain 入门到放弃(七)Chains-load_summarize_chain

182 阅读4分钟

15.langchain 入门到放弃(七)Chains-load_summarize_chain

  load_summarize_chain作为总结链,将所输入的文档进行总结,最后返回对应的总结链

  ‍

  原始的error.csv文件内容

故障&问题现象,当前状态,发生时间,客户是否感知,故障&问题影响范围(从影响时长,严重性,故障级别等方面描述),故障&问题所属模块,模块版本,故障&问题类型,排查原因,解决措施
高速缓存偶发出现down,已解决,2022/1/15,否,客户无感知/严重性一般,缓存,"dns-4.x",产品问题,接口域名大小写频繁操作引起,给出最新版本解决
mysql挂了,已解决,2022/1/15,否,客户无感知/严重性一般,ump,"mysql 5.7.x",产品问题,mysql异常
from langchain.chains.summarize import load_summarize_chain
from langchain_community.document_loaders import CSVLoader
from langchain_community.llms.ollama import Ollama
from langchain_openai import ChatOpenAI
from langchain_text_splitters import RecursiveCharacterTextSplitter

'''
1、通过csvLoader加载本地csv文件,
2、loader.load获取文档对象,
3、并且按照文本长度进行分段
4、调用模型的总结链进行总结,
5、最后输出总结项
'''
loader = CSVLoader(file_path="../source/error.csv")
documents = loader.load()
print(f'documents:{len(documents)}')
'''
chunk_overlap指切割后的每个 document 里包含几个上一个 document 结尾的内容
主要作用是为了增加每个 document 的上下文关联。比如,chunk_overlap=0时, 第一个 document 为 aaaaaa,第二个为 bbbbbb;
当 chunk_overlap=2 时,第一个 document 为 aaaaaa,第二个为 aabbbbbb。
'''

textSplitters = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=0)

split_document = textSplitters.split_documents(documents)

print(f'split_document:{len(split_document)}')

# model = ChatOpenAI(temperature=0, api_key="sk-nrKGd0JGzis7dMo95aF09642Be354fE3BdD7C61d7678Bb48",
#                    base_url="https://api.chatgpt-3.vip/v1")

model = Ollama(base_url="http://localhost:11434", model="llama3")
'''
load_summarize_chain 表示为总结链
chain_type:要使用的文档组合链的类型。应该是“stuff”、“map_reduce”和“refine”。
stuff: 这种最简单粗暴,会把所有的 document 一次全部传给 llm 模型进行总结,极大可能会导致token超限
map_reduce: 这个方式会先将每个 document 进行总结,最后将所有 document 总结出的结果再进行一次总结。
refine: 这种方式会先总结第一个 document,然后在将第一个 document 总结出的内容和第二个 document 一起发给 llm 模型在进行总结,以此类推,增加了总结内容的连贯性。
verbose=true表示打印明细 
'''
chain = load_summarize_chain(model, chain_type="refine", verbose=True)
# 获取总结链的前2个
print(chain.run(split_document[:2]))

  ‍

  输出结果为

> Entering new RefineDocumentsChain chain...

> Entering new LLMChain chain...
Prompt after formatting:
Write a concise summary of the following:


"故障&问题现象: 高速缓存偶发出现down
当前状态: 已解决
发生时间: 2022/1/15
客户是否感知: 否
故障&问题影响范围(从影响时长,严重性,故障级别等方面描述): 客户无感知/严重性一般
故障&问题所属模块: 缓存
模块版本: dns-4.x
故障&问题类型: 产品问题
排查原因: 接口域名大小写频繁操作引起
解决措施: 给出最新版本解决"


CONCISE SUMMARY:

> Finished chain.


> Entering new LLMChain chain...
Prompt after formatting:
Your job is to produce a final summary.
We have provided an existing summary up to a certain point: Here is a concise summary:

"High-speed cache occasionally went down on January 15, 2022. The issue was resolved and the customer did not experience any impact. The problem occurred in the DNS-4.x module's cache and was caused by frequent operations on domain names with varying capitalization. A new version was released to resolve the issue."
We have the opportunity to refine the existing summary (only if needed) with some more context below.
------------
"故障&问题现象: mysql挂了
当前状态: 已解决
发生时间: 2022/1/15
客户是否感知: 否
故障&问题影响范围(从影响时长,严重性,故障级别等方面描述): 客户无感知/严重性一般
故障&问题所属模块: ump
模块版本: mysql 5.7.x
故障&问题类型: 产品问题
排查原因: mysql异常
解决措施: None"
------------
Given the new context, refine the original summary.
If the context isn't useful, return the original summary.

> Finished chain.

> Finished chain.
Based on the additional context provided, I would refine the original summary as follows:
根据所提供的补充情况,我将把最初的摘要提炼如下:

"High-speed cache occasionally went down on January 15, 2022. The issue was resolved and the customer did not experience any impact. Additionally, a MySQL outage occurred on the same day, which was unrelated to the high-speed cache issue. The MySQL outage was caused by an exception and had no noticeable effect on customers due to its general severity level."

2022年1月15日,高速缓存偶尔会出现故障。问题已得到解决,客户未受到任何影响。此外,MySQL故障在同一天发生,与高速缓存问题无关。MySQL故障是由异常引起的,由于其一般严重性级别,因此对客户没有明显影响。

I refined the original summary by adding information about the MySQL outage that occurred on the same day as the high-speed cache issue. I also clarified that the two issues were unrelated, and that the customer did not experience any impact from the MySQL outage.

我通过添加与高速缓存问题发生在同一天的MySQL中断有关的信息,改进了原始摘要。我还澄清了这两个问题是无关的,客户没有感受到MySQL中断的任何影响。

Process finished with exit code 0

  ‍