本文记录了实现一个较为丰富的机器人问答可视化应用的过程。
基本功能需求:
-
调用llm进行对话
-
友好的web可视化,能够看到历史对话消息
-
RAG本地知识库问答
应用的技术:
-
WebUI:
Streamlit和Gradio都是比较便捷的框架,Streamlit相比有更加丰富的组件可以调用,本文采用Streamlit实现WebUI
-
机器人问答应用及RAG:
基于较新版本的Langchain框架,使用LCEL
-
Trace:
- LangSmith:易于部署,可视化及debug使用,记录详细的数据
前置
- pip install streamlit 和langchain相关库
- LLM:豆包大模型,平台支持使用openai接口调用
- 嵌入模型:使用豆包的嵌入模型或者huggingface开源嵌入模型。本文使用开源BAAI的bge-small-zh-v1.5。
以下为豆包大模型API调用方法和快速下载huggingface开源embedding模型的trick记录。
通过API调用豆包大模型
-
火山引擎注册登录:console.volcengine.com/ark/
-
创建API Key
-
创建接入点
选择模型:(新用户每个模型赠送50万tokens)
-
获取base_url和model endpoint 在接入点那边点进去API调用
获取base_url和model endpoint,记录下这两个字符串
-
代码OpenAI方式调用
基于前面获取的,设置环境变量
OPENAI_BASE_URL、OPENAI_API_KEY和LLM_MODELEND
。以下是在代码中显式设置环境变量的示例:
from langchain_openai import ChatOpenAI import os os.environ["OPENAI_BASE_URL"] = "" os.environ["OPENAI_API_KEY"] = "" os.environ["LLM_MODELEND"] = "" llm = ChatOpenAI( model=os.environ["LLM_MODELEND"], temperature=0, )
快速通过镜像+命令行下载huggingface模型
可以选择直接到镜像网站下载:hf-mirror.com/;
以下从通过命令行快速下载到本地
-
安装依赖
pip install -U huggingface_hub
-
设置环境变量
- linux
export HF_ENDPOINT=https://hf-mirror.com
- windows
$env:HF_ENDPOINT = "https://hf-mirror.com"
-
下载模型示例
两个关键变量:
-
huggingface仓库模型名称:BAAI/bge-small-zh-v1.5;
-
--local-dir
保存的本地路径
huggingface-cli download --resume-download BAAI/bge-small-zh-v1.5 --local-dir ./hub
-
-
下载数据集示例
huggingface-cli download --repo-type dataset --resume-download wikitext --local-dir wikitext
1. 具有历史记忆的对话应用
实现的界面:
项目结构
chatbot_project
|--outputs (存储历史记录等)
|--src
| |--utils
| | |--__init__.py
| | |--conversation.py (对话类)
| | |--retrival.py (本地检索时使用,本小节不涉及)
| |--__init__.py
|--main.py (UI实现及模型调用主文件)
初始化LLM
st.session_state
是一个变量寄存器,可储存当前session的变量
由于streamlit是动态加载的,灵活使用st.session_state可以避免一些数据多次加载的情况
import streamlit as st
from langchain_openai import ChatOpenAI
import os
###以下为在python脚本中设置的环境变量,也可以直接配置在环境变量中
os.environ["OPENAI_BASE_URL"] = ""
os.environ["OPENAI_API_KEY"] = ""
os.environ["LLM_MODELEND"] = "Doubao-pro-32k"
def init_llm():
if not "llm" in st.session_state:
# 初始化LLM
st.session_state['llm'] = ChatOpenAI(
model=os.environ["LLM_MODELEND"],
temperature=0,
)
init_llm()
界面初始化
- 界面呈现模式
##界面初始化
#两种界面呈现模式,即“wide”和“centered”。wide模式可以将页面撑满
st.set_page_config(layout="centered")
-
边栏
带有记忆选项设置
#边栏
with st.sidebar:
st.title("QA系统")
st.write("这是一个使用 Streamlit 构建的简单聊天应用程序。")
st.write("你可以提问并得到智能客服的回复。")
# ....
st.checkbox("With memory", key="with_history",
help="This will let the agent being able to remember the conversation history.")
### History length slider
his_len = st.slider("History length", min_value=1, max_value=10, step=1, key="history_length",
disabled=not st.session_state.get("with_history", False))
### Memory mode, ["All", "Trim", "Summarize", "Trim+Summarize"]
memory_mode = st.selectbox("Memory mode", ["All", "Trim", "Summarize", "Trim+Summarize"])
### Memory clear
col1, col2 = st.columns([1, 1])
col1.button("Clear history", on_click=lambda: st.session_state["chat_history"].clear(),
use_container_width=True, disabled=not st.session_state.get("with_history", False),
help="Clear the conversation history for agent.\n\n But the history for demonstration will be reserved")
### Memory save
col3, col4 = st.columns([1, 1])
col3.button("Save history", on_click=_history_to_disk, type="secondary", use_container_width=True)
-
对话框
-
通过
st.chat_input
渲染输入框并获取用户的输入prompt_text
-
模型流式调用:
response = st.session_state["llm"].stream(prompt_text)
-
对话消息渲染:
使用
st.chat_message("[角色]").write([内容])
将不同角色的对话消息渲染到web页面,如:# #定义了一个容器,在其中可放置任意streamlit能渲染的内容 with st.chat_message("assistant"): st.write("我是您的问答机器人,请问有什么可以帮到您的吗?")
-
##对话功能
if prompt_text := st.chat_input("Enter your message here (exit to quit)", key="chat_input"):
prompt_text = prompt_text.strip()
#进行判断,若用户输入exit,则保存历史记录到本地并停止
if prompt_text.lower() == "exit":
_history_to_disk()
historys.clear()
msgs.clear()
st.stop()
conversation = Conversation(role=Role.USER, content=prompt_text)
historys.append(conversation) # 在对话历史中添加对话
st.chat_message("user").write(prompt_text)
# st.spinner状态,也是一个容器,但是是预设的容器,显示一段等候动画
with st.spinner("Thinking..."):
# response = st.session_state["bot"].rag_chain.stream(prompt_text) # 大模型返回结果
if st.session_state.get("with_history", False):
# As usual, new messages are added to StreamlitChatMessageHistory when the Chain is called.
config = {"configurable": {"session_id": "any"}}
if memory_mode == 'All':
response = chain_with_history.stream({"question": prompt_text},config)
elif memory_mode == 'Trim+Summarize':
response = chain_with_trimming_and_summarization.stream({"question": prompt_text},config)
elif memory_mode == 'Trim':
response = chain_with_trimming.stream({"question": prompt_text},config)
elif memory_mode == 'Summarize':
response = chain_with_regular_summarization.stream({"question": prompt_text},config)
else:
response = st.session_state["llm"].stream(prompt_text)
content = st.chat_message("assistant").write_stream(response)
conversation = Conversation(role=Role.ASSISTANT, content=content) # 模型会话定义
historys.append(conversation) # 在对话历史中添加模型会话
Memory相关
想能够结合streamlit实现实时的消息渲染和对话记忆且适配高版本的langchain,故而没有使用传统的ConverationBufferMemory等。
参考Streamlit应用开发记录 - 02 几种对话历史应用实现 - 知乎并debug调试修改过具体的记忆传入生效,修改使用StreamlitChatMessageHistory()和st.chat_message()结合的实现。
初始化记忆
##记忆功能
##初始化
msgs = StreamlitChatMessageHistory(key="memory")
if len(msgs.messages) == 0:
msgs.add_ai_message("How can I help you?")
无记忆功能的调用
response = st.session_state["llm"].stream(prompt_text)
长久记忆功能实现
- 链构造:传入所有的历史记忆
# prompt
prompt = ChatPromptTemplate.from_messages([
("system", "You are a useful AI chatbot having a conversation with a human."),
MessagesPlaceholder(variable_name="history"),
("human", "{question}"),
])
# chain
chain = prompt | st.session_state["llm"]
chain_with_history = RunnableWithMessageHistory(
chain,
lambda session_id: msgs, # Always return the instance created earlier,
input_messages_key="question",
history_messages_key="history",
)
- 带有记忆的链调用
if st.session_state.get("with_history", False):
# As usual, new messages are added to StreamlitChatMessageHistory when the Chain is called.
config = {"configurable": {"session_id": "any"}}
if memory_mode == 'All':
response = chain_with_history.stream({"question": prompt_text},config)
##将模型调用输出流式写入web页面
content = st.chat_message("assistant").write_stream(response)
短期记忆功能实现
### 短期记忆
def trim_messages(chain_input):
"""Trim the messages to the desired length."""
stored_messages = msgs.messages.copy()
if len(stored_messages) <= his_len*2:
return False
msgs.messages = [] # 清空
for message in stored_messages[-(his_len*2):]:
msgs.add_message(message)
return True
chain_with_trimming = (
RunnablePassthrough.assign(messages_trimmed=trim_messages)
| chain_with_history
)
SummaryMemory
### 对话历史总结
def summarize_messages(chain_input):
"""Summarize the messages."""
global msgs
stored_messages = msgs.messages.copy()
if len(stored_messages) == 0:
return False
summarization_prompt = ChatPromptTemplate.from_messages([
MessagesPlaceholder(variable_name="chat_history"),
("user", "Distill the above chat messages into a single summary message. Include as many specific details as you can.")
])
if not "llm" in st.session_state:
init_llm()
summarization_chain = summarization_prompt | st.session_state["llm"]
summary_message = summarization_chain.invoke({"chat_history": stored_messages})
msgs.messages = [] # 清空
msgs.add_ai_message(summary_message)
return True
chain_with_summarization = (
RunnablePassthrough.assign(messages_summarized=summarize_messages)
| chain_with_history
)
类比ConversationBufferSummaryMemory
### 长短期对话历史
def trim_and_summarize_messages(chain_input):
"""Trim and summarize the messages."""
stored_messages = msgs.messages.copy()
if len(stored_messages) <= his_len*2:
return False
msgs.messages = [] # 清空
summarization_prompt = ChatPromptTemplate.from_messages([
MessagesPlaceholder(variable_name="chat_history"),
("user", "Distill the above chat messages into a single summary message. Include as many specific details as you can.")
])
if not "llm" in st.session_state:
init_llm()
summarization_chain = summarization_prompt | st.session_state["llm"]
# 总结前两轮对话
summary_message = summarization_chain.invoke({"chat_history": stored_messages[:4+len(stored_messages)%2]})
msgs.add_ai_message(summary_message)
for message in stored_messages[4+len(stored_messages)%2:]:
msgs.add_message(message)
return True
chain_with_trimming_and_summarization = (
RunnablePassthrough.assign(messages_trimmed_and_summarized=trim_and_summarize_messages)
| chain_with_history
)
历史记录相关
修改自Streamlit应用开发记录 - 02 几种对话历史应用实现 - 知乎
- Conversation类
##conversation.py
from __future__ import annotations
from dataclasses import dataclass
from enum import auto, Enum
from PIL.Image import Image
import streamlit as st
from streamlit.delta_generator import DeltaGenerator
class Role(Enum):
SYSTEM = auto()
USER = auto()
ASSISTANT = auto()
TOOL = auto()
INTERPRETER = auto()
OBSERVATION = auto()
def __str__(self):
if self == Role.SYSTEM:
return "<|system|>"
elif self == Role.USER:
return "<|user|>"
elif self in [Role.ASSISTANT, Role.TOOL, Role.INTERPRETER]:
return "<|assistant|>"
elif self == Role.OBSERVATION:
return "<|observation|>"
else:
raise ValueError(f'Unexpected role: {self}')
# Get the message block for the given role
def get_message(self):
# Compare by value here, because the enum object in the session state
# is not the same as the enum cases here, due to streamlit's rerunning
# behavior.
if self.value == Role.SYSTEM.value:
return
elif self.value == Role.USER.value:
return st.chat_message(name="user", avatar="user")
elif self.value == Role.ASSISTANT.value:
return st.chat_message(name="assistant", avatar="assistant")
elif self.value == Role.TOOL.value:
return st.chat_message(name="tool", avatar="assistant")
elif self.value == Role.INTERPRETER.value:
return st.chat_message(name="interpreter", avatar="assistant")
elif self.value == Role.OBSERVATION.value:
return st.chat_message(name="observation", avatar="user")
else:
st.error(f'Unexpected role: {self}')
@dataclass
class Conversation:
role: Role
content: str
tool: str | None = None
image: Image | None = None
def __str__(self) -> str:
print(self.role, f"{self.content}", self.tool)
if self.role in [Role.SYSTEM, Role.USER, Role.ASSISTANT, Role.OBSERVATION]:
return f'{self.role}\n{self.content}'
elif self.role == Role.TOOL:
return f'{self.role}{self.tool}\n{self.content}'
elif self.role == Role.INTERPRETER:
return f'{self.role}interpreter\n{self.content}'
# Human readable format
def get_text(self) -> str:
# text = postprocess_text(self.content)
text = self.content # 只考虑文本情况
if self.role.value == Role.TOOL.value:
text = f'Calling tool `{self.tool}`:\n\n{text}'
elif self.role.value == Role.INTERPRETER.value:
text = f'{text}'
elif self.role.value == Role.OBSERVATION.value:
text = f'Observation:\n```\n{text}\n```'
return text
# Display as a markdown block
def show(self, placeholder: DeltaGenerator | None = None) -> str:
if placeholder:
message = placeholder
else:
message = self.role.get_message()
# if self.image:
# message.image(self.image)
# else:
# text = self.get_text()
# message.markdown(text)
if isinstance(self.content, (list, tuple)):
for content in self.content:
message.markdown(content)
else:
message.write(self.content)
def to_dict(self):
convers = []
if isinstance(self.content, (list, tuple)):
for c in self.content:
convers.append({"role": f"{self.role}", "content": f"{c}"})
else:
convers.append({"role": f"{self.role}", "content": f"{self.content}"})
return convers
初始化对话历史
#初始化对话历史
placeholder = st.empty()
with placeholder.container():
if 'chat_history' not in st.session_state:
st.session_state['chat_history'] = []
historys: List[Conversation] = st.session_state['chat_history']
显示对话历史
##显示历史记录
for conversation in historys:
conversation.show()
将历史记录保存到本地
def _history_to_disk():
"""Save the history to disk."""
if 'chat_history' in st.session_state:
history: List[Conversation] = st.session_state['chat_history']
history_list = []
now = datetime.datetime.now().strftime("%Y%m%dT%H%M%S")
if not os.path.isdir("./outputs/logs"):
os.makedirs("./outputs/logs")
with open(f"./outputs/logs/history_{now}.json", "w", encoding='utf-8') as f:
for conversation in history:
history_list.extend(conversation.to_dict())
json.dump(history_list, f, ensure_ascii=False, indent=4)
print("save history to disk")
- 边栏按钮调用
col3, col4 = st.columns([1, 1])
col3.button("Save history", on_click=_history_to_disk, type="secondary", use_container_width=True)
在对话历史中添加对话
conversation = Conversation(role=Role.USER, content=prompt_text)
historys.append(conversation) # 在对话历史中添加对话
conversation = Conversation(role=Role.ASSISTANT, content=content) # 模型会话定义
historys.append(conversation) # 在对话历史中添加模型会话
清空对话历史记录
historys.clear()
- 边栏按钮调用
col1, col2 = st.columns([1, 1])
col1.button("Clear history", on_click=lambda: st.session_state["chat_history"].clear(),
use_container_width=True, disabled=not st.session_state.get("with_history", False),
help="Clear the conversation history for agent.\n\n But the history for demonstration will be reserved")
运行
streamlit run main.py
- 记忆验证
完整的main.py
#!/usr/bin/env python
# -*- coding: UTF-8 -*-
# '''
# @File: main.py
# @IDE: PyCharm
# @Author: Xandra
# @Time: 2024/11/22 22:35
# @Desc:
#
# '''
from typing import List
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.runnables import RunnableWithMessageHistory, RunnablePassthrough
from src.utils import init_qa,init_llm
from src.utils.conversation import Conversation, Role
from langchain_community.chat_message_histories import (
StreamlitChatMessageHistory,
)
import json
import datetime
import os
import streamlit as st
def _history_to_disk():
"""Save the history to disk."""
if 'chat_history' in st.session_state:
history: List[Conversation] = st.session_state['chat_history']
history_list = []
now = datetime.datetime.now().strftime("%Y%m%dT%H%M%S")
if not os.path.isdir("./outputs/logs"):
os.makedirs("./outputs/logs")
with open(f"./outputs/logs/history_{now}.json", "w", encoding='utf-8') as f:
for conversation in history:
history_list.extend(conversation.to_dict())
json.dump(history_list, f, ensure_ascii=False, indent=4)
print("save history to disk")
##记忆功能
##初始化
msgs = StreamlitChatMessageHistory(key="memory")
if len(msgs.messages) == 0:
msgs.add_ai_message("How can I help you?")
init_llm()
# prompt
prompt = ChatPromptTemplate.from_messages([
("system", "You are a useful AI chatbot having a conversation with a human."),
MessagesPlaceholder(variable_name="history"),
("human", "{question}"),
])
# chain
chain = prompt | st.session_state["llm"]
chain_with_history = RunnableWithMessageHistory(
chain,
lambda session_id: msgs, # Always return the instance created earlier,
input_messages_key="question",
history_messages_key="history",
)
##界面初始化
#两种界面呈现模式,即“wide”和“centered”。wide模式可以将页面撑满
st.set_page_config(layout="centered")
# st.title("问答机器人")
# # 初始化QA链
# init_qa()
#边栏
with st.sidebar:
st.title("QA系统")
st.write("这是一个使用 Streamlit 构建的简单聊天应用程序。")
st.write("你可以提问并得到智能客服的回复。")
# ....
st.checkbox("With memory", key="with_history",
help="This will let the agent being able to remember the conversation history.")
### History length slider
his_len = st.slider("History length", min_value=1, max_value=10, step=1, key="history_length",
disabled=not st.session_state.get("with_history", False))
### Memory mode, ["All", "Trim", "Summarize", "Trim+Summarize"]
memory_mode = st.selectbox("Memory mode", ["All", "Trim", "Summarize", "Trim+Summarize"])
### Memory clear
col1, col2 = st.columns([1, 1])
col1.button("Clear history", on_click=lambda: st.session_state["chat_history"].clear(),
use_container_width=True, disabled=not st.session_state.get("with_history", False),
help="Clear the conversation history for agent.\n\n But the history for demonstration will be reserved")
### Memory save
col3, col4 = st.columns([1, 1])
col3.button("Save history", on_click=_history_to_disk, type="secondary", use_container_width=True)
#初始化对话历史
placeholder = st.empty()
with placeholder.container():
if 'chat_history' not in st.session_state:
st.session_state['chat_history'] = []
historys: List[Conversation] = st.session_state['chat_history']
##显示历史记录
for conversation in historys:
conversation.show()
###不同对话历史实现
### 长短期对话历史
def trim_and_summarize_messages(chain_input):
"""Trim and summarize the messages."""
stored_messages = msgs.messages.copy()
if len(stored_messages) <= his_len*2:
return False
msgs.messages = [] # 清空
summarization_prompt = ChatPromptTemplate.from_messages([
MessagesPlaceholder(variable_name="chat_history"),
("user", "Distill the above chat messages into a single summary message. Include as many specific details as you can.")
])
if not "llm" in st.session_state:
init_llm()
summarization_chain = summarization_prompt | st.session_state["llm"]
# 总结前两轮对话
summary_message = summarization_chain.invoke({"chat_history": stored_messages[:4+len(stored_messages)%2]})
msgs.add_ai_message(summary_message)
for message in stored_messages[4+len(stored_messages)%2:]:
msgs.add_message(message)
return True
chain_with_trimming_and_summarization = (
RunnablePassthrough.assign(messages_trimmed_and_summarized=trim_and_summarize_messages)
| chain_with_history
)
### 定期总结对话历史
def regular_summarize_messages(chain_input):
global msgs
"""Trim and summarize the messages."""
stored_messages = msgs.messages.copy()
if len(stored_messages) <= his_len:
return False
msgs.messages = [] # 清空
for message in stored_messages[-his_len:]:
msgs.add_message(message)
summarization_prompt = ChatPromptTemplate.from_messages([
MessagesPlaceholder(variable_name="chat_history"),
("user",
"Distill the above chat messages into a single summary message. Include as many specific details as you can.")
])
if not "llm" in st.session_state:
init_llm()
summarization_chain = summarization_prompt | st.session_state["llm"]
summary_message = summarization_chain.invoke({"chat_history": msgs.messages})
msgs.messages = [] # 清空
msgs.add_ai_message(summary_message)
return True
chain_with_regular_summarization = (
RunnablePassthrough.assign(messages_regular_summarized=regular_summarize_messages)
| chain_with_history
)
### 短期对话历史
def trim_messages(chain_input):
"""Trim the messages to the desired length."""
stored_messages = msgs.messages.copy()
if len(stored_messages) <= his_len*2:
return False
msgs.messages = [] # 清空
for message in stored_messages[-(his_len*2):]:
msgs.add_message(message)
return True
chain_with_trimming = (
RunnablePassthrough.assign(messages_trimmed=trim_messages)
| chain_with_history
)
### 对话历史总结
def summarize_messages(chain_input):
"""Summarize the messages."""
global msgs
stored_messages = msgs.messages.copy()
if len(stored_messages) == 0:
return False
summarization_prompt = ChatPromptTemplate.from_messages([
MessagesPlaceholder(variable_name="chat_history"),
("user", "Distill the above chat messages into a single summary message. Include as many specific details as you can.")
])
if not "llm" in st.session_state:
init_llm()
summarization_chain = summarization_prompt | st.session_state["llm"]
summary_message = summarization_chain.invoke({"chat_history": stored_messages})
msgs.messages = [] # 清空
msgs.add_ai_message(summary_message)
return True
chain_with_summarization = (
RunnablePassthrough.assign(messages_summarized=summarize_messages)
| chain_with_history
)
# ##显示memory记录
# for msg in msgs.messages:
# # st.chat_message(msg.type).write(msg.content)
# print(msg.content)
##对话功能
if prompt_text := st.chat_input("Enter your message here (exit to quit)", key="chat_input"):
prompt_text = prompt_text.strip()
#进行判断,若用户输入exit,则保存历史记录到本地并停止
if prompt_text.lower() == "exit":
_history_to_disk()
historys.clear()
msgs.clear()
st.stop()
conversation = Conversation(role=Role.USER, content=prompt_text)
historys.append(conversation) # 在对话历史中添加对话
st.chat_message("user").write(prompt_text)
# st.spinner状态,也是一个容器,但是是预设的容器,显示一段等候动画
with st.spinner("Thinking..."):
# response = st.session_state["bot"].rag_chain.stream(prompt_text) # 大模型返回结果
if st.session_state.get("with_history", False):
# As usual, new messages are added to StreamlitChatMessageHistory when the Chain is called.
config = {"configurable": {"session_id": "any"}}
if memory_mode == 'All':
response = chain_with_history.stream({"question": prompt_text},config)
elif memory_mode == 'Trim+Summarize':
response = chain_with_trimming_and_summarization.stream({"question": prompt_text},config)
elif memory_mode == 'Trim':
response = chain_with_trimming.stream({"question": prompt_text},config)
elif memory_mode == 'Summarize':
response = chain_with_regular_summarization.stream({"question": prompt_text},config)
else:
response = st.session_state["llm"].stream(prompt_text)
content = st.chat_message("assistant").write_stream(response)
conversation = Conversation(role=Role.ASSISTANT, content=content) # 模型会话定义
historys.append(conversation)
# ##显示memory记录
# print("---------------------------")
# for msg in msgs.messages:
# # st.chat_message(msg.type).write(msg.content)
# print(msg.content)
2. 本地知识库问答应用实现
待更新
Reference
API Reference - Streamlit Docs