LangChain实战课-4.模型I/O:输入提示、调用模型、解析输出我正在参加「豆包MarsCode AI练中学体验活

我正在参加「豆包MarsCode AI练中学体验活动」

1.Model I/O

我们可以把对模型的使用过程拆解成三块，分别是输入提示（对应图中的Format）、调用模型（对应图中的Predict）和输出解析（对应图中的Parse）。这三块形成了一个整体，因此在LangChain中这个过程被统称为 Model I/O（Input/Output）。

2.提示模板

现在“提示工程”这个词特别流行，所谓Prompt Engineering，就是专门研究对大语言模型的提示构建。

如果你希望为销售的每一种鲜花生成一段简介文案，那么每当你的员工或者顾客想了解某种鲜花时，调用该模板就会生成适合的文字。


api_key =""
model = 'ep-20241104131149-csxf9'
base_url = "https://ark.cn-beijing.volces.com/api/v3"
# 导入LangChain中的提示模板
import os

from langchain.prompts import PromptTemplate

# 创建原始模板
template = """您是一位专业的鲜花店文案撰写员。\n
对于售价为 {price} 元的 {flower_name} ，您能提供一个吸引人的简短描述吗？
"""
# 根据原始模板创建LangChain提示模板
prompt = PromptTemplate.from_template(template)
# 打印LangChain提示模板的内容
print(prompt)

# 导入LangChain中的OpenAI模型接口
from langchain_openai import OpenAI, ChatOpenAI

# 创建模型实例
model = ChatOpenAI(api_key=api_key,base_url=base_url,model=model)
# 输入提示
input = prompt.format(flower_name=["玫瑰"], price="50")
# 得到模型的输出
output = model.invoke(input)
# 打印输出内容
print(output)

3.复用提示词模板

LangChain中支持的模型有三大类。

大语言模型（LLM），也叫Text Model，这些模型将文本字符串作为输入，并返回文本字符串作为输出。Open AI的text-davinci-003、Facebook的LLaMA、ANTHROPIC的Claude，都是典型的LLM。
聊天模型（Chat Model），主要代表Open AI的ChatGPT系列模型。这些模型通常由语言模型支持，但它们的 API 更加结构化。具体来说，这些模型将聊天消息列表作为输入，并返回聊天消息。
文本嵌入模型（Embedding Model），这些模型将文本作为输入并返回浮点数列表，也就是Embedding。而文本嵌入模型如OpenAI的text-embedding-ada-002，我们之前已经见过了。文本嵌入模型负责把文档存入向量数据库，和我们这里探讨的提示工程关系不大。

复用提示模板


api_key =""
model = 'ep-20241104131149-csxf9'
base_url = "https://ark.cn-beijing.volces.com/api/v3"

# 导入LangChain中的提示模板
from langchain.prompts import PromptTemplate

# 创建原始模板
template = """您是一位专业的鲜花店文案撰写员。\n
对于售价为 {price} 元的 {flower_name} ，您能提供一个吸引人的简短描述吗？
"""
# 根据原始模板创建LangChain提示模板
prompt = PromptTemplate.from_template(template)
# 打印LangChain提示模板的内容
print(prompt)


# 导入LangChain中的OpenAI模型接口
from langchain_openai import OpenAI, ChatOpenAI

# 创建模型实例
# model = OpenAI(model_name='gpt-3.5-turbo-instruct')
model = ChatOpenAI(api_key=api_key, model_name=model, base_url=base_url)

# 多种花的列表
flowers = ["玫瑰", "百合", "康乃馨"]
prices = ["50", "30", "20"]

# 生成多种花的文案
for flower, price in zip(flowers, prices):
    # 使用提示模板生成输入
    input_prompt = prompt.format(flower_name=flower, price=price)

    # 得到模型的输出
    output = model.invoke(input_prompt)

    # 打印输出内容
    print(output)

理论上来说，langchain的提示词的封装是为了让你更多的使用提示词工程，你会发现这个本质上和使用python的格式化字符串差不多。但是使用提示词工程还有一部分技巧，这也就是问什么之后还会出现少样本和思维链和思维树的方法。

使用原生而不是langchain的方法：

api_key =""
model = 'ep-20241104131149-csxf9'
base_url = "https://ark.cn-beijing.volces.com/api/v3"


from openai import OpenAI  # 导入OpenAI

prompt_text = "您是一位专业的鲜花店文案撰写员。对于售价为{}元的{}，您能提供一个吸引人的简短描述吗？"  # 设置提示

flowers = ["玫瑰", "百合", "康乃馨"]
prices = ["50", "30", "20"]

# 循环调用Text模型的Completion方法，生成文案
for flower, price in zip(flowers, prices):
    prompt = prompt_text.format(price, flower)
    client = OpenAI(
        api_key=api_key,
        base_url=base_url)
    response = client.chat.completions.create(
        
        model=model,
        messages=[
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": prompt},
        ],
        max_tokens=100,
    )

    print(response.choices[0].message.content)

可以发现，使用langchain的提示词更加简洁一点

4.更换模型

前面说了langchain除了语言提示更加简洁之外，还有一个优点就是更够方便更换模型，下面我们直接更换使用huggingface的模型（注意这里你需要注册HUGGINGFACE并且获取写的权限），在AI练中学的环境中无法访问外网，因此需要使用一些特殊手段如github的codingspace或者科学上网去访问huggingface，并且这里还有一个问题，就是这个案例中的huggingface模型不支持中文，如果出现中文就不会有任何回复效果。

# 设置HuggingFace API Token
import os
os.environ['HUGGINGFACEHUB_API_TOKEN'] = ''

# 导入LangChain中的提示模板
from langchain.prompts import PromptTemplate

# 创建原始模板
template = """You are a flower shop assitiant。\n
For {price} of {flower_name} ，can you write something for me？
"""
# 根据原始模板创建LangChain提示模板
prompt = PromptTemplate.from_template(template)
# 打印LangChain提示模板的内容
print(prompt)

from langchain_community.llms import HuggingFaceHub
# pip install langchain-huggingface==0.0.3

# 创建模型实例
model = HuggingFaceHub(repo_id="google/flan-t5-large")
# 输入提示
input = prompt.format(flower_name=["rose"], price="50")
# 得到模型的输出
output = model.invoke(input)
# 打印输出内容
print(output)

当然，这是一个很古老的模型，老师想要传递的是重用模板的意思，只需要修改模型，其他的模板不用做任何修改，目前来看是一个方便的提高。

5.输出解析

langchain的输出解析也是一个很好用的功能，只是效果具体还是得看模型。

下面，我们就通过LangChain的输出解析器来重构程序，让模型有能力生成结构化的回应，同时对其进行解析，直接将解析好的数据存入CSV文档。


api_key =""
model = 'ep-20241104131149-csxf9'
base_url = "https://ark.cn-beijing.volces.com/api/v3"

# 导入LangChain中的提示模板
from langchain.prompts import PromptTemplate

# 创建提示模板
prompt_template = """您是一位专业的鲜花店文案撰写员。
对于售价为 {price} 元的 {flower_name} ，您能提供一个吸引人的简短描述吗？
{format_instructions}"""

# 通过LangChain调用模型
from langchain_openai import OpenAI, ChatOpenAI

# 创建模型实例
# model = OpenAI(model_name='gpt-3.5-turbo-instruct')
model = ChatOpenAI(
    api_key=api_key,
    base_url=base_url,
    model=model
    )

# 导入结构化输出解析器和ResponseSchema
from langchain.output_parsers import StructuredOutputParser, ResponseSchema

# 定义我们想要接收的响应模式
response_schemas = [
    ResponseSchema(name="description", description="鲜花的描述文案"),
    ResponseSchema(name="reason", description="问什么要这样写这个文案"),
]
# 创建输出解析器
output_parser = StructuredOutputParser.from_response_schemas(response_schemas)

# 获取格式指示
format_instructions = output_parser.get_format_instructions()
# 根据模板创建提示，同时在提示中加入输出解析器的说明
prompt = PromptTemplate.from_template(
    prompt_template, partial_variables={"format_instructions": format_instructions}
)

# 数据准备
flowers = ["玫瑰", "百合", "康乃馨"]
prices = ["50", "30", "20"]

# 创建一个空的DataFrame用于存储结果
import pandas as pd

df = pd.DataFrame(columns=["flower", "price", "description", "reason"])  # 先声明列名

for flower, price in zip(flowers, prices):
    # 根据提示准备模型的输入
    input = prompt.format(flower_name=flower, price=price)

    # 获取模型的输出
    output = model.invoke(input)
    # 解析模型的输出（这是一个字典结构）
    parsed_output = output_parser.parse(output.content)

    # 在解析后的输出中添加“flower”和“price”
    parsed_output["flower"] = flower
    parsed_output["price"] = price

    # 将解析后的输出添加到DataFrame中
    df.loc[len(df)] = parsed_output

# 打印字典
print(df.to_dict(orient="records"))

# 保存DataFrame到CSV文件
df.to_csv("flowers_with_descriptions.csv", index=False)

到这里，我们今天的任务也就顺利完成了。