小白举手01|青训ai笔记-模型篇 | 豆包MarsCode AI 刷题声明：本人超级基础小白，属于python还不熟

声明：本人超级基础小白，属于python还不熟练的程度（非科班），基本学习是边看边查，笔记属于笨蛋小白的自我挣扎，所以大神请略过。

一、概念

1、模型模型，位于LangChain框架的最底层，它是基于语言模型构建的应用的核心元素，因为所谓LangChain应用开发，就是以LangChain作为框架，通过API调用大模型来解决具体问题的过程。

模型的类型

LangChain支持三种主要类型的模型：

大语言模型（LLM） ：这些模型接收文本字符串作为输入，并返回文本字符串作为输出。它们通常用于文本补全和生成任务。
聊天模型（Chat Model） ：这类模型，如OpenAI的ChatGPT系列，专门针对对话场景进行优化。输入为聊天消息列表，输出为生成的聊天消息，能够更好地保持上下文。
文本嵌入模型（Embedding Model） ：这些模型将文本转换为浮点数列表，用于各种机器学习任务，如相似性计算和信息检索。

模型的工作流程

在LangChain中，使用模型的过程可以分为三个主要步骤：

输入提示（Format） ：通过模板动态生成输入，以适应不同的任务需求。
调用模型（Predict） ：通过API调用相应的语言模型进行处理。
输出解析（Parse） ：将模型输出解析为所需的信息格式，提升数据处理效率。

这一流程被称为“Model I/O”，LangChain提供了多种工具和模板来简化这一过程（如图）

LangChain的功能

LangChain不仅提供了与多种LLM的标准接口，还支持以下功能：

异步调用：允许同时调用多个LLM，提高效率。
缓存机制：对于重复请求，LangChain提供缓存功能以节省时间和成本。
流式处理：支持逐步返回响应，提高用户体验。
自定义模型：用户可以通过继承基础类，自定义自己的LLM，以适应特定需求

二、代码实现

（一）输入模型

from langchain.prompts import PromptTemplate
# 创建原始模板
template = """您是一位专业的鲜花店文案撰写员。\n
对于售价为 {price} 元的 {flower_name} ，您能提供一个吸引人的简短描述吗？
"""
# 根据原始模板创建LangChain提示模板
prompt = PromptTemplate.from_template(template) 
# 打印LangChain提示模板的内容
print(prompt)

讲解：所谓“模板”就是一段描述某种鲜花的文本格式，它是一个 f-string，其中有两个变量 {flower_name} 和 {price} 表示花的名称和价格，这两个值是模板里面的占位符，在实际使用模板生成提示时会被具体的值替换。代码中的from_template是一个类方法，它允许我们直接从一个字符串模板中创建一个PromptTemplate对象。打印出这个PromptTemplate对象，你可以看到这个对象中的信息包括输入的变量（在这个例子中就是 flower_name 和 price）、输出解析器（这个例子中没有指定）、模板的格式（这个例子中为'f-string'）、是否验证模板（这个例子中设置为 True）。

因此PromptTemplate的from_template方法就是将一个原始的模板字符串转化为一个更丰富、更方便操作的PromptTemplate对象，这个对象就是LangChain中的提示模板。【注释】：f-string（格式化字符串字面量）是Python 3.6中引入的一种字符串格式化方法，旨在简化字符串的插值和格式化过程。它通过在字符串前加上字母“f”或“F”来定义，允许用户在字符串中直接嵌入表达式和变量。

如：pi = 3.14159 print(f"Pi rounded to two decimal places is {pi:.2f}.") Pi rounded to two decimal places is 3.14.

方法一：langchain应用

# 设置OpenAI API Key
import os
os.environ["OPENAI_API_KEY"] = '你的Open AI API Key'

# 导入LangChain中的OpenAI模型接口
from langchain_openai import OpenAI
# 创建模型实例
model = OpenAI(model_name='gpt-3.5-turbo-instruct')
# 输入提示
input = prompt.format(flower_name=["玫瑰"], price='50')
# 得到模型的输出
output = model.invoke(input)
# 打印输出内容
print(output)

方法二：直接调用api

import openai # 导入OpenAI
openai.api_key = 'Your-OpenAI-API-Key' # API Key

prompt_text = "您是一位专业的鲜花店文案撰写员。对于售价为{}元的{}，您能提供一个吸引人的简短描述吗？" # 设置提示

flowers = ["玫瑰", "百合", "康乃馨"]
prices = ["50", "30", "20"]

# 循环调用Text模型的Completion方法，生成文案
for flower, price in zip(flowers, prices):
    prompt = prompt_text.format(price, flower)
    response = openai.completions.create(
        engine="gpt-3.5-turbo-instruct",
        prompt=prompt,
        max_tokens=100
    )
    print(response.choices[0].text.strip()) # 输出文案

方法对比：

代码的可读性：使用模板的话，提示文本更易于阅读和理解，特别是对于复杂的提示或多变量的情况。
可复用性：模板可以在多个地方被复用，让你的代码更简洁，不需要在每个需要生成提示的地方重新构造提示字符串。
维护：如果你在后续需要修改提示，使用模板的话，只需要修改模板就可以了，而不需要在代码中查找所有使用到该提示的地方进行修改。
变量处理：如果你的提示中涉及到多个变量，模板可以自动处理变量的插入，不需要手动拼接字符串。
参数化：模板可以根据不同的参数生成不同的提示，这对于个性化生成文本非常有用。

（二）输出模型代码

模型生成了一个文案。这段文字是一段字符串，正是你所需要的。但是，在开发具体应用的过程中，很明显我们不仅仅需要文字，更多情况下我们需要的是程序能够直接处理的、结构化的数据。（意思就是说，能够把输出的内容按照逻辑进行归类，方便后边接着处理）

比如说，在这个文案中，如果你希望模型返回两个字段：

description：鲜花的说明文本
reason：解释一下为何要这样写上面的文案

# 导入OpenAI Key
import os
os.environ["OPENAI_API_KEY"] = '你的OpenAI API Key'

# 导入LangChain中的提示模板
from langchain.prompts import PromptTemplate
# 创建原始提示模板
prompt_template = """您是一位专业的鲜花店文案撰写员。
对于售价为 {price} 元的 {flower_name} ，您能提供一个吸引人的简短描述吗？
{format_instructions}"""

# 通过LangChain调用模型
from langchain_openai import OpenAI
# 创建模型实例
model = OpenAI(model_name='gpt-3.5-turbo-instruct')

# 导入结构化输出解析器和ResponseSchema
from langchain.output_parsers import StructuredOutputParser, ResponseSchema
# 定义我们想要接收的响应模式
response_schemas = [
    ResponseSchema(name="description", description="鲜花的描述文案"),
    ResponseSchema(name="reason", description="问什么要这样写这个文案")
]
# 创建输出解析器
output_parser = StructuredOutputParser.from_response_schemas(response_schemas)

# 获取格式指示
format_instructions = output_parser.get_format_instructions()
# 根据原始模板创建提示，同时在提示中加入输出解析器的说明
prompt = PromptTemplate.from_template(prompt_template, 
                partial_variables={"format_instructions": format_instructions}) 

# 数据准备
flowers = ["玫瑰", "百合", "康乃馨"]
prices = ["50", "30", "20"]

# 创建一个空的DataFrame用于存储结果
import pandas as pd
df = pd.DataFrame(columns=["flower", "price", "description", "reason"]) # 先声明列名

for flower, price in zip(flowers, prices):
    # 根据提示准备模型的输入
    input = prompt.format(flower_name=flower, price=price)

    # 获取模型的输出
    output = model.invoke(input)
    
    # 解析模型的输出（这是一个字典结构）
    parsed_output = output_parser.parse(output)

    # 在解析后的输出中添加“flower”和“price”
    parsed_output['flower'] = flower
    parsed_output['price'] = price

    # 将解析后的输出添加到DataFrame中
    df.loc[len(df)] = parsed_output  

# 打印字典
print(df.to_dict(orient='records'))

# 保存DataFrame到CSV文件
df.to_csv("flowers_with_descriptions.csv", index=False)

[{'flower': '玫瑰', 'price': '50', 'description': 'Luxuriate in the beauty of this 50 yuan rose, with its deep red petals and delicate aroma.', 'reason': 'This description emphasizes the elegance and beauty of the rose, which will be sure to draw attention.'}, 
{'flower': '百合', 'price': '30', 'description': '30元的百合，象征着坚定的爱情，带给你的是温暖而持久的情感！', 'reason': '百合是象征爱情的花，写出这样的描述能让顾客更容易感受到百合所带来的爱意。'}, 
{'flower': '康乃馨', 'price': '20', 'description': 'This beautiful carnation is the perfect way to show your love and appreciation. Its vibrant pink color is sure to brighten up any room!', 'reason': 'The description is short, clear and appealing, emphasizing the beauty and color of the carnation while also invoking a sense of love and appreciation.'}]