程序员转行学习 AI 大模型：第一次如何调用大模型API ｜附完整可运行代码本文是程序员转行学习AI大模型的第11个

本文是程序员转行学习AI大模型的第11个核心知识点笔记，笔记中附可直接运行的代码片段。

当前阶段：还在学习知识点，由点及面，从 0 到 1 搭建 AI 大模型知识体系中。

系列更新，关注我，后续会持续记录分享转行经历～

本文介绍第一次调用 AI 大模型的使用流程。

使用流程

在大模型服务商处获取 base_url 和 api_key 两个关键内容；
在项目文件中，新建一个.env 文件；

# API密钥等重要环境变量配置文件
# 请不要将此文件提交到代码仓库！

# 硅基流动siliconflow API Key
OPENAI_API_KEY="sk-xxx"
OPENAI_API_BASE="https://api.siliconflow.cn/v1"

在项目代码部分，先获取环境变量；

import os
from dotenv import load_dotenv

load_dotenv()

base_url = os.getenv("OPENAI_API_BASE")
api_key = os.getenv("OPENAI_API_KEY")

在大模型服务商处，获取要调用的大模型名称，注意：这里一定要从大模型服务商处拷贝完整模型名称，如都是调用 DeepSeek，每个服务商名称也是不一样的。

基于以上 4 步，然后可以通过 API 调用大模型了。

完整代码

import os
from dotenv import load_dotenv
from openai import OpenAI

class IntelligentQA:
    def __init__(self):
        load_dotenv()
        self.openai_api_key = os.getenv("OPENAI_API_KEY")
        self.openai_api_base = os.getenv("OPENAI_API_BASE")
        if not self.openai_api_base:
            raise ValueError("OPENAI_API_BASE environment variable is not set")

        self.client = OpenAI(
            api_key=self.openai_api_key,
            base_url=self.openai_api_base,
        )

        self.model_name = "deepseek-ai/DeepSeek-V3.2"

    def ask(self, question:str) -> str:
        response = self.client.chat.completions.create(
            model=self.model_name,
            messages=[
                {"role": "user", "content": question}
            ],
        )

        answer = response.choices[0].message.content
        return answer

def demo():
    print("智能问答系统演示")
    try:
        qa = IntelligentQA()

        question = "什么是Python？"
        print(f"问题: {question}")
        answer = qa.ask(question)
        print(f"回答: {answer}")
    except ValueError as e:
        print(f"初始化智能问答系统失败: {e}")
        return

if __name__ == "__main__":
    demo()

其中，OpenAI 完整参数列表如下：

OpenAI(
       api_key='your-api-key',
       organization='org-xxxxxxxx',
       base_url='https://api.openai.com/v1/',
       timeout=60.0,
       max_retries=2,
       http_client=None,
       default_headers=None,
       default_query=None,
       api_type='open_ai',
       project=None,
   )

使用 OpenAI 调用大模型，有以下几种方式：

Client 方式（最常用）

=== Client方式（最常用） ===

1. 创建client:
   
   client = OpenAI(api_key='your-api-key')
   
   特点:
   - 创建一个OpenAI实例
   - 保存API密钥和配置
   - 可以重复使用
   - 最常用的方式

2. 使用client:
   
   # 聊天完成
   response = client.chat.completions.create(
       model='gpt-3.5-turbo',
       messages=[{'role': 'user', 'content': 'Hello!'}]
   )
   
   # 嵌入
   response = client.embeddings.create(
       model='text-embedding-ada-002',
       input='Hello world'
   )
   
   # 模型列表
   models = client.models.list()
   
   特点:
   - 所有操作都通过client
   - client.chat.completions.create()
   - client.embeddings.create()
   - client.models.list()

3. 优点:
   
   - 简单直观
   - 易于理解
   - 易于调试
   - 可以复用client
   - 官方推荐

4. 缺点:
   
   - 需要创建client实例
   - 需要管理client生命周期
   - 相对较重（对于简单场景）

client.chat.completions.create 有哪些参数：

=== 必需参数 ===

1. 最简单的调用:
   
   response = client.chat.completions.create(
       model='gpt-3.5-turbo',
       messages=[{'role': 'user', 'content': 'Hello!'}]
   )
   
   必需参数:
   - model: str (模型名称)
     * 示例: 'gpt-3.5-turbo', 'gpt-4'
     * 必需: 是
     * 说明: 指定使用的模型
   
   - messages: list (消息列表)
     * 示例: [{'role': 'user', 'content': 'Hello!'}]
     * 必需: 是
     * 说明: 对话消息列表

2. model参数:
   
   可用模型:
   - gpt-4: 最强大的模型
   - gpt-4-turbo: GPT-4的Turbo版本
   - gpt-3.5-turbo: 快速、便宜
   - gpt-3.5-turbo-16k: 支持更长上下文
   
   示例:
   response = client.chat.completions.create(
       model='gpt-3.5-turbo',
       messages=[{'role': 'user', 'content': 'Hello!'}]
   )
   
   说明:
   - 不同模型有不同的能力和价格
   - gpt-4更强大但更贵
   - gpt-3.5-turbo更快更便宜

3. messages参数:
   
   消息结构:
   - role: str (角色)
     * 'system': 系统消息
     * 'user': 用户消息
     * 'assistant': AI回复
   - content: str (内容)
     * 消息的具体内容
   
   示例:
   messages = [
       {'role': 'system', 'content': '你是一个专业的助手。'},
       {'role': 'user', 'content': '你好！'},
       {'role': 'assistant', 'content': '你好！有什么我可以帮助你的吗？'},
       {'role': 'user', 'content': '介绍一下Python。'}
   ]
   
   说明:
   - system: 设置AI的角色和行为
   - user: 用户的问题或请求
   - assistant: AI的回复（用于多轮对话）

=== 常用可选参数 ===

1. temperature (温度参数):
   
   response = client.chat.completions.create(
       model='gpt-3.5-turbo',
       messages=[{'role': 'user', 'content': 'Hello!'}],
       temperature=0.7
   )
   
   参数说明:
   - 类型: float
   - 范围: 0.0 ~ 2.0
   - 默认: 1.0
   - 必需: 否
   
   含义:
   - 控制输出的随机性
   - 越低: 越确定、越一致
   - 越高: 越随机、越有创意
   
   示例:
   - 0.0: 非常确定，几乎总是相同输出
   - 0.7: 平衡，有创意但不太随机
   - 1.0: 默认值，适中的随机性
   - 1.5: 非常有创意，可能不太相关
   
   使用场景:
   - 代码生成: 0.0 ~ 0.3
   - 问答: 0.3 ~ 0.7
   - 创意写作: 0.7 ~ 1.5

2. max_tokens (最大token数):
   
   response = client.chat.completions.create(
       model='gpt-3.5-turbo',
       messages=[{'role': 'user', 'content': 'Hello!'}],
       max_tokens=100
   )
   
   参数说明:
   - 类型: int
   - 默认: 模型最大值
   - 必需: 否
   
   含义:
   - 限制AI回复的最大token数量
   - 越小: 回复越短
   - 越大: 回复越长
   
   使用场景:
   - 短回答: 50 ~ 100
   - 中等回答: 100 ~ 500
   - 长回答: 500 ~ 2000

3. n (生成数量):
   
   response = client.chat.completions.create(
       model='gpt-3.5-turbo',
       messages=[{'role': 'user', 'content': 'Hello!'}],
       n=3
   )
   
   参数说明:
   - 类型: int
   - 默认: 1
   - 必需: 否
   
   含义:
   - 生成多个不同的回复
   - 越多: 回复越多，但成本越高
   
   使用场景:
   - 单个回复: n=1
   - 多个选项: n=3 ~ 5
   - 生成创意: n=5 ~ 10

4. stop (停止词):
   
   response = client.chat.completions.create(
       model='gpt-3.5-turbo',
       messages=[{'role': 'user', 'content': 'Hello!'}],
       stop=['\n', 'END']
   )
   
   参数说明:
   - 类型: str 或 list
   - 默认: None
   - 必需: 否
   
   含义:
   - 当遇到这些词时停止生成
   - 可以是单个词或多个词
   
   使用场景:
   - 单行输出: stop=['\n']
   - 特定结束: stop=['END']
   - 多个停止词: stop=['\n', 'END', 'STOP']

5. top_p (核采样):
   
   response = client.chat.completions.create(
       model='gpt-3.5-turbo',
       messages=[{'role': 'user', 'content': 'Hello!'}],
       top_p=0.9
   )
   
   参数说明:
   - 类型: float
   - 范围: 0.0 ~ 1.0
   - 默认: 1.0
   - 必需: 否
   
   含义:
   - 控制输出的多样性
   - 越低: 越保守
   - 越高: 越多样
   
   注意:
   - 一般不要同时使用temperature和top_p
   - 推荐使用temperature

=== 其他可选参数 ===

1. presence_penalty (存在惩罚):
   
   response = client.chat.completions.create(
       model='gpt-3.5-turbo',
       messages=[{'role': 'user', 'content': 'Hello!'}],
       presence_penalty=0.5
   )
   
   参数说明:
   - 类型: float
   - 范围: -2.0 ~ 2.0
   - 默认: 0.0
   - 必需: 否
   
   含义:
   - 惩罚已经出现过的词
   - 越高: 越倾向于使用新词
   - 越低: 越倾向于重复使用词
   
   使用场景:
   - 避免重复: 0.5 ~ 1.0
   - 鼓励多样性: 1.0 ~ 2.0

2. frequency_penalty (频率惩罚):
   
   response = client.chat.completions.create(
       model='gpt-3.5-turbo',
       messages=[{'role': 'user', 'content': 'Hello!'}],
       frequency_penalty=0.5
   )
   
   参数说明:
   - 类型: float
   - 范围: -2.0 ~ 2.0
   - 默认: 0.0
   - 必需: 否
   
   含义:
   - 惩罚频繁出现的词
   - 越高: 越倾向于使用不频繁的词
   - 越低: 越倾向于使用频繁的词
   
   使用场景:
   - 避免重复: 0.5 ~ 1.0
   - 鼓励多样性: 1.0 ~ 2.0

3. user (用户标识):
   
   response = client.chat.completions.create(
       model='gpt-3.5-turbo',
       messages=[{'role': 'user', 'content': 'Hello!'}],
       user='user-123'
   )
   
   参数说明:
   - 类型: str
   - 默认: None
   - 必需: 否
   
   含义:
   - 标识最终用户的唯一ID
   - 用于监控和防止滥用
   
   使用场景:
   - 应用程序: 使用用户ID
   - 监控: 跟踪用户使用情况

4. stream (流式输出):
   
   response = client.chat.completions.create(
       model='gpt-3.5-turbo',
       messages=[{'role': 'user', 'content': 'Hello!'}],
       stream=True
   )
   
   参数说明:
   - 类型: bool
   - 默认: False
   - 必需: 否
   
   含义:
   - 是否流式输出
   - True: 逐个token返回
   - False: 一次性返回
   
   使用场景:
   - 实时显示: stream=True
   - 等待完整回复: stream=False

5. logprobs (对数概率):
   
   response = client.chat.completions.create(
       model='gpt-3.5-turbo',
       messages=[{'role': 'user', 'content': 'Hello!'}],
       logprobs=True,
       top_logprobs=5
   )
   
   参数说明:
   - logprobs: bool
     * 默认: False
     * 是否返回对数概率
   - top_logprobs: int
     * 默认: None
     * 返回前N个最可能的token
   
   使用场景:
   - 调试: logprobs=True
   - 分析: top_logprobs=5

参数使用示例：

=== 完整示例 ===

1. 基本调用:
   
   response = client.chat.completions.create(
       model='gpt-3.5-turbo',
       messages=[{'role': 'user', 'content': 'Hello!'}]
   )
   
   说明:
   - 最简单的调用
   - 只使用必需参数

2. 常用参数:
   
   response = client.chat.completions.create(
       model='gpt-3.5-turbo',
       messages=[{'role': 'user', 'content': 'Hello!'}],
       temperature=0.7,
       max_tokens=100
   )
   
   说明:
   - 使用常用参数
   - 控制随机性和长度

3. 多轮对话:
   
   response = client.chat.completions.create(
       model='gpt-3.5-turbo',
       messages=[
           {'role': 'system', 'content': '你是一个专业的助手。'},
           {'role': 'user', 'content': '你好！'},
           {'role': 'assistant', 'content': '你好！有什么我可以帮助你的吗？'},
           {'role': 'user', 'content': '介绍一下Python。'}
       ],
       temperature=0.7
   )
   
   说明:
   - 多轮对话
   - 包含历史消息

4. 生成多个选项:
   
   response = client.chat.completions.create(
       model='gpt-3.5-turbo',
       messages=[{'role': 'user', 'content': '写一个标题。'}],
       n=5,
       temperature=1.0
   )
   
   说明:
   - 生成5个不同的标题
   - 高温度增加多样性

5. 流式输出:
   
   response = client.chat.completions.create(
       model='gpt-3.5-turbo',
       messages=[{'role': 'user', 'content': 'Hello!'}],
       stream=True
   )
   
   for chunk in response:
       if chunk.choices[0].delta.content is not None:
           print(chunk.choices[0].delta.content, end='', flush=True)
   
   说明:
   - 流式输出
   - 实时显示回复

其他方式

=== 其他方式 ===

1. 直接使用API函数（不推荐）:
   
   # 旧版本SDK
   import openai
   
   # 直接调用API函数
   response = openai.ChatCompletion.create(
       model='gpt-3.5-turbo',
       messages=[{'role': 'user', 'content': 'Hello!'}]
   )
   
   特点:
   - 不需要创建client
   - 直接调用API函数
   - 旧版本SDK的方式
   
   问题:
   - 已经过时
   - 不推荐使用
   - 新版本SDK推荐使用client方式

2. 异步client方式:
   
   from openai import AsyncOpenAI
   
   # 创建异步client
   async_client = AsyncOpenAI(api_key='your-api-key')
   
   # 使用异步client
   response = await async_client.chat.completions.create(
       model='gpt-3.5-turbo',
       messages=[{'role': 'user', 'content': 'Hello!'}]
   )
   
   特点:
   - 异步调用
   - 适合高并发场景
   - 需要async/await

3. 自定义http_client:
   
   import httpx
   
   # 创建自定义HTTP客户端
   http_client = httpx.Client(
       proxies='http://your-proxy:port'
   )
   
   # 使用自定义HTTP客户端
   client = OpenAI(
       api_key='your-api-key',
       http_client=http_client
   )
   
   特点:
   - 可以设置代理
   - 可以自定义HTTP配置
   - 适合需要代理的环境

总结

最常用: client方式

创建OpenAI实例
所有操作通过client
官方推荐

其他方式:
异步client: 适合高并发
自定义http_client: 适合需要代理
直接API函数: 已经过时，不推荐

API 返回参数

=== 聊天完成API返回的参数 ===

1. 响应对象结构:
   
   response = client.chat.completions.create(...)
   
   主要属性:
   - response.id: 请求ID
   - response.object: 对象类型
   - response.created: 创建时间
   - response.model: 使用的模型
   - response.choices: 选择列表
   - response.usage: 使用情况

2. 详细参数:
   
   response.id:
   - 类型: str
   - 示例: 'chatcmpl-123abc456def'
   - 含义: 请求的唯一标识符
   
   response.object:
   - 类型: str
   - 示例: 'chat.completion'
   - 含义: 对象类型
   
   response.created:
   - 类型: int
   - 示例: 1699012345
   - 含义: 创建时间（Unix时间戳）
   
   response.model:
   - 类型: str
   - 示例: 'gpt-3.5-turbo'
   - 含义: 使用的模型
   
   response.choices:
   - 类型: list
   - 含义: 选择列表（通常只有一个）
   
   response.usage:
   - 类型: object
   - 含义: 使用情况（token数量）

3. choices参数:
   
   response.choices[0]:
   - index: int (选择索引，通常为0)
   - message: object (消息对象)
   - finish_reason: str (完成原因)
   
   response.choices[0].message:
   - role: str (角色，通常是'assistant')
   - content: str (回复内容)
   
   response.choices[0].finish_reason:
   - 类型: str
   - 示例: 'stop' / 'length' / 'content_filter'
   - 含义: 完成原因
     * stop: 正常完成
     * length: 达到最大长度
     * content_filter: 内容过滤

4. usage参数:
   
   response.usage:
   - prompt_tokens: int (输入token数量)
   - completion_tokens: int (输出token数量)
   - total_tokens: int (总token数量)
   
   示例:
   response.usage.prompt_tokens: 10
   response.usage.completion_tokens: 20
   response.usage.total_tokens: 30

5. 获取回复内容:
   
   # 方式1: 直接获取
   content = response.choices[0].message.content
   
   # 方式2: 遍历choices
   for choice in response.choices:
       content = choice.message.content
   
   # 方式3: 获取所有信息
   id = response.id
   model = response.model
   content = response.choices[0].message.content
   prompt_tokens = response.usage.prompt_tokens
   completion_tokens = response.usage.completion_tokens

=== 完整示例 ===

1. 完整响应:
   
   # 响应对象
   response = client.chat.completions.create(...)
   
   # 响应属性
   print(f"ID: {response.id}")
   print(f"Object: {response.object}")
   print(f"Created: {response.created}")
   print(f"Model: {response.model}")
   
   # choices
   for choice in response.choices:
       print(f"Index: {choice.index}")
       print(f"Role: {choice.message.role}")
       print(f"Content: {choice.message.content}")
       print(f"Finish Reason: {choice.finish_reason}")
   
   # usage
   print(f"Prompt Tokens: {response.usage.prompt_tokens}")
   print(f"Completion Tokens: {response.usage.completion_tokens}")
   print(f"Total Tokens: {response.usage.total_tokens}")

2. 常用操作:
   
   # 获取回复内容
   content = response.choices[0].message.content
   
   # 获取token数量
   prompt_tokens = response.usage.prompt_tokens
   completion_tokens = response.usage.completion_tokens
   total_tokens = response.usage.total_tokens
   
   # 获取完成原因
   finish_reason = response.choices[0].finish_reason
   
   # 获取模型
   model = response.model

3. 实际应用:
   
   # 调用API
   response = client.chat.completions.create(
       model='gpt-3.5-turbo',
       messages=[{'role': 'user', 'content': 'Hello!'}]
   )
   
   # 获取回复
   answer = response.choices[0].message.content
   print(f"AI回复: {answer}")
   
   # 获取token数量
   tokens = response.usage.total_tokens
   print(f"Token数量: {tokens}")
   
   # 计算成本
   cost = tokens * 0.002 / 1000  # 假设每1000 tokens 0.002美元
   print(f"成本: ${cost:.4f}")

4. 错误处理:
   
   try:
       response = client.chat.completions.create(...)
       content = response.choices[0].message.content
   except Exception as e:
       print(f"错误: {e}")

=== 其他API返回的参数 ===

1. 模型列表API:
   
   # 调用API
   models = client.models.list()
   
   # 返回参数:
   models.data: list (模型列表)
   models.data[0].id: str (模型ID)
   models.data[0].object: str (对象类型)
   models.data[0].created: int (创建时间)
   models.data[0].owned_by: str (所有者)
   
   # 示例:
   for model in models.data:
       print(f"模型: {model.id}")
       print(f"所有者: {model.owned_by}")

2. 嵌入API:
   
   # 调用API
   response = client.embeddings.create(
       model='text-embedding-ada-002',
       input='Hello world'
   )
   
   # 返回参数:
   response.object: str (对象类型)
   response.model: str (使用的模型)
   response.data: list (嵌入列表)
   response.data[0].index: int (索引)
   response.data[0].embedding: list (嵌入向量)
   response.usage: object (使用情况)
   
   # 示例:
   embedding = response.data[0].embedding
   print(f"嵌入维度: {len(embedding)}")
   print(f"嵌入向量: {embedding}")

3. 对比:
   
   | API | 主要参数 | 用途 |
   |-----|----------|------|
   | 聊天完成 | choices, usage | 获取AI回复 |
   | 模型列表 | data | 获取可用模型 |
   | 嵌入 | data, usage | 获取文本嵌入 |
   
   总结:
   - 聊天完成: 最常用，获取AI回复
   - 模型列表: 获取可用模型
   - 嵌入: 获取文本嵌入向量

综上：

response = client.chat.completions.create(...)返回的参数如下：

其中，choices 参数：

usage 参数：

常用操作：

# 获取回复内容
content = response.choices[0].message.content
# 获取token数量
tokens = response.usage.total_tokens
# 获取完成原因
finish_reason = response.choices[0].finish_reason

程序员转行学习 AI 大模型： 第一次如何调用大模型API ｜ 附完整可运行代码

使用流程

完整代码

API 返回参数

程序员转行学习 AI 大模型：第一次如何调用大模型API ｜附完整可运行代码