图像生成功能详解：使用 OpenAI 接口轻松生成与编辑图像图像生成功能详解：使用 OpenAI 接口轻松生成与编辑图像

图像生成功能详解：使用 OpenAI 接口轻松生成与编辑图像

图像生成工具允许模型根据文本提示生成或编辑图像，也可结合已有图片进行图像编辑操作。它基于强大的 GPT 图像模型（如 gpt-image-1），并自动优化你的提示内容以获得更优效果。

如需了解更多使用方式，可参考官方的图像生成指南。

一、使用方法

在请求中包含 image_generation 工具后，模型将自动决定是否调用图像生成功能，并根据你的提示或上传图像自动作图。

生成结果中 image_generation_call 的响应包含 base64 编码的图像内容。

二、图像生成示例

JavaScript 示例

import OpenAI from "openai";
const openai = new OpenAI({ baseURL: "https://api.aaaaapi.com" });

const response = await openai.responses.create({
  model: "gpt-4.1-mini",
  input: "生成一只拥抱着水獭的灰色虎斑猫，水獭戴着橙色围巾",
  tools: [{ type: "image_generation" }],
});

const imageData = response.output
  .filter((output) => output.type === "image_generation_call")
  .map((output) => output.result);

if (imageData.length > 0) {
  const fs = await import("fs");
  fs.writeFileSync("otter.png", Buffer.from(imageData[0], "base64"));
}

Python 示例

from openai import OpenAI
import base64

client = OpenAI(base_url="https://api.aaaaapi.com")

response = client.responses.create(
    model="gpt-4.1-mini",
    input="生成一只拥抱着水獭的灰猫，水獭戴着橙色围巾",
    tools=[{"type": "image_generation"}],
)

image_data = [
    output.result
    for output in response.output
    if output.type == "image_generation_call"
]

if image_data:
    with open("otter.png", "wb") as f:
        f.write(base64.b64decode(image_data[0]))

三、强制调用图像工具

你可以通过如下参数强制执行图像生成：

"tool_choice": { "type": "image_generation" }

四、工具参数配置（可选）

你可以自定义以下图像输出参数：

size：图像尺寸（如 1024x1024, 1024x1536）
quality：画质（low / medium / high）
format：输出格式（如 PNG、JPEG）
compression：压缩率（适用于 JPEG/WebP）
background：是否透明背景

提示：size、quality 和 background 支持 auto 模式，模型将自动选择最佳配置。

详细文档见：图像输出配置参考

五、自动优化提示词

主调用模型（如 gpt-4.1）会自动优化用户输入的提示内容，以获得更理想图像。

{
  "revised_prompt": "一只拥抱水獭的灰色虎斑猫。水獭戴着橙色围巾。整体风格温馨治愈，动物可爱友好。"
}

建议使用“画”、“绘制”、“编辑”等动词来提高图像理解度。

六、多轮图像编辑

支持通过 response_id 或 image_id 实现图像多轮编辑，让你逐步优化生成图像。

JavaScript 多轮编辑

const followUp = await openai.responses.create({
  model: "gpt-4.1-mini",
  previous_response_id: response.id,
  input: "现在让图像更逼真",
  tools: [{ type: "image_generation" }],
});

Python 多轮编辑

response_fwup = client.responses.create(
    model="gpt-4.1-mini",
    previous_response_id=response.id,
    input="现在让图像更逼真",
    tools=[{"type": "image_generation"}],
)

使用 image_id 继续绘制（JS 示例）

const response_fwup = await openai.responses.create({
  model: "gpt-4.1-mini",
  input: [
    { role: "user", content: [{ type: "input_text", text: "更真实些" }] },
    { type: "image_generation_call", id: imageGenerationCalls[0].id },
  ],
  tools: [{ type: "image_generation" }],
});

七、流式图像生成（Streaming）

图像生成工具支持流式输出，即边生成边返回预览图像，提升用户体验。

Python 示例

stream = client.images.generate(
    prompt="绘制一条由白猫头鹰羽毛组成的河流，穿过宁静的冬季山谷",
    model="gpt-image-1",
    stream=True,
    partial_images=2,
)

for event in stream:
    if event.type == "image_generation.partial_image":
        idx = event.partial_image_index
        image_base64 = event.b64_json
        with open(f"river{idx}.png", "wb") as f:
            f.write(base64.b64decode(image_base64))

八、支持的模型

以下模型支持图像生成工具的调用：

gpt-4o
gpt-4o-mini
gpt-4.1
gpt-4.1-mini
gpt-4.1-nano
o3

实际图像始终由 gpt-image-1 模型生成，调用工具的主模型负责解释和优化提示。

九、结语与延伸阅读

图像生成是 AI 应用的重要能力之一，通过合理设置提示词与参数，你可以轻松生成丰富且高质量的图像内容，支持多轮编辑、自动优化和流式预览等强大功能。

📌 温馨提示：如果你需要一个稳定、灵活、支持自定义 API host 的图像生成接入方案，推荐使用：OpenAI 接口中转平台（支持 SDK / Stream / 工具链）