图片编辑- Qwen-Image-Edit 与 Flux.1 Kontext

322 阅读6分钟

Qwen-Image-Edit

介绍

Qwen-Image-Edit 是 Qwen-Image 的编辑版本,基于 20B 的 Qwen-Image 模型训练,支持使用文本编辑图片。并且支持 Qwen2.5-VL 获取语义和 VAE Encoder 编码图片进行语义 clip 和 VAE 双通道编辑能力。

Features include:

  • Precise Text Editing: Qwen-Image-Edit supports bilingual (Chinese and English) text editing, allowing direct addition, deletion, and modification of text in images while preserving the original text size, font, and style.
  • Dual Semantic/Appearance Editing: Qwen-Image-Edit supports not only low-level visual appearance editing (such as style transfer, addition, deletion, modification, etc.) but also high-level visual semantic editing (such as IP creation, object rotation, etc.).
  • Strong Cross-Benchmark Performance: Evaluations on multiple public benchmarks show that Qwen-Image-Edit achieves SOTA in editing tasks, making it a powerful foundational model for image generation.

模型

模型存放位置

📂 ComfyUI/
├── 📂 models/
│   ├── 📂 diffusion_models/
│   │   └── qwen_image_edit_fp8_e4m3fn.safetensors
│   ├── 📂 loras/
│   │   └── Qwen-Image-Lightning-4steps-V1.0.safetensors
│   ├── 📂 vae/
│   │   └── qwen_image_vae.safetensors
│   └── 📂 text_encoders/
│       └── qwen_2.5_vl_7b_fp8_scaled.safetensors

模型存放位置

TypeNameLocationDownload
Main ModelQwen-ImageComfyUI/models/unetGGUF (this repo)
Main Text EncoderQwen2.5-VL-7BComfyUI/models/text_encodersSafetensors / GGUF
Text_Encoder (mmproj)Qwen2.5-VL-7B-Instruct-mmproj-BF16ComfyUI/models/text_encoders (same folder as your main text encoder)GGUF (this repo)
VAEQwen-Image VAEComfyUI/models/vaeSafetensors (this repo)
📂 ComfyUI/
├── 📂 models/
│   ├── 📂 unet/
│   │   └── Qwen_Image_Edit-Q8_0.gguf
│   ├── 📂 loras/
│   │   └── Qwen-Image-Lightning-4steps-V1.0.safetensors
│   ├── 📂 vae/
│   │   └── qwen_image_vae.safetensors
│   └── 📂 text_encoders/
│       └── qwen_2.5_vl_7b_fp8_scaled.safetensors

workflow

  • 模型加载

    • 使用 Load Diffusion Model 节点加载qwen_image_edit_fp8_e4m3fn.safetensors 模型,或者使用 Unet Loader(GGUF) 加载 Qwen_Image_Edit-Q8_0.gguf 模型
    • 使用 Load CLIP 节点加载 qwen_2.5_vl_7b_fp8_scaled.safetensors 模型
    • 使用 Load VAE 节点加载 qwen_image_vae.safetensors 模型
  • 图片加载

    • 使用 Load Image 节点加载需要编辑的图片
  • 提示词

    • 使用 TextEncoderQwenImageEdit 节点输入提示词,两个节点,分别输入正向提示词和负向提示词。
  • 图片缩放节点,控制总像素少于一百万。

  • Lora

    • 如果需要 4-step Lighting LoRA 加速图片生成,则使用 LoraLoaderModelOnly 节点加载 Qwen-Image-Lightning-4steps-V1.0.safetensors 模型,连接加入到模型通道。
  • KSampler 节点中的 steps 和 cfg 值参考下表:

ModelStepsCFG
Offical504.0
fp8_e4m3fn202.5
fp8_e4m3fn + 4steps LoRA41.0

Flux.1 Kontext

介绍

Flux.1 Kontext 是多模态的图片编辑模型,支持同时输入提示词和图片,理解图片内容,执行精准编辑。有如下特点:

  • Character Consistency: Preserves unique elements in images across multiple scenes and environments, such as reference characters or objects in the image.
  • Editing: Makes targeted modifications to specific elements in the image without affecting other parts.
  • Style Reference: Generates novel scenes while preserving the unique style of the reference image according to text prompts.
  • Interactive Speed: Minimal latency in image generation and editing.

Version Information

  • [FLUX.1 Kontext [pro]
    • Commercial version, focused on rapid iterative editing
  • FLUX.1 Kontext [max]
    • Experimental version with stronger prompt adherence
  • FLUX.1 Kontext [dev]  
    • Open source version (used in this tutorial), 12B parameters, mainly for research

模型

Flux.1 Kontext Dev 原始模型权重及社区版本

Text Encoder

VAE

模型存放位置

📂 ComfyUI/
├── 📂 models/
│   ├── 📂 diffusion_models/
│   │   └── flux1-dev-kontext_fp8_scaled.safetensors 或者 flux1-kontext-dev.safetensors
│   ├── 📂 unet/
│   │   └── 如 flux1-kontext-dev-Q4_K_M.gguf, # 仅在你需要使用 GGUF 版本时下载
│   ├── 📂 vae/
│   │   └── ae.safetensors
│   └── 📂 text_encoders/
│       ├── clip_l.safetensors
│       └── t5xxl_fp16.safetensors or t5xxl_fp8_e4m3fn_scaled.safetensors

workflow

  • 模型加载
    • 在 Load Diffusion Model 节点中加载 flux1-dev-kontext_fp8_scaled.safetensors 模型,或者在 Unet Loader (GGUF) 节点中加载 flux1-kontext-dev-Q4_K_M.gguf (或者其它版本)
    • 在 DualCLIP Load 节点中确保: clip_l.safetensors 及 t5xxl_fp16.safetensors 或 t5xxl_fp8_e4m3fn_scaled.safetensors 已经加载
    • 在 Load VAE 节点中确保加载 ae.safetensors 模型
  • 图片加载
    • 在 Load Image(from output) 节点中加载提供的输入图像
  • 提示词
    • 在 CLIP Text Encode 节点中修改提示词,仅支持英文

实战

提示词基本原则

  • Be Specific and Clear - 使用精确的描述,避免模糊的术语。Use precise descriptions, avoid vague terms
  • Step-by-step Editing - 将复杂的修改分解为多个简单的步骤。 Break complex modifications into multiple simple steps
  • Explicit Preservation - 明确什么应该保持不变。 State what should remain unchanged
  • Verb Selection - 使用“更改”、“替换”而不是“转换”。Use “change”, “replace” rather than “transform”

最佳实践范式

Object Modification:  将[对象]更改为[新状态],保持[要保留的内容]不变 "Change [object] to [new state], keep [content to preserve] unchanged"

Style Transfer:  转换为[特定样式],同时保持[构图/角色/其他]不变 "Transform to [specific style], while maintaining [composition/character/other] unchanged"

Background Replacement: 将背景更改为[新背景],使主体保持完全相同的位置和姿势 "Change the background to [new background], keep the subject in the exact same position and pose"

Text Editing:  将“[原始文本]”替换为“[新文本]”,保持相同的字体样式 "Replace '[original text]' with '[new text]', maintain the same font style"

Remember:  越具体越好。模型擅长理解详细的说明并保持一致性。

Change

Basic Modifications

  • Simple and direct: "Change the car color to red"
  • Maintain style: "Change to daytime while maintaining the same style of the painting"
  • Change the subject’s clothing into traditional Japanese attire
  • Change the direction of the man’s head so the camera captures it from the exact side profile
  • Change the camera angle so it faces directly towards the dog
  • Change the wall with a brick wall

Character Replacement Example

  • Change the flower pot with a canvas
  • Change the character to a male one
Change the character to a male one. He should also be fashionable and wear sunglasses. Add the text 'Comfy Creating in ComfyUI' at the top of the image. Pay attention to the capitalization of the text. The character should not block the text.

Perspective Change Example

  • Change the perspective to view
 Change the perspective to view the scene of the statue drinking beer from behind. Note that one of the statue's arms is missing.

Add

  • Add purple lipstick to the lips
  • Add afro hairstyle to the woman
  • Add Spinosaurus to the background
  • Add an armchair and a coffee table next to the lamp

Remove

Object Removal Example

  • Remove the blush from the face
  • Remove hair from the woman
  • Remove the hat and sunglasses from the subject
  • Remove the fog
  • Remove the headphones and cables
  • Remove all UI text elements from the image.
Remove all UI text elements from the image. Keep the feeling that the characters and scene are in water. Also, remove the green UI elements at the bottom.

Style Transfer

  • Style change
Turn this illustration into a realistic portrait photography style. Use young characters, and keep their green eye color and black lipstick. The characters have snow-white skin, with their eyelids slightly drooping and their eyes looking a little forward, showing an elegant and quiet expression.
  • Clearly name style: "Transform to Bauhaus art style"
  • Describe characteristics: "Transform to oil painting with visible brushstrokes, thick paint texture"
  • Preserve composition: "Change to Bauhaus style while maintaining the original composition"
  • Convert the image into a 3D animated style
  • Turn the hummingbird into a polished glass bird, transform the flower into a delicate crystal sculpture

Text Editing

  • Change the text ‘STOP’ to ‘DUR’ on the sign, keeping the same style and deformation
  • Write ‘Happy New Year’ on the blackboard
  • Use quotes: "Replace 'joy' with 'BFL'"
  • Maintain format: "Replace text while maintaining the same font style"
  • Multiple rounds of editing
Multiple rounds of editing: 
Round 1: Change "ComfyUI News" to "Qwen Image Edit" 
Round 2: Change "Qwen Image Edit is now available in ComfyUI" to "Edit the image and keep the style consistent"

Character Consistency

  • Specific description: "The woman with short black hair" instead of “she”
  • Preserve features: "while maintaining the same facial features, hairstyle, and expression"
  • Step-by-step modifications: Change background first, then actions

Image Editing Example:

  • Make the man look happy
  • Make the man look sad
  • Make the man strong and muscular
  • Make the dog slim
  • Replace all the bullets with shimmering, multi-colored butterflies
  • Show this man eating breakfast at a table
  • Replace the Taj Mahal with a detailed medieval castle, ensuring all reflections in water, glass, or surrounding surfaces match the new structure accurately. Keep lighting, shadows, and perspective consistent with the original scene for full realism
  • Transform the scene into a peaceful winter landscape. Cover the ground and trees with a fresh layer of snow. Add gentle snowflakes falling from the sky, with a soft white overcast atmosphere
  • Have the character seated at the bar of a busy pub.
Have the character seated at the bar of a busy pub. Behind her, the scene is lively, but the background stays mostly dark. She’s holding a wine glass, facing the bar, with the light lighting up the foreground—giving the whole image a cinematic look and feel.

参考

blog.comfy.org/p/qwen-imag…

docs.comfy.org/tutorials/i…

wiro.ai/blog/nano-b…

docs.comfy.org/tutorials/a…

comfyui-wiki.com/zh/tutorial…