图片编辑- Qwen-Image-Edit 与 Flux.1 KontextQwen-Image-Edit 介绍 Qwe

Qwen-Image-Edit

介绍

Qwen-Image-Edit 是 Qwen-Image 的编辑版本，基于 20B 的 Qwen-Image 模型训练，支持使用文本编辑图片。并且支持 Qwen2.5-VL 获取语义和 VAE Encoder 编码图片进行语义 clip 和 VAE 双通道编辑能力。

Features include:

Precise Text Editing: Qwen-Image-Edit supports bilingual (Chinese and English) text editing, allowing direct addition, deletion, and modification of text in images while preserving the original text size, font, and style.
Dual Semantic/Appearance Editing: Qwen-Image-Edit supports not only low-level visual appearance editing (such as style transfer, addition, deletion, modification, etc.) but also high-level visual semantic editing (such as IP creation, object rotation, etc.).
Strong Cross-Benchmark Performance: Evaluations on multiple public benchmarks show that Qwen-Image-Edit achieves SOTA in editing tasks, making it a powerful foundational model for image generation.

模型

原模型

模型存放位置

📂 ComfyUI/
├── 📂 models/
│   ├── 📂 diffusion_models/
│   │   └── qwen_image_edit_fp8_e4m3fn.safetensors
│   ├── 📂 loras/
│   │   └── Qwen-Image-Lightning-4steps-V1.0.safetensors
│   ├── 📂 vae/
│   │   └── qwen_image_vae.safetensors
│   └── 📂 text_encoders/
│       └── qwen_2.5_vl_7b_fp8_scaled.safetensors

GGUF 模型
- Qwen-Image-Edit-GGUF

模型存放位置

Type	Name	Location	Download
Main Model	Qwen-Image	`ComfyUI/models/unet`	GGUF (this repo)
Main Text Encoder	Qwen2.5-VL-7B	`ComfyUI/models/text_encoders`	Safetensors / GGUF
Text_Encoder (mmproj)	Qwen2.5-VL-7B-Instruct-mmproj-BF16	`ComfyUI/models/text_encoders` (same folder as your main text encoder)	GGUF (this repo)
VAE	Qwen-Image VAE	`ComfyUI/models/vae`	Safetensors (this repo)

📂 ComfyUI/
├── 📂 models/
│   ├── 📂 unet/
│   │   └── Qwen_Image_Edit-Q8_0.gguf
│   ├── 📂 loras/
│   │   └── Qwen-Image-Lightning-4steps-V1.0.safetensors
│   ├── 📂 vae/
│   │   └── qwen_image_vae.safetensors
│   └── 📂 text_encoders/
│       └── qwen_2.5_vl_7b_fp8_scaled.safetensors

workflow

模型加载
- 使用 Load Diffusion Model 节点加载qwen_image_edit_fp8_e4m3fn.safetensors 模型，或者使用 Unet Loader(GGUF) 加载 Qwen_Image_Edit-Q8_0.gguf 模型
- 使用 Load CLIP 节点加载 qwen_2.5_vl_7b_fp8_scaled.safetensors 模型
- 使用 Load VAE 节点加载 qwen_image_vae.safetensors 模型
图片加载
- 使用 Load Image 节点加载需要编辑的图片
提示词
- 使用 TextEncoderQwenImageEdit 节点输入提示词，两个节点，分别输入正向提示词和负向提示词。
图片缩放节点，控制总像素少于一百万。
Lora
- 如果需要 4-step Lighting LoRA 加速图片生成，则使用 LoraLoaderModelOnly 节点加载 Qwen-Image-Lightning-4steps-V1.0.safetensors 模型，连接加入到模型通道。
KSampler 节点中的 steps 和 cfg 值参考下表：

Model	Steps	CFG
Offical	50	4.0
fp8_e4m3fn	20	2.5
fp8_e4m3fn + 4steps LoRA	4	1.0

Flux.1 Kontext

介绍

Flux.1 Kontext 是多模态的图片编辑模型，支持同时输入提示词和图片，理解图片内容，执行精准编辑。有如下特点：

Character Consistency: Preserves unique elements in images across multiple scenes and environments, such as reference characters or objects in the image.
Editing: Makes targeted modifications to specific elements in the image without affecting other parts.
Style Reference: Generates novel scenes while preserving the unique style of the reference image according to text prompts.
Interactive Speed: Minimal latency in image generation and editing.

Version Information

[FLUX.1 Kontext [pro]
- Commercial version, focused on rapid iterative editing
FLUX.1 Kontext [max]
- Experimental version with stronger prompt adherence
FLUX.1 Kontext [dev]
- Open source version (used in this tutorial), 12B parameters, mainly for research

模型

Flux.1 Kontext Dev 原始模型权重及社区版本

Black Forest Labs原始版本：flux1-kontext-dev.safetensors
ComfyOrg 提供的 FP8 版本：flux1-dev-kontext_fp8_scaled.safetensors
社区 GGUF 版本：FLUX.1-Kontext-dev-GGUF
Nunchaku 加速推理版: nunchaku-flux.1-kontext-dev

Text Encoder

VAE

ae.safetensors

模型存放位置

📂 ComfyUI/
├── 📂 models/
│   ├── 📂 diffusion_models/
│   │   └── flux1-dev-kontext_fp8_scaled.safetensors 或者 flux1-kontext-dev.safetensors
│   ├── 📂 unet/
│   │   └── 如 flux1-kontext-dev-Q4_K_M.gguf， # 仅在你需要使用 GGUF 版本时下载
│   ├── 📂 vae/
│   │   └── ae.safetensors
│   └── 📂 text_encoders/
│       ├── clip_l.safetensors
│       └── t5xxl_fp16.safetensors or t5xxl_fp8_e4m3fn_scaled.safetensors

workflow

模型加载
- 在 Load Diffusion Model 节点中加载 flux1-dev-kontext_fp8_scaled.safetensors 模型，或者在 Unet Loader (GGUF) 节点中加载 flux1-kontext-dev-Q4_K_M.gguf （或者其它版本）
- 在 DualCLIP Load 节点中确保： clip_l.safetensors 及 t5xxl_fp16.safetensors 或 t5xxl_fp8_e4m3fn_scaled.safetensors 已经加载
- 在 Load VAE 节点中确保加载 ae.safetensors 模型
图片加载
- 在 Load Image(from output) 节点中加载提供的输入图像
提示词
- 在 CLIP Text Encode 节点中修改提示词，仅支持英文

实战

提示词基本原则

Be Specific and Clear - 使用精确的描述，避免模糊的术语。Use precise descriptions, avoid vague terms
Step-by-step Editing - 将复杂的修改分解为多个简单的步骤。 Break complex modifications into multiple simple steps
Explicit Preservation - 明确什么应该保持不变。 State what should remain unchanged
Verb Selection - 使用“更改”、“替换”而不是“转换”。Use “change”, “replace” rather than “transform”

最佳实践范式

Object Modification: 将[对象]更改为[新状态]，保持[要保留的内容]不变 "Change [object] to [new state], keep [content to preserve] unchanged"

Style Transfer: 转换为[特定样式]，同时保持[构图/角色/其他]不变 "Transform to [specific style], while maintaining [composition/character/other] unchanged"

Background Replacement: 将背景更改为[新背景]，使主体保持完全相同的位置和姿势 "Change the background to [new background], keep the subject in the exact same position and pose"

Text Editing: 将“[原始文本]”替换为“[新文本]”，保持相同的字体样式 "Replace '[original text]' with '[new text]', maintain the same font style"

Remember: 越具体越好。模型擅长理解详细的说明并保持一致性。

Change

Basic Modifications

Simple and direct: "Change the car color to red"
Maintain style: "Change to daytime while maintaining the same style of the painting"
Change the subject’s clothing into traditional Japanese attire
Change the direction of the man’s head so the camera captures it from the exact side profile
Change the camera angle so it faces directly towards the dog
Change the wall with a brick wall

Character Replacement Example

Change the flower pot with a canvas
Change the character to a male one

Change the character to a male one. He should also be fashionable and wear sunglasses. Add the text 'Comfy Creating in ComfyUI' at the top of the image. Pay attention to the capitalization of the text. The character should not block the text.

Perspective Change Example

Change the perspective to view

 Change the perspective to view the scene of the statue drinking beer from behind. Note that one of the statue's arms is missing.

Add

Add purple lipstick to the lips
Add afro hairstyle to the woman
Add Spinosaurus to the background
Add an armchair and a coffee table next to the lamp

Remove

Object Removal Example

Remove the blush from the face
Remove hair from the woman
Remove the hat and sunglasses from the subject
Remove the fog
Remove the headphones and cables
Remove all UI text elements from the image.

Remove all UI text elements from the image. Keep the feeling that the characters and scene are in water. Also, remove the green UI elements at the bottom.

Style Transfer

Style change

Turn this illustration into a realistic portrait photography style. Use young characters, and keep their green eye color and black lipstick. The characters have snow-white skin, with their eyelids slightly drooping and their eyes looking a little forward, showing an elegant and quiet expression.

Clearly name style: "Transform to Bauhaus art style"
Describe characteristics: "Transform to oil painting with visible brushstrokes, thick paint texture"
Preserve composition: "Change to Bauhaus style while maintaining the original composition"
Convert the image into a 3D animated style
Turn the hummingbird into a polished glass bird, transform the flower into a delicate crystal sculpture

Text Editing

Change the text ‘STOP’ to ‘DUR’ on the sign, keeping the same style and deformation
Write ‘Happy New Year’ on the blackboard
Use quotes: "Replace 'joy' with 'BFL'"
Maintain format: "Replace text while maintaining the same font style"
Multiple rounds of editing

Multiple rounds of editing： 
Round 1: Change "ComfyUI News" to "Qwen Image Edit" 
Round 2: Change "Qwen Image Edit is now available in ComfyUI" to "Edit the image and keep the style consistent"

Character Consistency

Specific description: "The woman with short black hair" instead of “she”
Preserve features: "while maintaining the same facial features, hairstyle, and expression"
Step-by-step modifications: Change background first, then actions

Image Editing Example:

Make the man look happy
Make the man look sad
Make the man strong and muscular
Make the dog slim
Replace all the bullets with shimmering, multi-colored butterflies
Show this man eating breakfast at a table
Replace the Taj Mahal with a detailed medieval castle, ensuring all reflections in water, glass, or surrounding surfaces match the new structure accurately. Keep lighting, shadows, and perspective consistent with the original scene for full realism
Transform the scene into a peaceful winter landscape. Cover the ground and trees with a fresh layer of snow. Add gentle snowflakes falling from the sky, with a soft white overcast atmosphere
Have the character seated at the bar of a busy pub.

Have the character seated at the bar of a busy pub. Behind her, the scene is lively, but the background stays mostly dark. She’s holding a wine glass, facing the bar, with the light lighting up the foreground—giving the whole image a cinematic look and feel.

参考

blog.comfy.org/p/qwen-imag…

docs.comfy.org/tutorials/i…

wiro.ai/blog/nano-b…

docs.comfy.org/tutorials/a…

comfyui-wiki.com/zh/tutorial…