Qwen-Image-Edit
介绍
Qwen-Image-Edit 是 Qwen-Image 的编辑版本,基于 20B 的 Qwen-Image 模型训练,支持使用文本编辑图片。并且支持 Qwen2.5-VL 获取语义和 VAE Encoder 编码图片进行语义 clip 和 VAE 双通道编辑能力。
Features include:
- Precise Text Editing: Qwen-Image-Edit supports bilingual (Chinese and English) text editing, allowing direct addition, deletion, and modification of text in images while preserving the original text size, font, and style.
- Dual Semantic/Appearance Editing: Qwen-Image-Edit supports not only low-level visual appearance editing (such as style transfer, addition, deletion, modification, etc.) but also high-level visual semantic editing (such as IP creation, object rotation, etc.).
- Strong Cross-Benchmark Performance: Evaluations on multiple public benchmarks show that Qwen-Image-Edit achieves SOTA in editing tasks, making it a powerful foundational model for image generation.
模型
模型存放位置
📂 ComfyUI/
├── 📂 models/
│ ├── 📂 diffusion_models/
│ │ └── qwen_image_edit_fp8_e4m3fn.safetensors
│ ├── 📂 loras/
│ │ └── Qwen-Image-Lightning-4steps-V1.0.safetensors
│ ├── 📂 vae/
│ │ └── qwen_image_vae.safetensors
│ └── 📂 text_encoders/
│ └── qwen_2.5_vl_7b_fp8_scaled.safetensors
- GGUF 模型
模型存放位置
| Type | Name | Location | Download |
|---|---|---|---|
| Main Model | Qwen-Image | ComfyUI/models/unet | GGUF (this repo) |
| Main Text Encoder | Qwen2.5-VL-7B | ComfyUI/models/text_encoders | Safetensors / GGUF |
| Text_Encoder (mmproj) | Qwen2.5-VL-7B-Instruct-mmproj-BF16 | ComfyUI/models/text_encoders (same folder as your main text encoder) | GGUF (this repo) |
| VAE | Qwen-Image VAE | ComfyUI/models/vae | Safetensors (this repo) |
📂 ComfyUI/
├── 📂 models/
│ ├── 📂 unet/
│ │ └── Qwen_Image_Edit-Q8_0.gguf
│ ├── 📂 loras/
│ │ └── Qwen-Image-Lightning-4steps-V1.0.safetensors
│ ├── 📂 vae/
│ │ └── qwen_image_vae.safetensors
│ └── 📂 text_encoders/
│ └── qwen_2.5_vl_7b_fp8_scaled.safetensors
workflow
-
模型加载
- 使用
Load Diffusion Model节点加载qwen_image_edit_fp8_e4m3fn.safetensors模型,或者使用Unet Loader(GGUF)加载Qwen_Image_Edit-Q8_0.gguf模型 - 使用
Load CLIP节点加载qwen_2.5_vl_7b_fp8_scaled.safetensors模型 - 使用
Load VAE节点加载qwen_image_vae.safetensors模型
- 使用
-
图片加载
- 使用
Load Image节点加载需要编辑的图片
- 使用
-
提示词
- 使用
TextEncoderQwenImageEdit节点输入提示词,两个节点,分别输入正向提示词和负向提示词。
- 使用
-
图片缩放节点,控制总像素少于一百万。
-
Lora
- 如果需要 4-step Lighting LoRA 加速图片生成,则使用
LoraLoaderModelOnly节点加载Qwen-Image-Lightning-4steps-V1.0.safetensors模型,连接加入到模型通道。
- 如果需要 4-step Lighting LoRA 加速图片生成,则使用
-
KSampler 节点中的
steps和cfg值参考下表:
| Model | Steps | CFG |
|---|---|---|
| Offical | 50 | 4.0 |
| fp8_e4m3fn | 20 | 2.5 |
| fp8_e4m3fn + 4steps LoRA | 4 | 1.0 |
Flux.1 Kontext
介绍
Flux.1 Kontext 是多模态的图片编辑模型,支持同时输入提示词和图片,理解图片内容,执行精准编辑。有如下特点:
- Character Consistency: Preserves unique elements in images across multiple scenes and environments, such as reference characters or objects in the image.
- Editing: Makes targeted modifications to specific elements in the image without affecting other parts.
- Style Reference: Generates novel scenes while preserving the unique style of the reference image according to text prompts.
- Interactive Speed: Minimal latency in image generation and editing.
Version Information
- [FLUX.1 Kontext [pro]
- Commercial version, focused on rapid iterative editing
- FLUX.1 Kontext [max]
- Experimental version with stronger prompt adherence
- FLUX.1 Kontext [dev]
- Open source version (used in this tutorial), 12B parameters, mainly for research
模型
Flux.1 Kontext Dev 原始模型权重及社区版本
- Black Forest Labs原始版本:flux1-kontext-dev.safetensors
- ComfyOrg 提供的 FP8 版本:flux1-dev-kontext_fp8_scaled.safetensors
- 社区 GGUF 版本:FLUX.1-Kontext-dev-GGUF
- Nunchaku 加速推理版: nunchaku-flux.1-kontext-dev
Text Encoder
VAE
模型存放位置
📂 ComfyUI/
├── 📂 models/
│ ├── 📂 diffusion_models/
│ │ └── flux1-dev-kontext_fp8_scaled.safetensors 或者 flux1-kontext-dev.safetensors
│ ├── 📂 unet/
│ │ └── 如 flux1-kontext-dev-Q4_K_M.gguf, # 仅在你需要使用 GGUF 版本时下载
│ ├── 📂 vae/
│ │ └── ae.safetensors
│ └── 📂 text_encoders/
│ ├── clip_l.safetensors
│ └── t5xxl_fp16.safetensors or t5xxl_fp8_e4m3fn_scaled.safetensors
workflow
- 模型加载
- 在
Load Diffusion Model节点中加载flux1-dev-kontext_fp8_scaled.safetensors模型,或者在Unet Loader (GGUF)节点中加载flux1-kontext-dev-Q4_K_M.gguf(或者其它版本) - 在
DualCLIP Load节点中确保:clip_l.safetensors及t5xxl_fp16.safetensors或t5xxl_fp8_e4m3fn_scaled.safetensors已经加载 - 在
Load VAE节点中确保加载ae.safetensors模型
- 在
- 图片加载
- 在
Load Image(from output)节点中加载提供的输入图像
- 在
- 提示词
- 在
CLIP Text Encode节点中修改提示词,仅支持英文
- 在
实战
提示词基本原则
- Be Specific and Clear - 使用精确的描述,避免模糊的术语。Use precise descriptions, avoid vague terms
- Step-by-step Editing - 将复杂的修改分解为多个简单的步骤。 Break complex modifications into multiple simple steps
- Explicit Preservation - 明确什么应该保持不变。 State what should remain unchanged
- Verb Selection - 使用“更改”、“替换”而不是“转换”。Use “change”, “replace” rather than “transform”
最佳实践范式
Object Modification: 将[对象]更改为[新状态],保持[要保留的内容]不变 "Change [object] to [new state], keep [content to preserve] unchanged"
Style Transfer: 转换为[特定样式],同时保持[构图/角色/其他]不变 "Transform to [specific style], while maintaining [composition/character/other] unchanged"
Background Replacement: 将背景更改为[新背景],使主体保持完全相同的位置和姿势 "Change the background to [new background], keep the subject in the exact same position and pose"
Text Editing: 将“[原始文本]”替换为“[新文本]”,保持相同的字体样式 "Replace '[original text]' with '[new text]', maintain the same font style"
Remember: 越具体越好。模型擅长理解详细的说明并保持一致性。
Change
Basic Modifications
- Simple and direct:
"Change the car color to red" - Maintain style:
"Change to daytime while maintaining the same style of the painting" - Change the subject’s clothing into traditional Japanese attire
- Change the direction of the man’s head so the camera captures it from the exact side profile
- Change the camera angle so it faces directly towards the dog
- Change the wall with a brick wall
Character Replacement Example
- Change the flower pot with a canvas
- Change the character to a male one
Change the character to a male one. He should also be fashionable and wear sunglasses. Add the text 'Comfy Creating in ComfyUI' at the top of the image. Pay attention to the capitalization of the text. The character should not block the text.
Perspective Change Example
- Change the perspective to view
Change the perspective to view the scene of the statue drinking beer from behind. Note that one of the statue's arms is missing.
Add
- Add purple lipstick to the lips
- Add afro hairstyle to the woman
- Add Spinosaurus to the background
- Add an armchair and a coffee table next to the lamp
Remove
Object Removal Example
- Remove the blush from the face
- Remove hair from the woman
- Remove the hat and sunglasses from the subject
- Remove the fog
- Remove the headphones and cables
- Remove all UI text elements from the image.
Remove all UI text elements from the image. Keep the feeling that the characters and scene are in water. Also, remove the green UI elements at the bottom.
Style Transfer
- Style change
Turn this illustration into a realistic portrait photography style. Use young characters, and keep their green eye color and black lipstick. The characters have snow-white skin, with their eyelids slightly drooping and their eyes looking a little forward, showing an elegant and quiet expression.
- Clearly name style:
"Transform to Bauhaus art style" - Describe characteristics:
"Transform to oil painting with visible brushstrokes, thick paint texture" - Preserve composition:
"Change to Bauhaus style while maintaining the original composition" - Convert the image into a 3D animated style
- Turn the hummingbird into a polished glass bird, transform the flower into a delicate crystal sculpture
Text Editing
- Change the text ‘STOP’ to ‘DUR’ on the sign, keeping the same style and deformation
- Write ‘Happy New Year’ on the blackboard
- Use quotes:
"Replace 'joy' with 'BFL'" - Maintain format:
"Replace text while maintaining the same font style" - Multiple rounds of editing
Multiple rounds of editing:
Round 1: Change "ComfyUI News" to "Qwen Image Edit"
Round 2: Change "Qwen Image Edit is now available in ComfyUI" to "Edit the image and keep the style consistent"
Character Consistency
- Specific description:
"The woman with short black hair"instead of “she” - Preserve features:
"while maintaining the same facial features, hairstyle, and expression" - Step-by-step modifications: Change background first, then actions
Image Editing Example:
- Make the man look happy
- Make the man look sad
- Make the man strong and muscular
- Make the dog slim
- Replace all the bullets with shimmering, multi-colored butterflies
- Show this man eating breakfast at a table
- Replace the Taj Mahal with a detailed medieval castle, ensuring all reflections in water, glass, or surrounding surfaces match the new structure accurately. Keep lighting, shadows, and perspective consistent with the original scene for full realism
- Transform the scene into a peaceful winter landscape. Cover the ground and trees with a fresh layer of snow. Add gentle snowflakes falling from the sky, with a soft white overcast atmosphere
- Have the character seated at the bar of a busy pub.
Have the character seated at the bar of a busy pub. Behind her, the scene is lively, but the background stays mostly dark. She’s holding a wine glass, facing the bar, with the light lighting up the foreground—giving the whole image a cinematic look and feel.