FLUX.1 Tools 套件， kontext 编辑 & nunchuka 量化 FLUX.1 Tools Fill

FLUX.1 Tools Fill

介绍

Flux.1 Fill Dev 是 Black Forest Labs 推出的开源图像编辑模型，属于 FLUX.1 Tools 套件中的重要组件，专注于图像修复（Inpainting）和扩展（Outpainting）任务。

Flux.1 Fill dev 是由 Black Forest Labs 推出的 FLUX.1 Tools 套件中的核心工具之一，专为图像修复和扩展设计。该模型主要用于:

图像修复：填充图像中缺失或被移除的区域
图像扩展：无缝地扩展现有图像的边界
通过蒙版和提示词来精确控制生成内容

Flux.1 Fill dev 的核心特点：

强大的图像重绘(Inpainting)和扩绘(Outpainting)能力
出色的提示词理解和跟随能力，能够精确捕捉用户意图并与原图保持高度一致性
采用先进的引导蒸馏训练技术，使模型在保持高质量输出的同时更加高效
灵活的使用许可，生成的内容可以用于个人、科学和商业用途请查看FLUX.1 [dev] Non-Commercial License

模型

Flux.1 Fill dev 模型仓库地址: Flux.1 Fill dev

手动模型安装

如果对应模型下载存在问题，请参考下面的模型文件列表，手动下载对应模型文件，并放置到 ComfyUI 的对应目录下。

你需要下载以下模型文件：

模型名称	文件名	安装位置	下载链接
CLIP 模型	`clip_l.safetensors`	`ComfyUI/models/text_encoders`	下载
	`t5xxl_fp16.safetensors`	`ComfyUI/models/text_encoders`	下载
VAE 模型	`ae.safetensors`	`ComfyUI/models/vae`	下载
Flux Fill 模型	`flux1-fill-dev.safetensors`	`ComfyUI/models/diffusion_models`	下载

文件保存位置：

├── models/
│   ├── text_encoders/
│   │    ├── clip_l.safetensors
│   │    └── t5xxl_fp16.safetensors
│   ├── vae/
│   │    └── ae.safetensors
│   └── diffusion_models/
│        └── flux1-fill-dev.safetensors

Inpainting

确保在Load Diffusion Model节点加载了flux1-fill-dev.safetensors
确保在DualCLIPLoader节点中下面的模型已加载：
- clip_name1: t5xxl_fp16.safetensors
- clip_name2: clip_l.safetensors
确保在Load VAE节点中加载了ae.safetensors
在Load Image节点中上传了文档中提供的输入图片，如果你使用的是不带蒙版的版本，记得使用遮罩编辑器完成蒙版的绘制
在CLIP Text Encode(Positive Prompt)节点中输入你希望修改图片蒙版部分的内容
点击 Run 按钮，或者使用快捷键 Ctrl(cmd) + Enter(回车) 来运行工作流

这个版本是一个完整版本的 inpainting 工作流，由于 Flux 系列模型对提示词优秀的理解能力，所以我们只需要输入简单的提示词，就可以得到一个非常不错的结果。

如果你是第一次新手并第一次使用 inpainting 工作流:

我们在Load Image节点中上传了输入图片，并使用MaskEditor工具完成了蒙版的绘制，也就是标记出来了对应需要模型修改的区域

outpainting

确保在Load Diffusion Model节点加载了flux1-fill-dev.safetensors
确保在DualCLIPLoader节点中下面的模型已加载：
- clip_name1: t5xxl_fp16.safetensors
- clip_name2: clip_l.safetensors
确保在Load VAE节点中加载了ae.safetensors
在Load Image节点中上传了文档中提供的输入图片
在 Pad Image for Outpainting 节点可以自定义设置你希望扩展的各个方向区域大小
在 CLIP Text Encode(Positive Prompt) 节点中输入对应的描述
点击 Run 按钮，或者使用快捷键 Ctrl(cmd) + Enter(回车) 来运行工作流

FLUX.1 Tools Redux

介绍

Flux Redux 是一个专门用于生成图像变体的适配器模型。它可以基于输入图像生成相似风格的变体,无需提供文本提示词。本教程将指导你完成从安装到使用的完整流程。

Flux Redux 模型主要用于:

生成图像变体：基于输入图像生成相似风格的新图像
无需提示词：直接从图像中提取风格特征
可与 Flux.1 [Dev] 和 [Schnell] 版本配合使用
支持多图像混合：可以混合多个输入图像的风格

模型

Flux Redux 模型仓库地址: Flux Redux

你需要下载以下模型文件：

模型名称	文件名	安装位置	下载链接
CLIP Vision 模型	`sigclip_vision_patch14_384.safetensors`	`ComfyUI/models/clip_vision`	下载
Redux 模型	`flux1-redux-dev.safetensors`	`ComfyUI/models/style_models`	下载
CLIP 模型	`clip_l.safetensors`	`ComfyUI/models/clip`	下载
T5 模型	`t5xxl_fp16.safetensors`	`ComfyUI/models/clip`	下载
Flux Dev 模型	`flux1-dev.safetensors`	`ComfyUI/models/unet`	下载
VAE 模型	`ae.safetensors`	`ComfyUI/models/vae`	下载

工作流节点说明

工作流主要包含以下关键节点：

模型加载节点

CLIPVisionLoader: 加载 CLIP Vision 模型
StyleModelLoader: 加载 Redux 模型
UNETLoader: 加载 Flux Dev/Schnell 模型
DualCLIPLoader: 加载 CLIP 文本编码模型
VAELoader: 加载 VAE 模型

图像处理节点

LoadImage: 加载参考图片
CLIPVisionEncode: 编码参考图片
StyleModelApply: 应用 Redux 模型
FluxGuidance: 控制生成强度
BasicGuider: 基础引导器

采样节点

KSamplerSelect: 选择采样器
BasicScheduler: 设置采样调度
SamplerCustomAdvanced: 高级采样设置

FLUX.1 ControlNet

介绍

ComfyUI中使用Flux官方的ControlNet模型。我们将分别介绍FLUX.1 Depth和FLUX.1 Canny两个官方控制模型的使用方法。

LUX.1 Depth [dev]

120亿参数的整流流变换器模型
基于深度图进行结构引导
使用引导蒸馏训练,提高效率
支持个人、科研和商业用途

FLUX.1 Canny [dev]

120亿参数的整流流变换器模型
基于Canny边缘检测进行结构引导
同样采用引导蒸馏训练方法
遵循FLUX.1 [dev]非商业许可

模型

模型版本说明 Flux ControlNet模型提供了两种使用方式：完整模型和LoRA模型。

完整版本模型下载

模型名称	文件名	安装位置	下载链接	说明
CLIP 模型	`clip_l.safetensors`	`ComfyUI/models/clip/`	下载	标准CLIP编码器
CLIP 模型	`t5xxl_fp16.safetensors`	`ComfyUI/models/clip/`	下载	标准精度版本
CLIP 模型	`t5xxl_fp8_e4m3fn.safetensors`	`ComfyUI/models/clip/`	下载	低精度版本
VAE 模型	`ae.safetensors`	`ComfyUI/models/vae/`	下载	VAE编码解码器
Flux Depth	`flux1-depth-dev.safetensors`	`ComfyUI/models/diffusion_models/`	下载	深度控制模型
Flux Canny	`flux1-canny-dev.safetensors`	`ComfyUI/models/diffusion_models/`	下载	边缘控制模型

LoRA版本模型下载

模型名称	文件名	安装位置	下载链接	说明
Flux基础模型	`flux1-dev.safetensors`	`ComfyUI/models/diffusion_models/`	下载	LoRA基础模型
Depth LoRA	`flux1-depth-dev-lora.safetensors`	`ComfyUI/models/loras/`	下载	深度控制LoRA
Canny LoRA	`flux1-canny-dev-lora.safetensors`	`ComfyUI/models/loras/`	下载	边缘控制LoRA

工作流

完整版本模型

在 Load Diffusion Model 节点中加载 fflux1-depth-dev.safetensors 模型
在 DualCLIP Load 节点中确保： clip_l.safetensors 及 t5xxl_fp16.safetensors 或 t5xxl_fp8_e4m3fn_scaled.safetensors 已经加载
在 Load VAE 节点中确保加载 ae.safetensors 模型

LoRA 模型

Kontext

介绍

FLUX.1 Kontext 是一个专为文本和图像驱动编辑设计的生成式模型套件。与传统的文本到图像（T2I）模型不同，Kontext 支持基于上下文的图像处理，能够同时理解图像和文本内容，实现更精确的图像编辑功能。

FLUX.1 Kontext 模型特点

同图像连续编辑：在多个编辑步骤中保持同一图像的一致性
精确对象修改：准确修改图像中的特定对象
角色一致性编辑：在多步编辑过程中保持角色特征不变
风格保持与转换：既能保持原有风格，也能进行风格迁移
图像文字编辑：直接编辑图像中的文本内容
构图控制：精确控制画面构图、相机角度和姿态
快速推理：高效的图像生成和编辑速度

模型

Flux.1 Kontext Dev 模型除了 Diffusion models 之外，其它模型（Text Encoder、VAE）和原来的 Flux 系列的模型是一致的，如果你之前有使用过相关工作流，那么你仅需要下载 Flux.1 Kontext Dev 的相关模型即可。

Kontext 模型的不同版本

这里三个不同版本的模型，你可以按需要选择一个下载即可，其中 原始版本 和 Fp8 版本 在 ComfyUI 中使用和存储位置都是相同的，而 GGUF 版本则需要保存到 ComfyUI/models/Unet/ 目录下，并使用 ComfyUI-GGUF 的 Unet Loader (GGUF) 节点进行加载。

Flux.1 Kontext Dev 原始模型权重及社区版本

Black Forest Labs原始版本：flux1-kontext-dev.safetensors
ComfyOrg 提供的 FP8 版本：flux1-dev-kontext_fp8_scaled.safetensors
社区 GGUF 版本：FLUX.1-Kontext-dev-GGUF

Flux.1 Kontext Dev 不同版本模型模型效果及显存要求对比

Text Encoder

VAE

ae.safetensors

工作流

Origin

在 Load Diffusion Model 节点中加载 flux1-dev-kontext_fp8_scaled.safetensors 模型
在 DualCLIP Load 节点中确保： clip_l.safetensors 及 t5xxl_fp16.safetensors 或 t5xxl_fp8_e4m3fn_scaled.safetensors 已经加载
在 Load VAE 节点中确保加载 ae.safetensors 模型
在 Load Image(from output) 节点中加载提供的输入图像
在 CLIP Text Encode 节点中修改提示词，仅支持英文
点击 Queue 按钮，或者使用快捷键 Ctrl(cmd) + Enter(回车) 来运行工作流

GGUF

在 Unet Loader (GGUF) 节点中加载 flux1-kontext-dev-Q4_K_M.gguf （或者其它版本）
在 DualCLIP Load 节点中确保： clip_l.safetensors 及 t5xxl_fp16.safetensors 或 t5xxl_fp8_e4m3fn_scaled.safetensors 已经加载
在 Load VAE 节点中确保加载 ae.safetensors 模型
在 Load Image(from output) 节点中加载提供的输入图像
在 CLIP Text Encode 节点中修改提示词，仅支持英文
点击 Queue 按钮，或者使用快捷键 Ctrl(cmd) + Enter(回车) 来运行工作流

nunchuka 量化版

介绍

Github 地址: github.com/mit-han-lab…

nunchuka SVDQuant 是一种用于扩散模型的后训练量化范式，可以实现精确的 4 位量化，并在 16GB 4090 笔记本电脑上支持 12B FLUX 模型，速度提高了 3 倍。这一突破使大型语言模型能够在笔记本电脑等边缘设备上部署，同时保持高性能。

nunchuka SVDQuant 模型将 FLUX 模型的权重和激活量化为 4 位，在 16GB 4090 GPU 笔记本电脑上实现了 3.5 倍的内存缩减和 8.7 倍的延迟缩减。

模型

hf-mirror.com/mit-han-lab

dev: hf-mirror.com/mit-han-lab… schnell: hf-mirror.com/mit-han-lab…

50 系以下的 GPU 使用 int4

50 系 GPU 使用 fp4

Kontext: hf-mirror.com/mit-han-lab…

ComfyUI Node

github.com/mit-han-lab…

Nunchaku Flux DiT Loader: A node for loading the FLUX diffusion model.
- model_path: Path to the model folder. You must manually download the model from our Hugging Face collection or ModelScope collection. Once downloaded, set model_path to the corresponding directory.
  
  Note: Legacy model folders are still supported but will be deprecated in v0.4. To migrate, use our merge_safetensors.json workflow to merge your legacy folder into a single .safetensors file or redownload the model from the above collections.
- cache_threshold: Controls the First-Block Cache tolerance, similar to residual_diff_threshold in WaveSpeed. Increasing this value improves speed but may reduce quality. A typical value is 0.12. Setting it to 0 disables the effect.
- attention: Defines the attention implementation method. You can choose between flash-attention2 or nunchaku-fp16. Our nunchaku-fp16 is approximately 1.2× faster than flash-attention2 without compromising precision. For Turing GPUs (20-series), where flash-attention2 is unsupported, you must use nunchaku-fp16.
- cpu_offload: Enables CPU offloading for the transformer model. While this reduces GPU memory usage, it may slow down inference.
  - When set to auto, it will automatically detect your available GPU memory. If your GPU has more than 14GiB of memory, offloading will be disabled. Otherwise, it will be enabled.
  - Memory usage will be further optimized in node later.
- device_id: Indicates the GPU ID for running the model.
- data_type: Defines the data type for the dequantized tensors. Turing GPUs (20-series) do not support bfloat16 and can only use float16.
- i2f_mode: For Turing (20-series) GPUs, this option controls the GEMM implementation mode. enabled and always modes exhibit minor differences. This option is ignored on other GPU architectures.

参数说明

cache_threshold ：控制首块缓存容差，类似于 WaveSpeed 中的 residual_diff_threshold 。增加此值可提高速度，但可能会降低质量。典型值为 0.12。将其设置为 0 可禁用该效果。

attention ：定义注意力实现方法。您可以选择 flash-attention2 或 nunchaku-fp16 。我们的 nunchaku-fp16 比 flash-attention2 快约 1.2 倍，且不影响精度。对于不支持 flash-attention2 的 Turing GPU（20 系列），您必须使用 nunchaku-fp16 。

cpu_offload ：为 Transformer 模型启用 CPU 卸载。虽然这会减少 GPU 内存使用量，但可能会减慢推理速度。当设置为 auto 时，它将自动检测您的可用 GPU 内存。如果您的 GPU 内存超过 14GiB，则卸载将被禁用。否则，它将被启用。

device_id ：运行模型的 GPU ID。

data_type ：定义反量化张量的数据类型。Turing GPU（20 系列）不支持 bfloat16 ，只能使用 float16。

i2f_mode ：对于 Turing（20 系列）GPU，此选项控制 GEMM 实现模式。 enabled 和 always 模式略有不同。其他 GPU 架构会忽略此选项。

Nunchaku FLUX LoRA Loader: A node for loading LoRA modules for SVDQuant FLUX models.
- Place your LoRA checkpoints in the models/loras directory. These will appear as selectable options under lora_name.
- lora_strength: Controls the strength of the LoRA module.
- You can connect multiple LoRA nodes together.
- Note: Starting from version 0.2.0, there is no need to convert LoRAs. Simply provide the original LoRA files to the loader.
Nunchaku Text Encoder Loader V2: A node for loading the text encoders.
- Select the CLIP and T5 models to use as text_encoder1 and text_encoder2, following the same convention as in DualCLIPLoader. In addition, you may choose to use our enhanced 4-bit T5XXL model for saving more GPU memory.
- t5_min_length: Sets the minimum sequence length for T5 text embeddings. The default in DualCLIPLoader is hardcoded to 256, but for better image quality, use 512 here.

FLUX.1 Tools 套件， kontext 编辑 & nunchuka 量化