0基础入门ComfyUI:从安装到出图/生视频,节点化工作流全攻略
AI生成领域,「灵活度」和「易用性」往往难以兼得——WebUI上手快但定制性弱,代码开发门槛高,而ComfyUI恰好填补了这一空白。作为以「节点化可视化」为核心的AI生成工具,它既能让新手通过拖拽快速复用工作流,也能让开发者自由拆解、重构生成逻辑,完美适配SD绘图、SVD/LTXV生视频、LoRA微调等全场景。
本文专为读者打造,从环境准备到实战出成果,全程图文对照,无冗余理论,重点解决「安装报错」「模型放哪」「节点怎么连」「视频怎么生」四大核心问题,适配Windows(N卡)、Mac(M系列)双平台,新手也能一次性跑通。
一、前置认知:ComfyUI核心逻辑(1分钟看懂)
不同于WebUI的「表单式操作」,ComfyUI的核心是**「工作流=节点+连线」**:将AI生成的全流程(加载模型→编码提示词→生成特征→解码输出)拆分为独立「节点」,通过连线定义数据流向,最终实现可视化生成。
为了让你快速建立认知,先看「核心工作流逻辑图」,后续所有操作都围绕这个逻辑展开:
图1:ComfyUI核心工作流逻辑(新手必记)
[加载基础模型] → [编码提示词] → [生成特征序列] → [采样计算] → [解码输出] → [保存结果]
(Checkpoint) (CLIPText) (EmptyLatent) (KSampler) (VAEDecode) (SaveImage/Video)
- 所有操作无需写代码,拖拽节点+连接端口即可完成;
- 工作流可保存为JSON,一键复用、分享,这也是ComfyUI最核心的优势。
二、环境安装:双平台一键落地(避坑版)
安装核心遵循「环境→本体→模型」三步走,针对掘金读者的技术属性,兼顾「手动安装(懂代码)」和「便携包(纯新手)」两种方案,全程标注关键避坑点。
2.1 前置依赖(必装)
| 依赖项 | Windows(N卡) | Mac(M系列) | 核心要求 |
|---|---|---|---|
| Python | 3.10.x(必须!3.11+兼容差) | 系统自带3.9+/Homebrew装3.10 | 勾选「Add Python to PATH」 |
| 算力加速 | NVIDIA驱动(CUDA 11.8+) | 无需额外驱动(Metal自动加速) | 无N卡可跑CPU,速度较慢 |
| 工具 | Git(可选,用于克隆插件) | Homebrew(可选,安装Git) | 后续安装插件/模型需用到 |
2.2 方案一:纯新手便携包(推荐,0命令)
无需配置环境,解压即跑,适合完全不懂终端/命令行的读者。
-
下载便携包:
- Windows:ComfyUI_windows_portable.zip
- Mac:ComfyUI_mac.zip
-
解压:将压缩包解压到无中文、无空格的路径(如
D:\AI\ComfyUI/~/Desktop/ComfyUI)。 -
启动:
- Windows:双击
run_nvidia_gpu.bat(N卡)/run_cpu.bat(无独显); - Mac:打开终端,执行
cd ~/Desktop/ComfyUI && python3 main.py。
- Windows:双击
2.3 方案二:手动安装(技术向,适合掘金读者)
适合需要自定义环境、后续开发插件的读者,步骤如下:
# 1. 克隆仓库
git clone https://github.com/comfyanonymous/ComfyUI.git
cd ComfyUI
# 2. 创建虚拟环境(避免依赖冲突)
python3 -m venv venv
# Windows激活:venv\Scripts\activate
# Mac激活:source venv/bin/activate
# 3. 安装依赖(国内用清华源加速)
pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple
2.4 启动验证(关键步骤)
启动后,终端会输出类似日志,浏览器自动打开 http://127.0.0.1:8188,即安装成功!
#执行命令
python main.py
Starting server on port 8188
Open your browser to http://127.0.0.1:8188
图2:ComfyUI启动成功界面(标注核心区域)
+-------------------------------------------------------+
| 顶部:导航栏(Load/Save/Queue Prompt/Manager) | ← 核心操作区
| +-------------+ +-----------------------------------+ |
| | 左侧:节点面板 | | 中间:画布(工作流搭建区) | | ← 核心工作区
| | (拖拽节点用) | | (拖拽+连线搭建逻辑) | |
| +-------------+ +-----------------------------------+ |
| | 右侧:输出面板 | | 底部:日志区(报错排查用) | | ← 结果/排障区
| | (预览生成结果)| | (显示生成进度/错误信息) | |
| +-------------+ +-----------------------------------+ |
+-------------------------------------------------------+
三、模型部署:核心模型「放对位置」是关键
ComfyUI本身不自带模型,所有生成依赖「基础模型、视觉编码器、LoRA」等,放对路径是新手最容易踩坑的地方。先看「模型路径总览图」,再针对性下载核心模型。
图3:ComfyUI模型路径总览(红色为必建,绿色为默认)
ComfyUI/
└── models/
├── checkpoints/ # 基础模型(SD1.5/SDXL/SVD)✅
├── lora/ # LoRA模型(风格/人物微调)✅
├── clip_vision/ # 视觉编码器(SVD/LTXV/IP-Adapter必备)🔴(手动建)
├── animatediff/ # AnimateDiff动效模型🔴(手动建)
├── ipadapter/ # IP-Adapter模型🔴(手动建)
└── vae/ # 独立VAE(可选,优化画质)✅
3.1 必装核心模型(新手先装这3个,足够出图+生视频)
| 模型类型 | 推荐模型 | 下载平台 | 存放路径 | 核心作用 |
|---|---|---|---|---|
| 基础模型 | SD 1.5(v1-5-pruned-emaonly.safetensors) | liblib.art/Civitai | models/checkpoints | 核心绘图模型,兼容性最强 |
| 视觉编码器 | clip_vision_vit_l.safetensors | h94/IP-Adapter(HF) | models/clip_vision | SVD/LTXV/IP-Adapter的「眼睛」 |
| 轻量化生视频模型 | SVD XT 1.1 | stabilityai(HF) | models/checkpoints | 新手入门生视频,体积小(13GB) |
3.2 快速下载命令(掘金技术向,支持断点续传)
利用huggingface-cli下载,避免浏览器下载中断,Mac/Windows通用:
# 进入clip_vision目录(先手动创建:mkdir -p models/clip_vision)
cd ~/Desktop/ComfyUI/models/clip_vision
# 下载clip_vision_vit_l(IP-Adapter/SVD通用)
huggingface-cli download h94/IP-Adapter models/clip_vision_vit_l.safetensors --local-dir . --local-dir-use-symlinks False
四、新手实战1:5分钟跑通第一张图(基础工作流)
无需手动搭建节点,直接加载官方预设工作流,修改提示词即可出图,全程图文对照,零门槛落地。
步骤1:加载预设工作流
点击ComfyUI顶部「Load」→ 选择 examples/basic_workflow.json,画布自动加载完整出图工作流,节点已全部连接完成。
图4:基础出图工作流(节点连线标注)
[CheckpointLoaderSimple] → [CLIPTextEncode(正面)] → [KSampler]
↓ ↓
[CLIPTextEncode(负面)] → [KSampler] → [VAEDecode] → [SaveImage]
↓
[EmptyLatentImage] → [KSampler]
- 红色节点:
CheckpointLoaderSimple(选择SD 1.5模型); - 蓝色节点:
CLIPTextEncode(填写正/负面提示词); - 绿色节点:
KSampler(核心采样,决定出图质量); - 黄色节点:
SaveImage(保存结果到output文件夹)。
步骤2:修改核心参数(新手默认即可)
| 节点 | 参数 | 新手推荐值 | 作用 |
|---|---|---|---|
| CheckpointLoaderSimple | ckpt_name | v1-5-pruned-emaonly.safetensors | 选择已安装的基础模型 |
| CLIPTextEncode(正面) | text | a cute cat, sitting on a windowsill, sunlight, watercolor style, 512x512 | 描述想要的效果 |
| CLIPTextEncode(负面) | text | blurry, ugly, distorted, low resolution, bad anatomy | 排除不良效果 |
| EmptyLatentImage | width/height | 512x512 | 分辨率,新手别太大(速度慢) |
| KSampler | sampler_name/steps/cfg | dpmpp_2m / 20 / 7 | 采样器(最稳)/步数(平衡速度与质量)/提示词强度 |
步骤3:运行工作流,出图!
点击顶部「Queue Prompt」,终端开始显示生成进度,完成后右侧输出面板会预览图片,同时自动保存到 ComfyUI/output 文件夹。
五、新手实战2:10分钟跑通古诗人物生视频(LTXV轻量化版)
结合掘金读者的创作需求,以「古诗人物微笑眨眼」为例,复用轻量化LTXV工作流,实现「图生视频」,重点标注「移位参数」(控制动作自然度)。
步骤1:安装必备插件(ComfyUI-Manager一键搞定)
- 安装Manager:终端进入
custom_nodes目录,执行git clone https://github.com/ltdrdata/ComfyUI-Manager.git,重启ComfyUI; - 安装LTXV插件:点击顶部「Manager」→「Custom Nodes Manager」→ 搜索「LTX-Video」→ 点击「Install」,重启生效。
步骤2:加载LTXV新手工作流(核心节点标注)
下载「LTXV古诗人物生视频.json」(文末附核心结构),点击「Load」导入,核心节点如下:
图5:LTXV图生视频工作流(核心节点+参数标注)
[LoadImage] → [CLIPVisionEncode] → [LTXVImgToVideo] → [VAEDecode] → [SaveVideo]
↓ ↓
[CheckpointLoaderSimple(SVD XT 1.1)] → [LTXVImgToVideo]
-
关键参数(红色标注,决定动作自然度):
Length:72(3秒,24FPS,足够呈现微笑+眨眼);Max Shift:4(最大移位,控制动作幅度,避免夸张);Base Shift:1(基础移位,保证轻微基础动态);ltxv_strength:0.9(锁定人物特征,不跑偏)。
步骤3:运行并优化效果
- 上传古诗人物插画(点击
LoadImage节点的「Upload」); - 点击「Queue Prompt」,Mac M系列约5-8分钟生成完成;
- 优化技巧:若动作不自然,将
Max Shift降至3,ltxv_strength升至1.0。
六、核心技巧:掘金读者必学的3个效率提升点
6.1 插件与模型快速查找(避免踩坑)
- 插件缺失:在Manager中搜索节点名称(如「LTXV」「IP-Adapter」),一键安装;
- 模型缺失:根据报错日志中的「model not found」,对照「图3」路径,下载对应模型放入即可。
6.2 工作流保存与复用(核心优势)
- 保存:搭建完成后,点击「Save」,命名为「古诗人物生视频.json」,下次直接加载;
- 分享:将JSON文件发给他人,对方只需补齐模型,即可一键复现效果。
6.3 Mac M系列加速技巧(专属优化)
在main.py启动命令中添加Metal加速参数,生成速度提升30%:
python3 main.py --use-metal --num-workers 2
七、新手避坑指南(掘金读者高频问题)
| 问题 | 原因 | 解决方案 |
|---|---|---|
| 启动报错「No module named xxx」 | 依赖缺失 | 终端执行 pip install xxx -i 清华源,或重装依赖 |
| 生成黑图/乱图 | 模型路径错误/参数不合理 | 核对「图3」路径,将steps设为20+,cfg设为6-8 |
| SVD/LTXV提示「缺少CLIP Vision」 | 视觉编码器未放对路径 | 将模型放入models/clip_vision,重启ComfyUI |
| Mac启动后无法访问8188端口 | 端口被占用 | 终端执行 lsof -i :8188,杀死占用进程,重新启动 |
八、总结与后续学习路线
本文从「安装→模型→出图→生视频」,完整覆盖ComfyUI新手入门核心流程,对于掘金读者而言,后续可沿着「基础→进阶→高阶」的路线深入:
- 基础:熟练掌握采样器、提示词、LoRA权重调节,提升出图质量;
- 进阶:学习ControlNet控制构图、IP-Adapter锁定人物特征,实现精准生成;
- 高阶:基于ComfyUI开发自定义节点、搭建AI Agent生视频工作流,适配工程化需求。
作为AI生成的「万能工具」,ComfyUI的灵活性远不止于此。后续我会持续在掘金分享「ComfyUI工程化落地」「LoRA训练实战」「多模态工作流搭建」等内容,关注我,一起解锁AI生成的更多可能!
附:LTXV古诗人物生视频核心工作流JSON(简化版)
-
原图
-
效果
- 工作流
可直接复制到ComfyUI,补齐模型后即可运行:
{
"id": "00000000-0000-0000-0000-000000000000",
"revision": 0,
"last_node_id": 16,
"last_link_id": 46,
"nodes": [
{
"id": 1,
"type": "CheckpointLoaderSimple",
"pos": [
100,
130
],
"size": [
273.4576171875,
98
],
"flags": {},
"order": 0,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"links": [
32
]
},
{
"name": "CLIP",
"type": "CLIP",
"links": null
},
{
"name": "VAE",
"type": "VAE",
"links": [
30,
44
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.12.2",
"Node name for S&R": "CheckpointLoaderSimple"
},
"widgets_values": [
"ltx-video-2b-v0.9.5.safetensors"
]
},
{
"id": 5,
"type": "LTXVConditioning",
"pos": [
973.4576171875,
130
],
"size": [
270,
78
],
"flags": {},
"order": 7,
"mode": 0,
"inputs": [
{
"name": "positive",
"type": "CONDITIONING",
"link": 26
},
{
"name": "negative",
"type": "CONDITIONING",
"link": 27
}
],
"outputs": [
{
"name": "positive",
"type": "CONDITIONING",
"links": [
28
]
},
{
"name": "negative",
"type": "CONDITIONING",
"links": [
29
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.12.2",
"Node name for S&R": "LTXVConditioning"
},
"widgets_values": [
25
]
},
{
"id": 7,
"type": "ModelSamplingLTXV",
"pos": [
1713.4576171875,
130
],
"size": [
270,
102
],
"flags": {},
"order": 9,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 32
},
{
"name": "latent",
"shape": 7,
"type": "LATENT",
"link": 33
}
],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"links": [
35
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.12.2",
"Node name for S&R": "ModelSamplingLTXV"
},
"widgets_values": [
2.05,
0.95
]
},
{
"id": 10,
"type": "RandomNoise",
"pos": [
100,
590
],
"size": [
270,
82
],
"flags": {},
"order": 1,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "NOISE",
"type": "NOISE",
"links": [
38
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.12.2",
"Node name for S&R": "RandomNoise"
},
"widgets_values": [
137920484619877,
"randomize"
]
},
{
"id": 11,
"type": "KSamplerSelect",
"pos": [
100,
802
],
"size": [
270,
58
],
"flags": {},
"order": 2,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "SAMPLER",
"type": "SAMPLER",
"links": [
40
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.12.2",
"Node name for S&R": "KSamplerSelect"
},
"widgets_values": [
"euler"
]
},
{
"id": 12,
"type": "SamplerCustomAdvanced",
"pos": [
2453.4576171875,
130
],
"size": [
179.9,
106
],
"flags": {},
"order": 12,
"mode": 0,
"inputs": [
{
"name": "noise",
"type": "NOISE",
"link": 38
},
{
"name": "guider",
"type": "GUIDER",
"link": 39
},
{
"name": "sampler",
"type": "SAMPLER",
"link": 40
},
{
"name": "sigmas",
"type": "SIGMAS",
"link": 41
},
{
"name": "latent_image",
"type": "LATENT",
"link": 42
}
],
"outputs": [
{
"name": "output",
"type": "LATENT",
"links": [
43
]
},
{
"name": "denoised_output",
"type": "LATENT",
"links": null
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.12.2",
"Node name for S&R": "SamplerCustomAdvanced"
}
},
{
"id": 13,
"type": "VAEDecode",
"pos": [
2733.3576171875,
130
],
"size": [
140,
46
],
"flags": {},
"order": 13,
"mode": 0,
"inputs": [
{
"name": "samples",
"type": "LATENT",
"link": 43
},
{
"name": "vae",
"type": "VAE",
"link": 44
}
],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"links": [
45
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.12.2",
"Node name for S&R": "VAEDecode"
}
},
{
"id": 14,
"type": "CreateVideo",
"pos": [
2973.3576171875,
130
],
"size": [
270,
78
],
"flags": {},
"order": 14,
"mode": 0,
"inputs": [
{
"name": "images",
"type": "IMAGE",
"link": 45
},
{
"name": "audio",
"shape": 7,
"type": "AUDIO",
"link": null
}
],
"outputs": [
{
"name": "VIDEO",
"type": "VIDEO",
"links": [
46
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.12.2",
"Node name for S&R": "CreateVideo"
},
"widgets_values": [
25
]
},
{
"id": 15,
"type": "SaveVideo",
"pos": [
3343.3576171875,
130
],
"size": [
270,
368
],
"flags": {},
"order": 15,
"mode": 0,
"inputs": [
{
"name": "video",
"type": "VIDEO",
"link": 46
}
],
"outputs": [],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.12.2"
},
"widgets_values": [
"poetry/poetry_char",
"auto",
"auto"
]
},
{
"id": 16,
"type": "CLIPLoader",
"pos": [
100,
990
],
"size": [
270,
106
],
"flags": {},
"order": 3,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "CLIP",
"type": "CLIP",
"links": [
24,
25
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.12.2",
"Node name for S&R": "CLIPLoader"
},
"widgets_values": [
"t5xxl_fp16.safetensors",
"ltxv",
"default"
]
},
{
"id": 2,
"type": "LoadImage",
"pos": [
100,
358
],
"size": [
270,
314
],
"flags": {},
"order": 4,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"links": [
31
]
},
{
"name": "MASK",
"type": "MASK",
"links": null
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.12.2",
"Node name for S&R": "LoadImage"
},
"widgets_values": [
"Q8zD8BqXRJzjcYQGCtZdsA8edc252b044daa9c18a4160a62c9c54d.jpg",
"image"
]
},
{
"id": 6,
"type": "LTXVImgToVideo",
"pos": [
1343.4576171875,
130
],
"size": [
270,
214
],
"flags": {},
"order": 8,
"mode": 0,
"inputs": [
{
"name": "positive",
"type": "CONDITIONING",
"link": 28
},
{
"name": "negative",
"type": "CONDITIONING",
"link": 29
},
{
"name": "vae",
"type": "VAE",
"link": 30
},
{
"name": "image",
"type": "IMAGE",
"link": 31
}
],
"outputs": [
{
"name": "positive",
"type": "CONDITIONING",
"links": [
36
]
},
{
"name": "negative",
"type": "CONDITIONING",
"links": [
37
]
},
{
"name": "latent",
"type": "LATENT",
"links": [
33,
34,
42
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.12.2",
"Node name for S&R": "LTXVImgToVideo"
},
"widgets_values": [
448,
448,
57,
1,
0.9
]
},
{
"id": 8,
"type": "LTXVScheduler",
"pos": [
1714.337744140625,
362.515625
],
"size": [
270,
154
],
"flags": {},
"order": 10,
"mode": 0,
"inputs": [
{
"name": "latent",
"shape": 7,
"type": "LATENT",
"link": 34
}
],
"outputs": [
{
"name": "SIGMAS",
"type": "SIGMAS",
"links": [
41
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.12.2",
"Node name for S&R": "LTXVScheduler"
},
"widgets_values": [
25,
2.05,
0.95,
true,
0.1
]
},
{
"id": 3,
"type": "CLIPTextEncode",
"pos": [
473.4576171875,
130
],
"size": [
400,
200
],
"flags": {},
"order": 5,
"mode": 0,
"inputs": [
{
"name": "clip",
"type": "CLIP",
"link": 24
}
],
"outputs": [
{
"name": "CONDITIONING",
"type": "CONDITIONING",
"links": [
26
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.12.2",
"Node name for S&R": "CLIPTextEncode"
},
"widgets_values": [
"natural blinking, gentle blink, subtle smile, soft smile, slight smile, calm expression, subtle facial movement, minimal motion, eyes close and open gently"
]
},
{
"id": 4,
"type": "CLIPTextEncode",
"pos": [
473.4576171875,
460
],
"size": [
400,
200
],
"flags": {},
"order": 6,
"mode": 0,
"inputs": [
{
"name": "clip",
"type": "CLIP",
"link": 25
}
],
"outputs": [
{
"name": "CONDITIONING",
"type": "CONDITIONING",
"links": [
27
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.12.2",
"Node name for S&R": "CLIPTextEncode"
},
"widgets_values": [
"no expression, blank stare, stiff face, frozen, wide open eyes, exaggerated smile, open mouth, distorted face, deformation"
]
},
{
"id": 9,
"type": "CFGGuider",
"pos": [
2083.4576171875,
130
],
"size": [
270,
98
],
"flags": {},
"order": 11,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 35
},
{
"name": "positive",
"type": "CONDITIONING",
"link": 36
},
{
"name": "negative",
"type": "CONDITIONING",
"link": 37
}
],
"outputs": [
{
"name": "GUIDER",
"type": "GUIDER",
"links": [
39
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.12.2",
"Node name for S&R": "CFGGuider"
},
"widgets_values": [
5
]
}
],
"links": [
[
24,
16,
0,
3,
0,
"CLIP"
],
[
25,
16,
0,
4,
0,
"CLIP"
],
[
26,
3,
0,
5,
0,
"CONDITIONING"
],
[
27,
4,
0,
5,
1,
"CONDITIONING"
],
[
28,
5,
0,
6,
0,
"CONDITIONING"
],
[
29,
5,
1,
6,
1,
"CONDITIONING"
],
[
30,
1,
2,
6,
2,
"VAE"
],
[
31,
2,
0,
6,
3,
"IMAGE"
],
[
32,
1,
0,
7,
0,
"MODEL"
],
[
33,
6,
2,
7,
1,
"LATENT"
],
[
34,
6,
2,
8,
0,
"LATENT"
],
[
35,
7,
0,
9,
0,
"MODEL"
],
[
36,
6,
0,
9,
1,
"CONDITIONING"
],
[
37,
6,
1,
9,
2,
"CONDITIONING"
],
[
38,
10,
0,
12,
0,
"NOISE"
],
[
39,
9,
0,
12,
1,
"GUIDER"
],
[
40,
11,
0,
12,
2,
"SAMPLER"
],
[
41,
8,
0,
12,
3,
"SIGMAS"
],
[
42,
6,
2,
12,
4,
"LATENT"
],
[
43,
12,
0,
13,
0,
"LATENT"
],
[
44,
1,
2,
13,
1,
"VAE"
],
[
45,
13,
0,
14,
0,
"IMAGE"
],
[
46,
14,
0,
15,
0,
"VIDEO"
]
],
"groups": [],
"config": {},
"extra": {
"ds": {
"scale": 1,
"offset": [
-623.4166412353516,
181.94795989990234
]
},
"frontendVersion": "1.37.11"
},
"version": 0.4
}