写在前面:之前读研究生的时候做过很多深度学习的东西,但当时很多都是一知半解,也觉得深度学习很难,没有坚持下去,后面工作到现在两年的时间,也基本上与AI就没有再打过交道
但是外面的世界日新月异,后面有了大模型,有了chatgpt,有了aigc,我觉得我与这些内容越来越脱节,所以很有必要把之前脱节的AI内容重新捡起来
偶然的机会我发现了我之前经常用到的Tusimple数据集的母公司图森未来已经完全改变了自己的自动驾驶业务,开始转向了AI业务iamcreate.ai/en-US,而且发布了自己的开源项目ruyi模型github.com/IamCreateAI…,正好我就想用这个契机开始学习
我自己用的电脑是mac pro m2,虽然预料到了自己动手实践起来会有很多困难,但是没想到有这么大的困难,到最后我发现了mac根本跑不了模型。。。。
1 常规方式
首先按照我最初的设想,运行一个模型就是把它clone下来,然后执行对应的py即可,所以首先我就是按照官方github教程来
git clone https://github.com/IamCreateAI/Ruyi-Models
cd Ruyi-Models
pip install -r requirements.txt
但是运行时出现了报错
(.venv) ~/ruyi/Ruyi-Models git:[main]
pip install decord
ERROR: Could not find a version that satisfies the requirement decord (from versions: none)
ERROR: No matching distribution found for decord
这里查阅了很多相关的网页,原因是因为python3.9以后已经不支持decord了,解决办法有很多,但大多都不能用,我最后是把decord包改成pip install eva-decord
才修复了这个问题
但是真正执行的时候又开始报错,报的是cuda的错误
(.venv) ~/ruyi/Ruyi-Models git:[main]
python3 predict_i2v.py
/Users/bytedance/ruyi/Ruyi-Models/.venv/lib/python3.9/site-packages/urllib3/__init__.py:35: NotOpenSSLWarning: urllib3 v2 only supports OpenSSL 1.1.1+, currently the 'ssl' module is compiled with 'LibreSSL 2.8.3'. See: https://github.com/urllib3/urllib3/issues/3020
warnings.warn(
Checking for Ruyi-Mini-7B updates ...
Fetching 12 files: 100%|███████████████████████████████████████████████████████████████████████| 12/12 [00:00<00:00, 2948.54it/s]
Vae loaded ...
Transformer loaded ...
Loading pipeline components...: 0it [00:00, ?it/s]
Pipeline loaded ...
Traceback (most recent call last):
File "/Users/bytedance/ruyi/Ruyi-Models/predict_i2v.py", line 220, in <module>
generator= torch.Generator(device).manual_seed(seed)
RuntimeError: Device type CUDA is not supported for torch.Generator() api.
mac根本没有gpu,也不能安装cuda,所以我把58行device = torch.device("cuda")
改成了device = torch.device("mps")
,因为cuda是nvidia gpu用到的架构,而mac自己开发的架构是mps,但是执行的时候依然报错
.venv➜ Ruyi-Models git:(main) ✗ python3 predict_i2v.py
/Users/bytedance/ruyi/Ruyi-Models/.venv/lib/python3.9/site-packages/urllib3/__init__.py:35: NotOpenSSLWarning: urllib3 v2 only supports OpenSSL 1.1.1+, currently the 'ssl' module is compiled with 'LibreSSL 2.8.3'. See: https://github.com/urllib3/urllib3/issues/3020
warnings.warn(
Checking for Ruyi-Mini-7B updates ...
Fetching 12 files: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [00:00<00:00, 2122.71it/s]
Vae loaded ...
Transformer loaded ...
Loading pipeline components...: 0it [00:00, ?it/s]
Pipeline loaded ...
Traceback (most recent call last):
File "/Users/bytedance/ruyi/Ruyi-Models/predict_i2v.py", line 232, in <module>
sample = pipeline(
File "/Users/bytedance/ruyi/Ruyi-Models/.venv/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/Users/bytedance/ruyi/Ruyi-Models/ruyi/pipeline/pipeline_ruyi_inpaint.py", line 726, in __call__
self.scheduler.set_timesteps(num_inference_steps, device=device)
File "/Users/bytedance/ruyi/Ruyi-Models/.venv/lib/python3.9/site-packages/diffusers/schedulers/scheduling_ddim.py", line 340, in set_timesteps
self.timesteps = torch.from_numpy(timesteps).to(device)
File "/Users/bytedance/ruyi/Ruyi-Models/.venv/lib/python3.9/site-packages/torch/cuda/__init__.py", line 310, in _lazy_init
raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled
到了这里再看到这些报错,我已经意识到,可能用mac去执行深度学习的任务不是一个很好的选择,到这里我已经浪费了很长的时间,再改下去也是遥遥无期,所以我放弃了这种方法
2 ComfyUI
用常规的方式根本不能实现,好在官方的教程还提到了ComfyUI的方法
鉴于本人已经离开深度学习领域两年多,所以看到这个也是一脸懵逼,这是啥,干啥用的,不过越看越深,给我的感觉很厉害,很多做AI的都在用这个工具,所以感觉有必要安装下来学习一下
2.1 ComfyUI安装
我自己用的是M2 pro的mac
安装过程我是根据指引到了comfyui的github中,找到了这一段介绍
看起来字很少,但是每个安装也不容易啊
2.1.1 安装pytorch
developer.apple.com/metal/pytor…
根据指引来到了这个网址,大概得意思是Apple为了让pytorch能在mac上也跑起来,开发了MPS来对接Pytorch
安装MPS版本的Pytorch的步骤
第一步Termial执行xcode-select --install
第二步安装pytorch pip3 install --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/cpu
安装没有遇到什么问题,可以再按照官方指引验证一下
2.1.2 安装ComfyUI
其实只是把这个仓库clone下来就可以了
执行git clone https://github.com/comfyanonymous/ComfyUI.git
也没有遇到什么问题
2.1.3 安装相关的依赖
用IDE打开这个仓库,我用的是Jetbrains的Pycharm,打开Python工程自动给建一个.venv
文件夹,相当于这个工程有了一个自己隔离的环境
我用的是Python3.9
打开文件后再这个文件夹路径下的终端执行pip install -r requirements.txt
也没有遇到什么问题
2.1.4 执行
还是在此终端执行命令 python main.py
然后终端弹出了提示
(.venv) ~/ruyi/ComfyUI git:[master]
python main.py
Total VRAM 32768 MB, total RAM 32768 MB
pytorch version: 2.5.1
Set vram state to: SHARED
Device: mps
Using sub quadratic optimization for attention, if you have memory or speed issues try using: --use-split-cross-attention
/Users/bytedance/ruyi/ComfyUI/.venv/lib/python3.9/site-packages/urllib3/__init__.py:35: NotOpenSSLWarning: urllib3 v2 only supports OpenSSL 1.1.1+, currently the 'ssl' module is compiled with 'LibreSSL 2.8.3'. See: https://github.com/urllib3/urllib3/issues/3020
warnings.warn(
****** User settings have been changed to be stored on the server instead of browser storage. ******
****** For multi-user setups add the --multi-user CLI argument to enable multiple user profiles. ******
[Prompt Server] web root: /Users/bytedance/ruyi/ComfyUI/web
Import times for custom nodes:
0.0 seconds: /Users/bytedance/ruyi/ComfyUI/custom_nodes/websocket_image_save.py
Starting server
To see the GUI go to: http://127.0.0.1:8188
点开这个本地的ip网址,就有了ComfyUI的执行界面
这时候点了一下执行,就有了报错,大概是说ckpt的内容不存在的,好吧,先跳过这里,先安装后面的东西
其实我现在也是很懵,我也不懂checkpoint是什么,也不懂Model,Clip,VAE是什么,后面慢慢学吧
2.1.5 其他的一些参考
我还搜到了这里有个网址也是ComfyUI的安装教程,comfyui-wiki.com/en/install/…
不过字实在太多了,不想看,我没有参考这个教程
2.2 ComfyUI Manager安装
正如ruyi的官方教程里说的,除了要安装ComfyUI还要安装ComfyUI Manager
首先ComfyUI Manager是干什么的,它就是ComfyUI的一个插件,用于管理ComfyUI的
安装很简单
第一步:cd ComfyUI/custom_nodes
第二步:克隆这个仓库 git clone https://github.com/ltdrdata/ComfyUI-Manager.git
第三步:安装ComfyUI Manager的依赖 pip install -r ComfyUI-Manager/requirements.txt
第四步:执行python main.py
在我第一次执行的时候有一个报错,大概的意思是说这个接口已经被占用了,也就是我刚才自己打开过一遍
(.venv) ~/ruyi/ComfyUI git:[master]
python main.py
[START] Security scan
[DONE] Security scan
## ComfyUI-Manager: installing dependencies done.
** ComfyUI startup time: 2024-12-28 00:47:22.526382
** Platform: Darwin
** Python version: 3.9.6 (default, Aug 11 2023, 19:44:49)
[Clang 15.0.0 (clang-1500.0.40.1)]
** Python executable: /Users/bytedance/ruyi/ComfyUI/.venv/bin/python
** ComfyUI Path: /Users/bytedance/ruyi/ComfyUI
** Log path: /Users/bytedance/ruyi/ComfyUI/comfyui.log
[notice] A new release of pip is available: 23.2.1 -> 24.3.1
[notice] To update, run: pip install --upgrade pip
[notice] A new release of pip is available: 23.2.1 -> 24.3.1
[notice] To update, run: pip install --upgrade pip
Prestartup times for custom nodes:
1.2 seconds: /Users/bytedance/ruyi/ComfyUI/custom_nodes/ComfyUI-Manager
Total VRAM 32768 MB, total RAM 32768 MB
pytorch version: 2.5.1
Set vram state to: SHARED
Device: mps
Using sub quadratic optimization for attention, if you have memory or speed issues try using: --use-split-cross-attention
[Prompt Server] web root: /Users/bytedance/ruyi/ComfyUI/web
### Loading: ComfyUI-Manager (V2.55.5)
### ComfyUI Version: v0.3.10-4-g4b5bcd8 | Released on '2024-12-27'
Import times for custom nodes:
0.0 seconds: /Users/bytedance/ruyi/ComfyUI/custom_nodes/websocket_image_save.py
0.1 seconds: /Users/bytedance/ruyi/ComfyUI/custom_nodes/ComfyUI-Manager
Starting server
Traceback (most recent call last):
File "/Users/bytedance/ruyi/ComfyUI/main.py", line 295, in <module>
event_loop.run_until_complete(start_all_func())
File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/asyncio/base_events.py", line 642, in run_until_complete
return future.result()
File "/Users/bytedance/ruyi/ComfyUI/main.py", line 285, in start_all
await run(prompt_server, address=args.listen, port=args.port, verbose=not args.dont_print_server, call_on_start=call_on_start)
File "/Users/bytedance/ruyi/ComfyUI/main.py", line 214, in run
await asyncio.gather(server_instance.start_multi_address(addresses, call_on_start), server_instance.publish_loop())
File "/Users/bytedance/ruyi/ComfyUI/server.py", line 822, in start_multi_address
await site.start()
File "/Users/bytedance/ruyi/ComfyUI/.venv/lib/python3.9/site-packages/aiohttp/web_runner.py", line 121, in start
self._server = await loop.create_server(
File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/asyncio/base_events.py", line 1494, in create_server
raise OSError(err.errno, 'error while attempting '
OSError: [Errno 48] error while attempting to bind on address ('127.0.0.1', 8188): address already in use
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/alter-list.json
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/model-list.json
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/github-stats.json
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/custom-node-list.json
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/extension-node-map.json
这时候我关闭pycharm再打开,发现右上角多了一个manager的标签,或许这就是安装这个插件的作用吧
2.3 Ruyi安装
在已经安装完的ComfyUI Manger下面,安装ruyi就非常简单了,点开中间一行Custom Nodes Manager那个按钮,然后搜索ruyi就出现了,这个时候点击install就可以了,
安装完ruyi后,再搜索一下ComfyUI-VideoHelperSuite同样要完成安装
两个都安装完成后按照指引,restart一下,再刷新一下浏览器页面就可以了
2.4 运行Ruyi
到目前位置,我并没有直接去下载ruyi的模型,先看看后面的流程
还是参照给出的指引
2.4.1 了解ruyi的结构体
在最左侧栏的节点库一栏,有ruyi的三个节点,分别四
Ruyi
- Load Model
- Load LoRA
- Sampler for Image to Video
2.4.2 Load Model
点击这个,在主页面就有了Load Model的卡片,剩下的每个内容的详细信息,就根据官网的解释来就可以,不是很难理解,这里摘抄一下
Load Model
- model: Select which model to use. Currently, Ruyi-Mini-7B is the only option.
- auto_download: Whether to automatically download. Defaults to yes. If the model is detected as missing or incomplete, it will automatically download the model to the ComfyUI/models/Ruyi path.
- auto_update: Whether to automatically check for and update the current model. Defaults to yes. When auto_download is enabled, the system will automatically check for updates to the model and download any updates to the ComfyUI/models/Ruyi directory. Please note that this feature relies on the caching mechanism of huggingface_hub, so do not delete the .cache folder in the model directory to ensure a smooth update process.
2.4.3 Load LoRA
读这一段就有些困惑了,我其实并不知道LoRA模型是什么,也不知道和上一个节点的Model有什么区别,ComfyUI/models/loras
这个文件夹下也没有什么模型,也不知道去哪里下载,也没有像上一个节点一样有auto_download的的选项
2.4.3 Sampler for Image to Video
这个卡片的参数太多了,也很难一下子把所有的参数搞明白,我现在掌握的就是start_image必传,而end_image可以不传
2.4.4 带着上面的疑问开始执行
点击左上角工作流-打开
选择路径 comfyui/workflows/
,但是这个路径下其实有好几个工作流,打开之后就差不多,就是参数有些差别,下面也有详细的说明,这里也copy过来,这里按需选择即可,我选的是workflow-ruyi-start-frame.json
workflow option
Image to Video (Starting Frame)
The workflow corresponds to the workflow-ruyi-i2v-start-frame.json file. For users with larger GPU memory, you can also use workflow-ruyi-i2v-start-frame-80g.json to enhance the generation speed.
Image to Video (Starting and Ending Frames)
The workflow corresponds to the workflow-ruyi-i2v-start-end-frames.json file. For users with larger GPU memory, you can also use workflow-ruyi-i2v-start-end-frames-80g.json to enhance the generation speed.
然后就是弄加载图像那个卡片,有一些示例图像,可以在 assests
文件夹找到
然后点击执行,果然报错了
是说对应的文件夹下没有Model,而且由于网络问题,也不能自动下载,有点离谱,网络明明是好的
后面我发现可能是没有这个文件夹ComfyUI/models/Ruyi
导致的,所以新建了一个文件夹
再次点击运行就可以了,不过只运行了几十秒,就又报这个错了
执行
git clone git@hf.co:IamCreateAI/Ruyi-Mini-7B
但是执行报错,愁人
2.4.5 手动下载模型
这一个部分不是很难理解,但是这个下载的网址很难打开,下载后把它放到ComfyUI/models/Ruyi
文件夹下
但是作为一个对这个网站很不熟悉的我来说,点开了网站,我也不知道在哪里下载啊 qaq
右侧的下载选项 clone repository
2.4.5.1 git lfs 安装
git lfs 又是什么?Git LFS 是Git 的扩展,它可提供用于描述提交到存储库中的大型文件的数据它通过homebrew安装
第一步 执行git --version
确保自己的version >= 1.8.3.1
第二步 执行 brew update
和brew install git-lfs
第三步 执行安装 git-lfs install
第四步 如果想卸载,执行 git-lfs uninstall
2.4.5.2 只下载文件不下载大文件模型
执行
GIT_LFS_SKIP_SMUDGE=1 git clone git@hf.co:IamCreateAI/Ruyi-Mini-7B
这一步是成功的,但是没用啊,因为我需要下载完整的模型
2.4.5.3 既下载文件又下载大文件模型
执行
git clone git@hf.co:IamCreateAI/Ruyi-Mini-7B
但是执行报错,愁人
➜ ruyi git clone git@hf.co:IamCreateAI/Ruyi-Mini-7B
Cloning into 'Ruyi-Mini-7B'...
Enter passphrase for key '/Users/bytedance/.ssh/id_ed25519':
Enter passphrase for key '/Users/bytedance/.ssh/id_ed25519':
remote: Enumerating objects: 48, done.
remote: Counting objects: 100% (48/48), done.
remote: Compressing objects: 100% (28/28), done.
remote: Total 48 (delta 17), reused 48 (delta 17), pack-reused 0 (from 0)
Receiving objects: 100% (48/48), 10.13 KiB | 1.69 MiB/s, done.
Resolving deltas: 100% (17/17), done.
Enter passphrase for key '/Users/bytedance/.ssh/id_ed25519':
Updating files: 100% (12/12), done.
Enter passphrase for key '/Users/bytedance/.ssh/id_ed25519':
Downloading embeddings.safetensors (75 MB)
Error downloading object: embeddings.safetensors (b2e2591): Smudge error: Error downloading embeddings.safetensors (b2e2591798abd3b934815a2c737b75b1a6555728ca19b68af9ce1d53cc7878d5): batch response: Post "https://huggingface.co/IamCreateAI/Ruyi-Mini-7B.git/info/lfs/objects/batch": read tcp 10.254.154.129:60703->13.35.202.34:443: read: connection reset by peer
Errors logged to '/Users/bytedance/ruyi/Ruyi-Mini-7B/.git/lfs/logs/20241228T024727.290984.log'.
Use `git lfs logs last` to view the log.
error: external filter 'git-lfs filter-process' failed
fatal: embeddings.safetensors: smudge filter lfs failed
warning: Clone succeeded, but checkout failed.
You can inspect what was checked out with 'git status'
and retry with 'git restore --source=HEAD :/'
解决方案 discuss.huggingface.co/t/problem-w…
解决办法就是不要用git lfs,要用huggingface-cli
2.4.5.4 用huggingface-cli下载完整的模型
最后搜了半天,还是直接用huggingface-cli比较好
第一步安装 brew install huggingface-cli
第二步登录 我卡在了这一步,死活登录不上去
2.4.5.5 更方便的一个方法
先通过GIT_LFS_SKIP_SMUDGE=1 git clone git@hf.co:IamCreateAI/Ruyi-Mini-7B
只下载文件不下载大模型
然后再手动去下载大模型
2.4.5.6 把下载的文件放到目标文件夹
目标文件夹:ComfyUI/models/Ruyi
完全处理好后的文件树是(按官方github来)
📦 Ruyi-Models/models/ or ComfyUI/models/Ruyi/
├── 📂 Ruyi-Mini-7B/
│ ├── 📂 transformers/
│ ├── 📂 vae/
│ └── 📂 ...
2.4.6 运行进度查看
在左侧栏的队列一栏可以查看运行的实施进度
2.5 运行Ruyi失败
即使把所有的东西都处理好了,但是用comfyui运行ruyi依然是失败的,报错依然是关于cuda的错误
不过我发现在加载这个模型的时候,我的电脑明显变得卡了很多,打字都打不利索了
3 后记
虽然这次执行ruyi模型失败了,但是我对AI的热情并没有熄灭,并不是毫无收获的,至少收获了自己的todo list
- 关于comfy,确实有很多值得研究的地方www.comfy.org/
- 关于huggingface,有必要了解一下他们的起源和历史huggingface.co
- 关于mps和cuda,大概了解一下这两个是什么东西,以及mac是不是真的不能支持模型推理和训练
- 关于推理模型的基本概念,从diffusion这个单词开始学习,我发现很多模型都提到了这个扩散模型
- 关于VAE,LoRA,暂且先不考虑,可能后面学习扩散模型也能遇到这些概念