重拾AI第一集:从Ruyi开始

190 阅读13分钟

写在前面:之前读研究生的时候做过很多深度学习的东西,但当时很多都是一知半解,也觉得深度学习很难,没有坚持下去,后面工作到现在两年的时间,也基本上与AI就没有再打过交道

但是外面的世界日新月异,后面有了大模型,有了chatgpt,有了aigc,我觉得我与这些内容越来越脱节,所以很有必要把之前脱节的AI内容重新捡起来

偶然的机会我发现了我之前经常用到的Tusimple数据集的母公司图森未来已经完全改变了自己的自动驾驶业务,开始转向了AI业务iamcreate.ai/en-US,而且发布了自己的开源项目ruyi模型github.com/IamCreateAI…,正好我就想用这个契机开始学习

我自己用的电脑是mac pro m2,虽然预料到了自己动手实践起来会有很多困难,但是没想到有这么大的困难,到最后我发现了mac根本跑不了模型。。。。

1 常规方式

首先按照我最初的设想,运行一个模型就是把它clone下来,然后执行对应的py即可,所以首先我就是按照官方github教程来

git clone https://github.com/IamCreateAI/Ruyi-Models
cd Ruyi-Models
pip install -r requirements.txt

但是运行时出现了报错

(.venv) ~/ruyi/Ruyi-Models git:[main]
pip install decord
ERROR: Could not find a version that satisfies the requirement decord (from versions: none)
ERROR: No matching distribution found for decord

这里查阅了很多相关的网页,原因是因为python3.9以后已经不支持decord了,解决办法有很多,但大多都不能用,我最后是把decord包改成pip install eva-decord才修复了这个问题

但是真正执行的时候又开始报错,报的是cuda的错误

(.venv) ~/ruyi/Ruyi-Models git:[main]
python3 predict_i2v.py
/Users/bytedance/ruyi/Ruyi-Models/.venv/lib/python3.9/site-packages/urllib3/__init__.py:35: NotOpenSSLWarning: urllib3 v2 only supports OpenSSL 1.1.1+, currently the 'ssl' module is compiled with 'LibreSSL 2.8.3'. See: https://github.com/urllib3/urllib3/issues/3020
  warnings.warn(
Checking for Ruyi-Mini-7B updates ...
Fetching 12 files: 100%|███████████████████████████████████████████████████████████████████████| 12/12 [00:00<00:00, 2948.54it/s]
Vae loaded ...
Transformer loaded ...
Loading pipeline components...: 0it [00:00, ?it/s]
Pipeline loaded ...
Traceback (most recent call last):
  File "/Users/bytedance/ruyi/Ruyi-Models/predict_i2v.py", line 220, in <module>
    generator= torch.Generator(device).manual_seed(seed)
RuntimeError: Device type CUDA is not supported for torch.Generator() api.

mac根本没有gpu,也不能安装cuda,所以我把58行device = torch.device("cuda")改成了device = torch.device("mps"),因为cuda是nvidia gpu用到的架构,而mac自己开发的架构是mps,但是执行的时候依然报错

.venv➜  Ruyi-Models git:(main) ✗ python3 predict_i2v.py
/Users/bytedance/ruyi/Ruyi-Models/.venv/lib/python3.9/site-packages/urllib3/__init__.py:35: NotOpenSSLWarning: urllib3 v2 only supports OpenSSL 1.1.1+, currently the 'ssl' module is compiled with 'LibreSSL 2.8.3'. See: https://github.com/urllib3/urllib3/issues/3020
  warnings.warn(
Checking for Ruyi-Mini-7B updates ...
Fetching 12 files: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [00:00<00:00, 2122.71it/s]
Vae loaded ...
Transformer loaded ...
Loading pipeline components...: 0it [00:00, ?it/s]
Pipeline loaded ...
Traceback (most recent call last):
  File "/Users/bytedance/ruyi/Ruyi-Models/predict_i2v.py", line 232, in <module>
    sample = pipeline(
  File "/Users/bytedance/ruyi/Ruyi-Models/.venv/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
  File "/Users/bytedance/ruyi/Ruyi-Models/ruyi/pipeline/pipeline_ruyi_inpaint.py", line 726, in __call__
    self.scheduler.set_timesteps(num_inference_steps, device=device)
  File "/Users/bytedance/ruyi/Ruyi-Models/.venv/lib/python3.9/site-packages/diffusers/schedulers/scheduling_ddim.py", line 340, in set_timesteps
    self.timesteps = torch.from_numpy(timesteps).to(device)
  File "/Users/bytedance/ruyi/Ruyi-Models/.venv/lib/python3.9/site-packages/torch/cuda/__init__.py", line 310, in _lazy_init
    raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled

到了这里再看到这些报错,我已经意识到,可能用mac去执行深度学习的任务不是一个很好的选择,到这里我已经浪费了很长的时间,再改下去也是遥遥无期,所以我放弃了这种方法

2 ComfyUI

用常规的方式根本不能实现,好在官方的教程还提到了ComfyUI的方法

鉴于本人已经离开深度学习领域两年多,所以看到这个也是一脸懵逼,这是啥,干啥用的,不过越看越深,给我的感觉很厉害,很多做AI的都在用这个工具,所以感觉有必要安装下来学习一下

2.1 ComfyUI安装

我自己用的是M2 pro的mac

安装过程我是根据指引到了comfyui的github中,找到了这一段介绍

看起来字很少,但是每个安装也不容易啊

2.1.1 安装pytorch

developer.apple.com/metal/pytor…

根据指引来到了这个网址,大概得意思是Apple为了让pytorch能在mac上也跑起来,开发了MPS来对接Pytorch

安装MPS版本的Pytorch的步骤

第一步Termial执行xcode-select --install

第二步安装pytorch pip3 install --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/cpu

安装没有遇到什么问题,可以再按照官方指引验证一下

2.1.2 安装ComfyUI

其实只是把这个仓库clone下来就可以了

执行git clone https://github.com/comfyanonymous/ComfyUI.git

也没有遇到什么问题

2.1.3 安装相关的依赖

用IDE打开这个仓库,我用的是Jetbrains的Pycharm,打开Python工程自动给建一个.venv文件夹,相当于这个工程有了一个自己隔离的环境

我用的是Python3.9

打开文件后再这个文件夹路径下的终端执行pip install -r requirements.txt

也没有遇到什么问题

2.1.4 执行

还是在此终端执行命令 python main.py

然后终端弹出了提示

(.venv) ~/ruyi/ComfyUI git:[master]
python main.py
Total VRAM 32768 MB, total RAM 32768 MB
pytorch version: 2.5.1
Set vram state to: SHARED
Device: mps
Using sub quadratic optimization for attention, if you have memory or speed issues try using: --use-split-cross-attention
/Users/bytedance/ruyi/ComfyUI/.venv/lib/python3.9/site-packages/urllib3/__init__.py:35: NotOpenSSLWarning: urllib3 v2 only supports OpenSSL 1.1.1+, currently the 'ssl' module is compiled with 'LibreSSL 2.8.3'. See: https://github.com/urllib3/urllib3/issues/3020
  warnings.warn(
****** User settings have been changed to be stored on the server instead of browser storage. ******
****** For multi-user setups add the --multi-user CLI argument to enable multiple user profiles. ******
[Prompt Server] web root: /Users/bytedance/ruyi/ComfyUI/web

Import times for custom nodes:
   0.0 seconds: /Users/bytedance/ruyi/ComfyUI/custom_nodes/websocket_image_save.py

Starting server

To see the GUI go to: http://127.0.0.1:8188

点开这个本地的ip网址,就有了ComfyUI的执行界面

这时候点了一下执行,就有了报错,大概是说ckpt的内容不存在的,好吧,先跳过这里,先安装后面的东西

其实我现在也是很懵,我也不懂checkpoint是什么,也不懂Model,Clip,VAE是什么,后面慢慢学吧

2.1.5 其他的一些参考

我还搜到了这里有个网址也是ComfyUI的安装教程,comfyui-wiki.com/en/install/…

不过字实在太多了,不想看,我没有参考这个教程

2.2 ComfyUI Manager安装

正如ruyi的官方教程里说的,除了要安装ComfyUI还要安装ComfyUI Manager

github.com/ltdrdata/Co…

首先ComfyUI Manager是干什么的,它就是ComfyUI的一个插件,用于管理ComfyUI的

安装很简单

第一步:cd ComfyUI/custom_nodes

第二步:克隆这个仓库 git clone https://github.com/ltdrdata/ComfyUI-Manager.git

第三步:安装ComfyUI Manager的依赖 pip install -r ComfyUI-Manager/requirements.txt

第四步:执行python main.py

在我第一次执行的时候有一个报错,大概的意思是说这个接口已经被占用了,也就是我刚才自己打开过一遍

(.venv) ~/ruyi/ComfyUI git:[master]
python main.py
[START] Security scan
[DONE] Security scan
## ComfyUI-Manager: installing dependencies done.
** ComfyUI startup time: 2024-12-28 00:47:22.526382
** Platform: Darwin
** Python version: 3.9.6 (default, Aug 11 2023, 19:44:49) 
[Clang 15.0.0 (clang-1500.0.40.1)]
** Python executable: /Users/bytedance/ruyi/ComfyUI/.venv/bin/python
** ComfyUI Path: /Users/bytedance/ruyi/ComfyUI
** Log path: /Users/bytedance/ruyi/ComfyUI/comfyui.log

[notice] A new release of pip is available: 23.2.1 -> 24.3.1
[notice] To update, run: pip install --upgrade pip

[notice] A new release of pip is available: 23.2.1 -> 24.3.1
[notice] To update, run: pip install --upgrade pip

Prestartup times for custom nodes:
   1.2 seconds: /Users/bytedance/ruyi/ComfyUI/custom_nodes/ComfyUI-Manager

Total VRAM 32768 MB, total RAM 32768 MB
pytorch version: 2.5.1
Set vram state to: SHARED
Device: mps
Using sub quadratic optimization for attention, if you have memory or speed issues try using: --use-split-cross-attention
[Prompt Server] web root: /Users/bytedance/ruyi/ComfyUI/web
### Loading: ComfyUI-Manager (V2.55.5)
### ComfyUI Version: v0.3.10-4-g4b5bcd8 | Released on '2024-12-27'

Import times for custom nodes:
   0.0 seconds: /Users/bytedance/ruyi/ComfyUI/custom_nodes/websocket_image_save.py
   0.1 seconds: /Users/bytedance/ruyi/ComfyUI/custom_nodes/ComfyUI-Manager

Starting server

Traceback (most recent call last):
  File "/Users/bytedance/ruyi/ComfyUI/main.py", line 295, in <module>
    event_loop.run_until_complete(start_all_func())
  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/asyncio/base_events.py", line 642, in run_until_complete
    return future.result()
  File "/Users/bytedance/ruyi/ComfyUI/main.py", line 285, in start_all
    await run(prompt_server, address=args.listen, port=args.port, verbose=not args.dont_print_server, call_on_start=call_on_start)
  File "/Users/bytedance/ruyi/ComfyUI/main.py", line 214, in run
    await asyncio.gather(server_instance.start_multi_address(addresses, call_on_start), server_instance.publish_loop())
  File "/Users/bytedance/ruyi/ComfyUI/server.py", line 822, in start_multi_address
    await site.start()
  File "/Users/bytedance/ruyi/ComfyUI/.venv/lib/python3.9/site-packages/aiohttp/web_runner.py", line 121, in start
    self._server = await loop.create_server(
  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/asyncio/base_events.py", line 1494, in create_server
    raise OSError(err.errno, 'error while attempting '
OSError: [Errno 48] error while attempting to bind on address ('127.0.0.1', 8188): address already in use
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/alter-list.json
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/model-list.json
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/github-stats.json
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/custom-node-list.json
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/extension-node-map.json

这时候我关闭pycharm再打开,发现右上角多了一个manager的标签,或许这就是安装这个插件的作用吧

2.3 Ruyi安装

在已经安装完的ComfyUI Manger下面,安装ruyi就非常简单了,点开中间一行Custom Nodes Manager那个按钮,然后搜索ruyi就出现了,这个时候点击install就可以了,

安装完ruyi后,再搜索一下ComfyUI-VideoHelperSuite同样要完成安装

两个都安装完成后按照指引,restart一下,再刷新一下浏览器页面就可以了

2.4 运行Ruyi

到目前位置,我并没有直接去下载ruyi的模型,先看看后面的流程

还是参照给出的指引

github.com/IamCreateAI…

2.4.1 了解ruyi的结构体

在最左侧栏的节点库一栏,有ruyi的三个节点,分别四

Ruyi

  • Load Model
  • Load LoRA
  • Sampler for Image to Video

2.4.2 Load Model

点击这个,在主页面就有了Load Model的卡片,剩下的每个内容的详细信息,就根据官网的解释来就可以,不是很难理解,这里摘抄一下

Load Model

  • model: Select which model to use. Currently, Ruyi-Mini-7B is the only option.
  • auto_download: Whether to automatically download. Defaults to yes. If the model is detected as missing or incomplete, it will automatically download the model to the ComfyUI/models/Ruyi path.
  • auto_update: Whether to automatically check for and update the current model. Defaults to yes. When auto_download is enabled, the system will automatically check for updates to the model and download any updates to the ComfyUI/models/Ruyi directory. Please note that this feature relies on the caching mechanism of huggingface_hub, so do not delete the .cache folder in the model directory to ensure a smooth update process.

2.4.3 Load LoRA

读这一段就有些困惑了,我其实并不知道LoRA模型是什么,也不知道和上一个节点的Model有什么区别,ComfyUI/models/loras这个文件夹下也没有什么模型,也不知道去哪里下载,也没有像上一个节点一样有auto_download的的选项

2.4.3 Sampler for Image to Video

这个卡片的参数太多了,也很难一下子把所有的参数搞明白,我现在掌握的就是start_image必传,而end_image可以不传

2.4.4 带着上面的疑问开始执行

点击左上角工作流-打开

选择路径 comfyui/workflows/ ,但是这个路径下其实有好几个工作流,打开之后就差不多,就是参数有些差别,下面也有详细的说明,这里也copy过来,这里按需选择即可,我选的是workflow-ruyi-start-frame.json

workflow option

Image to Video (Starting Frame)

The workflow corresponds to the workflow-ruyi-i2v-start-frame.json file. For users with larger GPU memory, you can also use workflow-ruyi-i2v-start-frame-80g.json to enhance the generation speed.

Image to Video (Starting and Ending Frames)

The workflow corresponds to the workflow-ruyi-i2v-start-end-frames.json file. For users with larger GPU memory, you can also use workflow-ruyi-i2v-start-end-frames-80g.json to enhance the generation speed.

然后就是弄加载图像那个卡片,有一些示例图像,可以在 assests文件夹找到

然后点击执行,果然报错了

是说对应的文件夹下没有Model,而且由于网络问题,也不能自动下载,有点离谱,网络明明是好的

后面我发现可能是没有这个文件夹ComfyUI/models/Ruyi导致的,所以新建了一个文件夹

再次点击运行就可以了,不过只运行了几十秒,就又报这个错了

执行

git clone git@hf.co:IamCreateAI/Ruyi-Mini-7B

但是执行报错,愁人

2.4.5 手动下载模型

这一个部分不是很难理解,但是这个下载的网址很难打开,下载后把它放到ComfyUI/models/Ruyi文件夹下

huggingface.co/IamCreateAI…

但是作为一个对这个网站很不熟悉的我来说,点开了网站,我也不知道在哪里下载啊 qaq

右侧的下载选项 clone repository

2.4.5.1 git lfs 安装

git lfs 又是什么?Git LFS 是Git 的扩展,它可提供用于描述提交到存储库中的大型文件的数据它通过homebrew安装

第一步 执行git --version确保自己的version >= 1.8.3.1

第二步 执行 brew updatebrew install git-lfs

第三步 执行安装 git-lfs install

第四步 如果想卸载,执行 git-lfs uninstall

2.4.5.2 只下载文件不下载大文件模型

执行

GIT_LFS_SKIP_SMUDGE=1 git clone git@hf.co:IamCreateAI/Ruyi-Mini-7B

这一步是成功的,但是没用啊,因为我需要下载完整的模型

2.4.5.3 既下载文件又下载大文件模型

执行

git clone git@hf.co:IamCreateAI/Ruyi-Mini-7B

但是执行报错,愁人

➜  ruyi git clone git@hf.co:IamCreateAI/Ruyi-Mini-7B
Cloning into 'Ruyi-Mini-7B'...
Enter passphrase for key '/Users/bytedance/.ssh/id_ed25519':
Enter passphrase for key '/Users/bytedance/.ssh/id_ed25519':
remote: Enumerating objects: 48, done.
remote: Counting objects: 100% (48/48), done.
remote: Compressing objects: 100% (28/28), done.
remote: Total 48 (delta 17), reused 48 (delta 17), pack-reused 0 (from 0)
Receiving objects: 100% (48/48), 10.13 KiB | 1.69 MiB/s, done.
Resolving deltas: 100% (17/17), done.
Enter passphrase for key '/Users/bytedance/.ssh/id_ed25519':
Updating files: 100% (12/12), done.
Enter passphrase for key '/Users/bytedance/.ssh/id_ed25519':
Downloading embeddings.safetensors (75 MB)
Error downloading object: embeddings.safetensors (b2e2591): Smudge error: Error downloading embeddings.safetensors (b2e2591798abd3b934815a2c737b75b1a6555728ca19b68af9ce1d53cc7878d5): batch response: Post "https://huggingface.co/IamCreateAI/Ruyi-Mini-7B.git/info/lfs/objects/batch": read tcp 10.254.154.129:60703->13.35.202.34:443: read: connection reset by peer

Errors logged to '/Users/bytedance/ruyi/Ruyi-Mini-7B/.git/lfs/logs/20241228T024727.290984.log'.
Use `git lfs logs last` to view the log.
error: external filter 'git-lfs filter-process' failed
fatal: embeddings.safetensors: smudge filter lfs failed
warning: Clone succeeded, but checkout failed.
You can inspect what was checked out with 'git status'
and retry with 'git restore --source=HEAD :/'

解决方案 discuss.huggingface.co/t/problem-w…

解决办法就是不要用git lfs,要用huggingface-cli

2.4.5.4 用huggingface-cli下载完整的模型

最后搜了半天,还是直接用huggingface-cli比较好

huggingface.co/docs/huggin…

第一步安装 brew install huggingface-cli

第二步登录 我卡在了这一步,死活登录不上去

2.4.5.5 更方便的一个方法

先通过GIT_LFS_SKIP_SMUDGE=1 git clone git@hf.co:IamCreateAI/Ruyi-Mini-7B

只下载文件不下载大模型

然后再手动去下载大模型

2.4.5.6 把下载的文件放到目标文件夹

目标文件夹:ComfyUI/models/Ruyi

完全处理好后的文件树是(按官方github来)

📦 Ruyi-Models/models/ or ComfyUI/models/Ruyi/
├── 📂 Ruyi-Mini-7B/
│   ├── 📂 transformers/
│   ├── 📂 vae/
│   └── 📂 ...

2.4.6 运行进度查看

在左侧栏的队列一栏可以查看运行的实施进度

2.5 运行Ruyi失败

即使把所有的东西都处理好了,但是用comfyui运行ruyi依然是失败的,报错依然是关于cuda的错误

不过我发现在加载这个模型的时候,我的电脑明显变得卡了很多,打字都打不利索了

3 后记

虽然这次执行ruyi模型失败了,但是我对AI的热情并没有熄灭,并不是毫无收获的,至少收获了自己的todo list

  1. 关于comfy,确实有很多值得研究的地方www.comfy.org/
  2. 关于huggingface,有必要了解一下他们的起源和历史huggingface.co
  3. 关于mps和cuda,大概了解一下这两个是什么东西,以及mac是不是真的不能支持模型推理和训练
  4. 关于推理模型的基本概念,从diffusion这个单词开始学习,我发现很多模型都提到了这个扩散模型
  5. 关于VAE,LoRA,暂且先不考虑,可能后面学习扩散模型也能遇到这些概念