Stable-diffusion-webui提供了很多好用的功能,比如支持civitai上面的各种社区自定义模型、支持Lora、支持各种插件等等,但是如果我们想要对外提供服务,支撑多人在线使用,我们就需要一种低成本的、能够根据在线数量自动扩缩容的部署方案,能够将stable-diffusion-webui部署到服务器上并提供完整的API,并且能够支持webui本身支持的所有功能(包括自定义模型、Lora、插件等等)。
这篇文章准备介绍在Replicate平台上部署stable-diffusion-webui API的流程,这个部署方案的优点是:
- 部署后的服务支持webui的原生API参数,使用了webui的较新的分支,升级到了torch2.0;
- 因为是自己上传镜像,所以可以部署自己想要的webui插件,自定义的社区checkpoint、Lora等等,本文的部署示例支持了ControlNet插件;
- Replicate平台可以自动扩缩容,请求人数多的时候,会扩容机器数量来处理请求,在请求量变少的时候回收机器,从而减少成本;
- 计费方式很友好,Replicate只会对真正运行模型的时间进行计费。我们知道要在云端运行一个模型,首先要经历模型的初始化,然后模型开始运行,运行结束之后一般还会将加载好的模型保留一段时间,避免直接回收导致短时间内又需要冷启动。而Replicate只会对中间模型运行的时间进行计费,这就省去了我们很多GPU成本;
我自己部署了一个majicMIX realistic 麦橘写实模型的示例服务,大家可以查看效果:
示例服务:replicate.com/wolverinn/w…
接下来详细介绍怎么将模型部署到Replicate上。文章末尾我也会提供出打包好的代码。
一、环境准备
首先我们需要准备一个带GPU的Linux环境(预留50G空间),可以是在本地也可以租一个服务器。然后进入到环境中,先安装一些依赖。首先要安装docker,docker的安装步骤比较复杂,可以直接参考官网的安装步骤。然后安装cog工具:
sudo curl -o /usr/local/bin/cog -L https://github.com/replicate/cog/releases/latest/download/cog_`uname -s`_`uname -m`
sudo chmod +x /usr/local/bin/cog
如果本地没有git的话,也需要安装一下:
sudo apt-get updatesudo apt-get install git
然后将AUTOMATIC1111的sd-webui仓库克隆到本地:
git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui.git
git checkout bcb6ad5fab6fb59fc79c8b6d94699cbabec34297 # 我使用的是这个版本的代码
如果网络不好的话也可以直接将代码文件复制过去。
然后进入sd-webui的目录,运行它的启动脚本:
cd stable-diffusion-webuibash webui.sh
这一步是为了让脚本自动帮我们下载一些仓库,从而简化我们构建镜像的流程。脚本成功启动之后,我们就可以把脚本关掉,然后把目录下的venv/文件夹删掉以减小我们的镜像大小;我们需要的主要是脚本帮我们下载的repositories/文件夹。
接下来,和常规的sd-webui使用方式一样,我们可以把我们想要部署的模型下载到对应的文件夹里,比如checkpoint就下载到models/Stable-diffusion/目录,Lora/VAE也都放到对应的目录就行。这里需要注意的是,checkpoint一次部署就部署一个就行,如果想要部署多个模型的话,就分多次部署就行了,然后在调用API的时候通过API来区分,这样避免了模型切换带来的时间消耗,而且部署多个模型也不会增加成本。
插件也是同样的道理,放在extensions/文件夹下就行,比如我们想要部署ControlNet,可以运行:
cd stable-diffusion-webui/extentions
git clone https://github.com/Mikubill/sd-webui-controlnet.git
# 然后把ControlNet对应的模型也下载到本地文件夹里
cd sd-webui-controlnet/models
# 以下载canny模型为例
wget https://huggingface.co/lllyasviel/ControlNet-v1-1/resolve/main/control_v11p_sd15_canny.pth
到这里,环境就准备完成了,接下来可以开始定义Replicate镜像的依赖,以及写一个简单的处理请求的函数。
二、定义镜像依赖
Replicate提供的cog工具让我们无需去了解docker相关知识和编写dockerfile,我们只需要指定我们想安装的依赖就行,在stable-diffusion-webui/的目录下,新建一个cog.yaml文件,复制以下代码:
# Configuration for Cog ⚙️
# https://replicate.com/docs/guides/push-a-model
# prerequisite:https://docs.docker.com/engine/install/ubuntu/#set-up-the-repository `dockerd` to start docker
# Reference: https://github.com/replicate/cog/blob/main/docs/yaml.md
# !!!! recommend 60G disk space for cog docker
build:
# set to true if your model requires a GPU
gpu: true
# a list of ubuntu apt packages to install
system_packages:
- "libgl1-mesa-glx"
- "libglib2.0-0"
# python version in the form '3.8' or '3.8.12'
python_version: "3.10.4"
# a list of packages in the format <package-name>==<version>
python_requirements: requirements.txt
# commands run after the environment is setup
run:
# - "pip3 install torch==2.0.1 torchvision==0.15.2 --extra-index-url https://download.pytorch.org/whl/cu118"
- "echo env is ready!"
# https://replicate.com/wolverinn/chill_watcher
image: "r8.im/wolverinn/simple-background"
# predict.py defines how predictions are run on your model
predict: "predict.py:Predictor"
可以看到在这个yaml文件里面,我们指定了要安装的python依赖requirements.txt,以及最终我们要部署的模型地址:r8.im/wolverinn/simple-background,这个是我自己的模型地址,需要进行替换,模型地址需要我们先到Replicate平台上创建一个模型,然后就能看到模型的路径了。
至于requirements.txt的内容,其实就是将sd-webui自身的依赖以及ControlNet的依赖组合一下:
torch==2.0.1
--extra-index-url https://download.pytorch.org/whl/cu118
torchvision==0.15.2
--extra-index-url https://download.pytorch.org/whl/cu118
GitPython==3.1.30
Pillow==9.5.0
accelerate==0.18.0
basicsr==1.4.2
blendmodes==2022
clean-fid==0.1.35
einops==0.4.1
fastapi==0.94.0
gfpgan==1.3.8
gradio==3.32.0
httpcore<=0.15
httpx==0.24.1
inflection==0.5.1
jsonmerge==1.8.0
kornia==0.6.7
lark==1.1.2
numpy==1.23.5
omegaconf==2.2.3
piexif==1.1.3
psutil~=5.9.5
pytorch_lightning==1.9.4
realesrgan==0.3.0
resize-right==0.0.2
safetensors==0.3.1
scikit-image==0.20.0
timm==0.6.7
tomesd==0.1.2
torchdiffeq==0.2.3
torchsde==0.2.5
transformers==4.25.1
lpips==0.1.3
gdown==4.5.1
addict
future
lmdb
opencv-python
Pillow
pyyaml
scikit-image
scipy
tb-nightly
tqdm
yapf
mediapipe
svglib
fvcore
xformers==0.0.20
到这里,镜像依赖就指定好了,接下来我们还需要编写yaml文件中所指定的predict.py文件,来告诉Replicate平台如何处理请求。
三、模型计算函数
这里就以调用img2img为例,并且会带上ControlNet的参数,其余的API也是完全相同的道理。我们在stable-diffusion-webui/目录下新建一个predict.py文件。
首先,我们要实现模型的初始化,这部分代码已经在webui.py的initialize()函数中了,我们直接调用就行,在predict.py中加入如下代码:
from cog import BasePredictor, Input, Path
import os
import time
import uuid
import base64
from io import BytesIO
from PIL import Image
os.environ["IGNORE_CMD_ARGS_ERRORS"] = "true"
from webui import initialize, img2imgapi
from modules.api.models import StableDiffusionTxt2ImgProcessingAPI, StableDiffusionImg2ImgProcessingAPI
class Predictor(BasePredictor):
def setup(self):
"""Load the model into memory to make running multiple predictions efficient"""
initialize()
def predict(
self,
image: Path = Input(description="Image to replace background"),
prompt: str = Input(description="prompt en", default="RAW photo, 8k uhd, dslr, soft lighting, high quality, film grain, Fujifilm XT3"),
negative_prompt: str = Input(description="negative prompt", default="(deformed iris, deformed pupils, semi-realistic, cgi, 3d, render, sketch, cartoon, drawing, anime, mutated hands and fingers:1.4), (deformed, distorted, disfigured:1.3), poorly drawn, bad anatomy, wrong anatomy, extra limb, missing limb, floating limbs, disconnected limbs, mutation, mutated, ugly, disgusting, amputation"),
sampler_name: str = Input(description="sampler name", default="DPM++ SDE Karras", choices=["DPM++ SDE Karras", "DPM++ 2M Karras", "DPM++ 2S a Karras", "DPM2 a Karras", "DPM2 Karras", "LMS Karras", "DPM adaptive", "DPM fast", "DPM++ SDE", "DPM++ 2M", "DPM++ 2S a", "DPM2 a", "DPM2", "Heun", "LMS", "Euler", "Euler a"]),
seed: int = Input(description="seed", default=-1),
) -> Path:
"""Run a single prediction on the model"""
img_data = Image.open(image)
img_bytes = None
with BytesIO() as output_bytes:
img_data.save(output_bytes, format="PNG")
img_bytes = output_bytes.getvalue()
encoded_image = base64.b64encode(img_bytes).decode('utf-8')
# A1111 payload
payload = {
"init_images": [encoded_image],
"prompt": prompt,
"negative_prompt": negative_prompt,
"batch_size": 1,
"steps": 20,
"cfg_scale": 7,
"denoising_strength": 0.75,
"seed": seed,
"do_not_save_samples": True,
"sampler_name": sampler_name,
"width": img_data.size[0],
"height": img_data.size[1],
"alwayson_scripts": {
"controlnet": {
"args": [
{
"input_image": encoded_image,
"module": "canny",
"model": "control_v11p_sd15_canny [d14c016b]",
"processor_res": max(img_data.size[0], img_data.size[1]),
"threshold_a": 1,
"threshold_b": 200,
}
]
}
}
}
req = StableDiffusionImg2ImgProcessingAPI(**payload)
# generate
resp = img2imgapi(req)
cnres_img = None
if len(resp.images) > 0:
cnres_img = resp.images[0]
gen_bytes = BytesIO(base64.b64decode(cnres_img))
gen_data = Image.open(gen_bytes)
gen_data.paste(rem_data, (0,0), mask = rem_data)
filename = "{}.png".format(uuid.uuid1())
gen_data.save(fp=filename, format="PNG")
return Path(filename)
可以看到,想要让我们的API支持更多的输入参数的话,只需要在predict()函数中按照固定格式支持更多参数就行了,然后将参数传入到payload变量当中。
其中使用到的img2imgapi,其实是一个从sd-webui中的modules/api/api.py中复制过来的函数:
from modules.api.api import script_name_to_index, validate_sampler_name, encode_pil_to_base64, decode_base64_to_image
import gradio as gr
def get_script(script_name, script_runner):
if script_name is None or script_name == "":
return None, None
script_idx = script_name_to_index(script_name, script_runner.scripts)
return script_runner.scripts[script_idx]
def init_default_script_args(script_runner):
#find max idx from the scripts in runner and generate a none array to init script_args
last_arg_index = 1
for script in script_runner.scripts:
if last_arg_index < script.args_to:
last_arg_index = script.args_to
# None everywhere except position 0 to initialize script args
script_args = [None]*last_arg_index
script_args[0] = 0
# get default values
with gr.Blocks(): # will throw errors calling ui function without this
for script in script_runner.scripts:
if script.ui(script.is_img2img):
ui_default_values = []
for elem in script.ui(script.is_img2img):
ui_default_values.append(elem.value)
script_args[script.args_from:script.args_to] = ui_default_values
return script_args
def get_selectable_script(script_name, script_runner):
if script_name is None or script_name == "":
return None, None
script_idx = script_name_to_index(script_name, script_runner.selectable_scripts)
script = script_runner.selectable_scripts[script_idx]
return script, script_idx
def init_script_args(request, default_script_args, selectable_scripts, selectable_idx, script_runner):
script_args = default_script_args.copy()
# position 0 in script_arg is the idx+1 of the selectable script that is going to be run when using scripts.scripts_*2img.run()
if selectable_scripts:
script_args[selectable_scripts.args_from:selectable_scripts.args_to] = request.script_args
script_args[0] = selectable_idx + 1
if request.alwayson_scripts:
for alwayson_script_name in request.alwayson_scripts.keys():
alwayson_script = get_script(alwayson_script_name, script_runner)
if alwayson_script is None:
raise HTTPException(status_code=422, detail=f"always on script {alwayson_script_name} not found")
# Selectable script in always on script param check
if alwayson_script.alwayson is False:
raise HTTPException(status_code=422, detail="Cannot have a selectable script in the always on scripts params")
# always on script with no arg should always run so you don't really need to add them to the requests
if "args" in request.alwayson_scripts[alwayson_script_name]:
# min between arg length in scriptrunner and arg length in the request
for idx in range(0, min((alwayson_script.args_to - alwayson_script.args_from), len(request.alwayson_scripts[alwayson_script_name]["args"]))):
script_args[alwayson_script.args_from + idx] = request.alwayson_scripts[alwayson_script_name]["args"][idx]
return script_args
from modules.api import models
from modules.api.models import PydanticModelGenerator, StableDiffusionTxt2ImgProcessingAPI, StableDiffusionImg2ImgProcessingAPI
from modules import scripts, ui
from modules.processing import process_images, StableDiffusionProcessingTxt2Img, StableDiffusionProcessingImg2Img
def img2imgapi(img2imgreq: models.StableDiffusionImg2ImgProcessingAPI):
init_images = img2imgreq.init_images
if init_images is None:
return
mask = img2imgreq.mask
if mask:
mask = decode_base64_to_image(mask)
script_runner = scripts.scripts_img2img
if not script_runner.scripts:
script_runner.initialize_scripts(True)
ui.create_ui()
default_script_arg_img2img = []
if not default_script_arg_img2img:
default_script_arg_img2img = init_default_script_args(script_runner)
selectable_scripts, selectable_script_idx = get_selectable_script(img2imgreq.script_name, script_runner)
populate = img2imgreq.copy(update={ # Override __init__ params
"sampler_name": validate_sampler_name(img2imgreq.sampler_name or img2imgreq.sampler_index),
"do_not_save_samples": not img2imgreq.save_images,
"do_not_save_grid": not img2imgreq.save_images,
"mask": mask,
})
if populate.sampler_name:
populate.sampler_index = None # prevent a warning later on
args = vars(populate)
args.pop('include_init_images', None) # this is meant to be done by "exclude": True in model, but it's for a reason that I cannot determine.
args.pop('script_name', None)
args.pop('script_args', None) # will refeed them to the pipeline directly after initializing them
args.pop('alwayson_scripts', None)
script_args = init_script_args(img2imgreq, default_script_arg_img2img, selectable_scripts, selectable_script_idx, script_runner)
send_images = args.pop('send_images', True)
args.pop('save_images', None)
p = StableDiffusionProcessingImg2Img(sd_model=shared.sd_model, **args)
p.init_images = [decode_base64_to_image(x) for x in init_images]
p.scripts = script_runner
shared.state.begin()
if selectable_scripts is not None:
p.script_args = script_args
processed = scripts.scripts_img2img.run(p, *p.script_args) # Need to pass args as list here
else:
p.script_args = tuple(script_args) # Need to pass args as tuple here
processed = process_images(p)
shared.state.end()
b64images = list(map(encode_pil_to_base64, processed.images)) if send_images else []
if not img2imgreq.include_init_images:
img2imgreq.init_images = None
img2imgreq.mask = None
return models.ImageToImageResponse(images=b64images, parameters=vars(img2imgreq), info=processed.js())
将上面的内容放到webui.py中就行。
四、push镜像
最后一步,需要将我们本地的文件都上传到Replicate平台。在cog.yaml文件所在的目录下,运行:
cog login
cog push
上传完成之后,我们就可以来到Repliate网页,可以在网页版中运行我们的API,也可以参考它的文档使用代码来调用API。
最后,本文的部署代码我也放在了代码仓库中,直接复制下来,下载好模型文件、安装好docker之后,运行push镜像的命令就能部署。
我还做了一个无需编写或运行任何代码,只需要上传自己想要部署的模型,就能拥有一个自动扩缩容的stable diffusion API服务的部署方案,欢迎阅读