Stable-diffusion-webui提供了很多好用的功能，比如支持civitai上面的各种社区自定义模型、支持Lora、支持各种插件等等，但是如果我们想要对外提供服务，支撑多人在线使用，我们就需要一种低成本的、能够根据在线数量自动扩缩容的部署方案，能够将stable-diffusion-webui部署到服务器上并提供完整的API，并且能够支持webui本身支持的所有功能（包括自定义模型、Lora、插件等等）。

这篇文章准备介绍在Replicate平台上部署stable-diffusion-webui API的流程，这个部署方案的优点是：

部署后的服务支持webui的原生API参数，使用了webui的较新的分支，升级到了torch2.0；
因为是自己上传镜像，所以可以部署自己想要的webui插件，自定义的社区checkpoint、Lora等等，本文的部署示例支持了ControlNet插件；
Replicate平台可以自动扩缩容，请求人数多的时候，会扩容机器数量来处理请求，在请求量变少的时候回收机器，从而减少成本；
计费方式很友好，Replicate只会对真正运行模型的时间进行计费。我们知道要在云端运行一个模型，首先要经历模型的初始化，然后模型开始运行，运行结束之后一般还会将加载好的模型保留一段时间，避免直接回收导致短时间内又需要冷启动。而Replicate只会对中间模型运行的时间进行计费，这就省去了我们很多GPU成本；

我自己部署了一个majicMIX realistic 麦橘写实模型的示例服务，大家可以查看效果：

示例服务：replicate.com/wolverinn/w…

图片.png

接下来详细介绍怎么将模型部署到Replicate上。文章末尾我也会提供出打包好的代码。

一、环境准备

首先我们需要准备一个带GPU的Linux环境（预留50G空间），可以是在本地也可以租一个服务器。然后进入到环境中，先安装一些依赖。首先要安装docker，docker的安装步骤比较复杂，可以直接参考官网的安装步骤。然后安装cog工具：

sudo curl -o /usr/local/bin/cog -L https://github.com/replicate/cog/releases/latest/download/cog_`uname -s`_`uname -m`
sudo chmod +x /usr/local/bin/cog

如果本地没有git的话，也需要安装一下：

sudo apt-get updatesudo apt-get install git

然后将AUTOMATIC1111的sd-webui仓库克隆到本地：

git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui.git
git checkout bcb6ad5fab6fb59fc79c8b6d94699cbabec34297 # 我使用的是这个版本的代码

如果网络不好的话也可以直接将代码文件复制过去。

然后进入sd-webui的目录，运行它的启动脚本：

cd stable-diffusion-webuibash webui.sh

这一步是为了让脚本自动帮我们下载一些仓库，从而简化我们构建镜像的流程。脚本成功启动之后，我们就可以把脚本关掉，然后把目录下的venv/文件夹删掉以减小我们的镜像大小；我们需要的主要是脚本帮我们下载的repositories/文件夹。

接下来，和常规的sd-webui使用方式一样，我们可以把我们想要部署的模型下载到对应的文件夹里，比如checkpoint就下载到models/Stable-diffusion/目录，Lora/VAE也都放到对应的目录就行。这里需要注意的是，checkpoint一次部署就部署一个就行，如果想要部署多个模型的话，就分多次部署就行了，然后在调用API的时候通过API来区分，这样避免了模型切换带来的时间消耗，而且部署多个模型也不会增加成本。

插件也是同样的道理，放在extensions/文件夹下就行，比如我们想要部署ControlNet，可以运行：

cd stable-diffusion-webui/extentions
git clone https://github.com/Mikubill/sd-webui-controlnet.git
# 然后把ControlNet对应的模型也下载到本地文件夹里
cd sd-webui-controlnet/models
# 以下载canny模型为例
wget https://huggingface.co/lllyasviel/ControlNet-v1-1/resolve/main/control_v11p_sd15_canny.pth

到这里，环境就准备完成了，接下来可以开始定义Replicate镜像的依赖，以及写一个简单的处理请求的函数。

二、定义镜像依赖

Replicate提供的cog工具让我们无需去了解docker相关知识和编写dockerfile，我们只需要指定我们想安装的依赖就行，在stable-diffusion-webui/的目录下，新建一个cog.yaml文件，复制以下代码：

# Configuration for Cog ⚙️
# https://replicate.com/docs/guides/push-a-model
# prerequisite：https://docs.docker.com/engine/install/ubuntu/#set-up-the-repository  `dockerd` to start docker
# Reference: https://github.com/replicate/cog/blob/main/docs/yaml.md
# !!!! recommend 60G disk space for cog docker

build:
  # set to true if your model requires a GPU
  gpu: true

  # a list of ubuntu apt packages to install
  system_packages:
    - "libgl1-mesa-glx"
    - "libglib2.0-0"

  # python version in the form '3.8' or '3.8.12'
  python_version: "3.10.4"

  # a list of packages in the format <package-name>==<version>
  python_requirements: requirements.txt
  
  # commands run after the environment is setup
  run:
    # - "pip3 install torch==2.0.1 torchvision==0.15.2 --extra-index-url https://download.pytorch.org/whl/cu118"
    - "echo env is ready!"

# https://replicate.com/wolverinn/chill_watcher
image: "r8.im/wolverinn/simple-background"

# predict.py defines how predictions are run on your model
predict: "predict.py:Predictor"

可以看到在这个yaml文件里面，我们指定了要安装的python依赖requirements.txt，以及最终我们要部署的模型地址：r8.im/wolverinn/simple-background，这个是我自己的模型地址，需要进行替换，模型地址需要我们先到Replicate平台上创建一个模型，然后就能看到模型的路径了。

至于requirements.txt的内容，其实就是将sd-webui自身的依赖以及ControlNet的依赖组合一下：

torch==2.0.1
--extra-index-url https://download.pytorch.org/whl/cu118
torchvision==0.15.2
--extra-index-url https://download.pytorch.org/whl/cu118
GitPython==3.1.30
Pillow==9.5.0
accelerate==0.18.0
basicsr==1.4.2
blendmodes==2022
clean-fid==0.1.35
einops==0.4.1
fastapi==0.94.0
gfpgan==1.3.8
gradio==3.32.0
httpcore<=0.15
httpx==0.24.1
inflection==0.5.1
jsonmerge==1.8.0
kornia==0.6.7
lark==1.1.2
numpy==1.23.5
omegaconf==2.2.3
piexif==1.1.3
psutil~=5.9.5
pytorch_lightning==1.9.4
realesrgan==0.3.0
resize-right==0.0.2
safetensors==0.3.1
scikit-image==0.20.0
timm==0.6.7
tomesd==0.1.2
torchdiffeq==0.2.3
torchsde==0.2.5
transformers==4.25.1
lpips==0.1.3
gdown==4.5.1
addict
future
lmdb
opencv-python
Pillow
pyyaml
scikit-image
scipy
tb-nightly
tqdm
yapf
mediapipe
svglib
fvcore
xformers==0.0.20

到这里，镜像依赖就指定好了，接下来我们还需要编写yaml文件中所指定的predict.py文件，来告诉Replicate平台如何处理请求。

三、模型计算函数

这里就以调用img2img为例，并且会带上ControlNet的参数，其余的API也是完全相同的道理。我们在stable-diffusion-webui/目录下新建一个predict.py文件。

首先，我们要实现模型的初始化，这部分代码已经在webui.py的initialize()函数中了，我们直接调用就行，在predict.py中加入如下代码：

from cog import BasePredictor, Input, Path
import os
import time
import uuid
import base64
from io import BytesIO
from PIL import Image
os.environ["IGNORE_CMD_ARGS_ERRORS"] = "true"
from webui import initialize, img2imgapi
from modules.api.models import StableDiffusionTxt2ImgProcessingAPI, StableDiffusionImg2ImgProcessingAPI
class Predictor(BasePredictor):
    def setup(self):
        """Load the model into memory to make running multiple predictions efficient"""
        initialize()
    def predict(
        self,
        image: Path = Input(description="Image to replace background"),
        prompt: str = Input(description="prompt en", default="RAW photo, 8k uhd, dslr, soft lighting, high quality, film grain, Fujifilm XT3"),
        negative_prompt: str = Input(description="negative prompt", default="(deformed iris, deformed pupils, semi-realistic, cgi, 3d, render, sketch, cartoon, drawing, anime, mutated hands and fingers:1.4), (deformed, distorted, disfigured:1.3), poorly drawn, bad anatomy, wrong anatomy, extra limb, missing limb, floating limbs, disconnected limbs, mutation, mutated, ugly, disgusting, amputation"),
        sampler_name: str = Input(description="sampler name", default="DPM++ SDE Karras", choices=["DPM++ SDE Karras", "DPM++ 2M Karras", "DPM++ 2S a Karras", "DPM2 a Karras", "DPM2 Karras", "LMS Karras", "DPM adaptive", "DPM fast", "DPM++ SDE", "DPM++ 2M", "DPM++ 2S a", "DPM2 a", "DPM2", "Heun", "LMS", "Euler", "Euler a"]),
        seed: int = Input(description="seed", default=-1),
    ) -> Path:
        """Run a single prediction on the model"""
        img_data = Image.open(image)
        img_bytes = None
        with BytesIO() as output_bytes:
            img_data.save(output_bytes, format="PNG")
            img_bytes = output_bytes.getvalue()
        encoded_image = base64.b64encode(img_bytes).decode('utf-8')
        # A1111 payload
        payload = {
            "init_images": [encoded_image],
            "prompt": prompt,
            "negative_prompt": negative_prompt,
            "batch_size": 1,
            "steps": 20,
            "cfg_scale": 7,
            "denoising_strength": 0.75,
            "seed": seed,
            "do_not_save_samples": True,
            "sampler_name": sampler_name,
            "width": img_data.size[0],
            "height": img_data.size[1],
            "alwayson_scripts": {
                "controlnet": {
                    "args": [
                        {
                            "input_image": encoded_image,
                            "module": "canny",
                            "model": "control_v11p_sd15_canny [d14c016b]",
                            "processor_res": max(img_data.size[0], img_data.size[1]),
                            "threshold_a": 1,
                            "threshold_b": 200,
                        }
                    ]
                }
            }
        }
        req = StableDiffusionImg2ImgProcessingAPI(**payload)
        # generate
        resp = img2imgapi(req)
        cnres_img = None
        if len(resp.images) > 0:
            cnres_img = resp.images[0]
        gen_bytes = BytesIO(base64.b64decode(cnres_img))
        gen_data = Image.open(gen_bytes)
        gen_data.paste(rem_data, (0,0), mask = rem_data)
        filename = "{}.png".format(uuid.uuid1())
        gen_data.save(fp=filename, format="PNG")
        return Path(filename)

可以看到，想要让我们的API支持更多的输入参数的话，只需要在predict()函数中按照固定格式支持更多参数就行了，然后将参数传入到payload变量当中。

其中使用到的img2imgapi，其实是一个从sd-webui中的modules/api/api.py中复制过来的函数：

from modules.api.api import script_name_to_index, validate_sampler_name, encode_pil_to_base64, decode_base64_to_image
import gradio as gr

def get_script(script_name, script_runner):
    if script_name is None or script_name == "":
        return None, None
    script_idx = script_name_to_index(script_name, script_runner.scripts)
    return script_runner.scripts[script_idx]
def init_default_script_args(script_runner):
    #find max idx from the scripts in runner and generate a none array to init script_args
    last_arg_index = 1
    for script in script_runner.scripts:
        if last_arg_index < script.args_to:
            last_arg_index = script.args_to
    # None everywhere except position 0 to initialize script args
    script_args = [None]*last_arg_index
    script_args[0] = 0
    # get default values
    with gr.Blocks(): # will throw errors calling ui function without this
        for script in script_runner.scripts:
            if script.ui(script.is_img2img):
                ui_default_values = []
                for elem in script.ui(script.is_img2img):
                    ui_default_values.append(elem.value)
                script_args[script.args_from:script.args_to] = ui_default_values
    return script_args
def get_selectable_script(script_name, script_runner):
    if script_name is None or script_name == "":
        return None, None
    script_idx = script_name_to_index(script_name, script_runner.selectable_scripts)
    script = script_runner.selectable_scripts[script_idx]
    return script, script_idx
def init_script_args(request, default_script_args, selectable_scripts, selectable_idx, script_runner):
    script_args = default_script_args.copy()
    # position 0 in script_arg is the idx+1 of the selectable script that is going to be run when using scripts.scripts_*2img.run()
    if selectable_scripts:
        script_args[selectable_scripts.args_from:selectable_scripts.args_to] = request.script_args
        script_args[0] = selectable_idx + 1
    if request.alwayson_scripts:
        for alwayson_script_name in request.alwayson_scripts.keys():
            alwayson_script = get_script(alwayson_script_name, script_runner)
            if alwayson_script is None:
                raise HTTPException(status_code=422, detail=f"always on script {alwayson_script_name} not found")
            # Selectable script in always on script param check
            if alwayson_script.alwayson is False:
                raise HTTPException(status_code=422, detail="Cannot have a selectable script in the always on scripts params")
            # always on script with no arg should always run so you don't really need to add them to the requests
            if "args" in request.alwayson_scripts[alwayson_script_name]:
                # min between arg length in scriptrunner and arg length in the request
                for idx in range(0, min((alwayson_script.args_to - alwayson_script.args_from), len(request.alwayson_scripts[alwayson_script_name]["args"]))):
                    script_args[alwayson_script.args_from + idx] = request.alwayson_scripts[alwayson_script_name]["args"][idx]
    return script_args

from modules.api import models
from modules.api.models import PydanticModelGenerator, StableDiffusionTxt2ImgProcessingAPI, StableDiffusionImg2ImgProcessingAPI
from modules import scripts, ui
from modules.processing import process_images, StableDiffusionProcessingTxt2Img, StableDiffusionProcessingImg2Img
def img2imgapi(img2imgreq: models.StableDiffusionImg2ImgProcessingAPI):
    init_images = img2imgreq.init_images
    if init_images is None:
        return
    mask = img2imgreq.mask
    if mask:
        mask = decode_base64_to_image(mask)
    script_runner = scripts.scripts_img2img
    if not script_runner.scripts:
        script_runner.initialize_scripts(True)
        ui.create_ui()
    default_script_arg_img2img = []
    if not default_script_arg_img2img:
        default_script_arg_img2img = init_default_script_args(script_runner)
    selectable_scripts, selectable_script_idx = get_selectable_script(img2imgreq.script_name, script_runner)
    populate = img2imgreq.copy(update={  # Override __init__ params
        "sampler_name": validate_sampler_name(img2imgreq.sampler_name or img2imgreq.sampler_index),
        "do_not_save_samples": not img2imgreq.save_images,
        "do_not_save_grid": not img2imgreq.save_images,
        "mask": mask,
    })
    if populate.sampler_name:
        populate.sampler_index = None  # prevent a warning later on
    args = vars(populate)
    args.pop('include_init_images', None)  # this is meant to be done by "exclude": True in model, but it's for a reason that I cannot determine.
    args.pop('script_name', None)
    args.pop('script_args', None)  # will refeed them to the pipeline directly after initializing them
    args.pop('alwayson_scripts', None)
    script_args = init_script_args(img2imgreq, default_script_arg_img2img, selectable_scripts, selectable_script_idx, script_runner)
    send_images = args.pop('send_images', True)
    args.pop('save_images', None)
    p = StableDiffusionProcessingImg2Img(sd_model=shared.sd_model, **args)
    p.init_images = [decode_base64_to_image(x) for x in init_images]
    p.scripts = script_runner
    shared.state.begin()
    if selectable_scripts is not None:
        p.script_args = script_args
        processed = scripts.scripts_img2img.run(p, *p.script_args) # Need to pass args as list here
    else:
        p.script_args = tuple(script_args) # Need to pass args as tuple here
        processed = process_images(p)
    shared.state.end()
    b64images = list(map(encode_pil_to_base64, processed.images)) if send_images else []
    if not img2imgreq.include_init_images:
        img2imgreq.init_images = None
        img2imgreq.mask = None
    return models.ImageToImageResponse(images=b64images, parameters=vars(img2imgreq), info=processed.js())

将上面的内容放到webui.py中就行。

四、push镜像

最后一步，需要将我们本地的文件都上传到Replicate平台。在cog.yaml文件所在的目录下，运行：

cog login
cog push

上传完成之后，我们就可以来到Repliate网页，可以在网页版中运行我们的API，也可以参考它的文档使用代码来调用API。

最后，本文的部署代码我也放在了代码仓库中，直接复制下来，下载好模型文件、安装好docker之后，运行push镜像的命令就能部署。

我还做了一个无需编写或运行任何代码，只需要上传自己想要部署的模型，就能拥有一个自动扩缩容的stable diffusion API服务的部署方案，欢迎阅读

在Replicate平台上部署带ControlNet的stable diffusion webui API

一、环境准备

二、定义镜像依赖

三、模型计算函数

四、push镜像