1.安装python 这里选择的是python-3.8
将此选项勾选用于添加环境变量
2.安装CUDA
下载链接:CUDA Toolkit 11.2 Downloads | NVIDIA Developer
我这里下载的是CUDA11.2版本,这里请根据自己的操作系统,下载CUDA文件
安装过程这里就不赘述了,很简单,一路点击下一步,安装完成后需要重启一下计算机。
开机后需要将英伟达控制面板打开,按图中红色方框指示的步骤操作,如果出现对应的CUDA版本,则为安装成功。
3.配置Stable Diffitions所需要的环境
3.1).安装 PyTorch
这里选择的是PyTorch1.9.1+cu111(直接安装11.1版本的就可以。pytorch自带cuda包,不需要和你电脑的cuda一致,只需要你驱动能够兼容11.1)
pip install torch==1.9.1+cu111 torchvision==0.10.1+cu111 torchaudio==0.9.1 -f https://download.pytorch.org/whl/torch_stable.html
可能需要一定时间,取决于你的网速
如果通过命令行安装PyTorch失败,可以采用以下方式,如果安装成功,请直接阅读3.2
从这个链接里面找到自己要安装的版本 PyTorch安装地址
下载完成你需要的版本后,在此文件夹打开终端,你需要执行以下命令来安装PyTorch,-no-deps 后跟的参数是你下载的文件的文件名,
pip install --no-deps torch-1.13.1+cu117-cp310-cp310-win_amd64.whl
安装完成后可以用以下方式来验证安装结果
import torch
print(torch.__version__)
print(torch.cuda.is_available())
3.2).安装所需环境中的其他包
//安装扩散器
pip install diffusers
//安装转换器模型,用于执行不同模式(如文本、视觉和音频)的任务。
pip install transformers
//Accelerate 是为喜欢编写 PyTorch 模型的训练循环但不愿意编写和维护使用多 GPU/TPU/fp16 所需的样板代码的 PyTorch 用户创建的
pip install accelerate
4.运行Stable Diffitions 图生图代码
将如下代码保存在img2img.py中,并执行python img2img.py(记得替换输入图片的路径哦),就可以生成图片啦
import torch
from PIL import Image
from diffusers import StableDiffusionImg2ImgPipeline
# load the pipeline
pipe = StableDiffusionImg2ImgPipeline.from_pretrained(
"runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16)
# or download via git clone https://huggingface.co/runwayml/stable-diffusion-v1-5
# and pass `model_id_or_path="./stable-diffusion-v1-5"`.
pipe = pipe.to("cuda")
## 将此图片路径换成你本机已有的图片路径
init_img = Image.open("./inputs/2022-12-30/DEF8D9C80B.png").convert("RGB")
init_img = init_img.resize((768, 1024))
prompt = "A cute cat"
images = pipe(prompt=prompt, image=init_img, strength=0.75, guidance_scale=7.5).images
images[0].save("fantasy_landscape.png")
生成的前后的图片对比~(放大了图片尺寸,提高了质量与像素)
下面是StableDiffusionImg2ImgPipeline支持的所有参数,大家可以参考链接Stable diffusion pipelines
- prompt ( or ) — The prompt or prompts to guide the image generation.
str``List[str] - init_image ( or ) — , or tensor representing an image batch, that will be used as the starting point for the process.
torch.FloatTensor``PIL.Image.Image``Image - strength (, optional, defaults to 0.8) — Conceptually, indicates how much to transform the reference . Must be between 0 and 1. will be used as a starting point, adding more noise to it the larger the . The number of denoising steps depends on the amount of noise initially added. When is 1, added noise will be maximum and the denoising process will run for the full number of iterations specified in . A value of 1, therefore, essentially ignores .
float``init_image``init_image``strength``strength``num_inference_steps``init_image - num_inference_steps (, optional, defaults to 50) — The number of denoising steps. More denoising steps usually lead to a higher quality image at the expense of slower inference. This parameter will be modulated by .
int``strength - guidance_scale (, optional, defaults to 7.5) — Guidance scale as defined in Classifier-Free Diffusion Guidance. is defined as of equation 2. of Imagen Paper. Guidance scale is enabled by setting . Higher guidance scale encourages to generate images that are closely linked to the text , usually at the expense of lower image quality.
float``guidance_scale``w``guidance_scale > 1``prompt - negative_prompt ( or , optional) — The prompt or prompts not to guide the image generation. Ignored when not using guidance (i.e., ignored if is less than ).
str``List[str]``guidance_scale``1 - num_images_per_prompt (, optional, defaults to 1) — The number of images to generate per prompt.
int - eta (, optional, defaults to 0.0) — Corresponds to parameter eta (η) in the DDIM paper: arxiv.org/abs/2010.02…. Only applies to schedulers.DDIMScheduler, will be ignored for others.
float - generator (, optional) — A torch generator to make generation deterministic.
torch.Generator - output_type (, optional, defaults to ) — The output format of the generate image. Choose between PIL: or .
str``"pil"``PIL.Image.Image``np.array - return_dict (, optional, defaults to ) — Whether or not to return a StableDiffusionPipelineOutput instead of a plain tuple.
bool``True - callback (, optional) — A function that will be called every steps during inference. The function will be called with the following arguments: .
Callable``callback_steps``callback(step: int, timestep: int, latents: torch.FloatTensor) - callback_steps (, optional, defaults to 1) — The frequency at which the function will be called. If not specified, the callback will be called at every step.
int``callback