基于 PPDiffusers 训练DreamBooth LoRA微调生成中国山水画风格【livingbody/Chinese_ShanShui_Style】
根据 PPDiffusers 练习DreamBooth LoRA微调生成我国山水画风格【livingbody/Chinese_ShanShui_Style】

教程将从以下两个方面带领咱们了解整个流程。

  • 1. 预备工作
    • 1.1 环境装置
    • 1.2 Hugging Face Space 注册和登录
  • 2. 怎么练习
    • 2.1 上传图片
    • 2.2 练习参数调整
    • 2.3 选择满意的权重上传至Huggingface
    • 2.4 再生成一张

1. 预备工作

1.1 环境装置


在开始之前,咱们需求预备咱们所需的环境,运转下面的指令装置依靠。为了确保装置成功,装置结束请重启内核!(留意:这儿只需求运转一次!)

基于 PPDiffusers 训练DreamBooth LoRA微调生成中国山水画风格【livingbody/Chinese_ShanShui_Style】

pip install "paddlenlp>=2.5.2" "ppdiffusers>=0.11.1" safetensors --user
# 请运转这儿装置所需求的依靠环境!!
!pip install "paddlenlp>=2.5.2" safetensors "ppdiffusers>=0.11.1" --user
from IPython.display import clear_output
clear_output() # 整理很长的内容

1.2 Hugging Face Space 注册和登录

标题要求将模型上传到 Hugging Face,需求先注册、登录。

  • 注册和登录:huggingface.co/join

    基于 PPDiffusers 训练DreamBooth LoRA微调生成中国山水画风格【livingbody/Chinese_ShanShui_Style】
  • 获取登录 Token

    基于 PPDiffusers 训练DreamBooth LoRA微调生成中国山水画风格【livingbody/Chinese_ShanShui_Style】
    基于 PPDiffusers 训练DreamBooth LoRA微调生成中国山水画风格【livingbody/Chinese_ShanShui_Style】
    基于 PPDiffusers 训练DreamBooth LoRA微调生成中国山水画风格【livingbody/Chinese_ShanShui_Style】
  • Aistudio 登录 Huggingface Hub

    基于 PPDiffusers 训练DreamBooth LoRA微调生成中国山水画风格【livingbody/Chinese_ShanShui_Style】

Tips:为了便利咱们之后上传权重,咱们需求登录 Huggingface Hub,想要了解更多的信息咱们能够查阅 官方文档。

!git config --global credential.helper store
from huggingface_hub import login
login()
VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

2. 怎么练习模型,并上传到HF

数据集运用的是我国山水画

基于 PPDiffusers 训练DreamBooth LoRA微调生成中国山水画风格【livingbody/Chinese_ShanShui_Style】

2.1 上传图片

# 解压缩数据集
!unzip -qoa data/data107231/Chinese_art_dataset.zip -d Chinese_art_dataset
!cp Chinese_art_dataset/Chinese_art_dataset/style_images/shanshui*  train_dataset/

2.2 练习参数调整

在练习进程中,咱们能够测验修正练习的默认参数,下面将从三个方面介绍部分参数。

首要修正的参数:

  • pretrained_model_name_or_path :想要练习的模型称号或者本地路径的模型,例如:"runwayml/stable-diffusion-v1-5",更多模型可参阅 PaddleNLP 文档。
  • instance_data_dir:练习图片所在的文件夹目录,咱们能够将图片上传至aistudio项目。
  • instance_prompt:练习所运用的 Prompt 文本。
  • resolution:练习时图像的分辨率,建议为 512
  • output_dir:练习进程中,模型保存的目录。
  • checkpointing_steps:每隔多少步保存模型,默认为100步。
  • learning_rate:练习运用的学习率,当我运用 LoRA 练习模型的时分,咱们需求运用更大的学习率,因此咱们这儿运用 1e-4 而不是 2e-6
  • max_train_steps:最大练习的步数,默认为500步。

可选修正的参数:

  • train_batch_size:练习时分运用的 batch_size,当咱们的GPU显存比较大的时分能够加大这个值,默认值为4
  • gradient_accumulation_steps:梯度累积的步数,当咱们GPU显存比较小的时分还想模仿大的练习批次,咱们能够恰当增加梯度累积的步数,默认值为1
  • seed:随机种子,设置后能够复现练习成果。
  • lora_rankLoRA 层的 rank 值,默认值为4,终究咱们会得到 3.5MB 的模型,咱们能够恰当修正这个值,如:32、64、128、256 等。
  • lr_scheduler:学习率衰减战略,能够是"linear", "constant", "cosine"等。
  • lr_warmup_steps:学习率衰减前,warmup 到最大学习率所需求的步数。

练习进程中评价运用的参数:

  • num_validation_images:练习的进程中,咱们希望回来多少张图片,默认值为4张图片。
  • validation_prompt:练习的进程中咱们会评价练习的怎么样,因此咱们需求设置评价运用的 prompt 文本。
  • validation_steps:每隔多少个 steps 评价模型,咱们能够查看练习的进度条,知道当前到了第几个 steps

Tips: 练习进程中会每隔 validation_steps 将生成的图片保存到 {你指定的输出路径}/validation_images/{步数}.jpg

基于 PPDiffusers 训练DreamBooth LoRA微调生成中国山水画风格【livingbody/Chinese_ShanShui_Style】

权重上传的参数:

  • push_to_hub: 是否将模型上传到 huggingface hub,默认值为 False
  • hub_token: 上传到 huggingface hub 所需求运用的 token,假如咱们现已登录了,那么咱们就无需填写。
  • hub_model_id: 上传到 huggingface hub 的模型库称号, 假如为 None 的话表示咱们将运用 output_dir 的称号作为模型库称号。

在下面的例子中,由于咱们前面现已登录了,因此咱们能够开启 push_to_hub 按钮,将终究练习好的模型同步上传到 huggingface.co

当咱们开启push_to_hub后,等候程序运转结束后会主动将权重上传到这个路径 huggingface.co/{你的用户名}/{你指… ,例如: huggingface.co/junnyu/lora…

!python train_dreambooth_lora.py \
    --pretrained_model_name_or_path="runwayml/stable-diffusion-v1-5"  \
    --instance_data_dir="train_dataset" \
    --output_dir="lora_outputs" \
    --instance_prompt="Chinese_ShanShui_Style" \
    --resolution=512 \
    --train_batch_size=2 \
    --gradient_accumulation_steps=1 \
    --checkpointing_steps=100 \
    --learning_rate=1e-4 \
    --lr_scheduler="constant" \
    --lr_warmup_steps=0 \
    --max_train_steps=800 \
    --seed=0 \
    --lora_rank=4 \
    --push_to_hub=False \
    --validation_prompt="A little black cat is playing in the woods with Chinese_ShanShui_Style" \
    --validation_steps=100 \
    --num_validation_images=4
W0323 16:10:06.002939  5675 gpu_resources.cc:85] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 11.2
W0323 16:10:06.007860  5675 gpu_resources.cc:115] device: 0, cuDNN Version: 8.2.
正在下载模型权重,请耐性等候。。。。。。。。。。
[33m[2023-03-23 16:10:08,262] [ WARNING][0m - You are using a model of type clip_text_model to instantiate a model of type . This is not supported for all configurations of models and can yield errors.[0m
Train Steps:  12%|█       | 100/800 [00:57<06:29,  1.80it/s, epoch=0016, lr=0.0001, step_loss=0.261]
 Saved lora weights to lora_outputs/checkpoint-100
100%|███████████████████████████████████████████| 601/601 [00:00<00:00, 171kB/s][A
100%|███████████████████████████████████████████| 342/342 [00:00<00:00, 113kB/s][A
Train Steps:  25%|█▊     | 200/800 [02:16<05:41,  1.76it/s, epoch=0033, lr=0.0001, step_loss=0.0311]
 Saved lora weights to lora_outputs/checkpoint-200
Train Steps:  38%|███     | 300/800 [03:35<04:38,  1.80it/s, epoch=0049, lr=0.0001, step_loss=0.113]
 Saved lora weights to lora_outputs/checkpoint-300
Train Steps:  50%|████    | 400/800 [04:53<03:44,  1.78it/s, epoch=0066, lr=0.0001, step_loss=0.118]
 Saved lora weights to lora_outputs/checkpoint-400
Train Steps:  62%|█████   | 500/800 [06:11<02:50,  1.76it/s, epoch=0083, lr=0.0001, step_loss=0.167]
 Saved lora weights to lora_outputs/checkpoint-500
Train Steps:  75%|██████▊  | 600/800 [07:30<01:52,  1.78it/s, epoch=0099, lr=0.0001, step_loss=0.11]
 Saved lora weights to lora_outputs/checkpoint-600
Train Steps:  88%|█████▎| 700/800 [08:49<00:56,  1.78it/s, epoch=0116, lr=0.0001, step_loss=0.00746]
 Saved lora weights to lora_outputs/checkpoint-700
Train Steps: 100%|███████| 800/800 [10:08<00:00,  1.74it/s, epoch=0133, lr=0.0001, step_loss=0.0411]
 Saved lora weights to lora_outputs/checkpoint-800
Model weights saved in lora_outputs/paddle_lora_weights.pdparams
Train Steps: 100%|███████| 800/800 [11:05<00:00,  1.20it/s, epoch=0133, lr=0.0001, step_loss=0.0411]
[0m

基于 PPDiffusers 训练DreamBooth LoRA微调生成中国山水画风格【livingbody/Chinese_ShanShui_Style】

2.3 选择满意的权重上传至Huggingface

参数解释:

  • upload_dir:咱们需求上传的文件夹目录。
  • repo_name:咱们需求上传的repo称号,终究咱们会上传到 huggingface.co/{你的用户名}/{你指… 例如: huggingface.co/junnyu/lora….
  • pretrained_model_name_or_path:练习该模型所运用的基础模型。
  • prompt:调配该权重需求运用的Prompt文本。
from utils import upload_lora_folder
upload_dir                    = "lora_outputs"                   # 咱们需求上传的文件夹目录
repo_name                     = "Chinese_ShanShui_Style"                  # 咱们需求上传的repo称号
pretrained_model_name_or_path = "runwayml/stable-diffusion-v1-5" # 练习该模型所运用的基础模型
prompt                        = "Chinese_ShanShui_Style" # 调配该权重需求运用的Prompt文本
upload_lora_folder(
    upload_dir=upload_dir,
    repo_name=repo_name,
    pretrained_model_name_or_path=pretrained_model_name_or_path,
    prompt=prompt, 
)
Pushing to livingbody/Chinese_ShanShui_Style
Upload 1 LFS files:   0%|          | 0/1 [00:00<?, ?it/s]
paddle_lora_weights.pdparams:   0%|          | 0.00/3.23M [00:00<?, ?B/s]

2.4 再生成一张

from ppdiffusers import DiffusionPipeline, DPMSolverMultistepScheduler
import paddle
pipe = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5")
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
pipe.unet.load_attn_procs("lora_outputs/", from_hf_hub=True)
prompt = "2 man are walking in the woods with Chinese_ShanShui_Style"
image = pipe(prompt, num_inference_steps=25).images[0]
image.save("demo.png")
[2023-03-23 17:07:14,171] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/model_index.json
[2023-03-23 17:07:14,176] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/vae/model_state.pdparams
[2023-03-23 17:07:14,179] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/vae/config.json
[2023-03-23 17:07:14,870] [    INFO] - Found /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/safety_checker/config.json
[2023-03-23 17:07:14,875] [    INFO] - loading configuration file /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/safety_checker/config.json
[2023-03-23 17:07:14,878] [    INFO] - Model config CLIPVisionConfig {
  "architectures": [
    "StableDiffusionSafetyChecker"
  ],
  "attention_dropout": 0.0,
  "dropout": 0.0,
  "hidden_act": "quick_gelu",
  "hidden_size": 1024,
  "image_size": 224,
  "initializer_factor": 1.0,
  "initializer_range": 0.02,
  "intermediate_size": 4096,
  "layer_norm_eps": 1e-05,
  "model_type": "clip_vision_model",
  "num_attention_heads": 16,
  "num_channels": 3,
  "num_hidden_layers": 24,
  "paddlenlp_version": null,
  "patch_size": 14,
  "projection_dim": 768,
  "return_dict": true
}
[2023-03-23 17:07:14,987] [    INFO] - Found /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/safety_checker/model_state.pdparams
[2023-03-23 17:07:17,520] [    INFO] - All model checkpoint weights were used when initializing StableDiffusionSafetyChecker.
[2023-03-23 17:07:17,525] [    INFO] - All the weights of StableDiffusionSafetyChecker were initialized from the model checkpoint at runwayml/stable-diffusion-v1-5/safety_checker.
If your task is similar to the task the model of the checkpoint was trained on, you can already use StableDiffusionSafetyChecker for predictions without further training.
[2023-03-23 17:07:17,531] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/tokenizer/vocab.json
[2023-03-23 17:07:17,533] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/tokenizer/merges.txt
[2023-03-23 17:07:17,536] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/tokenizer/added_tokens.json
[2023-03-23 17:07:17,538] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/tokenizer/special_tokens_map.json
[2023-03-23 17:07:17,541] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/tokenizer/tokenizer_config.json
[2023-03-23 17:07:17,724] [    INFO] - Found /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/text_encoder/config.json
[2023-03-23 17:07:17,728] [    INFO] - loading configuration file /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/text_encoder/config.json
[2023-03-23 17:07:17,731] [    INFO] - Model config CLIPTextConfig {
  "_name_or_path": "openai/clip-vit-large-patch14",
  "architectures": [
    "CLIPTextModel"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 0,
  "dropout": 0.0,
  "eos_token_id": 2,
  "hidden_act": "quick_gelu",
  "hidden_size": 768,
  "initializer_factor": 1.0,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-05,
  "max_position_embeddings": 77,
  "model_type": "clip_text_model",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 1,
  "paddlenlp_version": null,
  "projection_dim": 512,
  "return_dict": true,
  "torch_dtype": "float32",
  "transformers_version": "4.21.0.dev0",
  "vocab_size": 49408
}
[2023-03-23 17:07:17,891] [    INFO] - Found /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/text_encoder/model_state.pdparams
[2023-03-23 17:07:18,926] [    INFO] - All model checkpoint weights were used when initializing CLIPTextModel.
[2023-03-23 17:07:18,930] [    INFO] - All the weights of CLIPTextModel were initialized from the model checkpoint at runwayml/stable-diffusion-v1-5/text_encoder.
If your task is similar to the task the model of the checkpoint was trained on, you can already use CLIPTextModel for predictions without further training.
[2023-03-23 17:07:18,936] [    INFO] - Found /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/feature_extractor/preprocessor_config.json
[2023-03-23 17:07:18,940] [    INFO] - loading configuration file https://bj.bcebos.com/paddlenlp/models/community/runwayml/stable-diffusion-v1-5/feature_extractor/preprocessor_config.json from cache at /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/feature_extractor/preprocessor_config.json
[2023-03-23 17:07:18,943] [    INFO] - size should be a dictionary on of the following set of keys: ({'width', 'height'}, {'shortest_edge'}, {'shortest_edge', 'longest_edge'}), got 224. Converted to {'shortest_edge': 224}.
[2023-03-23 17:07:18,946] [    INFO] - crop_size should be a dictionary on of the following set of keys: ({'width', 'height'}, {'shortest_edge'}, {'shortest_edge', 'longest_edge'}), got 224. Converted to {'height': 224, 'width': 224}.
[2023-03-23 17:07:18,949] [    INFO] - Image processor CLIPFeatureExtractor {
  "crop_size": {
    "height": 224,
    "width": 224
  },
  "do_center_crop": true,
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "feature_extractor_type": "CLIPFeatureExtractor",
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "CLIPFeatureExtractor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "shortest_edge": 224
  }
}
[2023-03-23 17:07:18,951] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/unet/model_state.pdparams
[2023-03-23 17:07:18,954] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/unet/config.json
[2023-03-23 17:07:28,517] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/scheduler/scheduler_config.json
  0%|          | 0/25 [00:00<?, ?it/s]

基于 PPDiffusers 训练DreamBooth LoRA微调生成中国山水画风格【livingbody/Chinese_ShanShui_Style】

基于 PPDiffusers 训练DreamBooth LoRA微调生成中国山水画风格【livingbody/Chinese_ShanShui_Style】

代码如下:aistudio.baidu.com/aistudio/pr…

本文正在参加「金石计划」