Skip to content

Commit

Permalink
[PPDiffusers] ppdiffuser LDM weight to original LDM weight script (#3809
Browse files Browse the repository at this point in the history
)

* PPDiffusers版的LDM权重转换为原版LDM权重

* typo

* update args

Co-authored-by: gongenlei <gongel@qq.com>
  • Loading branch information
JunnYu and gongenlei authored Nov 19, 2022
1 parent de3509c commit cd78a9a
Show file tree
Hide file tree
Showing 6 changed files with 395 additions and 14 deletions.
13 changes: 9 additions & 4 deletions ppdiffusers/examples/text_to_image_laion400m/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,18 +2,23 @@

本教程带领大家如何开启32层的**Latent Diffusion Model**的训练(支持切换`中文``英文`分词器)。

___注意___:
___官方32层`CompVis/ldm-text2im-large-256`的Latent Diffusion Model使用的是vae,而不是vqvae!而Huggingface团队在设计目录结构的时候把文件夹名字错误的设置成了vqvae!为了与Huggingface团队保持一致,我们同样使用了vqvae文件夹命名!___

## 1 本地运行
### 1.1 安装依赖

在运行这个训练代码前,我们需要安装下面的训练依赖。

___注意___:
___当前这部分的代码需要使用develop分支的paddlenlp以及develop分支的ppdiffusers才可以正常运行!!!!___

```bash
# 安装cuda11.2, python 3.7, develop版本的paddle, commit号为b96a21df4e7a42b2445104426e2be407534705e6.
wget https://paddlenlp.bj.bcebos.com/models/community/CompVis/paddlepaddle_gpu-0.0.0.post112-cp37-cp37m-linux_x86_64.whl
pip install paddlepaddle_gpu-0.0.0.post112-cp37-cp37m-linux_x86_64.whl
# 安装指定版本的 paddlenlp 和 ppdiffusers.
pip install paddlenlp==2.4.2 ppdiffusers==0.6.2
pip install -U visualdl fastcore Pillow
# 注意当前该部分的训练需要使用develop分支的paddlenlp和develop分支的ppdiffusers。
pip install -U paddlenlp ppdiffusers visualdl fastcore Pillow
```

### 1.2 准备数据
Expand Down Expand Up @@ -239,7 +244,7 @@ python generate_pipelines.py \
```shell
├── ldm_pipelines # 我们指定的输出文件路径
├── model_index.json # 模型index文件
├── vqvae # vae权重文件夹
├── vqvae # vae权重文件夹!实际是vae模型,文件夹名字与HF保持了一致!
├── model_state.pdparams
├── config.json
├── bert # ldmbert权重文件夹
Expand Down
10 changes: 4 additions & 6 deletions ppdiffusers/examples/text_to_image_laion400m/ldm/ldm_args.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,17 +61,15 @@ class DataArguments:
"""
Arguments pertaining to what data we are going to input our model for training.
"""
file_list: Optional[str] = field(
default="./data/filelist/train.filelist.list",
metadata={"help": "The name of the file_list."})
resolution: Optional[str] = field(
file_list: str = field(default="./data/filelist/train.filelist.list",
metadata={"help": "The name of the file_list."})
resolution: int = field(
default=256,
metadata={
"help":
"The resolution for input images, all the images in the train/validation dataset will be resized to this resolution."
})
num_records: Optional[str] = field(default=10000000,
metadata={"help": "num_records"})
num_records: int = field(default=10000000, metadata={"help": "num_records"})
buffer_size: int = field(
default=100,
metadata={"help": "Buffer size"},
Expand Down
2 changes: 1 addition & 1 deletion ppdiffusers/examples/text_to_image_laion400m/ldm/model.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ def __init__(self, model_args):

# init vae
vae_name_or_path = model_args.vae_name_or_path if model_args.pretrained_model_name_or_path is None else os.path.join(
model_args.pretrained_model_name_or_path, "vae")
model_args.pretrained_model_name_or_path, "vqvae")
self.vae = AutoencoderKL.from_pretrained(vae_name_or_path)
freeze_params(self.vae.parameters())
logger.info("Freeze vae parameters!")
Expand Down
45 changes: 42 additions & 3 deletions ppdiffusers/examples/text_to_image_laion400m/scripts/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,10 @@
# LDM原版Pytorch权重转换为PPDiffusers权重
# LDM权重转换脚本
本目录下包含了两个脚本文件:
- **convert_orig_ldm_ckpt_to_ppdiffusers.py**: LDM原版Pytorch权重转换为PPDiffusers版LDM权重。
- **convert_ppdiffusers_to_orig_ldm_ckpt.py**: PPDiffusers版的LDM权重转换为原版LDM权重。

## 1. 转换权重
## 1. LDM原版Pytorch权重转换为PPDiffusers版LDM权重
### 1.1 转换权重
假设已经有了原版权重`"ldm_1p4b_init0.ckpt"`
```bash
python convert_orig_ldm_ckpt_to_ppdiffusers.py \
Expand All @@ -9,7 +13,7 @@ python convert_orig_ldm_ckpt_to_ppdiffusers.py \
--original_config_file text2img_L32H1280_unet800M.yaml
```

## 2. 推理预测
### 1.2 推理预测
```python
import paddle
from ppdiffusers import LDMTextToImagePipeline
Expand All @@ -19,3 +23,38 @@ prompt = "a blue tshirt"
image = pipe(prompt, guidance_scale=7.5)[0][0]
image.save("demo.jpg")
```

## 2. PPDiffusers版的LDM权重转换为原版LDM权重
### 2.1 转换权重
假设我们已经使用 `../generate_pipelines.py`生成了`ldm_pipelines`目录。
```shell
├── ldm_pipelines # 我们指定的输出文件路径
├── model_index.json # 模型index文件
├── vqvae # vae权重文件夹!实际是vae模型,文件夹名字与HF保持了一致!
├── model_state.pdparams
├── config.json
├── bert # ldmbert权重文件夹
├── model_config.json
├── model_state.pdparams
├── unet # unet权重文件夹
├── model_state.pdparams
├── config.json
├── scheduler # ddim scheduler文件夹
├── scheduler_config.json
├── tokenizer # bert tokenizer文件夹
├── tokenizer_config.json
├── special_tokens_map.json
├── vocab.txt
```

```bash
python convert_ppdiffusers_to_orig_ldm_ckpt.py \
--model_name_or_path ./ldm_pipelines \
--dump_path ldm_19w.ckpt
```

### 2.2 推理预测
使用`CompVis`[原版txt2img.py](https://github.com/CompVis/latent-diffusion/blob/main/scripts/txt2img.py)脚本生成图片。
```shell
python ./txt2img.py --prompt "a blue t shirt" --ddim_eta 0.0 --n_samples 1 --n_iter 1 --scale 7.5 --ddim_steps 50
```
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
# Copyright 2022 The HuggingFace Inc. team.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
Expand Down
Loading

0 comments on commit cd78a9a

Please sign in to comment.