PaddlePaddle · juncaipeng · Apr 23, 2023 · Mar 18, 2023 · Mar 31, 2023 · Mar 31, 2023
diff --git a/contrib/CrossPseudoSupervision/README.md b/contrib/CrossPseudoSupervision/README.md
@@ -0,0 +1,148 @@
+English | [简体中文](README_CN.md)
+
+# Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision
+
+Unlike image classification tasks, **data annotation is relatively difficult and costly for semantic segmentation tasks**. Each pixel in the image is required to have a label, including objects with fine details, such as electric poles. Compared with the dense annotation of pixels, obtaining raw RGB data is relatively simpler. Thus, **how to make use of the large amount of unlabeled data to improve the performance of the model is a research hotspot in the field of semi-supervised semantic segmentation**.
+
+[Cross pseudo supervision, CPS](https://arxiv.org/abs/2106.01226) is a **concise and high-performance** semi-supervised semantic segmentation algorithm. During training, two networks with the same structure but different initial states are used, and constraints are added to ensure that the outputs of the two networks for the same sample are similar. Specifically, the one-hot pseudo labels generated by one network will serve as the target for training another network. The cross entropy loss is used to make the supervision, just like the traditional supervised learning strategy used in semantic segmentation tasks. **This algorithm has achieved state-of-the-art (SOTA) results in two commonly-used benchmarks (PASCAL VOC, Cityscapes)**.
+
+Some visualization results are as follows (RGB image on the left, prediction image in the middle, and ground-truth segmentation map on the right):
+
+![](https://user-images.githubusercontent.com/52785738/229003524-103fb081-dd36-4b19-b070-156d58467fe2.png)
+
+![](https://user-images.githubusercontent.com/52785738/229003602-05cb2be1-8224-4600-8f6a-1ec58b909e47.png)
+
+## Contents
+- [Installation](#Installation)
+- [Models](#Models)
+- [Dataset Preparation](#Dataset-Preparation)
+- [Training, Evaluation and Prediction](#Training-Evaluation-and-Prediction)
+
+## Installation
+
+- [PaddlePaddle Installation](https://www.paddlepaddle.org.cn/install/quick)
+    - Versions：PaddlePaddle develop (Nightly build), Python>=3.7
+
+- To install PaddleSeg, use the following commands:
+
+```shell
+git clone -b develop https://github.com/PaddlePaddle/PaddleSeg
+cd PaddleSeg
+pip install -r requirements.txt
+pip install -v -e .
+```
+
+## Models
+
+We chose to reproduce the CPS.resnet50.deeplabv3+(1/2 Cityscapes) setting in the original paper, where 1/2 of the samples in the Cityscapes dataset are labeled. The mIoU is **78.39%**. The comparison is shown in the following table:
+
+| CPS.resnet50.deeplabv3+(1/2 Cityscapes) | mIOU |
+| --- | --- |
+| original paper | 78.77% |
+| reproduced | 78.39% |
+
+Please download the pretrained weights from [this link](https://paddleseg.bj.bcebos.com/dygraph/cross_pseudo_supervision/cityscapes/deeplabv3p_resnet50_cityscapes0.5.pdparams).
+
+## Data Preparation
+
+The Cityscapes dataset was provided by the CPS source code. Download the dataset `city` to the `contrib/CrossPseudoService/data` folder from [OneDrive link](https://pkueducn-my.sharepoint.com/:f:/g/personal/pkucxk_pku_edu_cn/EtjNKU0oVMhPkOKf9HTPlVsBIHYbACel6LSvcUeP4MXWVg?e=139icd).
+
+The dataset should be organized as follows:
+
+```
+data/
+|-- city
+    ├── config_new
+    │    ├── coarse_split
+    │    │   ├── train_extra_3000.txt
+    │    │   ├── train_extra_6000.txt
+    │    │   └── train_extra_9000.txt
+    │    ├── subset_train
+    │    │   ├── train_aug_labeled_1-16.txt
+    │    │   ├── train_aug_labeled_1-2.txt
+    │    │   ├── train_aug_labeled_1-4.txt
+    │    │   ├── train_aug_labeled_1-8.txt
+    │    │   ├── train_aug_unlabeled_1-16.txt
+    │    │   ├── train_aug_unlabeled_1-2.txt
+    │    │   ├── train_aug_unlabeled_1-4.txt
+    │    │   └── train_aug_unlabeled_1-8.txt
+    │    ├── test.txt
+    │    ├── train.txt
+    │    ├── train_val.txt
+    │    └── val.txt  
+    ├── generate_colored_gt.py
+    ├── images
+    │   ├── test
+    │   ├── train
+    │   └── val
+    └── segmentation
+        ├── test
+        ├── train
+        └── val
+```
+
+## Training, Evaluation and Prediction
+
+Execute the following command to enter the `CrossPseudoSupervision` folder:
+```shell
+cd ./contrib/CrossPseudoSupervision
+```
+
+### Training
+
+After preparing the environment and data, execute the following command to launch training:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0
+python train.py --config ./configs/deeplabv3p/deeplabv3p_resnet50_0.5cityscapes_800x800_240e.yml --log_iters 10 --save_dir ./output/ --batch_size 2
+```
+
+We recommend training the model with multiple GPUs on a single machine. Execute the following command to start training with four GPUs:
+
+```shell
+python -m paddle.distributed.launch --gpus="0,1,2,3" train.py --config ./configs/deeplabv3p/deeplabv3p_resnet50_0.5cityscapes_800x800_240e.yml \
+--log_iters 10 --save_dir $SAVE_PATH$ --batch_size 8
+```
+
+- `SAVE_PATH`: The path to save files such as weights and logs.
+
+**Note**:
+1. The default configuration uses 1/2 labeled data. If you want to change the labeled ratio, you can modify `labeled_ratio` parameter in the configuration file. When you change the ratio of labeled data, the number of training epochs also needs to be adjusted according to the following table (modify `nepochs` parameter in the configuration file to adjust the training epochs):
+
+| Ratio    | 1/16 | 1/8  | 1/4  | 1/2  |
+| ---------- | ---- | ---- | ---- | ---- |
+| nepochs | 128  | 137  | 160  | 240  |
+
+
+### Evaluation
+
+After training, execute the following commands to evaluate the model accuracy:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0
+python val.py \
+       --config ./configs/deeplabv3p/deeplabv3p_resnet50_0.5cityscapes_800x800_240e.yml \
+       --model_path $MODEL_PATH$
+```
+
+- `MODEL_PATH`: The path of model weights to load.
+
+### Prediction
+
+Execute the following commands to use sliding window inference for prediction:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0
+python predict.py \
+       --config ./configs/deeplabv3p/deeplabv3p_resnet50_0.5cityscapes_800x800_240e.yml \
+       --model_path $MODEL_PATH$ \
+       --image_path $IMG_PATH$ \
+       --save_dir $SAVE_PATH$ \
+       --is_slide \
+       --crop_size 800 800 \
+       --stride 532 532
+```
+
+- `IMG_PATH`: The path of the picture or folder to be predicted.
+
+You can also download the [pretrained weights](https://paddleseg.bj.bcebos.com/dygraph/cross_pseudo_supervision/cityscapes/deeplabv3p_resnet50_cityscapes0.5.pdparams) provided by us for prediction.
diff --git a/contrib/CrossPseudoSupervision/README_CN.md b/contrib/CrossPseudoSupervision/README_CN.md
@@ -0,0 +1,150 @@
+简体中文 | [English](README.md)
+
+# CPS: 基于交叉伪监督的半监督语义分割
+
+不同于图像分类任务，**数据的标注对于语义分割任务来说是比较困难且成本高昂的**。图像中的每个像素都需要有一个标签，包括一些特别细节的物体，如电线杆等。与对像素的密集标注相比，获取原始RGB数据相对简单。因此，**如何利用大量的无标注数据提升模型的性能，是半监督语义分割领域的研究热点**。
+
+[Cross pseudo supervision, CPS](https://arxiv.org/abs/2106.01226)是一种**简洁而高性能**的半监督语义分割任务算法。在训练时，使用两个相同结构、但是初始化状态不同的网络，添加约束**使得两个网络对同一样本的输出是相似的**。具体来说，一个网络生成的one-hot伪标签将作为训练另一个网络的目标。这个过程可以用交叉熵损失函数监督，就像传统的监督学习语义分割任务的一样。**该算法在在两个benchmark (PASCAL VOC, Cityscapes) 都取得了最先进的结果**。
+
+部分可视化结果如下（左边为RGB图像，中间为预测图，右边为真值）:
+
+![](https://user-images.githubusercontent.com/52785738/229003524-103fb081-dd36-4b19-b070-156d58467fe2.png)
+
+![](https://user-images.githubusercontent.com/52785738/229003602-05cb2be1-8224-4600-8f6a-1ec58b909e47.png)
+
+
+
+## 目录
+- [环境配置](#环境配置)
+- [模型](#模型)
+- [数据准备](#数据准备)
+- [训练评估预测](#训练评估预测)
+
+## 环境配置
+
+
+- [PaddlePaddle安装](https://www.paddlepaddle.org.cn/install/quick)
+    - 版本要求：PaddlePaddle develop (Nightly build), Python>=3.7
+
+- PaddleSeg安装，通过以下命令：
+
+```shell
+git clone -b develop https://github.com/PaddlePaddle/PaddleSeg
+cd PaddleSeg
+pip install -r requirements.txt
+pip install -v -e .
+```
+
+## 模型
+
+本项目的默认配置重现原始论文中的 CPS.resnet50.deeplabv3+(1/2 Cityscapes) 配置，其中使用50%的带标注样本，复现模型的 mIoU 为 **78.39%**。本项目复现结果与原论文结果对比如下表所示：：
+
+| CPS.resnet50.deeplabv3+(1/2 Cityscapes) | mIOU |
+| --- | --- |
+| original paper | 78.77% |
+| reproduced | 78.39% |
+
+请在[此链接](https://paddleseg.bj.bcebos.com/dygraph/cross_pseudo_supervision/cityscapes/deeplabv3p_resnet50_cityscapes0.5.pdparams)下载预训练权重。
+
+## 数据准备
+
+使用CPS源代码所提供的Cityscapes数据集，通过[OneDrive链接](https://pkueducn-my.sharepoint.com/:f:/g/personal/pkucxk_pku_edu_cn/EtjNKU0oVMhPkOKf9HTPlVsBIHYbACel6LSvcUeP4MXWVg?e=139icd)下载`city`数据集， 并将数据集`city`放至`contrib/CrossPseudoSupervision/data`文件夹下，准备好的数据组织如下：
+
+```
+data/
+|-- city
+    ├── config_new
+    │    ├── coarse_split
+    │    │   ├── train_extra_3000.txt
+    │    │   ├── train_extra_6000.txt
+    │    │   └── train_extra_9000.txt
+    │    ├── subset_train
+    │    │   ├── train_aug_labeled_1-16.txt
+    │    │   ├── train_aug_labeled_1-2.txt
+    │    │   ├── train_aug_labeled_1-4.txt
+    │    │   ├── train_aug_labeled_1-8.txt
+    │    │   ├── train_aug_unlabeled_1-16.txt
+    │    │   ├── train_aug_unlabeled_1-2.txt
+    │    │   ├── train_aug_unlabeled_1-4.txt
+    │    │   └── train_aug_unlabeled_1-8.txt
+    │    ├── test.txt
+    │    ├── train.txt
+    │    ├── train_val.txt
+    │    └── val.txt  
+    ├── generate_colored_gt.py
+    ├── images
+    │   ├── test
+    │   ├── train
+    │   └── val
+    └── segmentation
+        ├── test
+        ├── train
+        └── val
+```
+
+## 训练评估预测
+
+执行以下命令，进入到`CrossPseudoSupervision`文件夹下：
+
+```shell
+cd ./contrib/CrossPseudoSupervision
+```
+
+### 训练
+
+准备好环境与数据之后，执行以下命令启动训练：
+
+```shell
+export CUDA_VISIBLE_DEVICES=0
+python train.py --config ./configs/deeplabv3p/deeplabv3p_resnet50_0.5cityscapes_800x800_240e.yml --log_iters 10 --save_dir ./output/ --batch_size 2
+```
+
+建议使用单机多卡进行训练，执行以下命令启动四卡训练：
+
+```shell
+python -m paddle.distributed.launch --gpus="0,1,2,3" train.py --config ./configs/deeplabv3p/deeplabv3p_resnet50_0.5cityscapes_800x800_240e.yml \
+--log_iters 10 --save_dir $SAVE_PATH$ --batch_size 8
+```
+
+- `SAVE_PATH`: 保存权重与日志等文件的文件夹路径。
+
+**注**：
+1. 配置文件是训练1/2有标签的数据，若要调整为其他比例，修改配置文件中的`labeled_ratio`参数。当修改有标签数据的比例时，训练的epoch数需要按照下表进行调整（通过修改配置文件中的`nepochs`参数调整训练的epoch数量）：
+
+| Ratio    | 1/16 | 1/8  | 1/4  | 1/2  |
+| ---------- | ---- | ---- | ---- | ---- |
+| nepochs | 128  | 137  | 160  | 240  |
+
+
+### 评估
+
+训练结束后，执行以下命令评估模型精度：
+
+```shell
+export CUDA_VISIBLE_DEVICES=0
+python val.py \
+       --config ./configs/deeplabv3p/deeplabv3p_resnet50_0.5cityscapes_800x800_240e.yml \
+       --model_path $MODEL_PATH$
+```
+
+- `MODEL_PATH`: 要加载的权重路径。
+
+### 预测
+
+执行以下命令，使用滑窗推理进行预测：
+
+```shell
+export CUDA_VISIBLE_DEVICES=0
+python predict.py \
+       --config ./configs/deeplabv3p/deeplabv3p_resnet50_0.5cityscapes_800x800_240e.yml \
+       --model_path $MODEL_PATH$ \
+       --image_path $IMG_PATH$ \
+       --save_dir $SAVE_PATH$ \
+       --is_slide \
+       --crop_size 800 800 \
+       --stride 532 532
+```
+
+- `IMG_PATH`: 待预测的图片或文件夹所在的路径。
+
+本项目提供[预训练模型](https://paddleseg.bj.bcebos.com/dygraph/cross_pseudo_supervision/cityscapes/deeplabv3p_resnet50_cityscapes0.5.pdparams)可供直接进行预测。
diff --git a/contrib/CrossPseudoSupervision/batch_transforms/__init__.py b/contrib/CrossPseudoSupervision/batch_transforms/__init__.py
@@ -0,0 +1,16 @@
+# Copyright (c) 2023 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from .mask_gen import BoxMaskGenerator, AddMaskParamsToBatch
+from .custom_collate import SegCollate
diff --git a/contrib/CrossPseudoSupervision/batch_transforms/custom_collate.py b/contrib/CrossPseudoSupervision/batch_transforms/custom_collate.py
@@ -0,0 +1,25 @@
+# Copyright (c) 2023 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from paddle.fluid.dataloader.collate import default_collate_fn
+
+
+class SegCollate(object):
+    def __init__(self, batch_aug_fn=None):
+        self.batch_aug_fn = batch_aug_fn
+
+    def __call__(self, batch):
+        if self.batch_aug_fn is not None:
+            batch = self.batch_aug_fn(batch)
+        return default_collate_fn(batch)