By Luis Felipe Zeni and Claudio Jung.
Institute of Informatics, Federal University of Rio Grande do Sul, Brazil
This repository contains the PyTorch implementation of our paper Distilling Knowledge from Refinement in Multiple Instance Detection Networks published in Deep Vision 2020 CVPR workshop. (Go to Contents section if you are interested in how to run the code).
21-sep-2020: I returned the code to an old version. I made a considerable refactoring to release the code, and some of these changes impacted a little bit in the final mAP. As I am short on time, I decided to return the code to an older version (which is not beauty as the refactored one but have a better mAP in the end.). I also added a reproducibility section in this document were I explain why the results are not the same after training with the same seed.
25-may-2020: Finally we received the results from VOC 2012 evaluation server, and we beat C-MIl in detection mAP :). By best of my knowledge this is the best WSOD result in the VOC until now. http://host.robots.ox.ac.uk:8080/anonymous/E7JSMD.html
In this work, we claim that carefully selecting the aggregation criteria can considerably improve the accuracy of the learned detector. We start by proposing an additional refinement step to an existing approach (OICR), which we call refinement knowledge distillation. Then, we present an adaptive supervision aggregation function that dynamically changes the aggregation criteria for selecting boxes related to one of the ground-truth classes, background, or even ignored during the generation of each refinement module supervision. We call these improvements "Boosted-OICR".
Our code is under the MIT License (refer to the LICENSE file for details).
If you find our paper or our implementation useful in your research, please consider citing:
@inproceedings{zeni2020distilling,
title={Distilling Knowledge From Refinement in Multiple Instance Detection Networks},
author={Felipe Zeni, Luis and Jung, Claudio R},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
pages={768--769},
year={2020}
}
- Requirements: software
- Requirements: hardware
- Basic installation
- Installation for training and testing
- Extra Downloads (Models trained on PASCAL VOC)
- Usage
- About the training reproducibility
-
Linux OS (I did not tested it on other OS.)
- octave
-
python3 packages and versions used (listed using pip freeze):
- certifi==2020.6.20
- cycler==0.10.0
- Cython==0.29.21
- kiwisolver==1.2.0
- matplotlib==3.3.2
- numpy==1.19.2
- opencv-python==4.2.0.34
- Pillow==7.2.0
- protobuf==3.13.0
- pycocotools==2.0.2
- pyparsing==2.4.7
- python-dateutil==2.8.1
- PyYAML==5.3.1
- six==1.15.0
- tensorboardX==2.1
- torch==1.2.0+cu92
- torchvision==0.4.0+cu92
- tqdm==4.49.0
-
An Nnvidia GPU wuth suport to CUDA
- We used cuda 10.0 and cudnn 7.0
- We used an Nvidia Titan Xp with 12G of memory. But it shold be ok to train if you have a GPU with at least 8Gb.
- NOTICE: different versions of Pytorch have different memory usages.
-
Docker
- If you are not using Docker to run your experiments, we highly recommend that you start using it. In the folder, 'docker' is the Dockerfile to build a docker container to run our code ;).
-
Clone this repository
git clone https://github.com/luiszeni/Boosted-OICR && cd Boosted-OICR
-
[Optional] Build the docker-machine and start it. You should have the Nvidia-docker installed in your host machine
2.1. Enter in the docker folder inside the repo
cd docker
2.2. Build the docker image
docker build . -t boicr
2.3. Return to the root of the repo ($BOOSTED_OICR_ROOT)
cd ..
2.4 Create a container using the image. I prefer to mount an external volume with the code in a folder in the host machine. It makes it easier to edit the code using a GUI-text-editor or ide. This command will drop you in the container shell.
docker run --gpus all -v $(pwd):/root/Boosted-OICR --shm-size 12G -ti \ --name boicr boicr
2.5 If, in any moment of the future, you exit the container, you can enter the container again using this command.
docker start -ai boicr
Observation: I will not talk about how to display windows using X11 forwarding from the container to the host X. You will need this if you are interested to use the visualization scripts. There are a lot of tutorials on the internet teching X11 Foward in Docker.
-
Create a "data" folder in $BOOSTED_OICR_ROOT and enter in this folder
mkdir data cd data
-
Download the training, validation, test data, and VOCdevkit
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
Optional, normally faster to download, links to VOC (from darknet):
wget https://pjreddie.com/media/files/VOCtrainval_06-Nov-2007.tar wget https://pjreddie.com/media/files/VOCtest_06-Nov-2007.tar
-
Extract all of these tars into one directory named
VOCdevkit
tar xvf VOCtrainval_06-Nov-2007.tar tar xvf VOCtest_06-Nov-2007.tar
-
Download the VOCdevkit evaluation code adapted to octave
wget http://inf.ufrgs.br/~lfazeni/CVPR_deepvision2020/VOCeval_octave.tar
-
Extract VOCeval_octave
tar xvf VOCeval_octave.tar
-
Download pascal annotations in the COCO format
wget http://inf.ufrgs.br/~lfazeni/CVPR_deepvision2020/coco_annotations_VOC.tar
-
Extract the annotations
tar xvf coco_annotations_VOC.tar
-
It should have this basic structure
$VOC2007/ $VOC2007/annotations $VOC2007/JPEGImages $VOC2007/VOCdevkit # ... and several other directories ...
-
[Optional] download and extract PASCAL VOC 2012.
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar tar xvf VOCtrainval_11-May-2012.tar
or
wget https://pjreddie.com/media/files/VOCtrainval_11-May-2012.tar tar xvf VOCtrainval_11-May-2012.tar
Observation: The '2012 test set' is only available in the PASCAL VOC Evaluation Server to download. You must create a user and download it by yourself. After downloading, you can extract it in the data folder.
-
Download the proposals data generated by selective search
wget http://inf.ufrgs.br/~lfazeni/CVPR_deepvision2020/selective_search_data.tar
-
Extract the proposals
tar xvf selective_search_data.tar
-
Download the pre-trained VGG16 model
wget http://inf.ufrgs.br/~lfazeni/CVPR_deepvision2020/pretrained_model.tar
-
Extract the pre-trained VGG16 model
tar xvf pretrained_model.tar
-
[optional] Delete the downloaded files to free space
rm *.tar
-
Return to the root folder $BOOSTED_OICR_ROOT
cd ..
- Download the pretrained files at the root folder ($BOOSTED_OICR_ROOT) folder
wget http://inf.ufrgs.br/~lfazeni/CVPR_deepvision2020/trained_models.tar
- Extract it
tar xvf trained_models.tar
- Delete the tar file to free space
rm trained_models.tar
python3 code/tasks/test.py --cfg configs/baselines/vgg16_voc2007.yaml \
--dataset voc2007test \
--model oicr_lambda_log_distillation \
--load_ckpt snapshots/deepvision2020/oicr_lambda_log_distillation/final.pth \
--use_matlab
python3 code/tasks/test.py --cfg configs/baselines/vgg16_voc2007.yaml \
--dataset voc2007trainval \
--model oicr_lambda_log_distillation \
--load_ckpt snapshots/deepvision2020/oicr_lambda_log_distillation/final.pth
To Train the Boosted-OICR network on VOC 2007 trainval set:
python3 code/tasks/train.py --dataset voc2007 \
--cfg configs/baselines/vgg16_voc2007.yaml \
--bs 1 --nw 4 --iter_size 4 --model oicr_lambda_log_distillation
To Evaluate the Boosted-OICR network on VOC 2007:
python3 code/tasks/test.py --cfg configs/baselines/vgg16_voc2007.yaml \
--dataset voc2007test \
--model oicr_lambda_log_distillation \
--load_ckpt snapshots/oicr_lambda_log_distillation/<some-running-date-time>/ckpt/model_step24999.pth \
--use_matlab
python3 code/tasks/test.py --cfg configs/baselines/vgg16_voc2007.yaml \
--dataset voc2007trainval \
--model oicr_lambda_log_distillation \
--load_ckpt snapshots/oicr_lambda_log_distillation/<some-running-date-time>/ckpt/model_step24999.pth
You can run the visualization script to show the results in a openCV window
python3 code/tasks/visualize.py --cfg configs/baselines/vgg16_voc2007.yaml \
--dataset voc2007test \
--detections snapshots/deepvision2020/test/final/detections.pkl
...or you can save the visualizations as images. First create a folder to save the outputs
mkdir img_out
and pass it with the --output_dir argument
python3 code/tasks/visualize.py --cfg configs/baselines/vgg16_voc2007.yaml\
--dataset voc2007test\
--detections snapshots/deepvision2020/test/final/detections.pkl\
--output_dir img_out
We used the code available here
If you use model weights available to download, you will reproduce the same mAP, and Corloc described in the paper on the Pascal VOC2007 dataset. However, if you retrain the model, the final mAP and Corloc can differ from those described in the article.
I tried my best to make the result after the retraining using the same seed to be as similar as possible. Anyway, even fixing the seed, the results differ a little between different training instances. I am not sure from where this non-determinism comes. My best guess is that it is coming from the RoiPooling implemented in the Torchvision.
I retrained the model with the same seed five times, and the final mAP oscillates between 49.0 and 49.9.
It is also important to be aware that completely reproducible results are not guaranteed across PyTorch versions (https://pytorch.org/docs/stable/notes/randomness.html), so make sure to use the same version that we use here.
We would like to thanks Peng Tang and his colleagues for making the PCL and OICR codes publicly available.