This is the official pytorch implementation of our paper Discrepant Multiple Instance Learning for Weakly Supervised Object Detection, which is accepted by Pattern Recognition.
This implementation is based on jwyang's pytorch-faster-rcnn and ppengtang's pcl.pytorch.
Please go to other branches if you want to train D-MIL on COCO dataset or using ResNet as Backbone. For retraining on pascal voc 2007 and 2012 dataset based on Fast R-CNN, you can also go to the corresponding branch.
Using vgg16 as backbone, the trained model has detection mAP 53.5 on PASCAL VOC 2007 and 49.6 on PASCAL VOC 2012
1). On PASCAL VOC 2007 dataset
model | #GPUs | batch size | lr | lr_decay | max_epoch | time/epoch | mAP | CorLoc |
VGG-16 | 1 | 2 | 5e-4 | 10 | 18 | 2 hr | 53.5 | 68.7 |
2). On PASCAL VOC 2012 dataset
model | #GPUs | batch size | lr | lr_decay | max_epoch | time/epoch | mAP | CorLoc |
VGG-16 | 1 | 2 | 5e-4 | 10 | 18 | - | 49.6 | 70.1 |
- Nvidia GPU Tesla V100
- Ubuntu 16.04 LTS
- python 3.6
- pytorch version in 1.0 ~ 1.4 is required.
- tensorflow, tensorboard and tensorboardX for visualizing training and validation curve.
- Clone the repository
git clone
- Compile the modules(nms, roi_pooling, roi_ring_pooling and roi_align)
cd D-MIL.pytorch/lib
- Download the training, validation, test data and the VOCdevkit
cd D-MIL.pyorch/
mkdir data
cd data/
- Extract all of these tars into one directory named VOCdevkit
tar xvf VOCtrainval_06-Nov-2007.tar
tar xvf VOCtest_06-Nov-2007.tar
tar xvf VOCdevkit_08-Jun-2007.tar
- Create symlinks for PASCAL VOC dataset or just rename the VOCdevkit to VOCdevkit2007
cd D-MIL.pyorch/data
ln -s VOCdevkit VOCdevkit2007
- It should have this basic structure
$VOCdevkit2007/ # development kit
$VOCdevkit2007/VOC2007/ # VOC utility code
$VOCdevkit2007/VOCcode/ # image sets, annodations, etc
And for PASCAL VOC 2010 and PASCAL VOC 2012, just following the similar steps.
VGG16: Dropbox, VT Server and put it in the data/pretrained_model and rename it vgg16_caffe.pth. The folder has the following form.
$ data/pretrained_model/vgg16_caffe.pth
Download it from: and unzip it and the final folder has the following form
$ data/selective_search_data/voc_2007_test.mat
$ data/selective_search_data/voc_2007_trainval.mat
$ data/selective_search_data/voc_2012_test.mat
$ data/selective_search_data/voc_2012_trainval.mat
For vgg16 backbone, we can train and evaluate the model using the following commands
bash $prefix $GPU_ID
And for evaluation on detection mAP, we can using the following commands
bash $prefix
And for evaluation on CorLoc, we can using the following commands
bash $prefix
First, run the following commands to get the pseudo ground-truths
bash $prefix
The we will get annotations of pseudo ground-truths for retraining Fast RCNN. These annotations are located in the following folder:
$VOCdevkit2007/VOC2007/retrain_annotation_score_top1 # VOC utility code
For retraining Fast RCNN on PASCAL VOC 2012, we can change codes in line 8, 9, 18 and 19 in file
file, where we changing the dataset from VOC 2007
to VOC 2012
The codes for retraining Fast RCNN is in branch fast-rcnn-retrain-07 and branch fast-rcnn-retrain-12. Please go to the corresponding branch for relevant configurations.
The codes for training and testing on COCO dataset are in branch D-MIL-COCO. Please go to the corresponding branch for relavant settings.
As mentioned in paper DRN, it's not trivial do train a WSOD model on non-plain backbone(e.g., ResNet, DenseNet). And for evaluating the effectiveness of D-MIL on ResNet, we implement our model based on DRN. Check corresponding branch D-MIL-ResNet for more details.
If you find this repository is useful and use this code for a paper please cite:
title={Discrepant Multiple Instance Learning for Weakly Supervised Object Detection},
author={Gao, Wei and Wan, Fang and Yue, Jun and Xu, Songcen and Ye, Qixiang},
journal={Pattern Recognition},