SeqCapsGAN

Data

We pretrain our models using Microsoft COCO Dataset. Then, we train the models using SentiCap Dataset.

Requirements

python 3.7.4
numpy 1.18.1
hickle 3.4.6
scikit-image 0.16.2
tensorflow 1.14 or tensorflow-gpu 1.14
tqdm 4.44.1
torch 1.4.0
matplotlib 3.1.3

TODO

COCO Dataset loader and build pre-processing engine
Build LSTM Generator
Incorporate emotions into the Generator
Generator Logger
Build Conventional Discriminator
Discriminator Logger
GAN train engine
Validation engines
Record examples of generated captions in GAN structure
SentiCap Dataset loader and build pre-processing engine
Build CapsNet Discriminator
Inference engine
Train and evaluate
Plots

Train

Run ./download.sh and go to step 4, otherwise go to step 2.
Download Microsoft COCO Dataset including neutral image caption data: images: 2014 Train images [83K/13GB] (download), 2014 Val images [41K/6GB] (download), 2014 Test images [41K/6GB] (download), captions: 2014 Train/Val annotations [241MB] (download) and extract them to the folder data/images.
Download SentiCap Dataset including sentiment-bearing image caption data: captions (download) and only extract the file data/senticap_dataset.json to data/annotations.
Download the VGG network used for feature extraction download and move it to the folder data/
Run python resize.py --input_folder_dir ./data/images/train2014/ --output_folder_dir ./data/images/train2014_resized/ && python resize.py --input_folder_dir ./data/images/val2014/ --output_folder_dir ./data/images/val2014_resized/ (reseizes the downloded images into [224, 224] and puts them in data/images).
Run python prepro.py --coco_dataset_portions 1. 0.8 0.2 --senticap_dataset_portions 0.8 0.19 0.01, where the first second and third entries are the split portion from the original dataset.
Run python train.py --gen_train --gen_save_model_dir ./model/generator/ --gen_dataset coco --batchsize 8 --gen_epochs 10 to pretrain the generator.
Run python train.py --disc_train --disc_network capsnet --gen_load_model_dir ./model/generator/ --disc_save_model_dir ./model/discriminator/ --disc_dataset coco --batchsize 8 --disc_epochs 10 to pretrain the discriminator.
Run python train.py --gan_train --disc_network capsnet --gen_load_model_dir ./model/generator/ --disc_load_model_dir ./model/discriminator/ --gan_save_model_dir ./model/gan/ --gan_dataset senticap --batchsize 8 --gan_epochs 10 to train the GAN. You can add the arguments --gen_load_model_dir and/or --disc_load_model_dir to initialize your model with a pretrained generator and/or discriminator.

Test

Run python inference.py --word_to_idx_dir data/word_to_idx.pkl --image "test.jpg" --load_model_dir model/gan/ to describe an image.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Files

README.md

README.md

SeqCapsGAN

Data

Requirements

TODO

Train

Test

Results

Files

README.md

Latest commit

History

README.md

File metadata and controls

SeqCapsGAN

Data

Requirements

TODO

Train

Test

Results