This repository contains for the paper Bootstrap State Representation using Style Transfer for Better Generalization in Deep Reinforcement Learning
The initial code structure is from NeurIPS 2020 - Procgen competition.
Clone this repository:
git clone https://github.com/masud99r/thinker.git
mkdir thinker_results
cd thinker
conda create -n thinker python=3.7
conda activate thinker
Install pytorch:
pip install torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchaudio==0.9.0 -f https://download.pytorch.org/whl/torch_stable.html
Install Ray:
pip install ray[rllib]==1.8.0
Install sklearn:
pip install sklearn
Install procgen environment:
pip install gym==0.21.0
Install procgen environment:
pip install procgen==0.9.2
To run Thinker
on Procgen Maze environment run:
python trainer_thinker_stargan.py --algo thinker --env maze --seed 100 --epoch 0 --n_cluster 3 --n_epochs 500 --max_timestep 25000000 --result_dir ../thinker_results
To run PPO
on Procgen Maze environment run:
python trainer_baselines.py --algo ppo --env maze --seed 100 --max_timestep 25000000 --result_dir ../thinker_results
To run RAD
with random crop
data augmentation on Procgen Maze environment run:
python trainer_baselines_rad.py --algo rad_crop --env maze --seed 100 --max_timestep 25000000 --result_dir ../thinker_results
To run RAD
with random cutout color
data augmentation on Procgen Maze environment run:
python trainer_baselines_rad.py --algo rad_cutout --env maze --seed 100 --max_timestep 25000000 --result_dir ../thinker_results
- We conducted our experiments on A100 GPU with 40GB memory. Code can be run on other GPU with low memory. A common error is Cuda out of memory in low memory case. In that case, try reducing
sgd_minibatch_size
andtrain_batch_size
. A sample is given in inexperiments/thinker-stargan-procgen-small.yaml
. - In case of limited memory, the training can stop after training generator. In that case, we suggest to rerun the code using the
--epoch
flag to 1, which will load the saved generator modules and start RL training. - Issue with rate in resnet torch hub: pytorch/pytorch#61755 (comment)
Please contact the author at rahman64@purdue.edu if you have any queries.
If you use this code or data, please consider citing this paper:
@inproceedings{rahman2022bootstrap,
title={Bootstrap State Representation using Style Transfer for Better Generalization in Deep Reinforcement Learning},
author={Rahman, Md Masudur and Xue, Yexiang},
booktitle={European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2022)},
year={2022}
}