DeepRL

Highly modularized implementation of popular deep RL algorithms by PyTorch. My principal here is to reuse as much components as I can through different algorithms, use as less tricks as I can and switch easily between classical control tasks like CartPole and Atari games with raw pixel inputs.

Implemented algorithms:

Deep Q-Learning (DQN)
Double DQN
Dueling DQN
Async Advantage Actor Critic (A3C)
Async One-Step Q-Learning
Async One-Step Sarsa
Async N-Step Q-Learning

Curves

Curves for CartPole are trivial so I didn't place it here.

DQN, Double DQN, Dueling DQN

The network and parameters here are exactly same as the DeepMind Nature paper. Training curve is smoothed by a window of size 100. All the models are trained in a server with Xeon E5-2620 v3 and Titan X. For Breakout, test is triggered every 1000 episodes with 50 repetitions. In total, 16M frames cost about 4 days and 10 hours. For Pong, test is triggered every 10 episodes with no repetition. In total, 4M frames cost about 18 hours.

A3C

The network I used here is same as the network in DQN except the activation function is Elu rather than Relu. The optimizer is Adam with non-shared parameters. To my best knowledge, this network architecture is not the most suitable for A3C. If you use a 42 * 42 input, add a LSTM layer at last, you will get much much much better training speed than this. GAE can also improve performance.

The first 15M frames took about 5 hours (16 processes) in a server with two Xeon E5-2620 v3. This is the test curve. Test is triggered in a separate deterministic test process every 50K frames.

Dependency

Open AI gym
PyTorch
PIL (pip install Pillow)
Python 2.7 (I didn't test with Python 3)

Usage

Detailed usage and all training details can be found in main.py

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
images		images
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
async_agent.py		async_agent.py
atari_wrapper.py		atari_wrapper.py
bootstrap.py		bootstrap.py
dqn_agent.py		dqn_agent.py
main.py		main.py
network.py		network.py
policy.py		policy.py
replay.py		replay.py
task.py		task.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DeepRL

Curves

DQN, Double DQN, Dueling DQN

A3C

Dependency

Usage

References

About

Releases

Packages

Languages

License

wicky08/DeepRL

Folders and files

Latest commit

History

Repository files navigation

DeepRL

Curves

DQN, Double DQN, Dueling DQN

A3C

Dependency

Usage

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages