@@ -90,8 +90,6 @@ python cleanrl/ppo_pettingzoo.py -->
90
90
* For discrete action space.
91
91
* [ dqn_atari.py] ( https://github.com/vwxyzjn/cleanrl/blob/master/cleanrl/dqn_atari.py )
92
92
* For playing Atari games. It uses convolutional layers and common atari-based pre-processing techniques.
93
- * [ dqn_atari_visual.py] ( https://github.com/vwxyzjn/cleanrl/blob/master/cleanrl/dqn_atari_visual.py )
94
- * Adds q-values visulization for ` dqn_atari.py ` .
95
93
- [x] Categorical DQN (C51)
96
94
* [ c51.py] ( https://github.com/vwxyzjn/cleanrl/blob/master/cleanrl/c51.py )
97
95
* For discrete action space.
@@ -107,16 +105,6 @@ python cleanrl/ppo_pettingzoo.py -->
107
105
* For continuous action space. Also implemented Mujoco-specific code-level optimizations
108
106
* [ ppo_atari.py] ( https://github.com/vwxyzjn/cleanrl/blob/master/cleanrl/ppo_atari.py )
109
107
* For playing Atari games. It uses convolutional layers and common atari-based pre-processing techniques.
110
- * [ ppo_atari_visual.py] ( https://github.com/vwxyzjn/cleanrl/blob/master/cleanrl/ppo_atari_visual.py )
111
- * Adds action probability visulization for ` ppo_atari.py ` .
112
- * [ experiments/ppo_self_play.py] ( https://github.com/vwxyzjn/cleanrl/blob/master/cleanrl/experiments/ppo_self_play.py )
113
- * Implements a self-play agent for https://github.com/hardmaru/slimevolleygym
114
- * [ experiments/ppo_microrts.py] ( https://github.com/vwxyzjn/cleanrl/blob/master/cleanrl/experiments/ppo_microrts.py )
115
- * Implements invalid action masking and handling of ` MultiDiscrete ` action space for https://github.com/vwxyzjn/gym-microrts
116
- * [ experiments/ppo_simple.py] ( https://github.com/vwxyzjn/cleanrl/blob/master/cleanrl/experiments/ppo_simple.py )
117
- * (Not recommended for using) Naive implementation for discrete action space. I keep it here for educational purposes because I feel this is what most people would implement if they had just read the paper, usually unaware of the amount of implementation details that come with the well-tuned PPO implmentation.
118
- * [ experiments/ppo_simple_continuous_action.py] ( https://github.com/vwxyzjn/cleanrl/blob/master/cleanrl/experiments/ppo_simple_continuous_action.py )
119
- * (Not recommended for using) Naive implementation for continuous action space.
120
108
- [x] Soft Actor Critic (SAC)
121
109
* [ sac_continuous_action.py] ( https://github.com/vwxyzjn/cleanrl/blob/master/cleanrl/sac_continuous_action.py )
122
110
* For continuous action space.
0 commit comments