Find below the code that should be run on the command line using main.py for the different experiments.
run a simple q-learning agent on the Willemsen gridworld as a baseline for improvements
python main.py A SIMPLE_Q WILLEMSEN
run a multi-estimator q-learning agent on the Willemsen gridworld to observe the differences in behavior
python main.py A ME_Q WILLEMSEN
to be defined
to be defined