Python 2.7
tkinter
python human_player.py --difficulty 0|1|2
python ai_player.py --difficulty 0|1|2 --max-iteration 20
Branch q-learning-study is for Q-learning teaching purpose. The key function are removed
This code is modified from https://github.com/llSourcell/q_learning_demo.git. I wrap the agent.py as with better api that Q-learning algorithm can call it easily