Total-RD · ahalev · Dec 6, 2022 · Dec 6, 2022 · Dec 6, 2022
diff --git a/notebooks/rl-example.ipynb b/notebooks/rl-example.ipynb
@@ -0,0 +1,159 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "a12bb3ff",
+   "metadata": {},
+   "source": [
+    "## Reinforcement Learning\n",
+    "\n",
+    "This example displays how to use reinforcement learning (RL) to train a policy to control a simple microgrid. We will train and deploy a simple Discrete Q-Network (DQN) policy on one of the *pymgrid25* benchmark microgrids.\n",
+    "\n",
+    "Algorithms for reinforcement learning are not built into *pymgrid*, nor are they a dependency. We recommend using one of [RLlib](https://docs.ray.io/en/latest/rllib/index.html) and [garage](https://garage.readthedocs.io/en/latest/); RLlib is better supported and has a wider variety of algorithms but can be less developer-friendly in some scenarios. This example will use garage; the API for RLlib is similar.\n",
+    "\n",
+    "To install garage, see the [garage documentation](https://garage.readthedocs.io/en/latest/user/installation.html)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "230beca4",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import pandas as pd\n",
+    "\n",
+    "from pymgrid.envs import DiscreteMicrogridEnv"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "92121e85",
+   "metadata": {},
+   "source": [
+    "### Defining the Environment\n",
+    "\n",
+    "Defining an RL environment is extremely straightforward. To define an environment on one of the benchmark microgrids, we simply call ``from_scenario`` on our choice of the ``DiscreteMicrogridEnv`` and the ``ContinuousMicrogridEnv``. \n",
+    "\n",
+    "Here, we will use the discrete environment and train a DQN on it."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "ca0707cf",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "env = DiscreteMicrogridEnv.from_scenario(microgrid_number=0)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "1453812d",
+   "metadata": {},
+   "source": [
+    "Environments subclass [pymgrid.Microgrid](../reference/microgrid.rst) and thus have the same attribute and logging functionality:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "a9d10e03",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "LoadModule(time_series=<class 'numpy.ndarray'>, forecaster=OracleForecaster, forecast_horizon=23, forecaster_increase_uncertainty=False, raise_errors=False)\n",
+      "\n",
+      "RenewableModule(time_series=<class 'numpy.ndarray'>, raise_errors=False, forecaster=OracleForecaster, forecast_horizon=23, forecaster_increase_uncertainty=False, provided_energy_name=renewable_used)\n",
+      "\n",
+      "UnbalancedEnergyModule(raise_errors=False, loss_load_cost=10, overgeneration_cost=1)\n",
+      "\n",
+      "BatteryModule(min_capacity=290.40000000000003, max_capacity=1452, max_charge=363, max_discharge=363, efficiency=0.9, battery_cost_cycle=0.02, battery_transition_model=None, init_charge=None, init_soc=0.2, raise_errors=False)\n",
+      "\n",
+      "GridModule(max_import=1920, max_export=1920)\n",
+      "\n"
+     ]
+    }
+   ],
+   "source": [
+    "for module in env.modules.module_list():\n",
+    "    print(f'{module}\\n')"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "64eda080",
+   "metadata": {},
+   "source": [
+    "### Setting Up the RL Algorithm\n",
+    "\n",
+    "As we mentioned, we are planning on deploying a simple DQN in this case.\n",
+    "\n",
+    "For ease of use, we will employ a simple ``LocalSampler`` that does not parallelize sampling. We will also use an ``EpsilonGreedyPolicy`` for exploration."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "1d174e6c",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from garage.experiment.deterministic import set_seed\n",
+    "\n",
+    "from garage.np.exploration_policies import EpsilonGreedyPolicy\n",
+    "\n",
+    "from garage.replay_buffer import PathBuffer\n",
+    "\n",
+    "from garage.sampler import LocalSampler, RaySampler\n",
+    "\n",
+    "from garage.torch.algos.dqn import DQN\n",
+    "from garage.torch.policies import DiscreteQFArgmaxPolicy\n",
+    "from garage.torch.q_functions import DiscreteMLPQFunction\n",
+    "\n",
+    "from garage.trainer import Trainer"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ace8f2ae",
+   "metadata": {},
+   "source": [
+    "Remainder Coming Soon."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "c56b3d25",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.13"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/src/pymgrid/version.py b/src/pymgrid/version.py
@@ -1 +1 @@
-__version__ = '1.1.0'
+__version__ = '1.1.1'