This repository hosts the PyTorch implementation of "Efficient Multi-Task Reinforcement Learning with Cross-Task Policy Guidance" (CTPG) on two benchmarks: HalfCheetah Locomotion Benchmark and MetaWorld Manipulation Benchmark.
NOTE:
The code is based on the MTRL codebase.
The HalfCheetah Locomotion Benchmark is already integrated into the code and does not require additional installation.
The MetaWorld Manipulation Benchmark requires extra installation. Since MetaWorld is under active development, all experiments are performed on the stable release version v2.0.0: https://github.com/Farama-Foundation/Metaworld/tree/v2.0.0.
- Set up the working environment:
pip install -r requirements.txt
- Set up the MetaWorld benchmark:
First, install the mujoco-py package by following the instructions.
Then, install MetaWorld:
pip install git+https://github.com/Farama-Foundation/Metaworld.git@v2.0.0
Use the scripts/start.sh
script to quickly run the code as follows:
bash scripts/start.sh $alg $env $map
$alg
includes:guide_mtsac
,guide_mhsac
,guide_pcgrad
,guide_sm
,guide_paco
$env
includes:metaworld
andgym_extensions
$map
includes:mt10
,mt50
(formetaworld
)halfcheetah_gravity-mt5
,halfcheetah_body-mt8
(forgym_extensions
)
For example, to run MHSAC w/ CTPG
on the MetaWorld-MT10
setup:
bash scripts/start.sh guide_mhsac metaworld mt10
All results will be saved in the log
folder.
Refer to MTRL, Gym-extensions, MetaWorld, mujoco-py for additional instructions.