[Feature] Updated TD-MPC2 evaluation and fixed some bugs #538
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
#439
eval_env
from trainenv
(Align evaluation setups for different online RL algorithms #486)eval_env
is now used for evaluation. The only difference betweenenv
andeval_env
is num_envs.num_eval_envs
in configs now defines the num_envs ofeval_env
.max_episode_steps
(tdmpc2 baseline AttributeError: 'ManiSkillVectorEnv' object has no attribute 'max_episode_steps' #528)max_episode_steps
as an attribute. Therefore, defaultmax_episode_steps
is now determined bygym_utils.find_max_episode_steps_value(env)
.eval_episodes
eval_episodes
(it was a multiple ofnum_eval_envs
). Futhermore, onlynum_eval_envs
videos were logged at each evaluation.eval_episodes
at each evaluation.