Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] RayReplayBuffer #2835

Merged
merged 11 commits into from
Mar 12, 2025
Merged

[Feature] RayReplayBuffer #2835

merged 11 commits into from
Mar 12, 2025

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Mar 6, 2025

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Mar 6, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2835

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

❌ 5 New Failures, 3 Unrelated Failures

As of commit 8af7635 with merge base 9cd95d5 (image):

NEW FAILURES - The following jobs have failed:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

vmoens added a commit that referenced this pull request Mar 6, 2025
ghstack-source-id: 77a9fd94512041a2df4cb740c7c35977ff54361e
Pull Request resolved: #2835
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 6, 2025
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Mar 6, 2025
ghstack-source-id: 20f432208706acb7a066f62e7145f7610864e63f
Pull Request resolved: #2835
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Mar 6, 2025
ghstack-source-id: 42e965438fa2d95ed457ab6927cf54f32ad69206
Pull Request resolved: #2835
[ghstack-poisoned]
@vmoens vmoens added the enhancement New feature or request label Mar 10, 2025
vmoens added 3 commits March 10, 2025 11:21
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
Copy link

github-actions bot commented Mar 10, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}11$. Worsened: $\large\color{#d91a1a}19$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.6148s 0.5335s 1.8744 Ops/s 1.8530 Ops/s $\color{#35bf28}+1.16\%$
test_transformed 1.1702s 1.0817s 0.9245 Ops/s 0.9414 Ops/s $\color{#d91a1a}-1.80\%$
test_serial 1.5745s 1.5713s 0.6364 Ops/s 0.6408 Ops/s $\color{#d91a1a}-0.68\%$
test_parallel 1.4099s 1.3299s 0.7520 Ops/s 0.7482 Ops/s $\color{#35bf28}+0.51\%$
test_step_mdp_speed[True-True-True-True-True] 0.1991ms 31.5423μs 31.7035 KOps/s 33.4212 KOps/s $\textbf{\color{#d91a1a}-5.14\%}$
test_step_mdp_speed[True-True-True-True-False] 70.4010μs 18.4162μs 54.3001 KOps/s 55.8227 KOps/s $\color{#d91a1a}-2.73\%$
test_step_mdp_speed[True-True-True-False-True] 50.9850μs 17.7894μs 56.2133 KOps/s 58.4447 KOps/s $\color{#d91a1a}-3.82\%$
test_step_mdp_speed[True-True-True-False-False] 63.2980μs 10.4171μs 95.9964 KOps/s 100.7339 KOps/s $\color{#d91a1a}-4.70\%$
test_step_mdp_speed[True-True-False-True-True] 80.6210μs 33.7229μs 29.6535 KOps/s 31.2291 KOps/s $\textbf{\color{#d91a1a}-5.05\%}$
test_step_mdp_speed[True-True-False-True-False] 63.7290μs 20.4370μs 48.9309 KOps/s 50.5150 KOps/s $\color{#d91a1a}-3.14\%$
test_step_mdp_speed[True-True-False-False-True] 63.2280μs 19.7226μs 50.7033 KOps/s 53.2849 KOps/s $\color{#d91a1a}-4.84\%$
test_step_mdp_speed[True-True-False-False-False] 39.8950μs 12.3190μs 81.1751 KOps/s 85.0514 KOps/s $\color{#d91a1a}-4.56\%$
test_step_mdp_speed[True-False-True-True-True] 85.2890μs 36.2506μs 27.5857 KOps/s 29.4088 KOps/s $\textbf{\color{#d91a1a}-6.20\%}$
test_step_mdp_speed[True-False-True-True-False] 50.5750μs 22.4713μs 44.5012 KOps/s 47.1680 KOps/s $\textbf{\color{#d91a1a}-5.65\%}$
test_step_mdp_speed[True-False-True-False-True] 71.2630μs 19.9721μs 50.0700 KOps/s 53.3557 KOps/s $\textbf{\color{#d91a1a}-6.16\%}$
test_step_mdp_speed[True-False-True-False-False] 40.0250μs 12.4242μs 80.4883 KOps/s 85.0607 KOps/s $\textbf{\color{#d91a1a}-5.38\%}$
test_step_mdp_speed[True-False-False-True-True] 85.9100μs 37.4855μs 26.6770 KOps/s 28.2187 KOps/s $\textbf{\color{#d91a1a}-5.46\%}$
test_step_mdp_speed[True-False-False-True-False] 75.0400μs 24.4939μs 40.8265 KOps/s 42.8586 KOps/s $\color{#d91a1a}-4.74\%$
test_step_mdp_speed[True-False-False-False-True] 56.4150μs 21.5020μs 46.5073 KOps/s 48.7914 KOps/s $\color{#d91a1a}-4.68\%$
test_step_mdp_speed[True-False-False-False-False] 64.2730μs 13.9979μs 71.4394 KOps/s 73.4023 KOps/s $\color{#d91a1a}-2.67\%$
test_step_mdp_speed[False-True-True-True-True] 93.1600μs 35.5341μs 28.1419 KOps/s 29.6604 KOps/s $\textbf{\color{#d91a1a}-5.12\%}$
test_step_mdp_speed[False-True-True-True-False] 70.9520μs 22.3773μs 44.6882 KOps/s 46.3355 KOps/s $\color{#d91a1a}-3.56\%$
test_step_mdp_speed[False-True-True-False-True] 74.9300μs 22.2318μs 44.9806 KOps/s 47.0012 KOps/s $\color{#d91a1a}-4.30\%$
test_step_mdp_speed[False-True-True-False-False] 45.9160μs 13.6949μs 73.0198 KOps/s 75.5521 KOps/s $\color{#d91a1a}-3.35\%$
test_step_mdp_speed[False-True-False-True-True] 97.4310μs 37.3901μs 26.7451 KOps/s 28.3689 KOps/s $\textbf{\color{#d91a1a}-5.72\%}$
test_step_mdp_speed[False-True-False-True-False] 2.7041ms 24.3603μs 41.0504 KOps/s 42.7021 KOps/s $\color{#d91a1a}-3.87\%$
test_step_mdp_speed[False-True-False-False-True] 67.1150μs 24.0711μs 41.5436 KOps/s 43.4524 KOps/s $\color{#d91a1a}-4.39\%$
test_step_mdp_speed[False-True-False-False-False] 70.3210μs 15.6294μs 63.9822 KOps/s 67.4028 KOps/s $\textbf{\color{#d91a1a}-5.07\%}$
test_step_mdp_speed[False-False-True-True-True] 95.0470μs 39.0053μs 25.6375 KOps/s 26.8278 KOps/s $\color{#d91a1a}-4.44\%$
test_step_mdp_speed[False-False-True-True-False] 74.7900μs 26.3621μs 37.9332 KOps/s 39.8302 KOps/s $\color{#d91a1a}-4.76\%$
test_step_mdp_speed[False-False-True-False-True] 57.2260μs 23.9217μs 41.8031 KOps/s 43.4540 KOps/s $\color{#d91a1a}-3.80\%$
test_step_mdp_speed[False-False-True-False-False] 66.4840μs 15.6815μs 63.7692 KOps/s 67.5927 KOps/s $\textbf{\color{#d91a1a}-5.66\%}$
test_step_mdp_speed[False-False-False-True-True] 94.7670μs 40.7757μs 24.5244 KOps/s 25.9089 KOps/s $\textbf{\color{#d91a1a}-5.34\%}$
test_step_mdp_speed[False-False-False-True-False] 57.9280μs 27.8682μs 35.8832 KOps/s 37.6318 KOps/s $\color{#d91a1a}-4.65\%$
test_step_mdp_speed[False-False-False-False-True] 75.8110μs 25.7268μs 38.8700 KOps/s 40.5433 KOps/s $\color{#d91a1a}-4.13\%$
test_step_mdp_speed[False-False-False-False-False] 44.7240μs 17.2818μs 57.8643 KOps/s 60.4659 KOps/s $\color{#d91a1a}-4.30\%$
test_values[generalized_advantage_estimate-True-True] 10.5314ms 9.5160ms 105.0864 Ops/s 101.1057 Ops/s $\color{#35bf28}+3.94\%$
test_values[vec_generalized_advantage_estimate-True-True] 25.0956ms 24.1955ms 41.3300 Ops/s 41.1093 Ops/s $\color{#35bf28}+0.54\%$
test_values[td0_return_estimate-False-False] 0.2485ms 0.1775ms 5.6343 KOps/s 4.6705 KOps/s $\textbf{\color{#35bf28}+20.63\%}$
test_values[td1_return_estimate-False-False] 24.0869ms 23.5162ms 42.5239 Ops/s 40.1562 Ops/s $\textbf{\color{#35bf28}+5.90\%}$
test_values[vec_td1_return_estimate-False-False] 25.0752ms 24.2953ms 41.1603 Ops/s 40.9509 Ops/s $\color{#35bf28}+0.51\%$
test_values[td_lambda_return_estimate-True-False] 38.2550ms 33.8138ms 29.5738 Ops/s 28.2809 Ops/s $\color{#35bf28}+4.57\%$
test_values[vec_td_lambda_return_estimate-True-False] 27.1919ms 24.3873ms 41.0050 Ops/s 40.9774 Ops/s $\color{#35bf28}+0.07\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 12.5690ms 8.2947ms 120.5589 Ops/s 117.3780 Ops/s $\color{#35bf28}+2.71\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.2587ms 1.8990ms 526.5951 Ops/s 528.5431 Ops/s $\color{#d91a1a}-0.37\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.5171ms 0.3684ms 2.7145 KOps/s 2.7019 KOps/s $\color{#35bf28}+0.46\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 46.6434ms 42.5917ms 23.4788 Ops/s 24.3430 Ops/s $\color{#d91a1a}-3.55\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 4.4860ms 3.4637ms 288.7077 Ops/s 285.2263 Ops/s $\color{#35bf28}+1.22\%$
test_dqn_speed[False-None] 5.4886ms 1.4295ms 699.5261 Ops/s 695.8457 Ops/s $\color{#35bf28}+0.53\%$
test_dqn_speed[False-backward] 1.9898ms 1.9070ms 524.3827 Ops/s 521.6437 Ops/s $\color{#35bf28}+0.53\%$
test_dqn_speed[True-None] 0.9086ms 0.5677ms 1.7614 KOps/s 1.7782 KOps/s $\color{#d91a1a}-0.94\%$
test_dqn_speed[True-backward] 1.0303ms 0.9707ms 1.0302 KOps/s 699.7247 Ops/s $\textbf{\color{#35bf28}+47.23\%}$
test_dqn_speed[reduce-overhead-None] 0.8577ms 0.5570ms 1.7954 KOps/s 1.7592 KOps/s $\color{#35bf28}+2.06\%$
test_dqn_speed[reduce-overhead-backward] 1.0531ms 0.9835ms 1.0167 KOps/s 997.2393 Ops/s $\color{#35bf28}+1.96\%$
test_ddpg_speed[False-None] 5.4549ms 2.9667ms 337.0747 Ops/s 342.8984 Ops/s $\color{#d91a1a}-1.70\%$
test_ddpg_speed[False-backward] 4.1664ms 4.0691ms 245.7545 Ops/s 245.2024 Ops/s $\color{#35bf28}+0.23\%$
test_ddpg_speed[True-None] 1.6711ms 1.4326ms 698.0233 Ops/s 680.4574 Ops/s $\color{#35bf28}+2.58\%$
test_ddpg_speed[True-backward] 2.4155ms 2.3522ms 425.1283 Ops/s 424.0946 Ops/s $\color{#35bf28}+0.24\%$
test_ddpg_speed[reduce-overhead-None] 2.1471ms 1.4575ms 686.0913 Ops/s 687.5033 Ops/s $\color{#d91a1a}-0.21\%$
test_ddpg_speed[reduce-overhead-backward] 2.4571ms 2.3927ms 417.9364 Ops/s 422.5447 Ops/s $\color{#d91a1a}-1.09\%$
test_sac_speed[False-None] 8.9783ms 8.4283ms 118.6480 Ops/s 119.1928 Ops/s $\color{#d91a1a}-0.46\%$
test_sac_speed[False-backward] 11.9373ms 10.8960ms 91.7770 Ops/s 91.2041 Ops/s $\color{#35bf28}+0.63\%$
test_sac_speed[True-None] 3.6399ms 2.5937ms 385.5512 Ops/s 382.6802 Ops/s $\color{#35bf28}+0.75\%$
test_sac_speed[True-backward] 4.2839ms 4.2405ms 235.8224 Ops/s 233.4544 Ops/s $\color{#35bf28}+1.01\%$
test_sac_speed[reduce-overhead-None] 3.3620ms 2.6010ms 384.4605 Ops/s 391.4323 Ops/s $\color{#d91a1a}-1.78\%$
test_sac_speed[reduce-overhead-backward] 5.1452ms 4.2978ms 232.6767 Ops/s 233.5506 Ops/s $\color{#d91a1a}-0.37\%$
test_redq_speed[False-None] 16.4659ms 13.1176ms 76.2336 Ops/s 76.5231 Ops/s $\color{#d91a1a}-0.38\%$
test_redq_speed[False-backward] 24.5761ms 22.5025ms 44.4396 Ops/s 43.3992 Ops/s $\color{#35bf28}+2.40\%$
test_redq_speed[True-None] 8.2349ms 7.2714ms 137.5259 Ops/s 134.9906 Ops/s $\color{#35bf28}+1.88\%$
test_redq_speed[True-backward] 14.9015ms 13.9123ms 71.8788 Ops/s 69.2674 Ops/s $\color{#35bf28}+3.77\%$
test_redq_speed[reduce-overhead-None] 7.8245ms 6.5605ms 152.4266 Ops/s 144.5544 Ops/s $\textbf{\color{#35bf28}+5.45\%}$
test_redq_speed[reduce-overhead-backward] 14.3022ms 13.9290ms 71.7929 Ops/s 68.5822 Ops/s $\color{#35bf28}+4.68\%$
test_redq_deprec_speed[False-None] 13.9797ms 13.0733ms 76.4919 Ops/s 73.6796 Ops/s $\color{#35bf28}+3.82\%$
test_redq_deprec_speed[False-backward] 20.0937ms 18.6297ms 53.6776 Ops/s 51.1929 Ops/s $\color{#35bf28}+4.85\%$
test_redq_deprec_speed[True-None] 5.9455ms 5.1584ms 193.8588 Ops/s 185.2126 Ops/s $\color{#35bf28}+4.67\%$
test_redq_deprec_speed[True-backward] 10.9525ms 9.8962ms 101.0488 Ops/s 95.9033 Ops/s $\textbf{\color{#35bf28}+5.37\%}$
test_redq_deprec_speed[reduce-overhead-None] 6.1936ms 5.2589ms 190.1539 Ops/s 182.1918 Ops/s $\color{#35bf28}+4.37\%$
test_redq_deprec_speed[reduce-overhead-backward] 10.9079ms 10.2899ms 97.1831 Ops/s 89.8922 Ops/s $\textbf{\color{#35bf28}+8.11\%}$
test_td3_speed[False-None] 8.7555ms 8.2465ms 121.2631 Ops/s 117.9157 Ops/s $\color{#35bf28}+2.84\%$
test_td3_speed[False-backward] 11.0656ms 10.6221ms 94.1429 Ops/s 92.1800 Ops/s $\color{#35bf28}+2.13\%$
test_td3_speed[True-None] 2.5084ms 2.2507ms 444.3047 Ops/s 436.7827 Ops/s $\color{#35bf28}+1.72\%$
test_td3_speed[True-backward] 4.1131ms 3.9957ms 250.2717 Ops/s 228.3336 Ops/s $\textbf{\color{#35bf28}+9.61\%}$
test_td3_speed[reduce-overhead-None] 2.6977ms 2.2731ms 439.9243 Ops/s 426.5111 Ops/s $\color{#35bf28}+3.14\%$
test_td3_speed[reduce-overhead-backward] 4.0673ms 3.9627ms 252.3507 Ops/s 227.2664 Ops/s $\textbf{\color{#35bf28}+11.04\%}$
test_cql_speed[False-None] 41.3028ms 37.6189ms 26.5824 Ops/s 25.6809 Ops/s $\color{#35bf28}+3.51\%$
test_cql_speed[False-backward] 50.7031ms 47.3346ms 21.1262 Ops/s 20.4120 Ops/s $\color{#35bf28}+3.50\%$
test_cql_speed[True-None] 23.5507ms 22.7348ms 43.9854 Ops/s 43.8947 Ops/s $\color{#35bf28}+0.21\%$
test_cql_speed[True-backward] 31.5542ms 29.8854ms 33.4612 Ops/s 33.3396 Ops/s $\color{#35bf28}+0.36\%$
test_cql_speed[reduce-overhead-None] 24.2033ms 23.0348ms 43.4126 Ops/s 42.9750 Ops/s $\color{#35bf28}+1.02\%$
test_cql_speed[reduce-overhead-backward] 37.1751ms 30.5797ms 32.7014 Ops/s 32.9677 Ops/s $\color{#d91a1a}-0.81\%$
test_a2c_speed[False-None] 9.3820ms 7.7722ms 128.6644 Ops/s 129.4090 Ops/s $\color{#d91a1a}-0.58\%$
test_a2c_speed[False-backward] 18.6190ms 15.4874ms 64.5684 Ops/s 65.0689 Ops/s $\color{#d91a1a}-0.77\%$
test_a2c_speed[True-None] 5.7392ms 4.9283ms 202.9077 Ops/s 210.7010 Ops/s $\color{#d91a1a}-3.70\%$
test_a2c_speed[True-backward] 12.7345ms 11.8188ms 84.6109 Ops/s 85.7773 Ops/s $\color{#d91a1a}-1.36\%$
test_a2c_speed[reduce-overhead-None] 5.7491ms 4.7790ms 209.2469 Ops/s 207.6340 Ops/s $\color{#35bf28}+0.78\%$
test_a2c_speed[reduce-overhead-backward] 11.5280ms 11.1555ms 89.6422 Ops/s 85.6108 Ops/s $\color{#35bf28}+4.71\%$
test_ppo_speed[False-None] 8.4500ms 7.5516ms 132.4222 Ops/s 131.6407 Ops/s $\color{#35bf28}+0.59\%$
test_ppo_speed[False-backward] 17.8639ms 15.5744ms 64.2081 Ops/s 66.6993 Ops/s $\color{#d91a1a}-3.73\%$
test_ppo_speed[True-None] 5.7561ms 5.3366ms 187.3835 Ops/s 192.6244 Ops/s $\color{#d91a1a}-2.72\%$
test_ppo_speed[True-backward] 12.9505ms 11.4198ms 87.5673 Ops/s 88.2224 Ops/s $\color{#d91a1a}-0.74\%$
test_ppo_speed[reduce-overhead-None] 6.1005ms 5.2123ms 191.8525 Ops/s 190.0987 Ops/s $\color{#35bf28}+0.92\%$
test_ppo_speed[reduce-overhead-backward] 13.1238ms 11.7725ms 84.9435 Ops/s 84.7128 Ops/s $\color{#35bf28}+0.27\%$
test_reinforce_speed[False-None] 7.6844ms 6.8242ms 146.5374 Ops/s 145.9925 Ops/s $\color{#35bf28}+0.37\%$
test_reinforce_speed[False-backward] 10.5736ms 10.3538ms 96.5826 Ops/s 98.5009 Ops/s $\color{#d91a1a}-1.95\%$
test_reinforce_speed[True-None] 4.9915ms 4.2394ms 235.8843 Ops/s 239.6294 Ops/s $\color{#d91a1a}-1.56\%$
test_reinforce_speed[True-backward] 10.7796ms 10.3653ms 96.4760 Ops/s 95.0254 Ops/s $\color{#35bf28}+1.53\%$
test_reinforce_speed[reduce-overhead-None] 4.8478ms 4.1984ms 238.1835 Ops/s 240.4901 Ops/s $\color{#d91a1a}-0.96\%$
test_reinforce_speed[reduce-overhead-backward] 11.5308ms 10.6679ms 93.7395 Ops/s 96.5999 Ops/s $\color{#d91a1a}-2.96\%$
test_iql_speed[False-None] 40.2801ms 34.3729ms 29.0927 Ops/s 29.9491 Ops/s $\color{#d91a1a}-2.86\%$
test_iql_speed[False-backward] 55.8037ms 47.1941ms 21.1891 Ops/s 21.5276 Ops/s $\color{#d91a1a}-1.57\%$
test_iql_speed[True-None] 17.7780ms 16.1586ms 61.8864 Ops/s 60.7987 Ops/s $\color{#35bf28}+1.79\%$
test_iql_speed[True-backward] 28.3468ms 26.7574ms 37.3729 Ops/s 36.7737 Ops/s $\color{#35bf28}+1.63\%$
test_iql_speed[reduce-overhead-None] 16.6797ms 15.5598ms 64.2681 Ops/s 61.1391 Ops/s $\textbf{\color{#35bf28}+5.12\%}$
test_iql_speed[reduce-overhead-backward] 37.0658ms 27.3141ms 36.6111 Ops/s 36.7904 Ops/s $\color{#d91a1a}-0.49\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 0.6581s 7.9298ms 126.1061 Ops/s 210.2728 Ops/s $\textbf{\color{#d91a1a}-40.03\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.1266ms 0.5215ms 1.9175 KOps/s 1.9652 KOps/s $\color{#d91a1a}-2.43\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7529ms 0.4901ms 2.0404 KOps/s 2.0281 KOps/s $\color{#35bf28}+0.61\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 7.5486ms 4.5657ms 219.0228 Ops/s 217.8801 Ops/s $\color{#35bf28}+0.52\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 3.2307ms 0.5144ms 1.9441 KOps/s 1.9617 KOps/s $\color{#d91a1a}-0.90\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7280ms 0.4854ms 2.0603 KOps/s 2.0591 KOps/s $\color{#35bf28}+0.06\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.2746ms 1.6835ms 594.0115 Ops/s 604.6716 Ops/s $\color{#d91a1a}-1.76\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.9196ms 1.5870ms 630.1396 Ops/s 635.4550 Ops/s $\color{#d91a1a}-0.84\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 7.6522ms 4.8662ms 205.5009 Ops/s 213.2353 Ops/s $\color{#d91a1a}-3.63\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.1910ms 0.6614ms 1.5118 KOps/s 1.5403 KOps/s $\color{#d91a1a}-1.85\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8635ms 0.6439ms 1.5530 KOps/s 1.5971 KOps/s $\color{#d91a1a}-2.76\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.4447ms 4.6820ms 213.5822 Ops/s 217.4559 Ops/s $\color{#d91a1a}-1.78\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.5692ms 0.5272ms 1.8970 KOps/s 1.9134 KOps/s $\color{#d91a1a}-0.86\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6987ms 0.5048ms 1.9809 KOps/s 2.0435 KOps/s $\color{#d91a1a}-3.06\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.9100ms 4.5278ms 220.8573 Ops/s 220.8374 Ops/s $+0.01\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.9352ms 0.5067ms 1.9737 KOps/s 1.9508 KOps/s $\color{#35bf28}+1.17\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7960ms 0.4879ms 2.0494 KOps/s 2.0343 KOps/s $\color{#35bf28}+0.74\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.2510ms 4.6385ms 215.5868 Ops/s 211.4097 Ops/s $\color{#35bf28}+1.98\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.8733ms 0.6520ms 1.5337 KOps/s 1.5407 KOps/s $\color{#d91a1a}-0.45\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8771ms 0.6278ms 1.5928 KOps/s 1.5745 KOps/s $\color{#35bf28}+1.16\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 5.5636ms 4.3185ms 231.5614 Ops/s 235.4285 Ops/s $\color{#d91a1a}-1.64\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 8.4354ms 2.3865ms 419.0189 Ops/s 437.7997 Ops/s $\color{#d91a1a}-4.29\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 7.8979ms 1.5570ms 642.2413 Ops/s 708.7485 Ops/s $\textbf{\color{#d91a1a}-9.38\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.8741s 21.7548ms 45.9668 Ops/s 231.5686 Ops/s $\textbf{\color{#d91a1a}-80.15\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 3.2607ms 2.1292ms 469.6658 Ops/s 426.8408 Ops/s $\textbf{\color{#35bf28}+10.03\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 6.6914ms 1.4787ms 676.2518 Ops/s 641.3627 Ops/s $\textbf{\color{#35bf28}+5.44\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 5.9499ms 4.4993ms 222.2546 Ops/s 220.0501 Ops/s $\color{#35bf28}+1.00\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 7.4260ms 2.4956ms 400.6996 Ops/s 398.3134 Ops/s $\color{#35bf28}+0.60\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 6.2721ms 1.6454ms 607.7468 Ops/s 648.2246 Ops/s $\textbf{\color{#d91a1a}-6.24\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 63.6960ms 51.5733ms 19.3899 Ops/s 82.1667 Ops/s $\textbf{\color{#d91a1a}-76.40\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 15.1973ms 14.1332ms 70.7553 Ops/s 71.2113 Ops/s $\color{#d91a1a}-0.64\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 63.3219ms 51.0758ms 19.5787 Ops/s 48.0074 Ops/s $\textbf{\color{#d91a1a}-59.22\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 23.1787ms 14.3480ms 69.6960 Ops/s 70.5730 Ops/s $\color{#d91a1a}-1.24\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 65.2474ms 50.9797ms 19.6157 Ops/s 48.2254 Ops/s $\textbf{\color{#d91a1a}-59.33\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 17.7353ms 15.5315ms 64.3853 Ops/s 63.7578 Ops/s $\color{#35bf28}+0.98\%$

Copy link

github-actions bot commented Mar 10, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}11$. Worsened: $\large\color{#d91a1a}19$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.9115s 0.8405s 1.1898 Ops/s 1.2411 Ops/s $\color{#d91a1a}-4.13\%$
test_transformed 1.5017s 1.4170s 0.7057 Ops/s 0.6839 Ops/s $\color{#35bf28}+3.19\%$
test_serial 2.4312s 2.3449s 0.4265 Ops/s 0.4263 Ops/s $\color{#35bf28}+0.04\%$
test_parallel 2.1696s 1.9386s 0.5158 Ops/s 0.5316 Ops/s $\color{#d91a1a}-2.96\%$
test_step_mdp_speed[True-True-True-True-True] 0.2389ms 40.0743μs 24.9537 KOps/s 25.4723 KOps/s $\color{#d91a1a}-2.04\%$
test_step_mdp_speed[True-True-True-True-False] 48.4800μs 23.4138μs 42.7099 KOps/s 43.2363 KOps/s $\color{#d91a1a}-1.22\%$
test_step_mdp_speed[True-True-True-False-True] 47.9010μs 22.0218μs 45.4096 KOps/s 45.7695 KOps/s $\color{#d91a1a}-0.79\%$
test_step_mdp_speed[True-True-True-False-False] 42.1110μs 12.9121μs 77.4467 KOps/s 77.8838 KOps/s $\color{#d91a1a}-0.56\%$
test_step_mdp_speed[True-True-False-True-True] 70.7810μs 42.6161μs 23.4653 KOps/s 23.9913 KOps/s $\color{#d91a1a}-2.19\%$
test_step_mdp_speed[True-True-False-True-False] 54.7210μs 25.5245μs 39.1780 KOps/s 39.4151 KOps/s $\color{#d91a1a}-0.60\%$
test_step_mdp_speed[True-True-False-False-True] 66.7610μs 24.3168μs 41.1238 KOps/s 41.1580 KOps/s $\color{#d91a1a}-0.08\%$
test_step_mdp_speed[True-True-False-False-False] 47.1910μs 15.3514μs 65.1408 KOps/s 65.4098 KOps/s $\color{#d91a1a}-0.41\%$
test_step_mdp_speed[True-False-True-True-True] 70.3010μs 44.4556μs 22.4944 KOps/s 22.5084 KOps/s $\color{#d91a1a}-0.06\%$
test_step_mdp_speed[True-False-True-True-False] 60.9920μs 28.0728μs 35.6217 KOps/s 35.8416 KOps/s $\color{#d91a1a}-0.61\%$
test_step_mdp_speed[True-False-True-False-True] 52.3810μs 24.7079μs 40.4729 KOps/s 40.7201 KOps/s $\color{#d91a1a}-0.61\%$
test_step_mdp_speed[True-False-True-False-False] 46.4010μs 15.2840μs 65.4277 KOps/s 64.8528 KOps/s $\color{#35bf28}+0.89\%$
test_step_mdp_speed[True-False-False-True-True] 84.2720μs 46.2726μs 21.6111 KOps/s 21.2876 KOps/s $\color{#35bf28}+1.52\%$
test_step_mdp_speed[True-False-False-True-False] 62.0710μs 29.7224μs 33.6447 KOps/s 33.0183 KOps/s $\color{#35bf28}+1.90\%$
test_step_mdp_speed[True-False-False-False-True] 59.4120μs 26.8299μs 37.2719 KOps/s 37.3504 KOps/s $\color{#d91a1a}-0.21\%$
test_step_mdp_speed[True-False-False-False-False] 43.3210μs 17.5123μs 57.1026 KOps/s 56.5236 KOps/s $\color{#35bf28}+1.02\%$
test_step_mdp_speed[False-True-True-True-True] 77.0410μs 44.8859μs 22.2787 KOps/s 22.1870 KOps/s $\color{#35bf28}+0.41\%$
test_step_mdp_speed[False-True-True-True-False] 57.0410μs 27.9850μs 35.7335 KOps/s 35.4122 KOps/s $\color{#35bf28}+0.91\%$
test_step_mdp_speed[False-True-True-False-True] 2.5732ms 28.3993μs 35.2121 KOps/s 34.8760 KOps/s $\color{#35bf28}+0.96\%$
test_step_mdp_speed[False-True-True-False-False] 47.7710μs 17.0146μs 58.7730 KOps/s 57.9367 KOps/s $\color{#35bf28}+1.44\%$
test_step_mdp_speed[False-True-False-True-True] 88.4220μs 46.4240μs 21.5406 KOps/s 21.2109 KOps/s $\color{#35bf28}+1.55\%$
test_step_mdp_speed[False-True-False-True-False] 61.7410μs 29.7019μs 33.6679 KOps/s 32.5796 KOps/s $\color{#35bf28}+3.34\%$
test_step_mdp_speed[False-True-False-False-True] 61.7610μs 30.3431μs 32.9565 KOps/s 32.3626 KOps/s $\color{#35bf28}+1.84\%$
test_step_mdp_speed[False-True-False-False-False] 47.4510μs 19.2189μs 52.0322 KOps/s 51.8435 KOps/s $\color{#35bf28}+0.36\%$
test_step_mdp_speed[False-False-True-True-True] 81.4120μs 48.7797μs 20.5003 KOps/s 21.0336 KOps/s $\color{#d91a1a}-2.54\%$
test_step_mdp_speed[False-False-True-True-False] 65.7520μs 32.3370μs 30.9243 KOps/s 31.3251 KOps/s $\color{#d91a1a}-1.28\%$
test_step_mdp_speed[False-False-True-False-True] 65.3020μs 30.3350μs 32.9653 KOps/s 33.1540 KOps/s $\color{#d91a1a}-0.57\%$
test_step_mdp_speed[False-False-True-False-False] 56.4110μs 19.3483μs 51.6842 KOps/s 51.8593 KOps/s $\color{#d91a1a}-0.34\%$
test_step_mdp_speed[False-False-False-True-True] 88.3920μs 51.0758μs 19.5788 KOps/s 20.0253 KOps/s $\color{#d91a1a}-2.23\%$
test_step_mdp_speed[False-False-False-True-False] 67.5310μs 34.9383μs 28.6219 KOps/s 29.3320 KOps/s $\color{#d91a1a}-2.42\%$
test_step_mdp_speed[False-False-False-False-True] 64.9410μs 32.7066μs 30.5749 KOps/s 31.3186 KOps/s $\color{#d91a1a}-2.37\%$
test_step_mdp_speed[False-False-False-False-False] 50.5910μs 21.4254μs 46.6736 KOps/s 47.0122 KOps/s $\color{#d91a1a}-0.72\%$
test_values[generalized_advantage_estimate-True-True] 27.2249ms 26.1755ms 38.2037 Ops/s 39.6866 Ops/s $\color{#d91a1a}-3.74\%$
test_values[vec_generalized_advantage_estimate-True-True] 0.1103s 3.1176ms 320.7576 Ops/s 351.8115 Ops/s $\textbf{\color{#d91a1a}-8.83\%}$
test_values[td0_return_estimate-False-False] 0.1076ms 81.7364μs 12.2344 KOps/s 12.6986 KOps/s $\color{#d91a1a}-3.65\%$
test_values[td1_return_estimate-False-False] 60.2636ms 59.1666ms 16.9014 Ops/s 18.0552 Ops/s $\textbf{\color{#d91a1a}-6.39\%}$
test_values[vec_td1_return_estimate-False-False] 1.3178ms 1.1020ms 907.4708 Ops/s 924.5759 Ops/s $\color{#d91a1a}-1.85\%$
test_values[td_lambda_return_estimate-True-False] 92.2334ms 90.0949ms 11.0994 Ops/s 11.1994 Ops/s $\color{#d91a1a}-0.89\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.4272ms 1.1046ms 905.3270 Ops/s 928.7350 Ops/s $\color{#d91a1a}-2.52\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 25.9836ms 25.7011ms 38.9089 Ops/s 38.2277 Ops/s $\color{#35bf28}+1.78\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0494ms 0.7763ms 1.2882 KOps/s 1.3339 KOps/s $\color{#d91a1a}-3.42\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7276ms 0.6811ms 1.4683 KOps/s 1.4436 KOps/s $\color{#35bf28}+1.71\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5628ms 1.4985ms 667.3544 Ops/s 674.0079 Ops/s $\color{#d91a1a}-0.99\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.7507ms 0.6939ms 1.4411 KOps/s 1.4111 KOps/s $\color{#35bf28}+2.13\%$
test_dqn_speed[False-None] 1.5928ms 1.5040ms 664.9049 Ops/s 657.2728 Ops/s $\color{#35bf28}+1.16\%$
test_dqn_speed[False-backward] 2.2549ms 2.1325ms 468.9260 Ops/s 467.3727 Ops/s $\color{#35bf28}+0.33\%$
test_dqn_speed[True-None] 0.6807ms 0.5745ms 1.7407 KOps/s 1.7307 KOps/s $\color{#35bf28}+0.58\%$
test_dqn_speed[True-backward] 1.2884ms 1.2359ms 809.1339 Ops/s 886.2625 Ops/s $\textbf{\color{#d91a1a}-8.70\%}$
test_dqn_speed[reduce-overhead-None] 0.6821ms 0.6007ms 1.6647 KOps/s 1.7612 KOps/s $\textbf{\color{#d91a1a}-5.48\%}$
test_dqn_speed[reduce-overhead-backward] 1.1154ms 1.0706ms 934.0691 Ops/s 1.0121 KOps/s $\textbf{\color{#d91a1a}-7.71\%}$
test_ddpg_speed[False-None] 3.1303ms 2.7875ms 358.7410 Ops/s 353.4213 Ops/s $\color{#35bf28}+1.51\%$
test_ddpg_speed[False-backward] 4.6109ms 4.2010ms 238.0375 Ops/s 239.6948 Ops/s $\color{#d91a1a}-0.69\%$
test_ddpg_speed[True-None] 1.4266ms 1.3404ms 746.0212 Ops/s 739.6339 Ops/s $\color{#35bf28}+0.86\%$
test_ddpg_speed[True-backward] 2.8112ms 2.6552ms 376.6174 Ops/s 408.7596 Ops/s $\textbf{\color{#d91a1a}-7.86\%}$
test_ddpg_speed[reduce-overhead-None] 1.7859ms 1.3535ms 738.8061 Ops/s 743.5589 Ops/s $\color{#d91a1a}-0.64\%$
test_ddpg_speed[reduce-overhead-backward] 2.1019ms 2.0560ms 486.3728 Ops/s 523.3494 Ops/s $\textbf{\color{#d91a1a}-7.07\%}$
test_sac_speed[False-None] 8.4249ms 8.0037ms 124.9427 Ops/s 123.7922 Ops/s $\color{#35bf28}+0.93\%$
test_sac_speed[False-backward] 11.7174ms 11.2336ms 89.0185 Ops/s 90.3515 Ops/s $\color{#d91a1a}-1.48\%$
test_sac_speed[True-None] 2.0465ms 1.8686ms 535.1593 Ops/s 523.6458 Ops/s $\color{#35bf28}+2.20\%$
test_sac_speed[True-backward] 4.2303ms 3.8191ms 261.8445 Ops/s 278.7838 Ops/s $\textbf{\color{#d91a1a}-6.08\%}$
test_sac_speed[reduce-overhead-None] 21.5186ms 12.0951ms 82.6781 Ops/s 82.1400 Ops/s $\color{#35bf28}+0.66\%$
test_sac_speed[reduce-overhead-backward] 1.8304ms 1.7790ms 562.1152 Ops/s 553.9360 Ops/s $\color{#35bf28}+1.48\%$
test_redq_speed[False-None] 8.1019ms 7.6160ms 131.3021 Ops/s 125.4087 Ops/s $\color{#35bf28}+4.70\%$
test_redq_speed[False-backward] 12.4307ms 11.9194ms 83.8967 Ops/s 81.4067 Ops/s $\color{#35bf28}+3.06\%$
test_redq_speed[True-None] 2.4689ms 2.3459ms 426.2692 Ops/s 428.8015 Ops/s $\color{#d91a1a}-0.59\%$
test_redq_speed[True-backward] 4.3960ms 4.2958ms 232.7881 Ops/s 232.4450 Ops/s $\color{#35bf28}+0.15\%$
test_redq_speed[reduce-overhead-None] 2.5147ms 2.3707ms 421.8169 Ops/s 426.0412 Ops/s $\color{#d91a1a}-0.99\%$
test_redq_speed[reduce-overhead-backward] 4.3959ms 4.3255ms 231.1852 Ops/s 235.5505 Ops/s $\color{#d91a1a}-1.85\%$
test_redq_deprec_speed[False-None] 9.4288ms 9.0577ms 110.4031 Ops/s 109.6760 Ops/s $\color{#35bf28}+0.66\%$
test_redq_deprec_speed[False-backward] 12.6728ms 12.2995ms 81.3040 Ops/s 79.7876 Ops/s $\color{#35bf28}+1.90\%$
test_redq_deprec_speed[True-None] 2.8340ms 2.6628ms 375.5434 Ops/s 379.0330 Ops/s $\color{#d91a1a}-0.92\%$
test_redq_deprec_speed[True-backward] 4.7996ms 4.3978ms 227.3855 Ops/s 218.4706 Ops/s $\color{#35bf28}+4.08\%$
test_redq_deprec_speed[reduce-overhead-None] 2.8317ms 2.6570ms 376.3653 Ops/s 363.2288 Ops/s $\color{#35bf28}+3.62\%$
test_redq_deprec_speed[reduce-overhead-backward] 4.6470ms 4.5792ms 218.3798 Ops/s 220.3945 Ops/s $\color{#d91a1a}-0.91\%$
test_td3_speed[False-None] 8.1743ms 7.9602ms 125.6247 Ops/s 124.5805 Ops/s $\color{#35bf28}+0.84\%$
test_td3_speed[False-backward] 10.9360ms 10.5492ms 94.7939 Ops/s 94.7158 Ops/s $\color{#35bf28}+0.08\%$
test_td3_speed[True-None] 1.7588ms 1.6917ms 591.1062 Ops/s 606.7664 Ops/s $\color{#d91a1a}-2.58\%$
test_td3_speed[True-backward] 3.4108ms 3.2654ms 306.2433 Ops/s 297.8655 Ops/s $\color{#35bf28}+2.81\%$
test_td3_speed[reduce-overhead-None] 77.0782ms 26.4671ms 37.7828 Ops/s 37.9485 Ops/s $\color{#d91a1a}-0.44\%$
test_td3_speed[reduce-overhead-backward] 1.4069ms 1.3344ms 749.4143 Ops/s 671.5384 Ops/s $\textbf{\color{#35bf28}+11.60\%}$
test_cql_speed[False-None] 17.5370ms 16.7764ms 59.6076 Ops/s 58.9104 Ops/s $\color{#35bf28}+1.18\%$
test_cql_speed[False-backward] 22.5298ms 21.9684ms 45.5200 Ops/s 44.4357 Ops/s $\color{#35bf28}+2.44\%$
test_cql_speed[True-None] 3.3881ms 3.2812ms 304.7636 Ops/s 309.8582 Ops/s $\color{#d91a1a}-1.64\%$
test_cql_speed[True-backward] 6.2619ms 5.7354ms 174.3565 Ops/s 181.0801 Ops/s $\color{#d91a1a}-3.71\%$
test_cql_speed[reduce-overhead-None] 0.5934s 16.3080ms 61.3195 Ops/s 75.9268 Ops/s $\textbf{\color{#d91a1a}-19.24\%}$
test_cql_speed[reduce-overhead-backward] 2.0630ms 1.9901ms 502.4844 Ops/s 504.9713 Ops/s $\color{#d91a1a}-0.49\%$
test_a2c_speed[False-None] 3.3511ms 3.1637ms 316.0901 Ops/s 309.4479 Ops/s $\color{#35bf28}+2.15\%$
test_a2c_speed[False-backward] 6.9619ms 6.3719ms 156.9379 Ops/s 155.9870 Ops/s $\color{#35bf28}+0.61\%$
test_a2c_speed[True-None] 1.4565ms 1.3457ms 743.1072 Ops/s 739.8074 Ops/s $\color{#35bf28}+0.45\%$
test_a2c_speed[True-backward] 3.1747ms 3.0964ms 322.9542 Ops/s 337.0165 Ops/s $\color{#d91a1a}-4.17\%$
test_a2c_speed[reduce-overhead-None] 16.3537ms 9.2114ms 108.5615 Ops/s 108.7299 Ops/s $\color{#d91a1a}-0.15\%$
test_a2c_speed[reduce-overhead-backward] 1.6769ms 1.6115ms 620.5394 Ops/s 671.1044 Ops/s $\textbf{\color{#d91a1a}-7.53\%}$
test_ppo_speed[False-None] 3.9388ms 3.6941ms 270.6990 Ops/s 268.8428 Ops/s $\color{#35bf28}+0.69\%$
test_ppo_speed[False-backward] 7.6473ms 7.1454ms 139.9500 Ops/s 145.2602 Ops/s $\color{#d91a1a}-3.66\%$
test_ppo_speed[True-None] 1.7305ms 1.4296ms 699.4784 Ops/s 701.3954 Ops/s $\color{#d91a1a}-0.27\%$
test_ppo_speed[True-backward] 3.7453ms 3.2823ms 304.6650 Ops/s 322.3451 Ops/s $\textbf{\color{#d91a1a}-5.48\%}$
test_ppo_speed[reduce-overhead-None] 1.1395ms 0.9678ms 1.0333 KOps/s 1.0386 KOps/s $\color{#d91a1a}-0.51\%$
test_ppo_speed[reduce-overhead-backward] 1.6213ms 1.5642ms 639.3180 Ops/s 686.5771 Ops/s $\textbf{\color{#d91a1a}-6.88\%}$
test_reinforce_speed[False-None] 2.3848ms 2.2511ms 444.2285 Ops/s 428.3896 Ops/s $\color{#35bf28}+3.70\%$
test_reinforce_speed[False-backward] 3.8498ms 3.4727ms 287.9575 Ops/s 300.6180 Ops/s $\color{#d91a1a}-4.21\%$
test_reinforce_speed[True-None] 1.3837ms 1.2857ms 777.7597 Ops/s 773.9020 Ops/s $\color{#35bf28}+0.50\%$
test_reinforce_speed[True-backward] 3.2677ms 3.1242ms 320.0862 Ops/s 327.4716 Ops/s $\color{#d91a1a}-2.26\%$
test_reinforce_speed[reduce-overhead-None] 19.6199ms 10.5124ms 95.1256 Ops/s 94.3531 Ops/s $\color{#35bf28}+0.82\%$
test_reinforce_speed[reduce-overhead-backward] 1.6950ms 1.6411ms 609.3469 Ops/s 653.4953 Ops/s $\textbf{\color{#d91a1a}-6.76\%}$
test_iql_speed[False-None] 9.7633ms 9.1808ms 108.9232 Ops/s 106.7337 Ops/s $\color{#35bf28}+2.05\%$
test_iql_speed[False-backward] 13.7122ms 13.1584ms 75.9970 Ops/s 74.5671 Ops/s $\color{#35bf28}+1.92\%$
test_iql_speed[True-None] 2.6771ms 2.2275ms 448.9431 Ops/s 432.0256 Ops/s $\color{#35bf28}+3.92\%$
test_iql_speed[True-backward] 5.4510ms 5.0035ms 199.8585 Ops/s 198.6945 Ops/s $\color{#35bf28}+0.59\%$
test_iql_speed[reduce-overhead-None] 0.5279s 13.1271ms 76.1785 Ops/s 88.5110 Ops/s $\textbf{\color{#d91a1a}-13.93\%}$
test_iql_speed[reduce-overhead-backward] 2.1547ms 2.0526ms 487.1838 Ops/s 481.9221 Ops/s $\color{#35bf28}+1.09\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.6615ms 6.2251ms 160.6413 Ops/s 157.6912 Ops/s $\color{#35bf28}+1.87\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.5253ms 0.3134ms 3.1911 KOps/s 3.3106 KOps/s $\color{#d91a1a}-3.61\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.4661ms 0.2846ms 3.5138 KOps/s 3.0685 KOps/s $\textbf{\color{#35bf28}+14.51\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.2681ms 5.9063ms 169.3105 Ops/s 166.4537 Ops/s $\color{#35bf28}+1.72\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.2510ms 0.3392ms 2.9484 KOps/s 3.8547 KOps/s $\textbf{\color{#d91a1a}-23.51\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5010ms 0.3162ms 3.1622 KOps/s 3.1906 KOps/s $\color{#d91a1a}-0.89\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.6372ms 1.4458ms 691.6444 Ops/s 729.6298 Ops/s $\textbf{\color{#d91a1a}-5.21\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.5700ms 1.3326ms 750.4374 Ops/s 795.0510 Ops/s $\textbf{\color{#d91a1a}-5.61\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.4963ms 6.1734ms 161.9845 Ops/s 160.8284 Ops/s $\color{#35bf28}+0.72\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.1011ms 0.4287ms 2.3326 KOps/s 2.1091 KOps/s $\textbf{\color{#35bf28}+10.60\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7069ms 0.4476ms 2.2339 KOps/s 2.1588 KOps/s $\color{#35bf28}+3.48\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 10.1276ms 6.1677ms 162.1353 Ops/s 164.0204 Ops/s $\color{#d91a1a}-1.15\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.9852ms 0.3452ms 2.8969 KOps/s 3.7956 KOps/s $\textbf{\color{#d91a1a}-23.68\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 1.4861ms 0.2758ms 3.6252 KOps/s 3.1383 KOps/s $\textbf{\color{#35bf28}+15.52\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.5045ms 5.9472ms 168.1454 Ops/s 165.4432 Ops/s $\color{#35bf28}+1.63\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.6039ms 0.3245ms 3.0820 KOps/s 2.8442 KOps/s $\textbf{\color{#35bf28}+8.36\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6733ms 0.2835ms 3.5267 KOps/s 3.1755 KOps/s $\textbf{\color{#35bf28}+11.06\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.3323ms 6.1162ms 163.5001 Ops/s 157.1358 Ops/s $\color{#35bf28}+4.05\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.9140ms 0.4636ms 2.1568 KOps/s 2.2370 KOps/s $\color{#d91a1a}-3.58\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6416ms 0.4430ms 2.2571 KOps/s 2.2552 KOps/s $\color{#35bf28}+0.09\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 7.0400ms 5.4713ms 182.7717 Ops/s 177.7294 Ops/s $\color{#35bf28}+2.84\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 8.7445ms 2.0196ms 495.1397 Ops/s 448.3826 Ops/s $\textbf{\color{#35bf28}+10.43\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1.9876ms 0.9307ms 1.0745 KOps/s 842.0753 Ops/s $\textbf{\color{#35bf28}+27.60\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 7.1703ms 5.6001ms 178.5673 Ops/s 175.6941 Ops/s $\color{#35bf28}+1.64\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 10.0998ms 2.0491ms 488.0259 Ops/s 421.6200 Ops/s $\textbf{\color{#35bf28}+15.75\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 3.5629ms 1.1385ms 878.3839 Ops/s 926.1711 Ops/s $\textbf{\color{#d91a1a}-5.16\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.5305s 16.2663ms 61.4769 Ops/s 30.1374 Ops/s $\textbf{\color{#35bf28}+103.99\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 3.9726ms 1.8653ms 536.1164 Ops/s 509.7406 Ops/s $\textbf{\color{#35bf28}+5.17\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 2.1757ms 1.1920ms 838.8942 Ops/s 806.9027 Ops/s $\color{#35bf28}+3.96\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 57.0958ms 54.9086ms 18.2121 Ops/s 18.2798 Ops/s $\color{#d91a1a}-0.37\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 18.7965ms 17.0687ms 58.5868 Ops/s 56.8586 Ops/s $\color{#35bf28}+3.04\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 56.5558ms 54.4095ms 18.3791 Ops/s 17.8966 Ops/s $\color{#35bf28}+2.70\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 19.3870ms 17.3347ms 57.6877 Ops/s 56.3494 Ops/s $\color{#35bf28}+2.37\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 56.6669ms 54.9547ms 18.1968 Ops/s 17.9544 Ops/s $\color{#35bf28}+1.35\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 19.0205ms 18.1004ms 55.2473 Ops/s 53.8597 Ops/s $\color{#35bf28}+2.58\%$

[ghstack-poisoned]
def env_maker():
return GymEnv("Pendulum-v1", device="cpu")

policy = TensorDictModule(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps the example should be updated to use the .from_policy_factory in the PR above after that's merged

ray_init_kwargs = DEFAULT_RAY_INIT_CONFIG
ray.init(**ray_init_kwargs)

remote_cls = ReplayBuffer.as_remote(DEFAULT_REMOTE_CLASS_CONFIG).remote

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we allow DEFAULT_REMOTE_CLASS_CONFIG to be passed as an argument?

Unlike storages, writers and samplers, transform constructors must
be passed as separate keyword argument :attr:`transform_factory`,
as it is impossible to distinguish a constructor from a transform.
transform_factory (Callable[[], Callable], optional): a factory for the

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do this here for all of the buffers it kind of feels like the API for collector should match 😄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I agree
The more I think about it the more it's obvious we should just pass a factory to the collectors


def sample(self, *args, **kwargs):
pending_task = self._rb.sample.remote(*args, **kwargs)
return ray.get(pending_task)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very n00b qn re how sharing this across distributed workers works

is the idea that by initializing RayReplayBuffer in main and then passing it to each DistributedActorWorker, when .sample() in one of the DistributedActorWorker's methods is called it would do the .get on the resources that have been allocated to that specific DistributedActorWorker?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, not on main but within the worker

@@ -312,7 +314,9 @@ def __init__(
num_collectors: int = None,
update_after_each_batch=False,
max_weight_update_interval=-1,
replay_buffer: ReplayBuffer = None,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this is not documented

@@ -577,17 +591,33 @@ def stop_remote_collectors(self):
) # This will interrupt any running tasks on the actor, causing them to fail immediately

def iterator(self):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should iterator raise an error for the async case? Or what happens when trying to iterate over this collector when replaybuffer was passed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should get None outputs, and the buffer is filled each time you call next()

vmoens added 2 commits March 11, 2025 15:52
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
@vmoens vmoens merged commit 8af7635 into gh/vmoens/108/base Mar 12, 2025
60 of 72 checks passed
vmoens added a commit that referenced this pull request Mar 12, 2025
ghstack-source-id: 32eff06494037a1a30e532539794035c035f1e81
Pull Request resolved: #2835
@vmoens vmoens deleted the gh/vmoens/108/head branch March 12, 2025 15:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants