Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Fix PEnv device copies #2840

Merged
merged 2 commits into from
Mar 8, 2025
Merged

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Mar 7, 2025

Stack from ghstack (oldest at bottom):

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Mar 7, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2840

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures, 18 Pending

As of commit 392d3e6 with merge base 73c7b0a (image):

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 7, 2025
vmoens added a commit that referenced this pull request Mar 7, 2025
ghstack-source-id: c47d67ca714e143bbe1fcb3605eeb53f504a5958
Pull Request resolved: #2840
@vmoens vmoens added the bug Something isn't working label Mar 7, 2025
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Mar 8, 2025
ghstack-source-id: df39fd2e4cd72f24c645b0ac32b46ab3e8d847fc
Pull Request resolved: #2840
@vmoens vmoens merged commit 392d3e6 into gh/vmoens/109/base Mar 8, 2025
63 of 68 checks passed
vmoens added a commit that referenced this pull request Mar 8, 2025
ghstack-source-id: df39fd2e4cd72f24c645b0ac32b46ab3e8d847fc
Pull Request resolved: #2840
@vmoens vmoens deleted the gh/vmoens/109/head branch March 8, 2025 00:31
Copy link

github-actions bot commented Mar 8, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}20$. Worsened: $\large\color{#d91a1a}3$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.6223s 0.5261s 1.9008 Ops/s 1.9530 Ops/s $\color{#d91a1a}-2.67\%$
test_transformed 1.0830s 0.9959s 1.0041 Ops/s 0.9466 Ops/s $\textbf{\color{#35bf28}+6.08\%}$
test_serial 1.6058s 1.5045s 0.6647 Ops/s 0.6448 Ops/s $\color{#35bf28}+3.08\%$
test_parallel 1.4038s 1.3237s 0.7555 Ops/s 0.7542 Ops/s $\color{#35bf28}+0.17\%$
test_step_mdp_speed[True-True-True-True-True] 0.4082ms 30.5383μs 32.7457 KOps/s 33.6264 KOps/s $\color{#d91a1a}-2.62\%$
test_step_mdp_speed[True-True-True-True-False] 64.7380μs 17.9291μs 55.7753 KOps/s 56.0999 KOps/s $\color{#d91a1a}-0.58\%$
test_step_mdp_speed[True-True-True-False-True] 64.3300μs 17.2152μs 58.0883 KOps/s 58.8769 KOps/s $\color{#d91a1a}-1.34\%$
test_step_mdp_speed[True-True-True-False-False] 55.5650μs 10.0972μs 99.0378 KOps/s 99.6048 KOps/s $\color{#d91a1a}-0.57\%$
test_step_mdp_speed[True-True-False-True-True] 81.4430μs 32.2544μs 31.0035 KOps/s 31.2993 KOps/s $\color{#d91a1a}-0.95\%$
test_step_mdp_speed[True-True-False-True-False] 62.5080μs 19.8864μs 50.2856 KOps/s 51.1250 KOps/s $\color{#d91a1a}-1.64\%$
test_step_mdp_speed[True-True-False-False-True] 49.9140μs 18.9660μs 52.7260 KOps/s 53.2039 KOps/s $\color{#d91a1a}-0.90\%$
test_step_mdp_speed[True-True-False-False-False] 66.1340μs 11.9705μs 83.5388 KOps/s 83.5303 KOps/s $\color{#35bf28}+0.01\%$
test_step_mdp_speed[True-False-True-True-True] 90.1990μs 34.4499μs 29.0276 KOps/s 29.7684 KOps/s $\color{#d91a1a}-2.49\%$
test_step_mdp_speed[True-False-True-True-False] 98.0430μs 21.8324μs 45.8036 KOps/s 46.6852 KOps/s $\color{#d91a1a}-1.89\%$
test_step_mdp_speed[True-False-True-False-True] 71.2840μs 19.1341μs 52.2628 KOps/s 53.1544 KOps/s $\color{#d91a1a}-1.68\%$
test_step_mdp_speed[True-False-True-False-False] 42.4390μs 11.9624μs 83.5949 KOps/s 84.8874 KOps/s $\color{#d91a1a}-1.52\%$
test_step_mdp_speed[True-False-False-True-True] 79.8890μs 36.1191μs 27.6862 KOps/s 28.1811 KOps/s $\color{#d91a1a}-1.76\%$
test_step_mdp_speed[True-False-False-True-False] 79.4690μs 23.3685μs 42.7926 KOps/s 43.3154 KOps/s $\color{#d91a1a}-1.21\%$
test_step_mdp_speed[True-False-False-False-True] 0.5220ms 20.8396μs 47.9855 KOps/s 49.0658 KOps/s $\color{#d91a1a}-2.20\%$
test_step_mdp_speed[True-False-False-False-False] 59.5180μs 13.6582μs 73.2159 KOps/s 73.2203 KOps/s $-0.01\%$
test_step_mdp_speed[False-True-True-True-True] 90.7290μs 34.0196μs 29.3949 KOps/s 29.7531 KOps/s $\color{#d91a1a}-1.20\%$
test_step_mdp_speed[False-True-True-True-False] 66.7440μs 21.7058μs 46.0705 KOps/s 46.3718 KOps/s $\color{#d91a1a}-0.65\%$
test_step_mdp_speed[False-True-True-False-True] 79.5780μs 21.7939μs 45.8845 KOps/s 46.5559 KOps/s $\color{#d91a1a}-1.44\%$
test_step_mdp_speed[False-True-True-False-False] 68.4480μs 13.5003μs 74.0725 KOps/s 75.4596 KOps/s $\color{#d91a1a}-1.84\%$
test_step_mdp_speed[False-True-False-True-True] 2.4596ms 35.9747μs 27.7973 KOps/s 28.2761 KOps/s $\color{#d91a1a}-1.69\%$
test_step_mdp_speed[False-True-False-True-False] 77.3210μs 23.6040μs 42.3658 KOps/s 42.5306 KOps/s $\color{#d91a1a}-0.39\%$
test_step_mdp_speed[False-True-False-False-True] 56.9360μs 23.8107μs 41.9979 KOps/s 43.3259 KOps/s $\color{#d91a1a}-3.07\%$
test_step_mdp_speed[False-True-False-False-False] 59.3910μs 15.3216μs 65.2675 KOps/s 66.7750 KOps/s $\color{#d91a1a}-2.26\%$
test_step_mdp_speed[False-False-True-True-True] 87.2730μs 38.0533μs 26.2789 KOps/s 26.8857 KOps/s $\color{#d91a1a}-2.26\%$
test_step_mdp_speed[False-False-True-True-False] 63.2880μs 25.4117μs 39.3520 KOps/s 39.8052 KOps/s $\color{#d91a1a}-1.14\%$
test_step_mdp_speed[False-False-True-False-True] 72.5860μs 23.8659μs 41.9008 KOps/s 43.1945 KOps/s $\color{#d91a1a}-3.00\%$
test_step_mdp_speed[False-False-True-False-False] 43.6820μs 15.2577μs 65.5406 KOps/s 66.3701 KOps/s $\color{#d91a1a}-1.25\%$
test_step_mdp_speed[False-False-False-True-True] 0.1112ms 40.1367μs 24.9149 KOps/s 25.7857 KOps/s $\color{#d91a1a}-3.38\%$
test_step_mdp_speed[False-False-False-True-False] 81.9530μs 26.9642μs 37.0863 KOps/s 37.5684 KOps/s $\color{#d91a1a}-1.28\%$
test_step_mdp_speed[False-False-False-False-True] 0.5699ms 25.2088μs 39.6688 KOps/s 40.4897 KOps/s $\color{#d91a1a}-2.03\%$
test_step_mdp_speed[False-False-False-False-False] 91.3210μs 16.8317μs 59.4118 KOps/s 60.1307 KOps/s $\color{#d91a1a}-1.20\%$
test_values[generalized_advantage_estimate-True-True] 10.3557ms 9.7393ms 102.6771 Ops/s 101.5947 Ops/s $\color{#35bf28}+1.07\%$
test_values[vec_generalized_advantage_estimate-True-True] 26.4686ms 24.3019ms 41.1490 Ops/s 37.3204 Ops/s $\textbf{\color{#35bf28}+10.26\%}$
test_values[td0_return_estimate-False-False] 0.2309ms 0.1801ms 5.5514 KOps/s 5.5000 KOps/s $\color{#35bf28}+0.93\%$
test_values[td1_return_estimate-False-False] 25.2572ms 24.6736ms 40.5291 Ops/s 40.6978 Ops/s $\color{#d91a1a}-0.41\%$
test_values[vec_td1_return_estimate-False-False] 26.6804ms 24.3294ms 41.1026 Ops/s 38.0430 Ops/s $\textbf{\color{#35bf28}+8.04\%}$
test_values[td_lambda_return_estimate-True-False] 38.4226ms 35.2960ms 28.3318 Ops/s 28.7210 Ops/s $\color{#d91a1a}-1.36\%$
test_values[vec_td_lambda_return_estimate-True-False] 25.6855ms 24.2453ms 41.2450 Ops/s 38.2510 Ops/s $\textbf{\color{#35bf28}+7.83\%}$
test_gae_speed[generalized_advantage_estimate-False-1-512] 8.8078ms 8.5918ms 116.3895 Ops/s 117.1181 Ops/s $\color{#d91a1a}-0.62\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 3.0838ms 1.9014ms 525.9357 Ops/s 548.7819 Ops/s $\color{#d91a1a}-4.16\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.6604ms 0.3697ms 2.7052 KOps/s 2.7052 KOps/s $+0.00\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 41.7813ms 40.0815ms 24.9492 Ops/s 22.2217 Ops/s $\textbf{\color{#35bf28}+12.27\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 4.5781ms 3.4477ms 290.0489 Ops/s 289.3277 Ops/s $\color{#35bf28}+0.25\%$
test_dqn_speed[False-None] 6.6938ms 1.3835ms 722.8073 Ops/s 702.8140 Ops/s $\color{#35bf28}+2.84\%$
test_dqn_speed[False-backward] 1.9728ms 1.8577ms 538.3028 Ops/s 520.3613 Ops/s $\color{#35bf28}+3.45\%$
test_dqn_speed[True-None] 0.6765ms 0.5570ms 1.7953 KOps/s 1.7697 KOps/s $\color{#35bf28}+1.44\%$
test_dqn_speed[True-backward] 1.0165ms 0.9649ms 1.0364 KOps/s 1.0065 KOps/s $\color{#35bf28}+2.96\%$
test_dqn_speed[reduce-overhead-None] 0.7042ms 0.5540ms 1.8049 KOps/s 1.7809 KOps/s $\color{#35bf28}+1.35\%$
test_dqn_speed[reduce-overhead-backward] 1.0276ms 0.9753ms 1.0253 KOps/s 997.3298 Ops/s $\color{#35bf28}+2.81\%$
test_ddpg_speed[False-None] 3.6469ms 2.8699ms 348.4477 Ops/s 343.5375 Ops/s $\color{#35bf28}+1.43\%$
test_ddpg_speed[False-backward] 4.0800ms 3.9762ms 251.4964 Ops/s 245.5327 Ops/s $\color{#35bf28}+2.43\%$
test_ddpg_speed[True-None] 1.9697ms 1.4437ms 692.6496 Ops/s 685.6979 Ops/s $\color{#35bf28}+1.01\%$
test_ddpg_speed[True-backward] 2.3908ms 2.3153ms 431.9076 Ops/s 389.6483 Ops/s $\textbf{\color{#35bf28}+10.85\%}$
test_ddpg_speed[reduce-overhead-None] 2.0056ms 1.4333ms 697.6693 Ops/s 676.4530 Ops/s $\color{#35bf28}+3.14\%$
test_ddpg_speed[reduce-overhead-backward] 2.6629ms 2.3872ms 418.8933 Ops/s 422.7929 Ops/s $\color{#d91a1a}-0.92\%$
test_sac_speed[False-None] 12.7763ms 8.1035ms 123.4037 Ops/s 117.8345 Ops/s $\color{#35bf28}+4.73\%$
test_sac_speed[False-backward] 12.5511ms 11.1725ms 89.5055 Ops/s 91.0629 Ops/s $\color{#d91a1a}-1.71\%$
test_sac_speed[True-None] 2.8267ms 2.5745ms 388.4215 Ops/s 373.5717 Ops/s $\color{#35bf28}+3.98\%$
test_sac_speed[True-backward] 4.4510ms 4.2533ms 235.1127 Ops/s 235.1219 Ops/s $-0.00\%$
test_sac_speed[reduce-overhead-None] 3.3099ms 2.5899ms 386.1128 Ops/s 368.2695 Ops/s $\color{#35bf28}+4.85\%$
test_sac_speed[reduce-overhead-backward] 6.3491ms 5.1550ms 193.9850 Ops/s 231.3807 Ops/s $\textbf{\color{#d91a1a}-16.16\%}$
test_redq_speed[False-None] 20.0424ms 13.7988ms 72.4702 Ops/s 74.6347 Ops/s $\color{#d91a1a}-2.90\%$
test_redq_speed[False-backward] 23.6084ms 22.5412ms 44.3632 Ops/s 40.8454 Ops/s $\textbf{\color{#35bf28}+8.61\%}$
test_redq_speed[True-None] 8.6628ms 7.4117ms 134.9223 Ops/s 140.6508 Ops/s $\color{#d91a1a}-4.07\%$
test_redq_speed[True-backward] 18.1446ms 15.2184ms 65.7101 Ops/s 66.4026 Ops/s $\color{#d91a1a}-1.04\%$
test_redq_speed[reduce-overhead-None] 7.5178ms 6.8340ms 146.3264 Ops/s 140.9480 Ops/s $\color{#35bf28}+3.82\%$
test_redq_speed[reduce-overhead-backward] 15.3203ms 14.3777ms 69.5522 Ops/s 67.3203 Ops/s $\color{#35bf28}+3.32\%$
test_redq_deprec_speed[False-None] 13.8013ms 12.8896ms 77.5822 Ops/s 75.5689 Ops/s $\color{#35bf28}+2.66\%$
test_redq_deprec_speed[False-backward] 20.5779ms 18.5491ms 53.9111 Ops/s 52.3700 Ops/s $\color{#35bf28}+2.94\%$
test_redq_deprec_speed[True-None] 6.2183ms 5.1780ms 193.1245 Ops/s 186.4063 Ops/s $\color{#35bf28}+3.60\%$
test_redq_deprec_speed[True-backward] 11.4551ms 10.2010ms 98.0297 Ops/s 95.0341 Ops/s $\color{#35bf28}+3.15\%$
test_redq_deprec_speed[reduce-overhead-None] 6.9093ms 5.5683ms 179.5864 Ops/s 187.2814 Ops/s $\color{#d91a1a}-4.11\%$
test_redq_deprec_speed[reduce-overhead-backward] 14.0133ms 11.1114ms 89.9973 Ops/s 94.7564 Ops/s $\textbf{\color{#d91a1a}-5.02\%}$
test_td3_speed[False-None] 9.0667ms 8.1684ms 122.4231 Ops/s 122.9965 Ops/s $\color{#d91a1a}-0.47\%$
test_td3_speed[False-backward] 13.4369ms 11.0689ms 90.3431 Ops/s 94.5934 Ops/s $\color{#d91a1a}-4.49\%$
test_td3_speed[True-None] 2.4317ms 2.2632ms 441.8563 Ops/s 428.8376 Ops/s $\color{#35bf28}+3.04\%$
test_td3_speed[True-backward] 4.0368ms 3.8935ms 256.8391 Ops/s 248.7713 Ops/s $\color{#35bf28}+3.24\%$
test_td3_speed[reduce-overhead-None] 2.7876ms 2.2665ms 441.2071 Ops/s 432.1933 Ops/s $\color{#35bf28}+2.09\%$
test_td3_speed[reduce-overhead-backward] 6.9335ms 4.0715ms 245.6126 Ops/s 250.0583 Ops/s $\color{#d91a1a}-1.78\%$
test_cql_speed[False-None] 37.4286ms 36.1514ms 27.6615 Ops/s 27.0258 Ops/s $\color{#35bf28}+2.35\%$
test_cql_speed[False-backward] 51.6295ms 46.8878ms 21.3275 Ops/s 21.2883 Ops/s $\color{#35bf28}+0.18\%$
test_cql_speed[True-None] 24.1364ms 22.7811ms 43.8959 Ops/s 44.5422 Ops/s $\color{#d91a1a}-1.45\%$
test_cql_speed[True-backward] 31.5443ms 29.8931ms 33.4526 Ops/s 33.9211 Ops/s $\color{#d91a1a}-1.38\%$
test_cql_speed[reduce-overhead-None] 24.5805ms 23.0837ms 43.3207 Ops/s 44.2128 Ops/s $\color{#d91a1a}-2.02\%$
test_cql_speed[reduce-overhead-backward] 30.3181ms 29.2984ms 34.1315 Ops/s 33.0544 Ops/s $\color{#35bf28}+3.26\%$
test_a2c_speed[False-None] 8.5544ms 7.2028ms 138.8352 Ops/s 131.8339 Ops/s $\textbf{\color{#35bf28}+5.31\%}$
test_a2c_speed[False-backward] 15.9622ms 14.4997ms 68.9670 Ops/s 64.6152 Ops/s $\textbf{\color{#35bf28}+6.73\%}$
test_a2c_speed[True-None] 5.4864ms 4.6497ms 215.0667 Ops/s 200.4058 Ops/s $\textbf{\color{#35bf28}+7.32\%}$
test_a2c_speed[True-backward] 12.1833ms 11.2987ms 88.5061 Ops/s 84.3576 Ops/s $\color{#35bf28}+4.92\%$
test_a2c_speed[reduce-overhead-None] 5.4558ms 4.6441ms 215.3273 Ops/s 211.1536 Ops/s $\color{#35bf28}+1.98\%$
test_a2c_speed[reduce-overhead-backward] 13.1818ms 11.2767ms 88.6785 Ops/s 83.8230 Ops/s $\textbf{\color{#35bf28}+5.79\%}$
test_ppo_speed[False-None] 9.1202ms 7.5215ms 132.9526 Ops/s 128.6593 Ops/s $\color{#35bf28}+3.34\%$
test_ppo_speed[False-backward] 16.3844ms 14.9448ms 66.9130 Ops/s 63.3486 Ops/s $\textbf{\color{#35bf28}+5.63\%}$
test_ppo_speed[True-None] 5.7130ms 5.0610ms 197.5891 Ops/s 190.2329 Ops/s $\color{#35bf28}+3.87\%$
test_ppo_speed[True-backward] 11.6495ms 11.1135ms 89.9804 Ops/s 85.3433 Ops/s $\textbf{\color{#35bf28}+5.43\%}$
test_ppo_speed[reduce-overhead-None] 5.8183ms 5.0156ms 199.3783 Ops/s 177.2241 Ops/s $\textbf{\color{#35bf28}+12.50\%}$
test_ppo_speed[reduce-overhead-backward] 11.8662ms 10.9582ms 91.2560 Ops/s 89.7815 Ops/s $\color{#35bf28}+1.64\%$
test_reinforce_speed[False-None] 7.9670ms 6.5234ms 153.2948 Ops/s 149.5353 Ops/s $\color{#35bf28}+2.51\%$
test_reinforce_speed[False-backward] 10.1400ms 9.7987ms 102.0543 Ops/s 99.6331 Ops/s $\color{#35bf28}+2.43\%$
test_reinforce_speed[True-None] 4.7907ms 4.0337ms 247.9135 Ops/s 241.2722 Ops/s $\color{#35bf28}+2.75\%$
test_reinforce_speed[True-backward] 10.8781ms 10.2276ms 97.7745 Ops/s 97.2988 Ops/s $\color{#35bf28}+0.49\%$
test_reinforce_speed[reduce-overhead-None] 4.6505ms 4.0408ms 247.4764 Ops/s 243.3562 Ops/s $\color{#35bf28}+1.69\%$
test_reinforce_speed[reduce-overhead-backward] 11.1184ms 10.7225ms 93.2618 Ops/s 95.2949 Ops/s $\color{#d91a1a}-2.13\%$
test_iql_speed[False-None] 34.6735ms 32.6250ms 30.6514 Ops/s 29.6091 Ops/s $\color{#35bf28}+3.52\%$
test_iql_speed[False-backward] 48.1262ms 45.8870ms 21.7927 Ops/s 21.6299 Ops/s $\color{#35bf28}+0.75\%$
test_iql_speed[True-None] 16.6060ms 15.9119ms 62.8459 Ops/s 60.6288 Ops/s $\color{#35bf28}+3.66\%$
test_iql_speed[True-backward] 28.7983ms 27.3953ms 36.5026 Ops/s 35.6911 Ops/s $\color{#35bf28}+2.27\%$
test_iql_speed[reduce-overhead-None] 16.9509ms 15.8450ms 63.1116 Ops/s 62.1479 Ops/s $\color{#35bf28}+1.55\%$
test_iql_speed[reduce-overhead-backward] 27.8539ms 27.5068ms 36.3547 Ops/s 36.0980 Ops/s $\color{#35bf28}+0.71\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.3705ms 4.9237ms 203.0998 Ops/s 201.7453 Ops/s $\color{#35bf28}+0.67\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.3890ms 0.5478ms 1.8256 KOps/s 1.8568 KOps/s $\color{#d91a1a}-1.68\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.8838ms 0.5231ms 1.9118 KOps/s 1.9368 KOps/s $\color{#d91a1a}-1.29\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.5970ms 4.8952ms 204.2828 Ops/s 210.5911 Ops/s $\color{#d91a1a}-3.00\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.1516ms 0.5379ms 1.8590 KOps/s 1.8435 KOps/s $\color{#35bf28}+0.84\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.8855ms 0.5134ms 1.9479 KOps/s 1.9668 KOps/s $\color{#d91a1a}-0.96\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.3404ms 1.6990ms 588.5734 Ops/s 579.5179 Ops/s $\color{#35bf28}+1.56\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.3471ms 1.6122ms 620.2784 Ops/s 613.2695 Ops/s $\color{#35bf28}+1.14\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.0713ms 4.6647ms 214.3740 Ops/s 184.5304 Ops/s $\textbf{\color{#35bf28}+16.17\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.0549ms 0.6633ms 1.5076 KOps/s 1.4498 KOps/s $\color{#35bf28}+3.99\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8791ms 0.6423ms 1.5569 KOps/s 1.4979 KOps/s $\color{#35bf28}+3.94\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.3121ms 4.8017ms 208.2613 Ops/s 199.7904 Ops/s $\color{#35bf28}+4.24\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.3913ms 0.5549ms 1.8020 KOps/s 1.7891 KOps/s $\color{#35bf28}+0.72\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7660ms 0.5241ms 1.9081 KOps/s 1.9528 KOps/s $\color{#d91a1a}-2.29\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.4653ms 4.6302ms 215.9711 Ops/s 220.1064 Ops/s $\color{#d91a1a}-1.88\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.3409ms 0.5220ms 1.9157 KOps/s 1.8528 KOps/s $\color{#35bf28}+3.40\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.8860ms 0.5046ms 1.9819 KOps/s 1.9805 KOps/s $\color{#35bf28}+0.07\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.1840ms 4.7667ms 209.7866 Ops/s 209.4368 Ops/s $\color{#35bf28}+0.17\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.1621ms 0.6716ms 1.4890 KOps/s 1.4629 KOps/s $\color{#35bf28}+1.79\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8454ms 0.6427ms 1.5561 KOps/s 1.5114 KOps/s $\color{#35bf28}+2.95\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 5.5210ms 4.2688ms 234.2582 Ops/s 243.5497 Ops/s $\color{#d91a1a}-3.82\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 6.9361ms 2.3572ms 424.2280 Ops/s 435.9205 Ops/s $\color{#d91a1a}-2.68\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1.8205ms 1.1999ms 833.3843 Ops/s 760.5258 Ops/s $\textbf{\color{#35bf28}+9.58\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.4450s 13.1385ms 76.1122 Ops/s 231.2433 Ops/s $\textbf{\color{#d91a1a}-67.09\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 8.6020ms 2.3441ms 426.5943 Ops/s 429.2829 Ops/s $\color{#d91a1a}-0.63\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 1.9966ms 1.2897ms 775.3534 Ops/s 735.9864 Ops/s $\textbf{\color{#35bf28}+5.35\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 6.0214ms 4.4733ms 223.5483 Ops/s 32.6932 Ops/s $\textbf{\color{#35bf28}+583.78\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 6.5397ms 2.5421ms 393.3830 Ops/s 366.0745 Ops/s $\textbf{\color{#35bf28}+7.46\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 7.4037ms 1.6246ms 615.5442 Ops/s 621.1252 Ops/s $\color{#d91a1a}-0.90\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 12.1604ms 11.6257ms 86.0163 Ops/s 81.5944 Ops/s $\textbf{\color{#35bf28}+5.42\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 14.9424ms 14.0655ms 71.0958 Ops/s 70.5714 Ops/s $\color{#35bf28}+0.74\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 21.4276ms 20.5394ms 48.6870 Ops/s 47.3773 Ops/s $\color{#35bf28}+2.76\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 16.1272ms 14.3431ms 69.7198 Ops/s 69.4730 Ops/s $\color{#35bf28}+0.36\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 22.3293ms 20.6077ms 48.5255 Ops/s 47.8615 Ops/s $\color{#35bf28}+1.39\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 17.8779ms 15.7956ms 63.3088 Ops/s 63.6750 Ops/s $\color{#d91a1a}-0.58\%$

Copy link

github-actions bot commented Mar 8, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}15$. Worsened: $\large\color{#d91a1a}8$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.9011s 0.8145s 1.2278 Ops/s 1.2653 Ops/s $\color{#d91a1a}-2.97\%$
test_transformed 1.5105s 1.4215s 0.7035 Ops/s 0.7025 Ops/s $\color{#35bf28}+0.14\%$
test_serial 2.3884s 2.3003s 0.4347 Ops/s 0.4329 Ops/s $\color{#35bf28}+0.42\%$
test_parallel 1.9402s 1.8969s 0.5272 Ops/s 0.5343 Ops/s $\color{#d91a1a}-1.33\%$
test_step_mdp_speed[True-True-True-True-True] 0.1457ms 38.9033μs 25.7048 KOps/s 24.6324 KOps/s $\color{#35bf28}+4.35\%$
test_step_mdp_speed[True-True-True-True-False] 0.1702ms 23.3333μs 42.8573 KOps/s 42.2115 KOps/s $\color{#35bf28}+1.53\%$
test_step_mdp_speed[True-True-True-False-True] 0.1007ms 22.5887μs 44.2698 KOps/s 44.2300 KOps/s $\color{#35bf28}+0.09\%$
test_step_mdp_speed[True-True-True-False-False] 51.5610μs 12.9212μs 77.3924 KOps/s 75.6425 KOps/s $\color{#35bf28}+2.31\%$
test_step_mdp_speed[True-True-False-True-True] 80.9010μs 42.6947μs 23.4221 KOps/s 22.9284 KOps/s $\color{#35bf28}+2.15\%$
test_step_mdp_speed[True-True-False-True-False] 58.0210μs 25.7884μs 38.7771 KOps/s 38.0291 KOps/s $\color{#35bf28}+1.97\%$
test_step_mdp_speed[True-True-False-False-True] 95.6420μs 24.5408μs 40.7485 KOps/s 39.8371 KOps/s $\color{#35bf28}+2.29\%$
test_step_mdp_speed[True-True-False-False-False] 49.9810μs 15.3647μs 65.0841 KOps/s 63.2550 KOps/s $\color{#35bf28}+2.89\%$
test_step_mdp_speed[True-False-True-True-True] 0.1193ms 45.4344μs 22.0098 KOps/s 21.5924 KOps/s $\color{#35bf28}+1.93\%$
test_step_mdp_speed[True-False-True-True-False] 65.1310μs 28.2547μs 35.3923 KOps/s 34.2870 KOps/s $\color{#35bf28}+3.22\%$
test_step_mdp_speed[True-False-True-False-True] 66.4420μs 25.0051μs 39.9919 KOps/s 39.7409 KOps/s $\color{#35bf28}+0.63\%$
test_step_mdp_speed[True-False-True-False-False] 84.7320μs 15.4448μs 64.7466 KOps/s 64.0896 KOps/s $\color{#35bf28}+1.03\%$
test_step_mdp_speed[True-False-False-True-True] 88.9010μs 46.8743μs 21.3337 KOps/s 20.9709 KOps/s $\color{#35bf28}+1.73\%$
test_step_mdp_speed[True-False-False-True-False] 64.1810μs 30.5293μs 32.7554 KOps/s 32.4654 KOps/s $\color{#35bf28}+0.89\%$
test_step_mdp_speed[True-False-False-False-True] 62.4210μs 26.9858μs 37.0565 KOps/s 36.4917 KOps/s $\color{#35bf28}+1.55\%$
test_step_mdp_speed[True-False-False-False-False] 49.2610μs 17.7273μs 56.4100 KOps/s 55.9546 KOps/s $\color{#35bf28}+0.81\%$
test_step_mdp_speed[False-True-True-True-True] 0.1222ms 45.1666μs 22.1402 KOps/s 21.7233 KOps/s $\color{#35bf28}+1.92\%$
test_step_mdp_speed[False-True-True-True-False] 62.0310μs 28.4176μs 35.1895 KOps/s 34.6742 KOps/s $\color{#35bf28}+1.49\%$
test_step_mdp_speed[False-True-True-False-True] 2.5285ms 29.1547μs 34.2998 KOps/s 34.1422 KOps/s $\color{#35bf28}+0.46\%$
test_step_mdp_speed[False-True-True-False-False] 49.5510μs 17.3408μs 57.6674 KOps/s 57.1014 KOps/s $\color{#35bf28}+0.99\%$
test_step_mdp_speed[False-True-False-True-True] 80.2010μs 47.3705μs 21.1102 KOps/s 20.5692 KOps/s $\color{#35bf28}+2.63\%$
test_step_mdp_speed[False-True-False-True-False] 68.3320μs 30.8000μs 32.4675 KOps/s 31.8353 KOps/s $\color{#35bf28}+1.99\%$
test_step_mdp_speed[False-True-False-False-True] 96.3820μs 31.1782μs 32.0737 KOps/s 31.7046 KOps/s $\color{#35bf28}+1.16\%$
test_step_mdp_speed[False-True-False-False-False] 0.1450ms 19.4943μs 51.2970 KOps/s 50.2925 KOps/s $\color{#35bf28}+2.00\%$
test_step_mdp_speed[False-False-True-True-True] 88.0820μs 50.5050μs 19.8000 KOps/s 19.7342 KOps/s $\color{#35bf28}+0.33\%$
test_step_mdp_speed[False-False-True-True-False] 67.3910μs 33.1162μs 30.1967 KOps/s 29.4299 KOps/s $\color{#35bf28}+2.61\%$
test_step_mdp_speed[False-False-True-False-True] 76.8420μs 30.7384μs 32.5326 KOps/s 31.5058 KOps/s $\color{#35bf28}+3.26\%$
test_step_mdp_speed[False-False-True-False-False] 44.9510μs 19.4317μs 51.4624 KOps/s 51.0050 KOps/s $\color{#35bf28}+0.90\%$
test_step_mdp_speed[False-False-False-True-True] 87.1620μs 51.3900μs 19.4590 KOps/s 18.9697 KOps/s $\color{#35bf28}+2.58\%$
test_step_mdp_speed[False-False-False-True-False] 76.9510μs 35.4571μs 28.2031 KOps/s 27.7365 KOps/s $\color{#35bf28}+1.68\%$
test_step_mdp_speed[False-False-False-False-True] 0.1020ms 32.8386μs 30.4519 KOps/s 29.8678 KOps/s $\color{#35bf28}+1.96\%$
test_step_mdp_speed[False-False-False-False-False] 53.0410μs 21.6242μs 46.2446 KOps/s 45.6760 KOps/s $\color{#35bf28}+1.24\%$
test_values[generalized_advantage_estimate-True-True] 25.5737ms 23.7798ms 42.0525 Ops/s 42.8836 Ops/s $\color{#d91a1a}-1.94\%$
test_values[vec_generalized_advantage_estimate-True-True] 99.4048ms 2.8775ms 347.5290 Ops/s 361.5924 Ops/s $\color{#d91a1a}-3.89\%$
test_values[td0_return_estimate-False-False] 0.1001ms 76.0814μs 13.1438 KOps/s 13.0399 KOps/s $\color{#35bf28}+0.80\%$
test_values[td1_return_estimate-False-False] 52.7642ms 52.4382ms 19.0701 Ops/s 19.2370 Ops/s $\color{#d91a1a}-0.87\%$
test_values[vec_td1_return_estimate-False-False] 1.3418ms 1.0683ms 936.0887 Ops/s 939.2093 Ops/s $\color{#d91a1a}-0.33\%$
test_values[td_lambda_return_estimate-True-False] 83.8048ms 83.3250ms 12.0012 Ops/s 12.0693 Ops/s $\color{#d91a1a}-0.56\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.3292ms 1.0662ms 937.9116 Ops/s 940.5058 Ops/s $\color{#d91a1a}-0.28\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 23.4945ms 23.2943ms 42.9290 Ops/s 43.4674 Ops/s $\color{#d91a1a}-1.24\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0152ms 0.7492ms 1.3348 KOps/s 1.3683 KOps/s $\color{#d91a1a}-2.45\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7870ms 0.6473ms 1.5450 KOps/s 1.5543 KOps/s $\color{#d91a1a}-0.60\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.6262ms 1.4705ms 680.0335 Ops/s 682.6832 Ops/s $\color{#d91a1a}-0.39\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.8286ms 0.6612ms 1.5124 KOps/s 1.5185 KOps/s $\color{#d91a1a}-0.40\%$
test_dqn_speed[False-None] 1.6668ms 1.4865ms 672.6995 Ops/s 666.3357 Ops/s $\color{#35bf28}+0.96\%$
test_dqn_speed[False-backward] 2.2114ms 2.0790ms 480.9914 Ops/s 481.3993 Ops/s $\color{#d91a1a}-0.08\%$
test_dqn_speed[True-None] 0.7389ms 0.5638ms 1.7737 KOps/s 1.7431 KOps/s $\color{#35bf28}+1.76\%$
test_dqn_speed[True-backward] 1.2802ms 1.2180ms 821.0381 Ops/s 807.6731 Ops/s $\color{#35bf28}+1.65\%$
test_dqn_speed[reduce-overhead-None] 0.7649ms 0.5682ms 1.7601 KOps/s 1.7585 KOps/s $\color{#35bf28}+0.09\%$
test_dqn_speed[reduce-overhead-backward] 1.2045ms 1.0644ms 939.4545 Ops/s 929.0959 Ops/s $\color{#35bf28}+1.11\%$
test_ddpg_speed[False-None] 3.0928ms 2.7827ms 359.3668 Ops/s 356.2543 Ops/s $\color{#35bf28}+0.87\%$
test_ddpg_speed[False-backward] 4.6237ms 4.1173ms 242.8795 Ops/s 243.7255 Ops/s $\color{#d91a1a}-0.35\%$
test_ddpg_speed[True-None] 1.5097ms 1.3570ms 736.9384 Ops/s 743.8653 Ops/s $\color{#d91a1a}-0.93\%$
test_ddpg_speed[True-backward] 2.6934ms 2.6109ms 383.0073 Ops/s 384.7131 Ops/s $\color{#d91a1a}-0.44\%$
test_ddpg_speed[reduce-overhead-None] 2.3818ms 1.3609ms 734.7810 Ops/s 737.4233 Ops/s $\color{#d91a1a}-0.36\%$
test_ddpg_speed[reduce-overhead-backward] 2.1365ms 2.0403ms 490.1319 Ops/s 489.7495 Ops/s $\color{#35bf28}+0.08\%$
test_sac_speed[False-None] 8.3685ms 7.9270ms 126.1517 Ops/s 126.1600 Ops/s $-0.01\%$
test_sac_speed[False-backward] 11.4396ms 11.0267ms 90.6893 Ops/s 91.1444 Ops/s $\color{#d91a1a}-0.50\%$
test_sac_speed[True-None] 2.0364ms 1.8513ms 540.1487 Ops/s 535.2624 Ops/s $\color{#35bf28}+0.91\%$
test_sac_speed[True-backward] 3.6894ms 3.5627ms 280.6824 Ops/s 262.8395 Ops/s $\textbf{\color{#35bf28}+6.79\%}$
test_sac_speed[reduce-overhead-None] 20.1832ms 11.5965ms 86.2329 Ops/s 81.7777 Ops/s $\textbf{\color{#35bf28}+5.45\%}$
test_sac_speed[reduce-overhead-backward] 1.7320ms 1.5834ms 631.5515 Ops/s 613.7480 Ops/s $\color{#35bf28}+2.90\%$
test_redq_speed[False-None] 8.1137ms 7.5746ms 132.0197 Ops/s 131.3103 Ops/s $\color{#35bf28}+0.54\%$
test_redq_speed[False-backward] 11.8886ms 11.1871ms 89.3888 Ops/s 87.9872 Ops/s $\color{#35bf28}+1.59\%$
test_redq_speed[True-None] 2.4702ms 2.3036ms 434.0941 Ops/s 425.3072 Ops/s $\color{#35bf28}+2.07\%$
test_redq_speed[True-backward] 4.4363ms 4.0015ms 249.9061 Ops/s 235.8384 Ops/s $\textbf{\color{#35bf28}+5.96\%}$
test_redq_speed[reduce-overhead-None] 2.6418ms 2.3395ms 427.4369 Ops/s 424.0560 Ops/s $\color{#35bf28}+0.80\%$
test_redq_speed[reduce-overhead-backward] 4.3211ms 4.0278ms 248.2756 Ops/s 234.9015 Ops/s $\textbf{\color{#35bf28}+5.69\%}$
test_redq_deprec_speed[False-None] 9.2116ms 8.8623ms 112.8380 Ops/s 111.7476 Ops/s $\color{#35bf28}+0.98\%$
test_redq_deprec_speed[False-backward] 12.2419ms 11.7339ms 85.2234 Ops/s 82.6768 Ops/s $\color{#35bf28}+3.08\%$
test_redq_deprec_speed[True-None] 2.7593ms 2.6022ms 384.2953 Ops/s 377.3585 Ops/s $\color{#35bf28}+1.84\%$
test_redq_deprec_speed[True-backward] 4.6303ms 4.3043ms 232.3263 Ops/s 227.1226 Ops/s $\color{#35bf28}+2.29\%$
test_redq_deprec_speed[reduce-overhead-None] 2.7786ms 2.6005ms 384.5373 Ops/s 378.5326 Ops/s $\color{#35bf28}+1.59\%$
test_redq_deprec_speed[reduce-overhead-backward] 4.7237ms 4.3286ms 231.0219 Ops/s 230.6935 Ops/s $\color{#35bf28}+0.14\%$
test_td3_speed[False-None] 8.0887ms 7.8280ms 127.7458 Ops/s 127.7959 Ops/s $\color{#d91a1a}-0.04\%$
test_td3_speed[False-backward] 10.6817ms 10.1504ms 98.5187 Ops/s 99.5941 Ops/s $\color{#d91a1a}-1.08\%$
test_td3_speed[True-None] 1.6715ms 1.6413ms 609.2763 Ops/s 604.5574 Ops/s $\color{#35bf28}+0.78\%$
test_td3_speed[True-backward] 3.3869ms 3.2082ms 311.7030 Ops/s 310.4580 Ops/s $\color{#35bf28}+0.40\%$
test_td3_speed[reduce-overhead-None] 76.9113ms 26.2680ms 38.0691 Ops/s 38.5228 Ops/s $\color{#d91a1a}-1.18\%$
test_td3_speed[reduce-overhead-backward] 1.3783ms 1.3071ms 765.0713 Ops/s 753.3192 Ops/s $\color{#35bf28}+1.56\%$
test_cql_speed[False-None] 16.8484ms 16.4041ms 60.9605 Ops/s 60.0111 Ops/s $\color{#35bf28}+1.58\%$
test_cql_speed[False-backward] 21.7610ms 21.3420ms 46.8559 Ops/s 46.5055 Ops/s $\color{#35bf28}+0.75\%$
test_cql_speed[True-None] 3.6041ms 3.3667ms 297.0302 Ops/s 305.7826 Ops/s $\color{#d91a1a}-2.86\%$
test_cql_speed[True-backward] 6.3192ms 5.7279ms 174.5827 Ops/s 180.5890 Ops/s $\color{#d91a1a}-3.33\%$
test_cql_speed[reduce-overhead-None] 0.5885s 16.3420ms 61.1921 Ops/s 76.8046 Ops/s $\textbf{\color{#d91a1a}-20.33\%}$
test_cql_speed[reduce-overhead-backward] 2.1063ms 1.9634ms 509.3130 Ops/s 552.5936 Ops/s $\textbf{\color{#d91a1a}-7.83\%}$
test_a2c_speed[False-None] 3.4954ms 3.0924ms 323.3745 Ops/s 314.8990 Ops/s $\color{#35bf28}+2.69\%$
test_a2c_speed[False-backward] 6.7507ms 6.1179ms 163.4537 Ops/s 168.0897 Ops/s $\color{#d91a1a}-2.76\%$
test_a2c_speed[True-None] 1.7512ms 1.3539ms 738.6213 Ops/s 716.4577 Ops/s $\color{#35bf28}+3.09\%$
test_a2c_speed[True-backward] 3.1767ms 3.0544ms 327.3960 Ops/s 321.2618 Ops/s $\color{#35bf28}+1.91\%$
test_a2c_speed[reduce-overhead-None] 16.0075ms 9.0387ms 110.6358 Ops/s 109.6451 Ops/s $\color{#35bf28}+0.90\%$
test_a2c_speed[reduce-overhead-backward] 1.7282ms 1.6012ms 624.5292 Ops/s 609.7170 Ops/s $\color{#35bf28}+2.43\%$
test_ppo_speed[False-None] 3.7511ms 3.5872ms 278.7726 Ops/s 274.4483 Ops/s $\color{#35bf28}+1.58\%$
test_ppo_speed[False-backward] 7.1567ms 6.8258ms 146.5039 Ops/s 146.1759 Ops/s $\color{#35bf28}+0.22\%$
test_ppo_speed[True-None] 1.5823ms 1.4234ms 702.5625 Ops/s 691.7229 Ops/s $\color{#35bf28}+1.57\%$
test_ppo_speed[True-backward] 3.4494ms 3.2356ms 309.0638 Ops/s 303.3332 Ops/s $\color{#35bf28}+1.89\%$
test_ppo_speed[reduce-overhead-None] 1.1214ms 0.9651ms 1.0362 KOps/s 1.0299 KOps/s $\color{#35bf28}+0.61\%$
test_ppo_speed[reduce-overhead-backward] 1.5418ms 1.4156ms 706.4029 Ops/s 618.9616 Ops/s $\textbf{\color{#35bf28}+14.13\%}$
test_reinforce_speed[False-None] 2.4742ms 2.2237ms 449.7053 Ops/s 448.1754 Ops/s $\color{#35bf28}+0.34\%$
test_reinforce_speed[False-backward] 3.5760ms 3.1971ms 312.7824 Ops/s 303.7154 Ops/s $\color{#35bf28}+2.99\%$
test_reinforce_speed[True-None] 1.4575ms 1.2984ms 770.1975 Ops/s 760.4488 Ops/s $\color{#35bf28}+1.28\%$
test_reinforce_speed[True-backward] 3.1206ms 2.9445ms 339.6131 Ops/s 322.8599 Ops/s $\textbf{\color{#35bf28}+5.19\%}$
test_reinforce_speed[reduce-overhead-None] 18.3309ms 10.1606ms 98.4190 Ops/s 97.8610 Ops/s $\color{#35bf28}+0.57\%$
test_reinforce_speed[reduce-overhead-backward] 1.6553ms 1.4824ms 674.5959 Ops/s 594.5980 Ops/s $\textbf{\color{#35bf28}+13.45\%}$
test_iql_speed[False-None] 9.3990ms 8.9961ms 111.1594 Ops/s 109.4823 Ops/s $\color{#35bf28}+1.53\%$
test_iql_speed[False-backward] 13.0236ms 12.4616ms 80.2468 Ops/s 76.8274 Ops/s $\color{#35bf28}+4.45\%$
test_iql_speed[True-None] 2.4665ms 2.2521ms 444.0339 Ops/s 436.6906 Ops/s $\color{#35bf28}+1.68\%$
test_iql_speed[True-backward] 5.0001ms 4.7840ms 209.0284 Ops/s 203.1946 Ops/s $\color{#35bf28}+2.87\%$
test_iql_speed[reduce-overhead-None] 0.5266s 13.1275ms 76.1758 Ops/s 88.9346 Ops/s $\textbf{\color{#d91a1a}-14.35\%}$
test_iql_speed[reduce-overhead-backward] 1.9386ms 1.8608ms 537.3941 Ops/s 483.4216 Ops/s $\textbf{\color{#35bf28}+11.16\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.7333ms 6.2335ms 160.4244 Ops/s 158.6363 Ops/s $\color{#35bf28}+1.13\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.5767ms 0.3444ms 2.9032 KOps/s 3.7303 KOps/s $\textbf{\color{#d91a1a}-22.17\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6744ms 0.3310ms 3.0209 KOps/s 4.0364 KOps/s $\textbf{\color{#d91a1a}-25.16\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.2986ms 5.9707ms 167.4854 Ops/s 165.5467 Ops/s $\color{#35bf28}+1.17\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.6225ms 0.2932ms 3.4101 KOps/s 3.2599 KOps/s $\color{#35bf28}+4.61\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5317ms 0.2864ms 3.4917 KOps/s 3.6258 KOps/s $\color{#d91a1a}-3.70\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.6540ms 1.3589ms 735.9141 Ops/s 735.0805 Ops/s $\color{#35bf28}+0.11\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.4890ms 1.2641ms 791.0674 Ops/s 790.4726 Ops/s $\color{#35bf28}+0.08\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.3377ms 6.1034ms 163.8434 Ops/s 161.0553 Ops/s $\color{#35bf28}+1.73\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.1594ms 0.4101ms 2.4384 KOps/s 2.4736 KOps/s $\color{#d91a1a}-1.42\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6776ms 0.4287ms 2.3325 KOps/s 2.6061 KOps/s $\textbf{\color{#d91a1a}-10.50\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.2526ms 6.0751ms 164.6052 Ops/s 164.9051 Ops/s $\color{#d91a1a}-0.18\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.9419ms 0.3110ms 3.2154 KOps/s 3.1560 KOps/s $\color{#35bf28}+1.88\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5012ms 0.3012ms 3.3200 KOps/s 3.3611 KOps/s $\color{#d91a1a}-1.23\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 8.2274ms 5.9629ms 167.7039 Ops/s 165.9193 Ops/s $\color{#35bf28}+1.08\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.5141ms 0.2704ms 3.6982 KOps/s 3.1483 KOps/s $\textbf{\color{#35bf28}+17.46\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5490ms 0.2494ms 4.0097 KOps/s 3.7153 KOps/s $\textbf{\color{#35bf28}+7.92\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.3462ms 6.1471ms 162.6774 Ops/s 161.1287 Ops/s $\color{#35bf28}+0.96\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.2083ms 0.4403ms 2.2710 KOps/s 1.9919 KOps/s $\textbf{\color{#35bf28}+14.01\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6143ms 0.4183ms 2.3906 KOps/s 2.2839 KOps/s $\color{#35bf28}+4.67\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 7.0305ms 5.3854ms 185.6861 Ops/s 178.8546 Ops/s $\color{#35bf28}+3.82\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 9.5134ms 2.0604ms 485.3362 Ops/s 418.6858 Ops/s $\textbf{\color{#35bf28}+15.92\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 7.2791ms 1.2131ms 824.3460 Ops/s 863.5066 Ops/s $\color{#d91a1a}-4.54\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 7.1191ms 5.5297ms 180.8407 Ops/s 176.3301 Ops/s $\color{#35bf28}+2.56\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 9.1613ms 2.0558ms 486.4299 Ops/s 445.1599 Ops/s $\textbf{\color{#35bf28}+9.27\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 6.9462ms 1.1573ms 864.0497 Ops/s 769.0764 Ops/s $\textbf{\color{#35bf28}+12.35\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.5259s 16.0071ms 62.4723 Ops/s 30.3997 Ops/s $\textbf{\color{#35bf28}+105.50\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 10.6011ms 2.3325ms 428.7255 Ops/s 459.9849 Ops/s $\textbf{\color{#d91a1a}-6.80\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 2.4057ms 1.2670ms 789.2529 Ops/s 872.0106 Ops/s $\textbf{\color{#d91a1a}-9.49\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.5002ms 13.1932ms 75.7964 Ops/s 73.5789 Ops/s $\color{#35bf28}+3.01\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 18.0830ms 16.6228ms 60.1582 Ops/s 57.9994 Ops/s $\color{#35bf28}+3.72\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 18.0245ms 17.7148ms 56.4499 Ops/s 55.3184 Ops/s $\color{#35bf28}+2.05\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 19.3105ms 16.9706ms 58.9255 Ops/s 59.9899 Ops/s $\color{#d91a1a}-1.77\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 18.4967ms 17.8209ms 56.1137 Ops/s 55.2563 Ops/s $\color{#35bf28}+1.55\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 20.1151ms 18.4728ms 54.1336 Ops/s 55.4983 Ops/s $\color{#d91a1a}-2.46\%$

vmoens added a commit that referenced this pull request Mar 8, 2025
ghstack-source-id: df39fd2e4cd72f24c645b0ac32b46ab3e8d847fc
Pull Request resolved: #2840

(cherry picked from commit 6e40548)
vmoens added a commit that referenced this pull request Mar 10, 2025
ghstack-source-id: df39fd2e4cd72f24c645b0ac32b46ab3e8d847fc
Pull Request resolved: #2840

(cherry picked from commit 6e40548)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants