Skip to content

Actions: deepspeedai/DeepSpeed

hpu-gaudi2

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
1,571 workflow runs
1,571 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

Add DeepseekV3 AutoTP.
hpu-gaudi2 #1700: Pull request #7045 synchronize by tjruwase
February 23, 2025 20:04 58m 9s Yejing-Lai:lyj/deepseekv3
February 23, 2025 20:04 58m 9s
hpu-gaudi2
hpu-gaudi2 #1699: Scheduled
February 23, 2025 00:12 1h 53m 40s master
February 23, 2025 00:12 1h 53m 40s
hpu-gaudi2
hpu-gaudi2 #1698: Scheduled
February 22, 2025 00:11 2h 3m 1s master
February 22, 2025 00:11 2h 3m 1s
Improve overflow handling in ZeRO
hpu-gaudi2 #1697: Pull request #6976 synchronize by tjruwase
February 21, 2025 22:51 56m 59s olruwase/ds_5241
February 21, 2025 22:51 56m 59s
Enable ZeRO set/get APIs for NVMe offload
hpu-gaudi2 #1696: Pull request #7046 synchronize by loadams
February 21, 2025 20:59 58m 30s olruwase/update_nvme_offload_states
February 21, 2025 20:59 58m 30s
Enable ZeRO set/get APIs for NVMe offload
hpu-gaudi2 #1695: Pull request #7046 synchronize by tjruwase
February 21, 2025 12:06 57m 2s olruwase/update_nvme_offload_states
February 21, 2025 12:06 57m 2s
Enabled configurable auto Tensor Parallelism (TP) for the inference of diverse models
hpu-gaudi2 #1694: Pull request #6553 synchronize by gyou2021
February 21, 2025 06:01 Action required gyou2021:configurable_autoTP
February 21, 2025 06:01 Action required
hpu-gaudi2
hpu-gaudi2 #1693: Scheduled
February 21, 2025 00:11 2h 4m 12s master
February 21, 2025 00:11 2h 4m 12s
Bug Fix for offload_states API
hpu-gaudi2 #1692: Pull request #7050 synchronize by tohtana
February 20, 2025 18:23 1h 52m 17s U-rara:bugfix_reload_states
February 20, 2025 18:23 1h 52m 17s
Fix, pipeline model with moe cause error when send grad
hpu-gaudi2 #1691: Pull request #7055 synchronize by hwchen2017
February 20, 2025 18:12 1h 57m 43s wukong1992:fix-pipe-act-grad-comm
February 20, 2025 18:12 1h 57m 43s
Bug Fix for offload_states API
hpu-gaudi2 #1689: Pull request #7050 synchronize by U-rara
February 20, 2025 15:30 59m 44s U-rara:bugfix_reload_states
February 20, 2025 15:30 59m 44s
Enabled configurable auto Tensor Parallelism (TP) for the inference of diverse models
hpu-gaudi2 #1688: Pull request #6553 synchronize by loadams
February 20, 2025 15:29 Action required gyou2021:configurable_autoTP
February 20, 2025 15:29 Action required
Fix, bf16 optimizer remove dup loop
hpu-gaudi2 #1683: Pull request #7054 synchronize by hwchen2017
February 20, 2025 05:49 56m 30s wukong1992:fix-bf16-moe-refresh-params
February 20, 2025 05:49 56m 30s
Fix, bf16 optimizer remove dup loop
hpu-gaudi2 #1681: Pull request #7054 synchronize by wukong1992
February 20, 2025 03:09 Action required wukong1992:fix-bf16-moe-refresh-params
February 20, 2025 03:09 Action required
hpu-gaudi2
hpu-gaudi2 #1680: Scheduled
February 20, 2025 00:11 2h 30m 10s master
February 20, 2025 00:11 2h 30m 10s
Training multiple models
hpu-gaudi2 #1679: Pull request #7018 synchronize by loadams
February 19, 2025 23:37 57m 30s olruwase/zero_multi_models
February 19, 2025 23:37 57m 30s
add autoTP training zero2 tests
hpu-gaudi2 #1678: Pull request #7049 synchronize by tjruwase
February 19, 2025 18:50 1h 11m 53s inkcherry:minor_fix_version2
February 19, 2025 18:50 1h 11m 53s
Fix, bf16 optimizer remove dup loop
hpu-gaudi2 #1677: Pull request #7054 synchronize by tjruwase
February 19, 2025 18:44 57m 4s wukong1992:fix-bf16-moe-refresh-params
February 19, 2025 18:44 57m 4s
Enable ZeRO set/get APIs for NVMe offload
hpu-gaudi2 #1676: Pull request #7046 synchronize by loadams
February 19, 2025 17:47 57m 19s olruwase/update_nvme_offload_states
February 19, 2025 17:47 57m 19s
Variable batch size and LR scheduler (moved to #7104)
hpu-gaudi2 #1674: Pull request #7020 synchronize by bm-synth
February 19, 2025 15:44 Action required bm-synth:variable_batch_size_and_lr
February 19, 2025 15:44 Action required
Fix, pipeline model with moe cause error when send grad
hpu-gaudi2 #1673: Pull request #7055 opened by wukong1992
February 19, 2025 11:53 Action required wukong1992:fix-pipe-act-grad-comm
February 19, 2025 11:53 Action required