-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
autotp training(fix dco) #7004
autotp training(fix dco) #7004
Conversation
Signed-off-by: inkcherry <mingzhi.liu@intel.com>
Kudos @inkcherry for contributing AutoTP training! It's a nice feature make tensor parallel training/finetuning more available to HF model users. I think a tutorial page would help user discover and learn how to use this feature in DeepSpeed. Is it possible to write a tutorial and add it under https://github.com/deepspeedai/DeepSpeed/tree/master/docs/_tutorials introducing steps how to use this feature? I remember you have an example training alpaca with DeepSpeed AutoTP. |
Same as [this PR](#6922). [affeb88](affeb88) I noticed the CI updated the DCO check recently. Using the suggested rebase method for sign-off would reintroduce many conflicts, so I opted for a squash merge with sign-off instead. thanks: ) Signed-off-by: inkcherry <mingzhi.liu@intel.com> Signed-off-by: Olatunji Ruwase <olruwase@microsoft.com>
Same as [this PR](deepspeedai#6922). [affeb88](deepspeedai@affeb88) I noticed the CI updated the DCO check recently. Using the suggested rebase method for sign-off would reintroduce many conflicts, so I opted for a squash merge with sign-off instead. thanks: ) Signed-off-by: inkcherry <mingzhi.liu@intel.com>
Same as [this PR](deepspeedai#6922). [affeb88](deepspeedai@affeb88) I noticed the CI updated the DCO check recently. Using the suggested rebase method for sign-off would reintroduce many conflicts, so I opted for a squash merge with sign-off instead. thanks: ) Signed-off-by: inkcherry <mingzhi.liu@intel.com> Signed-off-by: siqi <siqi@tecorigin.com>
Yes, I will add some document soon~ |
@inkcherry, I think a blog would be appropriate to publicize this amazing technology. Although blogs can be a bit work, we will be glad to collaborate and jointly advertise. |
Same as [this PR](#6922). [affeb88](affeb88) I noticed the CI updated the DCO check recently. Using the suggested rebase method for sign-off would reintroduce many conflicts, so I opted for a squash merge with sign-off instead. thanks: ) Signed-off-by: inkcherry <mingzhi.liu@intel.com> Signed-off-by: Logan Adams <loadams@microsoft.com>
Same as [this PR](deepspeedai#6922). [affeb88](deepspeedai@affeb88) I noticed the CI updated the DCO check recently. Using the suggested rebase method for sign-off would reintroduce many conflicts, so I opted for a squash merge with sign-off instead. thanks: ) Signed-off-by: inkcherry <mingzhi.liu@intel.com>
Same as [this PR](deepspeedai#6922). [affeb88](deepspeedai@affeb88) I noticed the CI updated the DCO check recently. Using the suggested rebase method for sign-off would reintroduce many conflicts, so I opted for a squash merge with sign-off instead. thanks: ) Signed-off-by: inkcherry <mingzhi.liu@intel.com>
Same as [this PR](deepspeedai#6922). [affeb88](deepspeedai@affeb88) I noticed the CI updated the DCO check recently. Using the suggested rebase method for sign-off would reintroduce many conflicts, so I opted for a squash merge with sign-off instead. thanks: ) Signed-off-by: inkcherry <mingzhi.liu@intel.com> Signed-off-by: gyou2021 <ganmei.you@intel.com>
@inkcherry @loadams @tjruwase keep_module_on_host is currently ignored and not used in auto_tp. |
@inkcherry, can you please help address issues raised by @oelayan7. It seems we need to re-apply the lost changes from #6846. |
@oelayan7 I will fix it later. Thanks! |
Same as [this PR](deepspeedai#6922). [affeb88](deepspeedai@affeb88) I noticed the CI updated the DCO check recently. Using the suggested rebase method for sign-off would reintroduce many conflicts, so I opted for a squash merge with sign-off instead. thanks: ) Signed-off-by: inkcherry <mingzhi.liu@intel.com>
Same as [this PR](#6922). [affeb88](affeb88) I noticed the CI updated the DCO check recently. Using the suggested rebase method for sign-off would reintroduce many conflicts, so I opted for a squash merge with sign-off instead. thanks: ) Signed-off-by: inkcherry <mingzhi.liu@intel.com> Signed-off-by: Masahiro Tanaka <mtanaka@microsoft.com>
Same as [this PR](deepspeedai#6922). [affeb88](deepspeedai@affeb88) I noticed the CI updated the DCO check recently. Using the suggested rebase method for sign-off would reintroduce many conflicts, so I opted for a squash merge with sign-off instead. thanks: ) Signed-off-by: inkcherry <mingzhi.liu@intel.com> Signed-off-by: yisheng <yi.sheng@intel.com>
Same as this PR. affeb88
I noticed the CI updated the DCO check recently. Using the suggested rebase method for sign-off would reintroduce many conflicts, so I opted for a squash merge with sign-off instead. thanks: )