-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[XPU] use allgather and fp32 multinomial for XPU #8787
Conversation
Thanks for your contribution! |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #8787 +/- ##
===========================================
+ Coverage 55.44% 55.49% +0.05%
===========================================
Files 626 626
Lines 98064 98069 +5
===========================================
+ Hits 54367 54427 +60
+ Misses 43697 43642 -55 ☔ View full report in Codecov by Sentry. |
@@ -42,7 +42,7 @@ | |||
load_state_dict, | |||
) | |||
from ...transformers.utils import get_checkpoint_shard_files, weight_name_suffix | |||
from ...utils.distributed import distributed_gather | |||
from ...utils.distributed import distributed_allgather, distributed_gather | |||
from ...utils.env import LORA_WEIGHTS_NAME, SAFE_PEFT_WEIGHTS_INDEX_NAME |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
在XPU和GPU的接口里面可以直接同时import distributed_allgather 和 distributed_gather 吗
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
参考代码是:#8697
改这个的原因是,XPU目前不支持gather。
gather和allgather对于rank0是一样的,对于非rank0不同。但是代码逻辑中有is_dst的控制,非rank0的反正都会扔掉不要,所以就直接替换了。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR types
New features
PR changes
Models
Description
针对XPU的特性修改两处:
multinomial
算子在调用之前,先转为float32类型,类似于: [GCU] Support llama for GCU #8445 。distributed_gather
的地方,替换成distributed_allgather
,类似于: xpu use allgather #8697 。