Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[XPU] avoid malloc redundant memory when creating context in comm man… #64139

Merged
merged 1 commit into from
May 10, 2024

Conversation

cqulilujia
Copy link
Contributor

@cqulilujia cqulilujia commented May 9, 2024

PR Category

Communication Library

PR Types

Bug fixes

Description

fix the redundant malloc caused my XPUAPI_DEFAULT_SIZE when creating context in comm manager.
related PR: #63817

Copy link

paddle-bot bot commented May 9, 2024

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

Copy link
Contributor

@houj04 houj04 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我感觉没问题,但还需要更熟悉这块的同学一起看下。

@@ -38,7 +38,8 @@ class XPUContext : public DeviceContext,
public:
XPUContext();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

默认构造函数是不是也一起加上这个选项

@@ -38,7 +38,8 @@ class XPUContext : public DeviceContext,
public:
XPUContext();

explicit XPUContext(const XPUPlace&);
// is_comm_context = 1 for init comm context with gm_size=1 and l3_size=1
explicit XPUContext(const XPUPlace&, bool is_comm_context = 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里为啥默认值不直接置为false而要用int类型0呢?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

编译的时候应该会被优化掉

@@ -211,8 +211,9 @@ void CommContextManager::CreateBKCLCommContext(
std::make_unique<BKCLCommContext>(rank, size, bkcl_id);

if (CommContextManager::device_id != -1) {
std::unique_ptr<phi::XPUContext> dev_ctx(
new phi::XPUContext(phi::XPUPlace(CommContextManager::device_id)));
bool is_comm_context = 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同上,另外感觉这个变量只是作为参数传进构造函数好像没啥必要

Copy link
Contributor Author

@cqulilujia cqulilujia May 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里可以不创建变量,我改掉

@XiaociZhang
Copy link
Contributor

可以先这么绕开,回头我看看是不是直接把dev_ctx拿掉。

@houj04 houj04 merged commit e7208bb into PaddlePaddle:develop May 10, 2024
30 of 31 checks passed
co63oc pushed a commit to co63oc/Paddle that referenced this pull request May 10, 2024
co63oc pushed a commit to co63oc/Paddle that referenced this pull request May 11, 2024
@XiaociZhang
Copy link
Contributor

可以先这么绕开,回头我看看是不是直接把dev_ctx拿掉。

update: 不太容易拿掉,unified_comm下有个地方需要取这个ctx,为了后续修改方便还是和gpu保持一致
去掉xdnn里bufferMgr的报错log:#64396

@houj04 houj04 added the XPU label Sep 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants