-
Notifications
You must be signed in to change notification settings - Fork 436
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UCT: Removed XGVMI BF2 support (umem) #10476
base: master
Are you sure you want to change the base?
Conversation
src/uct/ib/mlx5/dv/ib_mlx5dv_md.c
Outdated
} | ||
|
||
return uct_ib_mlx5_devx_xgvmi_umem_mr(md, memh); | ||
return UCS_ERR_UNSUPPORTED; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- can return UNSUPPORTED after the if in line 2728
- shall we check UCT_IB_MLX5_MD_FLAG_INDIRECT_XGVMI support in order to report if the MD supports exported key?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
Yes, that's the desired behavior
We noticed that in some cases UCT_IB_MLX5_MD_FLAG_INDIRECT_XGVMI capability is advertised, but not actually supported. In this case we detect this and revoke this capability (line 2743). In this case we return UNSUPPORTED -
Ok, we already kind of do that:
if (uct_ib_mlx5_devx_check_xgvmi(cap_2, uct_ib_device_name(dev))) {
md->flags |= UCT_IB_MLX5_MD_FLAG_INDIRECT_XGVMI;
md->super.cap_flags |= UCT_MD_FLAG_EXPORTED_MKEY;
}
There is a subtle difference between INDIRECT_XGVMI and EXPORTED_MKEY. The first one could be revoked if later on we detect that XGVMI is not supported at runtime
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so it means we remove the fallback approach for supporting exported mkey, right?
and we expose exported mkey support in MD caps, while it may not be really possible to support.
Would it be possible to check somehow whether we can create ksms for exported key during MD creation? Then we would not need UCT_IB_MLX5_MD_FLAG_INDIRECT_XGVMI
at all
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right
can create ksms for exported key during MD creation?
I have another idea.
If I understand correctly it is only BF2 device who advertizes INDIRECT_XGVMI
support, but then fails to register it. Maybe we just disable EXPORTED_MKEY
on BF2 (it is no-op anyway without UMEM fallback)?
And then we can remove UCT_IB_MLX5_MD_FLAG_INDIRECT_XGVMI
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good to me
2ea3965
to
04c837f
Compare
/azp run UCX PR |
Azure Pipelines successfully started running 1 pipeline(s). |
/* | ||
* This check does not guarantee that XGVMI works with indirect mkeys. | ||
* We can filter out devices that does not support XGVMI at all. | ||
*/ | ||
result = uct_ib_mlx5_devx_check_xgvmi(cap); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will it guarantee?
/* | |
* This check does not guarantee that XGVMI works with indirect mkeys. | |
* We can filter out devices that does not support XGVMI at all. | |
*/ | |
result = uct_ib_mlx5_devx_check_xgvmi(cap); | |
result = uct_ib_mlx5_devx_check_xgvmi(cap) && | |
uct_ib_mlx5_devx_allow_xgvmi_access(device, uct_ib_mlx5_flush_rkey_make(), 1); |
+minor refactoring of uct_ib_mlx5_devx_allow_xgvmi_access
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would love it to be that simple, but of course it doesn't fly.
We need to allow XGVMI on some real valid mkey, not a dummy one like this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's the difference with
if ((cap_2 != NULL) && (md->flush_mr != NULL) &&
(md->super.flush_rkey != UCT_IB_MD_INVALID_FLUSH_RKEY) &&
uct_ib_mlx5_devx_check_xgvmi(cap_2)) {
status = uct_ib_mlx5_devx_allow_xgvmi_access(
md, md->super.flush_rkey, 1);
in md_open?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The difference is that flush_rkey is a real mkey created in md_open in uct_ib_mlx5_devx_init_flush_mr
.
I just reuse that key for XGVMI check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, I missed devx path.. but looking into uct_ib_mlx5_devx_init_flush_mr
and uct_ib_mlx5_md_buf_alloc
it seems like MD is not really required, we have ibv_context
and can reg/dereg local buffer to build temporary key. Do I miss something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you pls check the case with ucs_info -d
described in #10372 which I fixed recently? I guess this PR in current state is a kind of regression..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm trying to implement it the way you proposed, it should be consistent then
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, it's implemented, with a little hack to avoid massive refactoring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so it LGTM if uct_ib_mlx5_devx_dummy_md_init/clenup will be removed in next PR with corresponding refactoring, I think it could be useful for IB MD as well for other caps which can be detected only in runtime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is another tricky issue with this approach..
I see that many fork tests are failing in CI, and the root cause is the initialization order: we must call uct_ib_fork_init
before any ibv_create_*
call:
#686
/*
169 * ibv_fork_init testing - if fork support is requested then ibv_fork_init
170 * should be called right at the beginning of the verbs initialization flow, before ibv_create_* call.
171 *
172 * Known limitations:
173 * If ibv_fork_init is called after ibv_create_* functions - it will have no effect.
174 * OMPI initializes verbs many times during initialization in the following verbs components:
175 * oob/ud, btl/openib, mtl/mxm, pml/yalla, oshmem/ikrit, oshmem/yoda, ompi/mca/coll/{fca,hcoll}
176 *
177 * So, ibv_fork_init should be called once, in the beginning of the init flow of every verb component
178 * to proper request fork support.
179 *
180 */
Violation of this rule leads to warnings in tests and potential fork errors at runtime.
So I tend to revert the change for detecting XGVMI support before MR is created: it was already a bit hack-ish, and with this fork limitation it is just not worth it
e59abc5
to
e2baa4f
Compare
* indirect mkeys is actually supported. Therefore we do this check by | ||
* allowing XGVMI on indirect KSM flush_rkey | ||
*/ | ||
if ((cap_2 != NULL) && (md->flush_mr != NULL) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in which case md->flush_mr is NULL? maybe we can assert it is not NULL?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possible error outputs of uct_ib_mlx5_devx_init_flush_mr
:
- flush_rkey=INVALID, flush_mr=NULL - can be if
UCT_IB_MLX5_MD_FLAG_KSM
is not supported (which is also a correct precondition for XGVMI) - flush_rkey=DUMMY, flush_mr=NULL - if uct_ib_reg_mr or KSM registration fails
So assertion on flush_mr!=NULL
is not good, because it can be a valid device without KSM reg. At least currently we handle this situation gracefully in uct_ib_mlx5_devx_init_flush_mr
.
In principle it's enough to check that flush_mr!=NULL
, but I also check flush_rkey
is valid just for extra safety
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so maybe assert flush_rkey is not invalid? instead of checking it as part of the if condition
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But if we do assert(flaush_rkey != INVALID)
then it will crash on HW where KSM is not supported, right?
I guess this is BF2. It is actually the first point that I described in my comment - KSM is not supported is pretty valid use case IMO
/azp run UCX PR |
Azure Pipelines successfully started running 1 pipeline(s). |
What?
This PR discontinues XGVMI support on BF2 devices, by removing umem fallback XGVMI registration
However the tricky part here is that FW capability does not allow to detect whether indirect XGVMI mkeys are actually supported. So we do this check by trying to allow XGVMI access on flush_rkey during MD open.
This has consequences for GGA as well: see the changes