Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cgroups: ebpf: use link.Anchor to check for BPF_F_REPLACE support #4548

Merged
merged 3 commits into from
Dec 7, 2024

Conversation

cyphar
Copy link
Member

@cyphar cyphar commented Dec 6, 2024

In v0.13.0, cilium/ebpf stopped supporting setting BPF_F_REPLACE as an
explicit flag and instead requires us to use link.Anchor to specify
where the program should be attached.

Commit 216175a ("Upgrade Cilium's eBPF library version to 0.16")
did update this correctly for the actual attaching logic, but when
checking for kernel support we still passed BPF_F_REPLACE. This would
result in a generic error being returned, which our feature-support
checking logic would treat as being an error the indicates that
BPF_F_REPLACE is supported, resulting in a regression on pre-5.6
kernels.

It turns out that our debug logging saying that this unexpected error
was happening was being output as a result of this change, but nobody
noticed...

Fixes: 216175a ("Upgrade Cilium's eBPF library version to 0.16")
Fixes #3008
Signed-off-by: Aleksa Sarai cyphar@cyphar.com

In v0.13.0, cilium/ebpf stopped supporting setting BPF_F_REPLACE as an
explicit flag and instead requires us to use link.Anchor to specify
where the program should be attached.

Commit 216175a ("Upgrade Cilium's eBPF library version to 0.16")
did update this correctly for the actual attaching logic, but when
checking for kernel support we still passed BPF_F_REPLACE. This would
result in a generic error being returned, which our feature-support
checking logic would treat as being an error the indicates that
BPF_F_REPLACE *is* supported, resulting in a regression on pre-5.6
kernels.

It turns out that our debug logging saying that this unexpected error
was happening was being output as a result of this change, but nobody
noticed...

Fixes: 216175a ("Upgrade Cilium's eBPF library version to 0.16")
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
It is possible for LinkAttachProgram to return ErrNotSupported if
program attachment is not supported at all (which doesn't matter in this
case), but it seems possible that upstream will start returning
ErrNotSupported for BPF_F_REPLACE at some point so it's best to make
sure we don't cause additional regressions here.

Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
If we get an unexpected error here, it is probably because of a library
or kernel change that could cause our detection logic to be invalid. As
a result, these warnings should be louder so users have a chance to tell
us about them sooner (or so we might notice them before doing a release,
as happened with the 1.2.0 regression).

Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
@cyphar cyphar added the backport/1.2-todo A PR in main branch which needs to be backported to release-1.2 label Dec 6, 2024
@cyphar cyphar added this to the 1.2.3 milestone Dec 6, 2024
@cyphar
Copy link
Member Author

cyphar commented Dec 6, 2024

It turns out we've had this debug logging message since #4397 was merged (which indicated that we had a regression on #3008), but nobody noticed:

DEBU[0000]libcontainer/cgroups/devices/ebpf_linux.go:139 libcontainer/cgroups/devices.haveBpfProgReplace.func1() checking for BPF_F_REPLACE: got unexpected (not EBADF or EINVAL) error: disallowed flags: use Anchor to specify attach target

@rafaelroquetto
Copy link
Contributor

Nice catch, thanks for spotting this. If there's anything in particular I can do to help let me know.

Copy link
Contributor

@kolyshkin kolyshkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except a minor nit

// not supported
return
}
// attach_flags test succeeded.
if !errors.Is(err, unix.EBADF) {
logrus.Debugf("checking for BPF_F_REPLACE: got unexpected (not EBADF or EINVAL) error: %v", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the message is not entirely correct now (I guess we can just drop the (not EBADF or EINVAL) part).

@kolyshkin
Copy link
Contributor

Looks like in order to catch this in CI we'd need a combination of kernel < v5.6 and cgroup v2? This is why it went unnoticed -- we don't have that (and it's hard to get in CI -- something like Ubuntu 20.04 with an explicit switch to cgroup v2 and kernel not upgraded to HWE one).

@cyphar
Copy link
Member Author

cyphar commented Dec 7, 2024

Yeah, though we would've caught that something was going wrong if the debug messages had been warnings (because all systems were returning the wrong error since the update patch).

Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@thaJeztah thaJeztah merged commit e075206 into opencontainers:main Dec 7, 2024
40 checks passed
@cyphar cyphar deleted the ebpf-enotsup branch December 7, 2024 13:20
@lifubang lifubang added backport/1.2-done A PR in main branch which has been backported to release-1.2 and removed backport/1.2-todo A PR in main branch which needs to be backported to release-1.2 labels Dec 7, 2024
@kolyshkin kolyshkin removed this from the 1.2.3 milestone Dec 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/1.2-done A PR in main branch which has been backported to release-1.2
Projects
None yet
5 participants