-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cgroups: ebpf: use link.Anchor to check for BPF_F_REPLACE support #4548
Conversation
In v0.13.0, cilium/ebpf stopped supporting setting BPF_F_REPLACE as an explicit flag and instead requires us to use link.Anchor to specify where the program should be attached. Commit 216175a ("Upgrade Cilium's eBPF library version to 0.16") did update this correctly for the actual attaching logic, but when checking for kernel support we still passed BPF_F_REPLACE. This would result in a generic error being returned, which our feature-support checking logic would treat as being an error the indicates that BPF_F_REPLACE *is* supported, resulting in a regression on pre-5.6 kernels. It turns out that our debug logging saying that this unexpected error was happening was being output as a result of this change, but nobody noticed... Fixes: 216175a ("Upgrade Cilium's eBPF library version to 0.16") Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
It is possible for LinkAttachProgram to return ErrNotSupported if program attachment is not supported at all (which doesn't matter in this case), but it seems possible that upstream will start returning ErrNotSupported for BPF_F_REPLACE at some point so it's best to make sure we don't cause additional regressions here. Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
If we get an unexpected error here, it is probably because of a library or kernel change that could cause our detection logic to be invalid. As a result, these warnings should be louder so users have a chance to tell us about them sooner (or so we might notice them before doing a release, as happened with the 1.2.0 regression). Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
It turns out we've had this debug logging message since #4397 was merged (which indicated that we had a regression on #3008), but nobody noticed:
|
Nice catch, thanks for spotting this. If there's anything in particular I can do to help let me know. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM except a minor nit
// not supported | ||
return | ||
} | ||
// attach_flags test succeeded. | ||
if !errors.Is(err, unix.EBADF) { | ||
logrus.Debugf("checking for BPF_F_REPLACE: got unexpected (not EBADF or EINVAL) error: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: the message is not entirely correct now (I guess we can just drop the (not EBADF or EINVAL)
part).
Looks like in order to catch this in CI we'd need a combination of kernel < v5.6 and cgroup v2? This is why it went unnoticed -- we don't have that (and it's hard to get in CI -- something like Ubuntu 20.04 with an explicit switch to cgroup v2 and kernel not upgraded to HWE one). |
Yeah, though we would've caught that something was going wrong if the debug messages had been warnings (because all systems were returning the wrong error since the update patch). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
In v0.13.0, cilium/ebpf stopped supporting setting BPF_F_REPLACE as an
explicit flag and instead requires us to use link.Anchor to specify
where the program should be attached.
Commit 216175a ("Upgrade Cilium's eBPF library version to 0.16")
did update this correctly for the actual attaching logic, but when
checking for kernel support we still passed BPF_F_REPLACE. This would
result in a generic error being returned, which our feature-support
checking logic would treat as being an error the indicates that
BPF_F_REPLACE is supported, resulting in a regression on pre-5.6
kernels.
It turns out that our debug logging saying that this unexpected error
was happening was being output as a result of this change, but nobody
noticed...
Fixes: 216175a ("Upgrade Cilium's eBPF library version to 0.16")
Fixes #3008
Signed-off-by: Aleksa Sarai cyphar@cyphar.com