Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add note on epoll versus poll for FD lifetime #9

Open
mikael-s-persson opened this issue Mar 8, 2024 · 1 comment
Open

Add note on epoll versus poll for FD lifetime #9

mikael-s-persson opened this issue Mar 8, 2024 · 1 comment

Comments

@mikael-s-persson
Copy link
Contributor

In the comment section here:
https://github.com/dallison/co/blob/main/coroutine.h#L24

It only talks about the poll/epoll difference in terms of duplicate FDs. But, after a bit of painful debugging of an issue while trying to update my code, I narrowed my issue down to the fact that with epoll each coroutine Wait involves adding FDs and then removing them from the epoll set when waking up. This means that if one coroutine entered a wait on an FD and that during that wait, another coroutine is woken up and closes that FD, you end up with an assert-failure when trying to remove that FD from the epoll set (EBADF error from epoll_ctl). My fix was easy enough: make a copy of all the toolbelt::FileDescriptor fds before entering Wait so that even if clients disconnect / close their FDs, they are temporarily kept open at least until Wait comes back.

It might be worth adding a comment on that, or changing that behavior if that makes sense (maybe, just tolerate EBADF when removing FDs from the epoll set).

@dallison
Copy link
Owner

dallison commented Mar 8, 2024

Yeah, closing an fd while a coroutine is waiting for it sounds like a bug. It would be better to add another fd to the wait set that can be used to interrupt the wait. Saying that, I think that your solution is fine (keep the fd open while something is waiting on it). I also think that tolerating the EBADF would be fine since it's an internal operation asserting is confusing. The worst that could happen is that the coroutine wakes up and gets an error from a read.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants