Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Notify panics in handlers to the scheduler thread promptly #1574

Merged
merged 2 commits into from
Jun 10, 2024

Conversation

ryoqun
Copy link
Collaborator

@ryoqun ryoqun commented Jun 2, 2024

Problem

From #1211:

Specifically, there are following several termination conditions which it must handle respectively:

...
2. One of the handler threads could panic for extreme situations due to newly discovered LoA-like bugs. In that case, unified scheduler should propagate panic! promptly to terminate the whole process. This is an edge case variant of 1. (note: I'm against resuming normal operation after panic): #1574
...

Summary of Changes

This pr addresses (2).

@ryoqun ryoqun requested a review from apfitzge June 2, 2024 05:45
@ryoqun ryoqun force-pushed the graceful-abort-on-panic branch 2 times, most recently from 62520f2 to 5b09500 Compare June 2, 2024 06:09
@ryoqun ryoqun changed the title Gracefully abort scheduler on a panic in handlers Propagate panics in handlers to scheduler Jun 2, 2024
@ryoqun
Copy link
Collaborator Author

ryoqun commented Jun 2, 2024

oops, fixed the pr title... branch name is complete misnomer. I won't fix it... ;)

@ryoqun ryoqun changed the title Propagate panics in handlers to scheduler Propagate panics in handlers to unified scheduler Jun 2, 2024
Comment on lines +1802 to +1809
let progress = sleepless_testing::setup(&[
&TestCheckPoint::BeforeNewTask,
&CheckPoint::NewTask(0),
&PanickingHanlderCheckPoint::BeforeNotifiedPanic,
&CheckPoint::SchedulerThreadAborted,
&PanickingHanlderCheckPoint::BeforeIgnoredPanic,
&TestCheckPoint::BeforeEndSession,
]);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm satisfied with sleepless_testing. :)

@ryoqun ryoqun changed the title Propagate panics in handlers to unified scheduler Notify panics in handlers to the scheduler thread promptly Jun 2, 2024
@ryoqun ryoqun force-pushed the graceful-abort-on-panic branch from 5b09500 to 0067ad6 Compare June 8, 2024 13:23
Copy link

@apfitzge apfitzge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only a nit on this one!

Comment on lines 1066 to 1068
// It seems that scheduler has been aborted...
// This branch is deliberately tested by using 2 transactions with
// different timings in test_scheduler_schedule_execution_panic

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: seems odd to me to comment how the code is tested in the code itself. test name could change, etc.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, good point. done: d83571b

@ryoqun ryoqun requested a review from apfitzge June 10, 2024 05:21
@ryoqun ryoqun merged commit 10e814e into anza-xyz:master Jun 10, 2024
53 checks passed
samkim-crypto pushed a commit to samkim-crypto/agave that referenced this pull request Jul 31, 2024
…1574)

* Gracefully abort scheduler on a panic in handlers

* Move test-related mention out of the tested code itself
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants