-
Notifications
You must be signed in to change notification settings - Fork 380
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make unified schedler abort on tx execution errors #1211
Make unified schedler abort on tx execution errors #1211
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1211 +/- ##
========================================
Coverage 81.6% 81.6%
========================================
Files 867 868 +1
Lines 368900 369593 +693
========================================
+ Hits 301052 301656 +604
- Misses 67848 67937 +89 |
ff13fc3
to
a8d8736
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a few nits. I need to take another pass, specifically to review the trashed
scheduler concept.
thanks for review! I've addressed all comments so far. re-requesting code-review. |
I'm not sure I understand the concept of trashed schedulers. When we return schedulers to the pool, if all exection threads are not joined it is considered "trashed". We will clean it up later. Can you give an example of when a scheduler would become trashed? Think I'm still missing something here. |
7bf597a
to
8212761
Compare
Thanks for the question. seems more explanation is needed... I won't intentionally gave such an example to verify others can decipher my code by themselves. So, I instead added comments rather extensively as much as possible: 8212761 Could you give another try to grasp the trashed schedulers? |
450aa8c
to
649bce3
Compare
all of test sleeps has gone: 649bce3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dunno how you like this test library. but I'm planning to publish this as a crate in the future. so, the mod name is fancy. ;)
649bce3
to
66b9f32
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Largely happy with the scheduler changes themselves.
The additional sleepless testing utilities are a nice addition; had some nits on the implementation, but since you mentioned this may be split out into a separate crate/library I'm not going to block on it. (would love to be involved with the library if/when you split it out!)
} else if current().name().unwrap_or_default().starts_with("test_") { | ||
panic!("seems setup() isn't called yet?"); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems a bit hacky to have this rely on rust's current test-thread naming scheme to me.
Why not just panic if this is called when the registry is not set up? is there a reason why at
would be called w/o the thread being registered?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, this could be more strict.
pub(crate) fn at<T: Debug>(check_point: T) { | ||
let mut registry = THREAD_REGISTRY.lock().unwrap(); | ||
if let Some(progress) = registry.get_mut(¤t().id()).cloned() { | ||
drop(registry); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this intentional? we drop the lock before the checkpoint is actually changed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, i think dead lock is possible otherwise iirc
lazy_static! { | ||
static ref THREAD_REGISTRY: Mutex<HashMap<ThreadId, Arc<Progress>>> = | ||
Mutex::new(HashMap::new()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a global. won't it cause contention between all tests using this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I'm leaning towards switching to thread_local!
impl soon.
#[derive(Debug)] | ||
struct JustCreated; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from library perspective, it is not obvious the user cannot use "JustCreated" as one of their check-points
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
let check_points_set = check_points.iter().collect::<HashSet<_>>(); | ||
assert_eq!(check_points.len(), check_points_set.len()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems a bit odd that we cannot have duplicate checkpoint names. If I had some sort of state-machine it would seem reasonable to me that I could use at(NAME_OF_STATE)
to test the state-transitions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so true. the current impl is too primitive. this is another improvement which i'd like to have when creating a crate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this limitation will be lifted by this pr: #1797
Problem
As a quick background, the unified scheduler currently doesn't terminate at all. In other words, it leaks shamelessly. :p It means its memory usage is unbounded and indeed causes the process to be reaped by the out-of-memory killer after a while (like a week), even while it's pooled for reuse.
Let alone the memory issue, it needs proper shutdown mechanism for other reasons as well. Specifically, there are following several termination conditions which it must handle respectively:
panic!
promptly to terminate the whole process. This is an edge case variant of 1. (note: I'm against resuming normal operation after panic):UsageQueueLoader
doesn't never evict unused entries. it can become too big with many entries. Its design relies on some external mechanism to mitigate the unbounded growth. Thus, the unified scheduler needs to drop the scheduler itself depending on the size ofUsageQueueLoader
:As for the (1), there's currently also no way for the unified scheduler to propagate errors back to the callers (the replay stage) until the bank freezing. So, the dead-block marking by the replay stage could be delayed by maliciously-crafted blocks even if the unified scheduler immediately aborts internally.
Summary of Changes
This pr specifically addresses
(1)
.To that end, this pr makes the new task code-path return
Result
s to abruptly propagate previously-scheduled transaction error when unrelated new tasks are about to be submitted to the unified scheduler, in order to notify the replay stage earlier than reaching block boundaries.After that, this pr introduces crossbeam-channel disconnection based cross-thread coordination for graceful termination of the unified scheduler instance itself. In this way, there's almost no runtime overhead other than bunch of additional
if
s. Also, this means there's no new potential bottleneck and synchronization. On the other hand, this incurs code complexities.Lastly, a pool-owned (i.e. singleton) auxiliary background thread is introduced to actually drop the aborted scheduler.
Other termination conditions will be addressed in upcoming prs.
context: extracted from #1122