-
Notifications
You must be signed in to change notification settings - Fork 847
beacon_node, consensus: fix possible deadlocks when comparing with itself #1241
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Great find, thanks @BurtonQin! |
@@ -408,6 +408,9 @@ fn prune_validator_hash_map<T, F, E: EthSpec>( | |||
/// Compare two operation pools. | |||
impl<T: EthSpec + Default> PartialEq for OperationPool<T> { | |||
fn eq(&self, other: &Self) -> bool { | |||
if self as *const _ == other as *const _ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if we should be using std::ptr::eq
as per Why can comparing two seemingly equal pointers with == return false?
@michaelsproul do you have thoughts here? I'm not particularly well versed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah ptr::eq
looks like it might be cleaner. Given this PartialEq
impl is only used in tests, I'd be interested in potentially removing it in future.
Other alternatives that I could think of are:
- Make the RHS of the
PartialEq
&mut Self
, so that self comparisons are statically prevented (although that would be a bit gross to use) - Use a timeout on the RwLock, knowing that it's only used for tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use ptr::eq instead.
CI failed with a weird permissions error on MacOS. Rerunning to see if it happens again. |
I saw this somewhere else as well, so perhaps not just a one-off. I have attempted a fix in #1246 |
Proposed Changes
This PR fixes possible deadlock bugs in beacon_node/operation_pool and consensus/proto_array_fork_choice.
Both are in fn
eq(&self, other: &Self)
.When
self
andother
refers to the same obj, (i.e. comparing with itself),the first Read lock is not released before calling the second Read lock.
The lock is a
parking_lot::RwLock
.According to the doc of
parking_lot::RwLock
:“readers trying to acquire the lock will block even if the lock is unlocked when there are writers waiting to acquire the lock.”
“attempts to recursively acquire a read lock within a single thread may result in a deadlock.”
Therefore, a deadlock may happen when a write lock interleaves the two read locks.
This PR compares the ptr addresses first to avoid comparing with itself.