-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Not clear that lock is always taken for init_scan_list #78
Comments
On reflection, it also looks like the scan thread can call POSIX "cancellation point" APIs while the mutex is locked, which would mean that the thread could be cancelled without unlocking the mutex. Is there something preventing this that I've missed? (I would expect this to produce deadlocks etc. and I have not seen them, so I'm simply curious about this one.) Edit: Also, where is the scan result mutex actually initialized? The code appears to rely on "all zeroes" being a legal mutex, which happens to work on glibc (apparently!) but is not correct per the POSIX standard, which states, "If mutex does not refer to an initialized mutex object, the behavior of pthread_mutex_lock(), pthread_mutex_trylock(), and pthread_mutex_unlock() is undefined." |
So ... what is the solution you are suggesting? |
It took me a while to reconstruct what is going on. The simplification in c74e52d removed the use of mutexes when destroying the scan list, since the thread can be canceled with a mutex held. The change was to use a static initializer ( Your first comment above identifies the bug that c74e52d did not fix: I will rework the original fix and revisit the mutex handling. Thank you for reporting this. |
This reverts commit c74e52d. The fix was not right, using deferred pthread_cancel (the default). Also the use of locks needs to be revised.
This addresses several issues in handling the scan thread: 1. pthread_cancel was called without pthread_join, allowing the detached thread to continue to run. 2. iw_nl80211_get_scan_data wiped the whole structure, including the mutex, causing undefined behaviour. 3. Address the problem of what happens if the thread is canceled with the mutex held: use a static allocator and disable thread cancellation during the time the mutex is held. 4. Sorting and computing metrics is done on a temporary struct to minimize the time the lock is held. 5. The main thread should not allocate/clear the scan data. This is now retained when switching windows.
@cbiffle - I have made comprehensive changes following your suggestions. Would you be able to compile from |
This updates the mutex handling to match how it was done for #78 (re-initalization of the static mutex is not required).
This updates the mutex handling to match how it was done for #78 (re-initalization of the static mutex is not required).
I've noticed wavemon segfaulting a lot, and took the time to rebuild with debug symbols and capture a coredump.
It looks like it's interacting with bogus contents in the
scan_result
ininit_scan_list
. To reproduce this, startwavemon
0.9.1 (I am on Arch, specifically, running the 0.9.1-1 package version) and spam the F1-F3-F1-F3 keys until it crashes. Usually takes less than ten seconds for me.I noticed this line in the source:
/** Initialize scan results. Requires lock to be taken. */
Nothing about the structure of the program ensures that the lock is taken, and in particular, it looks like the code path through
main
intoscr_aplst_init
callsinit_scan_list
on a global variable without any synchronization. It's also not clear howfini
synchronizes with the mutex management in the scan thread; I don't see any calls topthread_setcanceltype
, so thepthread_cancel
will take effect at some point later, possibly leading to data races. (I notice the mutex management changed in c74e52d and some synchronization was removed; I don't totally understand the change.)The text was updated successfully, but these errors were encountered: