-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mempool fix #2741
Mempool fix #2741
Conversation
This adds some more complexity :( Why not simply revert d4f909c? Do we have numbers showing the optimization is significant? |
Here's my results on Hikey:
Revert "mempool: optimize reference counting":
Using this script for a in 4006 4007 4008 4009 ; do \
echo -n $a " " >> time.txt ;\
time -o time.txt.tmp xtest -l 15 $a || break ;\
grep real time.txt.tmp >> time.txt
done
cat time.txt |
Thanks for the numbers. So there is quite a big difference indeed... How does the code in this PR compare? Assuming the code proposed here is still much faster, that worries me about the performance of our mutexes. The original code, with a mutex and a condvar, is a typical pattern... Is there a way we could optimize the mutex common case (locking, unlocked state)? With an |
"mempool: fix race in get_pool()"
Second run
So I'd say it's if anything slightly faster. I think we should try to improve the fast path of the mutex. I'm not so sure we'll get numbers that can compete with these, but it will be useful anyway. I guess we could base the fast path on the atomic_cas_ushort() function. |
What we need here is a recursive mutex. |
OK, thanks for running the benchmark again. I agree that manipulating
|
* any value but our thread id. | ||
*/ | ||
if (atomic_load_int(&pool->owner) == thread_get_id()) { | ||
if (!refcount_inc(&pool->refc)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think, that this code is still prone to race. You have two atomic operations there, but you need only one.
Suppose that Thread 0 is at line 107 and Tread 1 is at L82. Thread 1 passes check and thinks that it owns the pool. But Thread 0 resets pool owner.
I can't see how this can be fixed with atomic variables. Basically, you need to execute operation "check owner and increase refcount" atomically. But you are are doing two atomic operations instead - "check owner" and "increase refcount".
I propose to drop atomics at all and introduce spinlock. With spinlock you really can operate atomically on owner and refcount.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only the thread owning the pool may set it to free. The case you're describing is not supposed to happen, it's like unlocking a mutex with wrong thread. The code doesn't depend on refcount_inc()
and refcount_dec()
being atomic, it just happens to be a convenient interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, you are right.
acc6c5a
to
7d734b7
Compare
Updated commit message and added tag. Please wait with the merge a bit to see that we can agree with @lorc . |
@jenswi-linaro of course. |
|
Adds atomic_load_int() and atomic_store_int(). Reviewed-by: Volodymyr Babchuk <vlad.babchuk@gmail.com> Acked-by: Jerome Forissier <jerome.forissier@linaro.org> Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
Fixes a race in get_pool() which could leave the pool with zero refences but still owned by the last thread using the pool. Some performance number on Hikey with default configuration: github/master (edbb89f, before this commit): 4006 real 1m 41.11s 4007 real 1m 14.51s 4008 real 0m 0.13s 4009 real 1m 5.68s Revert "mempool: optimize reference counting", before this commit: 4006 real 3m 27.78s 4007 real 0m 50.03s 4008 real 0m 0.13s 4009 real 2m 24.07s With this commit, two runs: 4006 real 1m 37.51s 4007 real 0m 56.67s 4008 real 0m 0.09s 4009 real 1m 3.18s 4006 real 1m 37.61s 4007 real 0m 35.32s 4008 real 0m 0.13s 4009 real 1m 3.15s Numbers are gathered with this script: for a in 4006 4007 4008 4009 ; do \ echo -n $a " " >> time.txt ;\ time -o time.txt.tmp xtest -l 15 $a || break ;\ grep real time.txt.tmp >> time.txt done cat time.txt Reviewed-by: Volodymyr Babchuk <vlad.babchuk@gmail.com> Acked-by: Jerome Forissier <jerome.forissier@linaro.org> Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
7d734b7
to
d1644ff
Compare
Tag applie |
No description provided.