-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
High GC CPU consumption when running sysbench point select #25573
Comments
In this profile zip, two TiDB instances (38_135.prof, 39_41.prof) encounter the issue, while the other one (35_167.prof) does not. |
any update? |
Who is following this issue? |
The cause is that the strategy of triggering a GC in golang is too trivial:
When running a workload with low memory usage (e.g. point select), the Go runtime triggers GC collections too frequently, leading to high CPU usage. It seems impossible to solve the problem in the real world as long as we use Go. Adjusting GOGC to a very high value is unreasonable in practice because TiDB also needs to support workloads that use lots of memory. And Go does not support advanced strategies of GC triggering. It is by design because the Go team just wants it to be simple. Related to golang/go#42430 |
@sticnarf Can we close this issue? |
This is a case that needs to be optimized. But we don't have a good solution yet. So I would like to keep this issue open, but we can lower the severity to |
What we should do is optimize the object allocation ... |
@tiancaiamao My thought is that it is actually unnecessary for the Go runtime to collect the garbage that fast. For a server-side application which usually owns all resources of the machine, it is not so meaningful to keep the actual memory usage low. In this case, lower the frequency of GC by increasing It is more like a workaround to optimize object allocation... |
Keep the GC CPU less than 25% of the total throughput is very, very important. I find some places to optimize the allocation, but it's more important to find a way to observe those performance changes. Without micro performance benchmark, even though we can do some tiny optimizations when the problem emerge, we can't stop it from happening. And everyone change the code base and every change might be suspectable, we can't rely on someone to optimize it occasionally. |
Increase |
should we increase GOGC from the default value, in order to mitigate this issue? I think most cases we want TiDB to be more aggressive on memory allocation. |
I don't think this is the best solution. We should optimize TiDB to reduce unnecessary object allocation. |
Some hacks to check how far we can go.
|
I do not think this is a bug and changed it to 'enhancment' instead. |
I think we can close it now. |
Bug Report
1. Minimal reproduce step (Required)
3 TiDB under one haproxy, 3 TiKV
sysbench 16 tables 10,000,000 rows per table
Run sysbench point select.
2. What did you expect to see? (Required)
GC should not consume much CPU
3. What did you see instead (Required)
It is possible that

gcBgMarkWorker
consumes much CPU:But it cannot be reproduced all the time.
4. What is your TiDB version? (Required)
TiDB 5.1 compiled with Go 1.16.4
The text was updated successfully, but these errors were encountered: