Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance hit of ServerLogInterceptor #3272

Closed
kamilkloch opened this issue Oct 25, 2023 · 3 comments
Closed

Performance hit of ServerLogInterceptor #3272

kamilkloch opened this issue Oct 25, 2023 · 3 comments
Milestone

Comments

@kamilkloch
Copy link
Contributor

Follow-up on https://www.reddit.com/r/scala/comments/17c9lpg/comment/k68ffix

Disabling the default ServerLogInterceptior reduces tapir code from 25.5% to 16.5% in the async-profiler flame graph.
This is despite the fact that no logging actually occurs in the code.

  1. Server options with the default ServerLogInterceptor
  private val serverOptions = Http4sServerOptions
    .customiseInterceptors[IO]
    .options

Screenshot from 2023-10-25 15-38-17

  1. Server options without the default ServerLogInterceptor
  private val serverOptions = Http4sServerOptions
    .customiseInterceptors[IO]
    .serverLog(None)
    .options

Screenshot from 2023-10-25 15-37-39

Test setup: https://github.com/kamilkloch/rest-benchmark

ZIP file with async profiler results: tapir-flamegraphs.zip

h/t @plokhotnyuk

@kciesielski
Copy link
Member

kciesielski commented Feb 12, 2024

@kamilkloch I'd like to ask about your comment in the Reddit thread:

it turned out to be costly both in terms of CPU cycles and GC pressure

I'd like to investigate both issues and see them in metrics. Your flamegraphs are very interesting and I haven't run asyncprofiler with my tests yet, but I have some initial attempts with my tests and JFR recordings.
For example, running a http4s+blaze pure, "tapirless" server shows following characteristics:
image
while on http4s-tapir without interceptors the recording is:
image
and with ServerLog interceptor:
image

These high-level recordings don't show any alarming differences in terms of CPU and memory load. I suspect I may need some adjustments to my testing methodology. If you're seeing such a big difference in the CPU flamegraph of asyncprofiler, I would expect to notice some differences in the general CPU consumption as well, but maybe my understanding is flawed. Also, memory profile and GC pauses aren't noticeably different, but maybe I should compare pause lengths in miliseconds in depth?
I would be grateful for any hints!

P.S. The simulations are: a GET with 128 concurrent users on a /path0 endpoint (where there are /path0, /path1, ..., /path127 endpoints), then again 128 users sending a GET to the /path127 endpoint. Both simulations are preceeded with 5-seconds of warmup requests sent by 3 users, which are visible as smaller regions, before the high regions arise.

@kciesielski
Copy link
Member

@kamilkloch update: I ran these comparisons using IntelliJ profiler and the difference in CPU load is visible there.

vanilla http4s
"Run" view:
image
Profiler timeline view:
image

tapir-http4s
"Run" view:
image
Profiler timeline view:
image

tapir-http4s + ServerLog
"Run" view:
image
Profiler timeline view:
image

And the flamegraph shows a certain amount of CPU time consumed by ServerLogEndpointInterceptor
image

it's not as significant as in your case, but this may depend on test parameters and environment.
Still, I'd love to understand better where I can see the GC pressure.

@kamilkloch
Copy link
Contributor Author

@kciesielski we will update our benchmarks and let you know. Need to find a free time slot, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants