-
Notifications
You must be signed in to change notification settings - Fork 324
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce memory usage when reading response body #451
Reduce memory usage when reading response body #451
Conversation
Currently for each chunk read from the socket a new string is being allocated. This is not necessary, because we're feeding that string object into the HTTP parser and never using it again. So we initialize a buffer string and pass it in when reading chunks. This will make each chunk be read into the same buffer object, no new string will be allocated in that operation. In my benchmark this halves the allocated memory when downloading large response bodies.
5d2e5c6
to
ddaa8a4
Compare
Seems reasonable to me. We have enough specs on request states that I'd expect it to have broken if the http_parser gem was somehow reusing the string, and otherwise it does look like |
Looks good to me as well. Merging this in! |
I know that the HTTP
.timeout(read: 2)
.get("http://example.com/a-50MB-file")
.body.each { |chunk| chunk.clear }
|
@janko-m I'll work on this. |
Released as 3.2.0 (somehow I merged EVERY bugfix from master to 3-x-stable but this and released as 3.1.0, so had to release 3.2.0 immediately after that one). :D |
@ixti Awesome, thank you very much! 🙏 The memory difference isn't as drastic as I initially posted (I read that from the MemoryProfiler output). I now tried watching how much the actual process memory grew when GC was disabled. So, testing with a 50MB file, before this change the amount allocated memory was about 150MB, while after this change it's only about 15MB. So, while not as drastic as 500MB to 15MB, it's still 10x less which is a lot 😃 I also compared this with |
Update ruby-http to 4.4.1. ## 4.4.1 (2020-03-29) * Backport [#590](httprb/http#590) Fix parser failing on some edge cases. ([@ixti]) ## 4.4.0 (2020-03-25) * Backport [#587](httprb/http#587) Fix redirections when server responds with multiple Location headers. ([@ixti]) * Backport [#599](httprb/http#599) Allow passing HTTP::FormData::{Multipart,UrlEncoded} object directly. ([@ixti]) ## 4.3.0 (2020-01-09) * Backport [#581](httprb/http#581) Add Ruby-2.7 compatibility. ([@ixti], [@janko]) ## 4.2.0 (2019-10-22) * Backport [#489](httprb/http#489) Fix HTTP parser. ([@ixti], [@fxposter]) ## 4.1.1 (2019-03-12) * Add `HTTP::Headers::ACCEPT_ENCODING` constant. ([@ixti]) ## 4.1.0 (2019-03-11) * [#533](httprb/http#533) Add URI normalizer feature that allows to swap default URI normalizer. ([@mamoonraja]) ## 4.0.5 (2019-02-15) * Backport [#532](httprb/http#532) from master. Fix pipes support in request bodies. ([@ixti]) ## 4.0.4 (2019-02-12) * Backport [#506](httprb/http#506) from master. Skip auto-deflate when there is no body. ([@Bonias]) ## 4.0.3 (2019-01-18) * Fix missing URL in response wrapped by auto inflate. ([@ixti]) * Provide `HTTP::Request#inspect` method for debugging purposes. ([@ixti]) ## 4.0.2 (2019-01-15) * [#506](httprb/http#506) Fix instrumentation feature. ([@paul]) ## 4.0.1 (2019-01-14) * [#515](httprb/http#515) Fix `#build_request` and `#request` to respect default options. ([@RickCSong]) ## 4.0.0 (2018-10-15) * [#482](httprb/http#482) [#499](httprb/http#499) Introduce new features injection API with 2 new feaures: instrumentation (compatible with ActiveSupport::Notification) and logging. ([@paul]) * [#473](httprb/http#473) Handle early responses. ([@janko-m]) * [#468](httprb/http#468) Rewind `HTTP::Request::Body#source` once `#each` is complete. ([@ixti]) * [#467](httprb/http#467) Drop Ruby 2.2 support. ([@ixti]) * [#436](httprb/http#436) Raise ConnectionError when writing to socket fails. ([@janko-m]) * [#438](httprb/http#438) Expose `HTTP::Request::Body#source`. ([@janko-m]) * [#446](httprb/http#446) Simplify setting a timeout. ([@mikegee]) * [#451](httprb/http#451) Reduce memory usage when reading response body. ([@janko-m]) * [#458](httprb/http#458) Extract HTTP::Client#build_request method. ([@tycoon]) * [#462](httprb/http#462) Fix HTTP::Request#headline to allow two leading slashes in path. ([@scarfacedeb]) * [#454](httprb/http#454) [#464](httprb/http#464) [#384](httprb/http#384) Fix #readpartial not respecting max length argument. ([@janko-m], [@marshall-lee])
Currently for each chunk read from the socket a new string is being allocated. This is not necessary, because we're feeding that string object into the HTTP parser and never using it again.
So we initialize a buffer string and pass it in when reading chunks. This will make each chunk be read into the same buffer object, no new string will be allocated in that operation.
When profiling download of a 18MB file, before this change there was 48MB of strings being allocated, and after this change it's about 18MB. This is more than 50% reduction in memory allocation.