Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Do not merge yet] Remove alloc_jemalloc, switch the default global allocator to System #52020

Closed
wants to merge 5 commits into from

Conversation

SimonSapin
Copy link
Contributor

Fixes #36963

Do not merge yet. This PR by itself is likely to regress rustc performance. The purpose of opening it now is to measure by how much. We’ll likely want to figure out #51038 and land them around the same time.

@SimonSapin SimonSapin added A-allocators Area: Custom and system allocators T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. relnotes Marks issues that should be documented in the release notes of the next release. S-blocked Status: Blocked on something else such as an RFC or other implementation work. S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Jul 3, 2018
@rust-highfive
Copy link
Collaborator

r? @eddyb

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jul 3, 2018
@SimonSapin
Copy link
Contributor Author

@bors try

@Mark-Simulacrum, could you run this on perf.rlo?

@bors
Copy link
Contributor

bors commented Jul 3, 2018

⌛ Trying commit 54fa990 with merge 91eac4d...

bors added a commit that referenced this pull request Jul 3, 2018
[Do not merge yet] Remove alloc_jemalloc, switch the default global allocator to System

Fixes #36963

**Do not merge** yet. This PR by itself is likely to regress rustc performance. The purpose of opening it now is to measure by how much. We’ll likely want to figure out #51038 and land them around the same time.
@cuviper
Copy link
Member

cuviper commented Jul 3, 2018

Please specifically note the versions of the system allocator (presumably glibc) when reporting performance. If possible, it would be nice to have results for both current and older glibc, say Fedora 28 versus RHEL/CentOS 7. If there's a way I can help generate results, let me know -- I don't have hardware to dedicate to this, but I can do manual runs for this evaluation.

@SimonSapin
Copy link
Contributor Author

Indeed, results on perf.rust-lang.org are presumably collected on Linux with a single version of glibc. Measuring impact on macOS would be nice too.

@bors
Copy link
Contributor

bors commented Jul 3, 2018

☀️ Test successful - status-travis
State: approved= try=True

@Mark-Simulacrum
Copy link
Member

Perf results, once ready, will have been run with Ubuntu GLIBC 2.23-0ubuntu10.

@cuviper
Copy link
Member

cuviper commented Jul 3, 2018

OK. The most notable change for glibc came in 2.26:
https://sourceware.org/ml/libc-alpha/2017-01/msg00524.html

For Ubuntu, that would be in artful (17.10) and later.

@Mark-Simulacrum
Copy link
Member

We might be able to upgrade the collector -- it seems like a relatively decent idea to have recent software on it, and there is a new LTS of Ubuntu (18.04) so shouldn't be too difficult.

@SimonSapin
Copy link
Contributor Author

In this case, getting both sets of results would be interesting.

@SimonSapin
Copy link
Contributor Author

So I tried running the benchmarks on a mac and got this:

thread 'main' panicked at 'assertion failed: has_perf', collector/src/bin/rustc-perf-collector/execute.rs:313:9

I assume has_perf refers to Linux’s perf tool. Is there a mode for the data collector to get the subset of available measures, possibly only wall clock time?

I also ran it on an Ubuntu 18.04 machine with the libc6 package at version 2.27-3ubuntu1 and an i7 CPU with 4 cores / 8 threads at 4.6 GHz. http://localhost:2346/compare.html?start=64f8ae08fd5294718466a61f74f6de0135b88035&end=91eac4dd1e41932c605075305758d84406abfaf4 does show results, but there’s so much green (where I would have expected at least some red) that I wonder if I’m comparing data sets measured on different machines.

compare	2018-07-03 (64f8ae08)	2000-01-01 (91eac4dd)	% change
 
tokio-webpush-simple-opt
	avg: -22.2%	min: -65.7%	max: -0.4%
unify-linearly-opt
	avg: -16.1%	min: -26.7%	max: -12.0%
helloworld-opt
	avg: -17.2%	min: -22.2%	max: -15.1%
helloworld-debug
	avg: -16.3%	min: -17.5%	max: -15.5%
unify-linearly-debug
	avg: -13.7%	min: -16.3%	max: -12.1%
issue-46449-opt
	avg: -5.1%	min: -9.3%	max: -0.8%
coercions-opt
	avg: -5.3%?	min: -8.9%?	max: -3.2%?
unused-warnings-opt
	avg: -6.3%	min: -6.9%	max: -5.3%
unused-warnings-debug
	avg: -5.9%	min: -6.9%	max: -5.2%
coercions-debug
	avg: -4.6%?	min: -6.4%?	max: -3.6%?
html5ever-opt
	avg: -3.3%	min: -5.4%	max: -1.5%
html5ever-debug
	avg: -3.5%	min: -5.3%	max: -1.6%
html5ever-check
	avg: -3.7%	min: -5.2%	max: -1.7%
deep-vector-debug
	avg: -2.9%	min: -4.4%	max: -2.1%
deep-vector-opt
	avg: -2.3%	min: -4.4%	max: -1.4%
tuple-stress-opt
	avg: -2.3%	min: -3.8%	max: -0.0%
script-servo-debug
	avg: -1.6%	min: -3.7%	max: -0.7%
coercions-check
	avg: -2.6%?	min: -3.7%?	max: -1.7%?
tuple-stress-check
	avg: -1.9%	min: -3.3%	max: -0.0%
style-servo-debug
	avg: -2.1%?	min: -3.2%?	max: -0.9%?
tokio-webpush-simple-debug
	avg: -1.8%	min: -3.1%	max: -1.2%
tuple-stress-debug
	avg: -2.0%	min: -3.1%	max: -0.0%
sentry-cli-opt
	avg: -1.2%	min: -3.0%	max: -0.4%
inflate-opt
	avg: -1.9%	min: -2.9%	max: -1.2%
encoding-debug
	avg: -1.6%	min: -2.7%	max: -0.7%
ripgrep-opt
	avg: -1.3%	min: -2.7%	max: -0.4%
unused-warnings-check
	avg: -2.3%	min: -2.7%	max: -1.8%
style-servo-opt
	avg: -1.0%?	min: -2.7%?	max: -0.4%?
syn-debug
	avg: -1.7%	min: -2.6%	max: -1.0%
issue-46449-debug
	avg: -1.8%	min: -2.5%	max: -1.5%
ripgrep-debug
	avg: -1.5%	min: -2.5%	max: -0.6%
regression-31157-opt
	avg: -1.2%	min: -2.5%	max: -0.1%
regex-debug
	avg: -1.4%	min: -2.4%	max: -0.3%
syn-opt
	avg: -0.7%	min: -2.3%	max: 1.9%
deep-vector-check
	avg: -1.9%	min: -2.3%	max: -1.6%
regex-opt
	avg: -1.1%	min: -2.2%	max: -0.3%
sentry-cli-check
	avg: -1.9%	min: -2.1%	max: -1.5%
inflate-debug
	avg: -0.1%	min: -2.1%	max: 0.6%
encoding-check
	avg: -1.7%	min: -2.0%	max: -1.6%
regression-31157-debug
	avg: -1.3%	min: -2.0%	max: -0.7%
encoding-opt
	avg: -1.0%	min: -2.0%	max: -0.2%
ripgrep-check
	avg: -1.6%	min: -2.0%	max: -1.3%
futures-debug
	avg: -1.3%	min: -1.9%	max: -0.9%
serde-debug
	avg: -1.2%	min: -1.9%	max: -0.9%
deeply-nested-debug
	avg: -1.1%	min: -1.9%	max: -0.8%
issue-46449-check
	avg: -1.6%	min: -1.9%	max: -1.4%
piston-image-debug
	avg: -1.1%	min: -1.9%	max: -0.3%
serde-opt
	avg: -1.2%	min: -1.8%	max: -0.8%
regression-31157-check
	avg: -1.3%	min: -1.8%	max: -0.9%
sentry-cli-debug
	avg: -1.0%	min: -1.8%	max: -0.4%
regex-check
	avg: -1.7%	min: -1.8%	max: -1.5%
syn-check
	avg: -1.6%	min: -1.8%	max: -1.4%
deeply-nested-check
	avg: -1.2%	min: -1.7%	max: -0.9%
futures-opt
	avg: -1.2%	min: -1.7%	max: -0.8%
tokio-webpush-simple-check
	avg: -1.4%	min: -1.7%	max: -1.3%
cargo-opt
	avg: -0.7%	min: -1.7%	max: -0.4%
deeply-nested-opt
	avg: -0.6%	min: -1.6%	max: -0.1%
cargo-check
	avg: -1.1%	min: -1.6%	max: -0.8%
webrender-check
	avg: -1.2%	min: -1.6%	max: -1.1%
webrender-debug
	avg: -1.0%	min: -1.5%	max: -0.4%
piston-image-opt
	avg: -0.9%	min: -1.5%	max: -0.3%
style-servo-check
	avg: -1.2%?	min: -1.5%?	max: -0.9%?
clap-rs-opt
	avg: -0.8%	min: -1.4%	max: -0.5%
futures-check
	avg: -1.1%	min: -1.4%	max: -0.8%
piston-image-check
	avg: -1.3%	min: -1.4%	max: -1.1%
clap-rs-check
	avg: -0.9%	min: -1.4%	max: -0.5%
unify-linearly-check
	avg: -1.0%	min: -1.4%	max: -0.6%
cargo-debug
	avg: -0.8%	min: -1.4%	max: -0.5%
webrender-opt
	avg: -1.0%	min: -1.3%	max: -0.7%
helloworld-check
	avg: -1.0%	min: -1.3%	max: -0.6%
inflate-check
	avg: 0.7%	min: -0.9%	max: 1.3%
script-servo-check
	avg: -0.5%	min: -1.2%	max: -0.0%
serde-check
	avg: -0.8%	min: -1.0%	max: -0.5%
clap-rs-debug
	avg: -0.6%	min: -0.8%	max: -0.4%
script-servo-opt
	avg: -0.5%	min: -0.7%	max: -0.2%

@SimonSapin
Copy link
Contributor Author

Ah, this was the instructions:u metric. wall-time shows a lot of both red and green, but still large differences in absolute numbers:

tokio-webpush-simple-opt
	avg: -12.4%	min: -41.5%	max: 3.2%
issue-46449-check
	avg: 2.7%	min: -0.5%	max: 23.6%
unify-linearly-opt
	avg: -9.9%	min: -15.2%	max: -7.4%
coercions-opt
	avg: -5.3%?	min: -14.0%?	max: -2.0%?
helloworld-opt
	avg: -10.7%	min: -13.3%	max: -9.3%
script-servo-debug
	avg: 8.8%	min: 5.9%	max: 11.5%
helloworld-debug
	avg: -10.5%	min: -11.2%	max: -10.0%
webrender-opt
	avg: 5.6%	min: 2.4%	max: 10.6%
tuple-stress-check
	avg: 5.0%	min: 2.3%	max: 10.6%
unify-linearly-debug
	avg: -8.6%	min: -10.1%	max: -7.5%
style-servo-debug
	avg: 6.7%?	min: 5.6%?	max: 8.7%?
futures-debug
	avg: 3.6%	min: 0.7%	max: 8.2%
futures-opt
	avg: 3.6%	min: 0.5%	max: 8.2%
unused-warnings-debug
	avg: -6.0%	min: -7.2%	max: -4.9%
encoding-opt
	avg: 3.2%	min: 1.1%	max: 7.1%
tuple-stress-debug
	avg: 2.4%	min: 0.4%	max: 6.9%
cargo-debug
	avg: 4.1%	min: -1.8%	max: 6.6%
tuple-stress-opt
	avg: 2.3%	min: -0.7%	max: 6.4%
regex-check
	avg: 2.0%	min: -0.9%	max: 6.1%
unused-warnings-opt
	avg: -5.6%	min: -6.1%	max: -4.7%
cargo-opt
	avg: 4.5%	min: 3.2%	max: 5.8%
script-servo-opt
	avg: 3.0%	min: -1.6%	max: 5.7%
inflate-opt
	avg: 4.3%	min: 1.1%	max: 5.7%
deep-vector-check
	avg: -3.6%	min: -5.6%	max: -1.4%
inflate-check
	avg: 3.7%	min: 1.8%	max: 5.5%
style-servo-opt
	avg: 4.0%?	min: 2.4%?	max: 5.3%?
piston-image-debug
	avg: 3.2%	min: 1.4%	max: 5.2%
clap-rs-opt
	avg: 3.9%	min: 1.1%	max: 5.1%
coercions-debug
	avg: -2.2%?	min: -5.1%?	max: 1.6%?
webrender-debug
	avg: 0.0%	min: -4.7%	max: 5.0%
tokio-webpush-simple-check
	avg: 2.5%	min: 1.7%	max: 5.0%
webrender-check
	avg: 3.0%	min: 1.2%	max: 4.9%
piston-image-opt
	avg: 3.2%	min: 1.3%	max: 4.8%
serde-opt
	avg: 4.3%	min: 3.9%	max: 4.7%
serde-check
	avg: 4.3%	min: 3.6%	max: 4.7%
script-servo-check
	avg: 3.8%	min: 2.8%	max: 4.7%
clap-rs-debug
	avg: 3.5%	min: 0.9%	max: 4.7%
deep-vector-opt
	avg: 3.0%	min: -2.8%	max: 4.7%
ripgrep-opt
	avg: 1.9%	min: -1.3%	max: 4.6%
ripgrep-check
	avg: 2.5%	min: 1.0%	max: 4.4%
serde-debug
	avg: 4.1%	min: 3.8%	max: 4.4%
style-servo-check
	avg: 3.4%?	min: 2.7%?	max: 4.2%?
regression-31157-opt
	avg: 2.0%	min: -0.6%	max: 4.0%
syn-opt
	avg: 2.7%	min: 1.1%	max: 4.0%
sentry-cli-opt
	avg: 2.5%	min: 0.5%	max: 3.9%
regression-31157-debug
	avg: 2.4%	min: 0.6%	max: 3.9%
regex-opt
	avg: 2.3%	min: 0.9%	max: 3.9%
sentry-cli-debug
	avg: 2.3%	min: -0.2%	max: 3.8%
sentry-cli-check
	avg: 3.0%	min: 2.0%	max: 3.7%
html5ever-debug
	avg: 1.9%	min: 1.0%	max: 3.5%
coercions-check
	avg: -2.2%?	min: -3.5%?	max: -1.0%?
cargo-check
	avg: 2.7%	min: 2.2%	max: 3.5%
syn-debug
	avg: 2.1%	min: 0.8%	max: 3.5%
futures-check
	avg: 2.3%	min: 1.1%	max: 3.5%
piston-image-check
	avg: 2.5%	min: 1.5%	max: 3.3%
unused-warnings-check
	avg: -2.5%	min: -3.3%	max: -2.1%
html5ever-opt
	avg: 2.0%	min: 0.9%	max: 3.3%
regex-debug
	avg: 1.9%	min: 0.8%	max: 3.1%
issue-46449-opt
	avg: 0.5%	min: -0.6%	max: 3.1%
deep-vector-debug
	avg: -2.1%	min: -3.1%	max: -1.2%
syn-check
	avg: 2.8%	min: 2.6%	max: 3.0%
inflate-debug
	avg: 2.1%	min: 1.0%	max: 3.0%
tokio-webpush-simple-debug
	avg: 1.8%	min: -1.3%	max: 2.8%
clap-rs-check
	avg: 2.0%	min: 1.3%	max: 2.7%
encoding-check
	avg: 1.9%	min: 0.9%	max: 2.6%
regression-31157-check
	avg: 1.4%	min: -0.6%	max: 2.6%
ripgrep-debug
	avg: 0.9%	min: -0.7%	max: 2.6%
encoding-debug
	avg: 1.8%	min: 1.0%	max: 2.4%
deeply-nested-opt
	avg: 1.6%	min: 0.0%	max: 2.3%
deeply-nested-debug
	avg: 0.9%	min: 0.0%	max: 1.6%
helloworld-check
	avg: -0.4%	min: -1.6%	max: 0.0%
issue-46449-debug
	avg: 0.8%	min: -0.4%	max: 1.5%
html5ever-check
	avg: 1.1%	min: 1.0%	max: 1.2%
unify-linearly-check
	avg: 0.1%	min: -0.5%	max: 0.7%
deeply-nested-check
	avg: 0.1%	min: -0.6%	max: 0.5%

@SimonSapin
Copy link
Contributor Author

@SimonSapin
Copy link
Contributor Author

TL;DR of the numbers: depending on what is being compiled in what mode, compilation wall clock time difference due do disabling jemalloc is between -42.5% (time nearly halved!) and +10.6%

@alexcrichton
Copy link
Member

@SimonSapin oh the perf links you gisted actually were for a pretty wide range of commits, but this comparison URL is just for the jemalloc changes.

From that URL some conclusions are:

  • Tiny programs tend to compile faster (0.17s -> 0.15s) ish
  • Big programs compile a little slower (187s -> 211s for script servo debug)

Overall I think the losses here outweigh the gains (as the gains are all very small and only on small programs) and we probably want to stick with jemalloc on Linux for now. It'd be interesting though to compare with the glibc 2.26 allocator as well though!

@nnethercote
Copy link
Contributor

I agree with @alexcrichton. The ones that improved were toy or stress programs; all the more interesting benchmarks regressed :(

@michaelwoerister
Copy link
Member

I agree with @alexcrichton and @nnethercote, we should not merge this as is. Thanks a lot for measuring though, @SimonSapin!

@mati865
Copy link
Contributor

mati865 commented Jul 9, 2018

For redis glibc 2.26 can make huge difference

It should be revaluated again once perf.rust-lang.org is updated to something more recent.

@michaelwoerister
Copy link
Member

For redis glibc 2.26 can make huge difference

Those numbers look great but wouldn't that mean that people on older systems would still experience a noticeable regression?

@mati865
Copy link
Contributor

mati865 commented Jul 9, 2018

Those numbers look great but wouldn't that mean that people on older systems would still experience a noticeable regression?

Unfortunately that's the case however distributions like Ubuntu, Fedora, openSUSE/SUSE already provide Rust built without jemalloc in their repositories.

There is no need to hurry but at some the focus should be shifted for more recent systems.

@steveklabnik
Copy link
Member

steveklabnik commented Jul 9, 2018 via email

@SimonSapin
Copy link
Contributor Author

@steveklabnik There are people using rustc packaged by their distro, but also users of various distros with various versions of glibc using our rustc builds through rustup.

Re binary size I assume you mean changing default allocator for Rust programs to System, #36963. I think there’s strong consensus to do that, but not to do it without also making the compiler continue to use jemalloc somehow. So I think we’ll end up merging most of the changes in this PR eventually, but likely not before fixing #51038.

@steveklabnik
Copy link
Member

@SimonSapin ah; yes, that was what I was talking about; I got my threads mixed up. Thanks.

@SimonSapin
Copy link
Contributor Author

So I think we’ll end up merging most of the changes in this PR eventually, but likely not before fixing #51038.

There’s no point keeping this PR open in the meantime, closing to make triage easier. Please direct conversation to #51038 or #36963 as appropriate.

@SimonSapin SimonSapin closed this Jul 9, 2018
@SimonSapin SimonSapin deleted the no-jemalloc branch July 9, 2018 21:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-allocators Area: Custom and system allocators relnotes Marks issues that should be documented in the release notes of the next release. S-blocked Status: Blocked on something else such as an RFC or other implementation work. S-waiting-on-perf Status: Waiting on a perf run to be completed. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.