-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rustdoc-json: switch from HashMap to FxHashMap to fix non-determinism #108626
Conversation
r? @jsha (rustbot has picked a reviewer for you, use r? to override) |
rustdoc-json-types is a public (although nightly-only) API. If possible, consider changing |
e921fe9
to
02b32e6
Compare
3f26808
to
098cf6d
Compare
I'm a bit nervous about the performance impact of this, as it changes the core data structure of rustdoc-json. Unfortunately we don't have any automated way to see if this actually matters (cc rust-lang/rustc-perf#1512) Could you use a |
I don't think this is true? FxHashMap is just changing the default hasher for performance, it doesn't affect the order. I vaguely recall there's a StableHasher type of something but it's not the same as the FxHasher. |
The relevant difference is that two FxHashMaps with the same order of inserts will always iterate in the same order, as they don't randomize the hash seed (unlike |
As part of my performance work on I found that using This doesn't say that we shouldn't replace If anyone would like to repeat my experiments, I'd be happy to help you get set up. And if |
I didn't want to involve For the performance concerns, here is a basic demonstration from my desktop machine. using `BTreeMap`Build completed successfully in 0:00:15
Performance counter stats for '../../rust/x doc --stage 1 std --json':
15,873.45 msec task-clock:u # 1.000 CPUs utilized
0 context-switches:u # 0.000 K/sec
0 cpu-migrations:u # 0.000 K/sec
419,374 page-faults:u # 0.026 M/sec
62,660,633,799 cycles:u # 3.948 GHz (83.38%)
512,794,289 stalled-cycles-frontend:u # 0.82% frontend cycles idle (83.39%)
80,806,547 stalled-cycles-backend:u # 0.13% backend cycles idle (83.38%)
82,024,064,071 instructions:u # 1.31 insn per cycle
# 0.01 stalled cycles per insn (83.36%)
16,035,947,363 branches:u # 1010.237 M/sec (83.38%)
201,578,639 branch-misses:u # 1.26% of all branches (83.32%)
15.867989639 seconds time elapsed
14.909949000 seconds user
0.826553000 seconds sys using `FxHashMap`Build completed successfully in 0:00:15
Performance counter stats for '../../rust/x doc --stage 1 std --json':
15,548.93 msec task-clock:u # 1.000 CPUs utilized
0 context-switches:u # 0.000 K/sec
0 cpu-migrations:u # 0.000 K/sec
423,520 page-faults:u # 0.027 M/sec
61,458,365,022 cycles:u # 3.953 GHz (83.37%)
505,423,215 stalled-cycles-frontend:u # 0.82% frontend cycles idle (83.37%)
710,815,053 stalled-cycles-backend:u # 1.16% backend cycles idle (83.40%)
81,826,900,373 instructions:u # 1.33 insn per cycle
# 0.01 stalled cycles per insn (83.31%)
15,968,492,682 branches:u # 1026.984 M/sec (83.35%)
201,022,039 branch-misses:u # 1.26% of all branches (83.40%)
15.542970326 seconds time elapsed
14.672585000 seconds user
0.740135000 seconds sys
using `HashMap`Build completed successfully in 0:00:15
Performance counter stats for '../../rust/x doc --stage 1 std --json':
15,617.23 msec task-clock:u # 1.000 CPUs utilized
0 context-switches:u # 0.000 K/sec
0 cpu-migrations:u # 0.000 K/sec
425,083 page-faults:u # 0.027 M/sec
61,592,835,803 cycles:u # 3.944 GHz (83.39%)
490,297,782 stalled-cycles-frontend:u # 0.80% frontend cycles idle (83.37%)
727,341,887 stalled-cycles-backend:u # 1.18% backend cycles idle (83.38%)
81,851,581,044 instructions:u # 1.33 insn per cycle
# 0.01 stalled cycles per insn (83.37%)
15,975,646,762 branches:u # 1022.950 M/sec (83.38%)
202,179,644 branch-misses:u # 1.27% of all branches (83.31%)
15.609654475 seconds time elapsed
14.611848000 seconds user
0.869682000 seconds sys Those runs were on the precompiled bootstrap/compiler/library. All went directly to generating the json files. Machine specs Regarding to the results, |
It's fine, as this is only ment for use in the rust-lang/rust repo. The reexport I maintain for crates.io already handles this. While it's important that this library can be used on crates.io and stable, adding |
I will push it using |
These commits modify the If this was intentional then you can ignore this comment. |
278535c
to
df061c8
Compare
I'm not sure what the policy on this is. I've asked on zulip |
Turns out it's fine. Thanks for fixing this. @bors r+ |
Signed-off-by: ozkanonur <work@onurozkan.dev>
3f8d3ab
to
52c71e6
Compare
r=me when CI's green |
Could not assign reviewer from: |
@bors r+ |
…iaskrgr Rollup of 7 pull requests Successful merges: - rust-lang#106440 (Ignore files in .gitignore in tidy) - rust-lang#108613 (Remove `llvm.skip-rebuild` option) - rust-lang#108616 (Sync codegen defaults with compiler defaults and add a ping message so they stay in sync) - rust-lang#108618 (Rename `src/etc/vscode_settings.json` to `rust_analyzer_settings.json`) - rust-lang#108626 (rustdoc-json: switch from HashMap to FxHashMap to fix non-determinism) - rust-lang#108744 (Don't ICE when encountering bound var in builtin copy/clone bounds) - rust-lang#108749 (Clean up rustdoc-js tester.js file) Failed merges: r? `@ghost` `@rustbot` modify labels: rollup
`rustc_data_structures` rust-lang/rust#108626
The original motivation was me trying to remove the `#![allow(rustc::default_hash_types)]`, as after rust-lang#108626, we should be using `FxHashMap` here. I then realized I should also be able to remove the `.into_iter().collect()`, as we no longer need to convert from `FxHashMap` to `std::HashMap`. However, this didn't work, and I got the following error ``` error[E0308]: mismatched types --> src/librustdoc/json/mod.rs:235:13 | 235 | index, | ^^^^^ expected `rustc_hash::FxHasher`, found `FxHasher` | = note: `FxHasher` and `rustc_hash::FxHasher` have similar names, but are actually distinct types note: `FxHasher` is defined in crate `rustc_hash` --> /cargo/registry/src/index.crates.io-6f17d22bba15001f/rustc-hash-1.1.0/src/lib.rs:60:1 note: `rustc_hash::FxHasher` is defined in crate `rustc_hash` --> /home/alona/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rustc-hash-1.1.0/src/lib.rs:60:1 | 60 | pub struct FxHasher { | ^^^^^^^^^^^^^^^^^^^ = note: perhaps two different versions of crate `rustc_hash` are being used? ``` This is because `librustdoc` got it's `FxHashMap` via the sysroot `rustc-data-strucures`, whereas `rustdoc-json-types` got it via `rustc-hash` on crates.io. To avoid this, `rustdoc-json-types` now uses `#![feature(rustc_private)]` to load the same version as `librustdoc`. However, this needs to be placed behind a feature, as `rustdoc-json-types` is also dependency of `src/tools/jsondocck`, which means need needs to be buildable without nightly features.
Use same `FxHashMap` in `rustdoc-json-types` and `librustdoc`. The original motivation was me trying to remove the `#![allow(rustc::default_hash_types)]`, as after rust-lang#108626, we should be using `FxHashMap` here. I then realized I should also be able to remove the `.into_iter().collect()`, as we no longer need to convert from `FxHashMap` to `std::HashMap`. However, this didn't work, and I got the following error ``` error[E0308]: mismatched types --> src/librustdoc/json/mod.rs:235:13 | 235 | index, | ^^^^^ expected `rustc_hash::FxHasher`, found `FxHasher` | = note: `FxHasher` and `rustc_hash::FxHasher` have similar names, but are actually distinct types note: `FxHasher` is defined in crate `rustc_hash` --> /cargo/registry/src/index.crates.io-6f17d22bba15001f/rustc-hash-1.1.0/src/lib.rs:60:1 note: `rustc_hash::FxHasher` is defined in crate `rustc_hash` --> /home/alona/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rustc-hash-1.1.0/src/lib.rs:60:1 | 60 | pub struct FxHasher { | ^^^^^^^^^^^^^^^^^^^ = note: perhaps two different versions of crate `rustc_hash` are being used? ``` This is because `librustdoc` got it's `FxHashMap` via the sysroot `rustc-data-strucures`, whereas `rustdoc-json-types` got it via `rustc-hash` on crates.io. To avoid this, `rustdoc-json-types` now uses `#![feature(rustc_private)]` to load the same version as `librustdoc`. However, this needs to be placed behind a feature, as `rustdoc-json-types` is also dependency of `src/tools/jsondocck`, which means need needs to be buildable without nightly features. r? `@jyn514`
The original motivation was me trying to remove the `#![allow(rustc::default_hash_types)]`, as after rust-lang#108626, we should be using `FxHashMap` here. I then realized I should also be able to remove the `.into_iter().collect()`, as we no longer need to convert from `FxHashMap` to `std::HashMap`. However, this didn't work, and I got the following error ``` error[E0308]: mismatched types --> src/librustdoc/json/mod.rs:235:13 | 235 | index, | ^^^^^ expected `rustc_hash::FxHasher`, found `FxHasher` | = note: `FxHasher` and `rustc_hash::FxHasher` have similar names, but are actually distinct types note: `FxHasher` is defined in crate `rustc_hash` --> /cargo/registry/src/index.crates.io-6f17d22bba15001f/rustc-hash-1.1.0/src/lib.rs:60:1 note: `rustc_hash::FxHasher` is defined in crate `rustc_hash` --> /home/alona/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rustc-hash-1.1.0/src/lib.rs:60:1 | 60 | pub struct FxHasher { | ^^^^^^^^^^^^^^^^^^^ = note: perhaps two different versions of crate `rustc_hash` are being used? ``` This is because `librustdoc` got it's `FxHashMap` via the sysroot `rustc-data-strucures`, whereas `rustdoc-json-types` got it via `rustc-hash` on crates.io. To avoid this, `rustdoc-json-types` now uses `#![feature(rustc_private)]` to load the same version as `librustdoc`. However, this needs to be placed behind a feature, as `rustdoc-json-types` is also dependency of `src/tools/jsondocck`, which means need needs to be buildable without nightly features.
Use same `FxHashMap` in `rustdoc-json-types` and `librustdoc`. The original motivation was me trying to remove the `#![allow(rustc::default_hash_types)]`, as after rust-lang#108626, we should be using `FxHashMap` here. I then realized I should also be able to remove the `.into_iter().collect()`, as we no longer need to convert from `FxHashMap` to `std::HashMap`. However, this didn't work, and I got the following error ``` error[E0308]: mismatched types --> src/librustdoc/json/mod.rs:235:13 | 235 | index, | ^^^^^ expected `rustc_hash::FxHasher`, found `FxHasher` | = note: `FxHasher` and `rustc_hash::FxHasher` have similar names, but are actually distinct types note: `FxHasher` is defined in crate `rustc_hash` --> /cargo/registry/src/index.crates.io-6f17d22bba15001f/rustc-hash-1.1.0/src/lib.rs:60:1 note: `rustc_hash::FxHasher` is defined in crate `rustc_hash` --> /home/alona/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rustc-hash-1.1.0/src/lib.rs:60:1 | 60 | pub struct FxHasher { | ^^^^^^^^^^^^^^^^^^^ = note: perhaps two different versions of crate `rustc_hash` are being used? ``` This is because `librustdoc` got it's `FxHashMap` via the sysroot `rustc-data-strucures`, whereas `rustdoc-json-types` got it via `rustc-hash` on crates.io. To avoid this, `rustdoc-json-types` now uses `#![feature(rustc_private)]` to load the same version as `librustdoc`. However, this needs to be placed behind a feature, as `rustdoc-json-types` is also dependency of `src/tools/jsondocck`, which means need needs to be buildable without nightly features. r? `@jyn514`
Use same `FxHashMap` in `rustdoc-json-types` and `librustdoc`. The original motivation was me trying to remove the `#![allow(rustc::default_hash_types)]`, as after rust-lang#108626, we should be using `FxHashMap` here. I then realized I should also be able to remove the `.into_iter().collect()`, as we no longer need to convert from `FxHashMap` to `std::HashMap`. However, this didn't work, and I got the following error ``` error[E0308]: mismatched types --> src/librustdoc/json/mod.rs:235:13 | 235 | index, | ^^^^^ expected `rustc_hash::FxHasher`, found `FxHasher` | = note: `FxHasher` and `rustc_hash::FxHasher` have similar names, but are actually distinct types note: `FxHasher` is defined in crate `rustc_hash` --> /cargo/registry/src/index.crates.io-6f17d22bba15001f/rustc-hash-1.1.0/src/lib.rs:60:1 note: `rustc_hash::FxHasher` is defined in crate `rustc_hash` --> /home/alona/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rustc-hash-1.1.0/src/lib.rs:60:1 | 60 | pub struct FxHasher { | ^^^^^^^^^^^^^^^^^^^ = note: perhaps two different versions of crate `rustc_hash` are being used? ``` This is because `librustdoc` got it's `FxHashMap` via the sysroot `rustc-data-strucures`, whereas `rustdoc-json-types` got it via `rustc-hash` on crates.io. To avoid this, `rustdoc-json-types` now uses `#![feature(rustc_private)]` to load the same version as `librustdoc`. However, this needs to be placed behind a feature, as `rustdoc-json-types` is also dependency of `src/tools/jsondocck`, which means need needs to be buildable without nightly features. r? ``@jyn514``
The original motivation was me trying to remove the `#![allow(rustc::default_hash_types)]`, as after rust-lang#108626, we should be using `FxHashMap` here. I then realized I should also be able to remove the `.into_iter().collect()`, as we no longer need to convert from `FxHashMap` to `std::HashMap`. However, this didn't work, and I got the following error ``` error[E0308]: mismatched types --> src/librustdoc/json/mod.rs:235:13 | 235 | index, | ^^^^^ expected `rustc_hash::FxHasher`, found `FxHasher` | = note: `FxHasher` and `rustc_hash::FxHasher` have similar names, but are actually distinct types note: `FxHasher` is defined in crate `rustc_hash` --> /cargo/registry/src/index.crates.io-6f17d22bba15001f/rustc-hash-1.1.0/src/lib.rs:60:1 note: `rustc_hash::FxHasher` is defined in crate `rustc_hash` --> /home/alona/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rustc-hash-1.1.0/src/lib.rs:60:1 | 60 | pub struct FxHasher { | ^^^^^^^^^^^^^^^^^^^ = note: perhaps two different versions of crate `rustc_hash` are being used? ``` This is because `librustdoc` got it's `FxHashMap` via the sysroot `rustc-data-strucures`, whereas `rustdoc-json-types` got it via `rustc-hash` on crates.io. To avoid this, `rustdoc-json-types` now uses `#![feature(rustc_private)]` to load the same version as `librustdoc`. However, this needs to be placed behind a feature, as `rustdoc-json-types` is also dependency of `src/tools/jsondocck`, which means need needs to be buildable without nightly features.
The original motivation was me trying to remove the `#![allow(rustc::default_hash_types)]`, as after rust-lang#108626, we should be using `FxHashMap` here. I then realized I should also be able to remove the `.into_iter().collect()`, as we no longer need to convert from `FxHashMap` to `std::HashMap`. However, this didn't work, and I got the following error ``` error[E0308]: mismatched types --> src/librustdoc/json/mod.rs:235:13 | 235 | index, | ^^^^^ expected `rustc_hash::FxHasher`, found `FxHasher` | = note: `FxHasher` and `rustc_hash::FxHasher` have similar names, but are actually distinct types note: `FxHasher` is defined in crate `rustc_hash` --> /cargo/registry/src/index.crates.io-6f17d22bba15001f/rustc-hash-1.1.0/src/lib.rs:60:1 note: `rustc_hash::FxHasher` is defined in crate `rustc_hash` --> /home/alona/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rustc-hash-1.1.0/src/lib.rs:60:1 | 60 | pub struct FxHasher { | ^^^^^^^^^^^^^^^^^^^ = note: perhaps two different versions of crate `rustc_hash` are being used? ``` This is because `librustdoc` got it's `FxHashMap` via the sysroot `rustc-data-strucures`, whereas `rustdoc-json-types` got it via `rustc-hash` on crates.io. To avoid this, `rustdoc-json-types` now uses `#![feature(rustc_private)]` to load the same version as `librustdoc`. However, this needs to be placed behind a feature, as `rustdoc-json-types` is also dependency of `src/tools/jsondocck`, which means need needs to be buildable without nightly features.
The original motivation was me trying to remove the `#![allow(rustc::default_hash_types)]`, as after rust-lang#108626, we should be using `FxHashMap` here. I then realized I should also be able to remove the `.into_iter().collect()`, as we no longer need to convert from `FxHashMap` to `std::HashMap`. However, this didn't work, and I got the following error ``` error[E0308]: mismatched types --> src/librustdoc/json/mod.rs:235:13 | 235 | index, | ^^^^^ expected `rustc_hash::FxHasher`, found `FxHasher` | = note: `FxHasher` and `rustc_hash::FxHasher` have similar names, but are actually distinct types note: `FxHasher` is defined in crate `rustc_hash` --> /cargo/registry/src/index.crates.io-6f17d22bba15001f/rustc-hash-1.1.0/src/lib.rs:60:1 note: `rustc_hash::FxHasher` is defined in crate `rustc_hash` --> /home/alona/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rustc-hash-1.1.0/src/lib.rs:60:1 | 60 | pub struct FxHasher { | ^^^^^^^^^^^^^^^^^^^ = note: perhaps two different versions of crate `rustc_hash` are being used? ``` This is because `librustdoc` got it's `FxHashMap` via the sysroot `rustc-data-strucures`, whereas `rustdoc-json-types` got it via `rustc-hash` on crates.io. To avoid this, `rustdoc-json-types` now uses `#![feature(rustc_private)]` to load the same version as `librustdoc`. However, this needs to be placed behind a feature, as `rustdoc-json-types` is also dependency of `src/tools/jsondocck`, which means need needs to be buildable without nightly features.
Using
HashMap
inrustdoc_json_types::Crate
were causing creating randomly ordered objects in the json doc files. Which might cause problems to people who are doing comparison on those files specially in CI pipelines. See #103785 (comment)This PR fixes that issue and extends the coverage of
tests/run-make/rustdoc-verify-output-files
testing ability.