Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calls to BitSlice::set_aliased and BitSlice::set_aliased_unchecked on Cell<u8> is deleted completely by optimizer in release mode #283

Open
royaltm opened this issue Dec 6, 2024 · 3 comments

Comments

@royaltm
Copy link

royaltm commented Dec 6, 2024

Hello @myrrlyn

bitvec: v1.0.1

I think I stumbled upon a strange and nasty bug, where the outcome is different in rustc DEV versus RELEASE compilation profile.

The minimal test example which passes in DEV while fails in RELEASE:

use std::cell::Cell;
use bitvec::prelude::*;

#[test]
fn test_cell_bitvec() {
    let cell: Cell<u8> = Cell::new(0);

    for index in 0..8 {
        assert_eq!(*cell.view_bits::<Lsb0>().get(index).unwrap(), false);
    }

    for index in 0..8 {
        cell.view_bits::<Lsb0>().set_aliased(index, true);
    }

    for index in 0..8 {
        assert_eq!(*cell.view_bits::<Lsb0>().get(index).unwrap(), true);
    }
}

#[test]
fn test_cell_bitvec_unchecked() {
    let cell: Cell<u8> = Cell::new(0);

    for index in 0..8 {
        assert_eq!(*unsafe { cell.view_bits::<Lsb0>().get_unchecked(index) }, false);
    }

    for index in 0..8 {
        unsafe { cell.view_bits::<Lsb0>().set_aliased_unchecked(index, true) }
    }

    for index in 0..8 {
        assert_eq!(*unsafe { cell.view_bits::<Lsb0>().get_unchecked(index) }, true);
    }
}

When tested with --release flag both tests fail as if set_alias and set_alias_unchecked was never called:

running 2 tests
test tests::test_cell_bitvec ... FAILED
test tests::test_cell_bitvec_unchecked ... FAILED

failures:

---- tests::test_cell_bitvec stdout ----
thread 'tests::test_cell_bitvec' panicked at src/main.rs:36:13:
assertion `left == right` failed
  left: false
 right: true
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

---- tests::test_cell_bitvec_unchecked stdout ----
thread 'tests::test_cell_bitvec_unchecked' panicked at src/main.rs:53:13:
assertion `left == right` failed
  left: false
 right: true


failures:
    tests::test_cell_bitvec
    tests::test_cell_bitvec_unchecked

test result: FAILED. 0 passed; 2 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

If I replace Cell<u8> with any other primitive type: Cell<u16|u32|u64|usize> the test does not fail in RELEASE mode. This only happens for Cell<u8>!

The only workaround I found so far is to replace Cell<u8> with AtomicU8 which is not ideal as I'm using it in a single threaded environment.

This happens regardless of the platform, as I first detected the problem on thumbv7em but later I confirmed the same is happening on x86_64.

The bug is quite nasty and can give you headaches when trying to debug this, especially that it works fine in DEV and only manifests when optimizations are turned on.

Moreover it does not matter what kind of opt-level is enabled as long as it's equal to or above opt-level = 1 for profile.release.

@zachs18
Copy link

zachs18 commented Feb 7, 2025

This appears to have started happening with LLVM 16.

cargo bisect-rustc result
searched toolchains nightly-2022-09-17 through nightly-2024-04-28


********************************************************************************
Regression in nightly-2023-06-15
********************************************************************************

fetching https://static.rust-lang.org/dist/2023-06-14/channel-rust-nightly-git-commit-hash.txt
nightly manifest 2023-06-14: 40 B / 40 B [===================================================================================================================] 100.00 % 542.65 KB/s converted 2023-06-14 to 371994e0d8380600ddda78ca1be937c7fb179b49
fetching https://static.rust-lang.org/dist/2023-06-15/channel-rust-nightly-git-commit-hash.txt
nightly manifest 2023-06-15: 40 B / 40 B [===================================================================================================================] 100.00 % 951.88 KB/s converted 2023-06-15 to 8c74a5d27c644a0f7a22bb2fa8dd3ff8257bc220
looking for regression commit between 2023-06-14 and 2023-06-15
fetching (via remote github) commits from max(371994e0d8380600ddda78ca1be937c7fb179b49, 2023-06-12) to 8c74a5d27c644a0f7a22bb2fa8dd3ff8257bc220
ending github query because we found starting sha: 371994e0d8380600ddda78ca1be937c7fb179b49
get_commits_between returning commits, len: 9
  commit[0] 2023-06-13: Auto merge of #112314 - ferrocene:pa-core-alloc-abort, r=bjorn3
  commit[1] 2023-06-13: Auto merge of #112062 - lukas-code:unsized-layout, r=wesleywiser
  commit[2] 2023-06-14: Auto merge of #112448 - nnethercote:no-tiny-cgus, r=wesleywiser
  commit[3] 2023-06-14: Auto merge of #112609 - matthiaskrgr:rollup-er6weld, r=matthiaskrgr
  commit[4] 2023-06-14: Auto merge of #110662 - bryangarza:safe-transmute-reference-types, r=compiler-errors
  commit[5] 2023-06-14: Auto merge of #112400 - WaffleLapkin:vtable_stats, r=compiler-errors
  commit[6] 2023-06-14: Auto merge of #112418 - ferrocene:pa-mir-opt-panic, r=ozkanonur,saethlin
  commit[7] 2023-06-14: Auto merge of #112624 - matthiaskrgr:rollup-db6ta1b, r=matthiaskrgr
  commit[8] 2023-06-14: Auto merge of #112625 - matthiaskrgr:rollup-jcobj3g, r=matthiaskrgr
ERROR: no CI builds available between 371994e0d8380600ddda78ca1be937c7fb179b49 and 8c74a5d27c644a0f7a22bb2fa8dd3ff8257bc220 within last 167 days

The CI artifacts from that far back are gone, and I'm not going to build LLVM locally to test which of these may have caused the difference, but rust-lang/rust#112062 seems relevant maybe? Or maybe rust-lang/rust#112448 which changed codegen-unit partitioning.

Actually, the codegen-unit partition change made me think to try the code with RUSTFLAGS="-C codegen-units=1, which makes the bisection go to nightly-2023-03-19, which contained the first try LLVM bump rust-lang/rust#107224 that was reverted in rust-lang/rust#109326 and the re-landed in rust-lang/rust#109474 .

Manually trying all nightlies in that range gives the expected result for this being caused by the LLVM 16 bump:

  • nightly-2023-03-18 and before: success (before first try LLVM 16 bump)
  • nightly-2023-03-19: fail (after first bump, before revert)
  • nightly-2023-03-20 through nightly-2023-03-25: success (after revert, before re-land)
  • nightly-2023-03-26 and after: fail (after second try LLVM 16 bump)

A simpler example that doesn't require -C codegen-units=1

use std::cell::Cell;
use bitvec::prelude::*;

#[inline(never)]
#[no_mangle]
pub extern "C" fn do_the_thing(cell: &Cell<u8>) {
    cell.view_bits::<Lsb0>().set_aliased(0, true);
}

In release mode, in the "success" nightlies as listed above, this code compiles to:

do_the_thing:
  or byte ptr [rdi], 1
  ret

and in the "fail" nightlies, compiles to:

do_the_thing:
  ret

@zachs18
Copy link

zachs18 commented Feb 7, 2025

Changing _mem: [()] to _mem: [UnsafeCell<()>] in BitSlice seems to fix the possible miscompilation, but makes Miri mad under Stacked Borrows.

@zachs18
Copy link

zachs18 commented Feb 8, 2025

I was going to post the following as an issue on rust-lang/rust, but now I think this is probably not a miscompilation but UB because bitvec::BitSlice has only non-interior-mutable fields, even if T: Radium and thus the BitSlice should be writable.
Leaving this here for extra info:

would-be Rust issue

I tried this code: (using any of bitvec 1.0.1 from crates.io or the current main branch at https://github.com/ferrilab/bitvec or https://github.com/ferrilab/ferrilab).

use std::cell::Cell;
use bitvec::prelude::*;

#[inline(never)]
#[no_mangle]
pub extern "C" fn do_the_thing(cell: &Cell<u8>) {
    cell.view_bits::<Lsb0>().set_aliased(0, true);
}

I expected to see this happen: In release mode, in the "success" nightlies as listed below, this code compiles to:

do_the_thing:
  or byte ptr [rdi], 1
  ret

Instead, this happened: in the "fail" nightlies (and since), this compiles to:

do_the_thing:
  ret
Original linked issue

#283

Put in a lib.rs in a cargo workspace and run with RUSTFLAGS="-Ccodegen-units=1" cargo +toolchain test --release

use std::cell::Cell;
use bitvec::prelude::*;

#[test]
fn test_cell_bitvec() {
    let cell: Cell<u8> = Cell::new(0);

    for index in 0..8 {
        assert_eq!(*cell.view_bits::<Lsb0>().get(index).unwrap(), false);
    }

    for index in 0..8 {
        cell.view_bits::<Lsb0>().set_aliased(index, true);
    }

    for index in 0..8 {
        assert_eq!(*cell.view_bits::<Lsb0>().get(index).unwrap(), true);
    }
}

#[test]
fn test_cell_bitvec_unchecked() {
    let cell: Cell<u8> = Cell::new(0);

    for index in 0..8 {
        assert_eq!(*unsafe { cell.view_bits::<Lsb0>().get_unchecked(index) }, false);
    }

    for index in 0..8 {
        unsafe { cell.view_bits::<Lsb0>().set_aliased_unchecked(index, true) }
    }

    for index in 0..8 {
        assert_eq!(*unsafe { cell.view_bits::<Lsb0>().get_unchecked(index) }, true);
    }
}

I expected to see this happen: Either the tests pass Miri and suceeds in both debug and release mode, or the tests don't pass Miri.

Instead, this happened: The tests pass Miri (though bitvec does use integer-to-pointer casts, which Miri points out may cause it to miss UB). and succeed in debug mode, but fail in release mode since the LLVM 16 bump (depending on -Ccodegen-units).

bitvec basically uses &[()] references, so if those shrink provenance to zero bytes or restrict provenance to read-only, then this is probably UB and not a miscompilation. CC rust-lang/unsafe-code-guidelines#314, rust-lang/unsafe-code-guidelines#256, etc.

Meta

  • nightly-2023-03-18 and before: success (before first try LLVM 16 bump)
  • nightly-2023-03-19: fail (after first bump, before revert)
  • nightly-2023-03-20 through nightly-2023-03-25: success (after revert, before re-land)
  • nightly-2023-03-26 and after: fail (after second try LLVM 16 bump)

nightly-2023-03-18 version:

rustc 1.70.0-nightly (13afbdaa0 2023-03-17)
binary: rustc
commit-hash: 13afbdaa0655dda23d7129e59ac48f1ec88b2084
commit-date: 2023-03-17
host: x86_64-unknown-linux-gnu
release: 1.70.0-nightly
LLVM version: 15.0.7

nightly-2023-03-26 version:

rustc 1.70.0-nightly (0c61c7a97 2023-03-25)
binary: rustc
commit-hash: 0c61c7a978fe9f7b77a1d667c77d2202dadd1c10
commit-date: 2023-03-25
host: x86_64-unknown-linux-gnu
release: 1.70.0-nightly
LLVM version: 16.0.0

Without -Ccodegen-units=1, this only started failing later (probably due to #112448):

First nightly with failing cargo test --release without -Ccodegen-units=1:

rustc 1.72.0-nightly (8c74a5d27 2023-06-14)
binary: rustc
commit-hash: 8c74a5d27c644a0f7a22bb2fa8dd3ff8257bc220
commit-date: 2023-06-14
host: x86_64-unknown-linux-gnu
release: 1.72.0-nightly
LLVM version: 16.0.5
cargo bisect-rustc output

cargo bisect-rustc --script bisect-script.sh --start 1.71.0 --end 1.73.0

#!/bin/bash
#!/bin/bash
cargo test --release
exit $?
searched toolchains nightly-2023-05-27 through nightly-2023-08-20


********************************************************************************
Regression in nightly-2023-06-15
********************************************************************************

fetching https://static.rust-lang.org/dist/2023-06-14/channel-rust-nightly-git-commit-hash.txt
nightly manifest 2023-06-14: 40 B / 40 B [===================================================================================================================] 100.00 % 402.12 KB/s converted 2023-06-14 to 371994e0d8380600ddda78ca1be937c7fb179b49
fetching https://static.rust-lang.org/dist/2023-06-15/channel-rust-nightly-git-commit-hash.txt
nightly manifest 2023-06-15: 40 B / 40 B [===================================================================================================================] 100.00 % 530.32 KB/s converted 2023-06-15 to 8c74a5d27c644a0f7a22bb2fa8dd3ff8257bc220
looking for regression commit between 2023-06-14 and 2023-06-15
fetching (via remote github) commits from max(371994e0d8380600ddda78ca1be937c7fb179b49, 2023-06-12) to 8c74a5d27c644a0f7a22bb2fa8dd3ff8257bc220
ending github query because we found starting sha: 371994e0d8380600ddda78ca1be937c7fb179b49
get_commits_between returning commits, len: 9
  commit[0] 2023-06-13: Auto merge of #112314 - ferrocene:pa-core-alloc-abort, r=bjorn3
  commit[1] 2023-06-13: Auto merge of #112062 - lukas-code:unsized-layout, r=wesleywiser
  commit[2] 2023-06-14: Auto merge of #112448 - nnethercote:no-tiny-cgus, r=wesleywiser
  commit[3] 2023-06-14: Auto merge of #112609 - matthiaskrgr:rollup-er6weld, r=matthiaskrgr
  commit[4] 2023-06-14: Auto merge of #110662 - bryangarza:safe-transmute-reference-types, r=compiler-errors
  commit[5] 2023-06-14: Auto merge of #112400 - WaffleLapkin:vtable_stats, r=compiler-errors
  commit[6] 2023-06-14: Auto merge of #112418 - ferrocene:pa-mir-opt-panic, r=ozkanonur,saethlin
  commit[7] 2023-06-14: Auto merge of #112624 - matthiaskrgr:rollup-db6ta1b, r=matthiaskrgr
  commit[8] 2023-06-14: Auto merge of #112625 - matthiaskrgr:rollup-jcobj3g, r=matthiaskrgr
ERROR: no CI builds available between 371994e0d8380600ddda78ca1be937c7fb179b49 and 8c74a5d27c644a0f7a22bb2fa8dd3ff8257bc220 within last 167 days

Also: rust-lang/unsafe-code-guidelines#256 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants