-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Collection of cases of deliberate UB #158
Comments
Unwinding through FFI boundaries is used in |
smallvec deliberately triggers UB as a hint for the compiler that the code is unreachable: https://github.com/servo/rust-smallvec/blob/3ee6d1e3e5a8b5a6eafa5f396c397d5038a9c97e/lib.rs#L130-L139 Some people are confused about why it's done that way: servo/rust-smallvec#158 |
That does not look like triggering UB to me. Like, no UB actually happens. By "triggering UB" I mean to actually hit that execution path (on the Abstract Machine). It's not UB it it's in an |
For the bytes example, perhaps link this LLVM bug report about optimizations that, if implemented, would probably remove the penalty for relaxed atomics compared to normal loads. |
So, here's something that's definitely UB that I just re-discovered. Should probably fix it, though I'm not sure if I believe this makes it okay? @@ -90,19 +90,21 @@ impl Handle {
use std::mem;
use std::ptr;
unsafe {
- let mut stack_key: (DataType, DataType) = mem::uninitialized();
+ let mut stack_key: (mem::MaybeUninit<DataType>, mem::MaybeUninit<DataType>) =
+ unsafe { (mem::MaybeUninit::uninit(), mem::MaybeUninit::uninit()) };
ptr::copy_nonoverlapping(
&key[0] as *const DataType,
- &mut stack_key.0 as *mut DataType,
+ stack_key.0.as_mut_ptr(),
1,
);
ptr::copy_nonoverlapping(
&key[1] as *const DataType,
- &mut stack_key.1 as *mut DataType,
+ stack_key.1.as_mut_ptr(),
1,
);
+ let stack_key =
+ unsafe { mem::transmute::<_, &(DataType, DataType)>(&stack_key) };
let v = h.meta_get_and(&stack_key, then);
- mem::forget(stack_key);
v
}
} |
Yeah, they should just use That's not really "crate causes UB because it cannot be avoided", it's more "crate hasn't been updated to |
Not sure what you mean? Why would transmuting the value ( |
it's just unnecessary to transmute references, and usually preferred to do raw pointer casts instead. And usually I see people doing value-transmutes for the "assume_init" step. So I wondered why you followed a different style. Either way though, this doesn't look like a candidate for the list above. There is a good alternative available to this crate, and AFAIK the only reason they are not doing this is that the crate hasn't been updated to follow the Rust-1.36+ style. |
I have recently tried to reduce the amount of UB in Deserializing an "abomonated" object entails fixing dangling pointers inside of it to have them point into other data in the serialized bytes. By definition, objects with dangling pointers are not valid, so constructing a reference to them is not okay according to our current UB rules. But I have no idea how to, say, access the inside of an invalid In the case of structs, this will be resolved by See the FIXMEs in https://github.com/TimelyDataflow/abomonation/pull/22/files for more details. |
That is only true for pointers at reference type. Like, a dangling I think a concrete example would help.
Why should tuples be any different than structs?
For slices, given a |
You can find many examples near the "FIXME" comments of the linked pull request. Regarding tuples, the problem is basically that it is difficult to generate the sequence tup.0, tup.1, tup.2... in a declarative macro, and that the most commonly used workaround uses references. This one might be soluble by introducing a private procedural macro utility, but the associated splitting in two crates will make the code even more obscure. I think some sort of variadic generics is needed to cleanly resolve this, but improvements to macros might do the trick as well. For slices, the problem is that abomonation gives me, say, a |
Ah, yes -- we definitely need a
The issue is, that's in the middle of tons of code that I don't know anything about. I was asking for something self-contained -- in typical "minimal reproducible example" style. My guess is, you have a |
Also, can we continue that discussion in #77 ? That fits better for this particular case of UB. |
Another example of |
Notice that such code is only used in |
Would llvm's unordered atomics help here? |
Well... "help" for some notion of help. See this forum discussion about "unordered". |
The Ruffle project uses an unsound custom DST (implement as a |
That's unsound under stacked borrows, but not under other potential Rust aliasing models — some of the discussions in #256 mention ways it (stacked borrows) might be fixed to accommodate such code. |
1: Remove miri hack r=taiki-e a=taiki-e Use currently use a hack to avoid rust-lang/rust#69488 and to make sure that Miri errors for atomic load/store of integers containing uninitialized bytes (which is probably not a problem and uncharted territory at best [1] [2] [3], and can be detected by `-Zmiri-check-number-validity` [4]), do not mask Miri errors for the use of uninitialized bytes (which is definitely a problem). https://github.com/taiki-e/atomic-memcpy/blob/3507fef17534e4825b2b303d04702b4678e29dd0/src/lib.rs#L426-L450 [1]: crossbeam-rs/crossbeam#315 [2]: rust-lang/unsafe-code-guidelines#158 [3]: rust-lang/unsafe-code-guidelines#71 [4]: rust-lang/miri#1904 However, this actually causes another "unsupported operation" Miri error. ``` error: unsupported operation: unable to turn pointer into raw bytes --> /Users/taiki/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:701:9 | 701 | copy_nonoverlapping(src, tmp.as_mut_ptr(), 1); | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ unable to turn pointer into raw bytes | = help: this is likely not a bug in the program; it indicates that the program performed an operation that the interpreter does not support ``` Co-authored-by: Taiki Endo <te316e89@gmail.com>
I'll turn this into a markdown file in the repo, and also add this. |
Sometimes crates deliberately cause UB because there is no other way to do what they need, or because the alternatives are unacceptably worse in some sense. This is not great, but it is valuable feedback for us -- we should try to either make those things not UB, or provide UB-free alternatives that are good enough!
Some cases:
AtomicCell
does various things with uninitialized memory. Most of this would be fine if we allowed uninitialized integers and adapted LLVM's handling of data races, butcompare_exchange
is worse.bytes
does a non-atomic plain load that races, because relaxed loads cost too much performance. (Note that LLVM's handling of data races is not enough here, data races still return garbage data. Also see this thread on using "unordered".)offset_of
macro everywhere, in particularmemoffset
field-offset
mozjpeg-sys
and possibly other image-related libs. One of many long discussions on the issue is here.In this issue, please let's only discuss whether something is UB according to current rules. if you want to talk about ways to improve the code or the spec to avoid UB, open a new issue (or there might already be one).
The text was updated successfully, but these errors were encountered: