-
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
introduce a compile error for when a result location provides integer widening ambiguity #16310
Comments
This proposal lacks behavior test cases. I'll re-open if you add some concrete examples. Give me something I can pass to |
Test cases showing valid conversions var a: i16 = 0;
var b: i16 = 0;
var c: i32 = 0;
var d: i32 = 0;
var e: *i16 = &a;
c = b;
b = a * a;
c = b * d;
c = (a + d) * (b + d) - a;
c = e.*;
c = @intFromBool(false);
c = c + b;
c = (c + b) * a;
c = @intCast(b + a);
c = @intCast(b + a * 1); Invalid conversions according to this proposal: var a: i16 = 0;
var b: i16 = 0;
var c: i32 = 0;
c = b + a;
// ^^^^^ Error, i16 + i16 cannot be implicitly widened
c = b + a * 1;
// ^^^^^^^^^ Error, i16 + (i16 + i16) cannot be implicitly widened
c = c + b * 1;
// ^^^^^ Error, i16 * i16 cannot be implicitly widened
c = b << 1;
// ^^^^^^ Error, i16 << i1 cannot be implicitly widened
c = -b;
// ^^ Error, -i16 cannot be implicitly widened
c = ~b;
// ^^ Error, ~i16 cannot be implicitly widened
c = c + (b + 1);
// ^^^^^ Error, i16 + i16 cannot be implicitly widened
c = b + 1;
// ^^^^^ Error, i16 + i16 cannot be implicitly widened
c = @intCast(c + (b + 1));
// ^^^^^ Error, i16 + i16 cannot be implicitly widened
c = (a + b) - c;
// ^^^^^ Error, i16 + i16 cannot be implicitly widened I've underlined the part which is detected as invalid, to demonstrate when in the semantic checking process the error is detected. Let me know if this is sufficiently clear. |
Kinda-of naive suggestion, but could we do the following:
? Implementation wise, this would require looking at the entire arithmetic expression, noting the types of input, computing the least-wide type to hold all intermediate results, and doing arithmetic in that wide type. The end result would be that, eg, I think I seen such semantics in this post. |
You can look up AIIR. There are several unsolved practical issues with this. Up to i32/u32, it is pretty ok, but once we use i64, we're flipping over to i128 which has quite different perf characteristics. In addition to this the fact that intermediates may have somewhat unclear type size, affecting bit operations. My thoughts and attempts at better semantics are collected here, as you see I made several aborted attempts before settling with this trade-off. https://c3.handmade.network/blog/p/7651-overflow_trapping_in_practice |
Do I understand this correctly that
If you have an expression like |
So to me the main thing is to prevent the silent error of starting with this: var a: i32 = 0;
var b: i32 = 0;
...
var c: i32 = a + b; Then going to var a: i16 = 0;
var b: i16 = 0;
...
var c: i32 = a + b; Without the compiler requiring a cast. It is correct that once the cast is written, the detection is broken. I think this might be inevitable. After all, a cast essentially means "silence this error". So if one applies a very broad cast, then yes that will create a bug down the line. But as you noted, it's also possible to do the right thing. We could envision a clunky cast that explicitly says what we're converting from for extra safety: var c: i32 = @intCastFrom(i16, a + b); But I do think that widening |
I've taken a look at the example cases. I think it could be a nice compromise, if #3806 does not work out. However, if it works the way I want it to, then all of these lines that have proposed compile errors will become fully safe arithmetic that preserves the mathematical value of all operands and results. I want to explore if we can get to a place where there is no hidden safety checks on arithmetic, they would all be at |
See #7416 by spexguy. It's basically the same idea. Make it so intermediate operations don't overflow its only when the final assignment occurs is overflow considered. In my opinion this is a much more sane (for the programmer) way to go about mathematics. I think it should be possible for additions. But maybe not multiplication and division. Think about it like this: we recently fixed the comparision operators to work for every integer type. < > don't care what signed or unsigned nonsense is going on they just do what is mathematically correct. (They work like https://en.cppreference.com/w/cpp/utility/intcmp) |
@AssortedFantasy signed<->unsigned comparison is trivial to solve with minimal overhead. Overflow is a problem of a different magnitude and cannot be solved the same way. |
This is a different proposal than #7967. The idea is to give Zig integer promotion semantics that is safe from spooky action at a distance.
The problem
Given some code:
We can break the last line by changing the type of
a
andb
Note here that
a
andb
does not need to be variables, but can be things like fields in a struct, possibly an anonymous one resulting from a call.The lack of safety comes from the binary expression, we note that
var c: i32 = a;
is completely safe. It is only when there is an operation in which the subexpression's type matters that the problem arises. Stated in a different way, the problem arises whenever the widening may occur in more than one semantically different way. So in this case, we either (1) widena
andb
to i32, then add or (2) first adda
andb
then widen to i32. The ambiguity is what causes the "spooky action at a distance" when changinga
andb
.While increasing the width of
a
andb
to - for example - i64 would cause a compilation error, a narrowing to i16 like in the example is impossible for the compiler to detect, as both possibilities may be desired.The proposed solution
Other solutions
1. Last step widening
This is the status quo, corresponding to always doing the widening at the last step, when it's needed. In this case, it would mean
a + b
is done first, then the widening. This is what C does as well, but there is a big difference: C first does widening to int/unsigned, which makes this only dangerous if the type widened to is wider than int, i.e. size_t, long, long long etc. Because this is less common, it's a less common problem. Nevertheless, this is a source of security vulnerabilities and bugs.2. Push down type widening
In this solution, the widening type is "pushed down" into the sub expressions. In our case above, this would mean
a
andb
are first widened, then added, then the widening is applied. While this solves the issue at hand, it also creates some very subtle changes to an expression when the left hand side is widened. For example, say we wantc = a << b
and we deliberately pickeda
to be u16? If so then we must now try to explicitly opt out of the widening toc
's type. Pushing down the type is also not 100% trivial.C3 tested this type of widening and the killer problem was how to understand the actual type of sub expressions and whether they would be top down widened or not. It was not retained.
Summary
Disallowing implicit widening of sub expressions seems like a simple change which prevents vulnerabilities in the current Zig integer promotion semantics that cannot be detected by the compiler.
Example
d
ande
. If they do not have the same size, we try to widen the other expression. Asd
ande
are both simple expressions. This is always allowed.b
and(d + e)
ifb
needs to be widened, this is allowed, but ifb
is wider than(d + e)
this is an error.a
andb * (d + e)
, ifa
needs to be widened, this is allowed, but if it is wider thanb * (d + e)
this is an error.c
anda + b * (d + e)
, sincea + b * (d + e)
is non-simple, we don't allow any widening of it. Nor may it be wider thanc
, since Zig does not allow implicit narrowing.Example with constant folding
b * 6
is resolved by typing 6 to the type ofb
a
is compared to the type ofb * 6
(which is the type ofb
) if it is wider it is an error (becauseb * 6
is not simple), if it is more narrow,a
is widened.c
is checked whether it has the type ofa + b * 6
(i.e. doesc
andb
have the same type) otherwise it is an error.Example with simple expressions
b
is more narrow thanc
widen it. Ifc
is more narrow thanb
then this is an error.The text was updated successfully, but these errors were encountered: