introduce a compile error for when a result location provides integer widening ambiguity #16310

lerno · 2023-07-03T14:37:32Z

This is a different proposal than #7967. The idea is to give Zig integer promotion semantics that is safe from spooky action at a distance.

The problem

Given some code:

var a: i32 = ...;
var b: i32 = ...;
// Lots of code in between
var c: i32 = a + b; // Overflow exceeding i32

We can break the last line by changing the type of a and b

var a: i8 = ...;
var b: i8 = ...;
// Lots of code in between
var c: i32 = a + b; // Overflow exceeding i8

Note here that a and b does not need to be variables, but can be things like fields in a struct, possibly an anonymous one resulting from a call.

The lack of safety comes from the binary expression, we note that var c: i32 = a; is completely safe. It is only when there is an operation in which the subexpression's type matters that the problem arises. Stated in a different way, the problem arises whenever the widening may occur in more than one semantically different way. So in this case, we either (1) widen a and b to i32, then add or (2) first add a and b then widen to i32. The ambiguity is what causes the "spooky action at a distance" when changing a and b.

While increasing the width of a and b to - for example - i64 would cause a compilation error, a narrowing to i16 like in the example is impossible for the compiler to detect, as both possibilities may be desired.

The proposed solution

First we define "simple" vs "non-simple" expressions. A simple expression is one that can only be widened in one way: a constant, a variable, a call, a dereference, taking the address of a value. A "non-simple" expression is typically any binary expression, negation, bit negation, ternary etc. The full list is fairly easy to work out.
When encountering a situation where widening should occur, check the (constant folded) expression whether is is simple or non-simple. The latter is a compile time error that can be fixed with a cast.

Summary

Disallowing implicit widening of sub expressions seems like a simple change which prevents vulnerabilities in the current Zig integer promotion semantics that cannot be detected by the compiler.

Example

c = a + b * (d + e);

We first start by checking d and e. If they do not have the same size, we try to widen the other expression. As d and e are both simple expressions. This is always allowed.
We then check b and (d + e) if b needs to be widened, this is allowed, but if b is wider than (d + e) this is an error.
We then check a and b * (d + e), if a needs to be widened, this is allowed, but if it is wider than b * (d + e) this is an error.
We finally check c and a + b * (d + e), since a + b * (d + e) is non-simple, we don't allow any widening of it. Nor may it be wider than c, since Zig does not allow implicit narrowing.

Example with constant folding

c = a + b * (1 + 5);

First we constant fold 1 + 5 to 6
b * 6 is resolved by typing 6 to the type of b
a is compared to the type of b * 6 (which is the type of b) if it is wider it is an error (because b * 6 is not simple), if it is more narrow, a is widened.
c is checked whether it has the type of a + b * 6 (i.e. does c and b have the same type) otherwise it is an error.

Example with simple expressions

c = b;

If b is more narrow than c widen it. If c is more narrow than b then this is an error.

The text was updated successfully, but these errors were encountered:

andrewrk · 2023-07-03T18:53:46Z

This proposal lacks behavior test cases. I'll re-open if you add some concrete examples. Give me something I can pass to zig test and find out if the proposal is implemented or not.

lerno · 2023-07-03T19:25:32Z

Test cases showing valid conversions

var a: i16 = 0;
var b: i16 = 0;
var c: i32 = 0;
var d: i32 = 0;
var e: *i16 = &a;
c = b;
b = a * a;
c = b * d;
c = (a + d) * (b + d) - a;
c = e.*;
c = @intFromBool(false);
c = c + b;
c = (c + b) * a;
c = @intCast(b + a);
c = @intCast(b + a * 1);

Invalid conversions according to this proposal:

var a: i16 = 0;
var b: i16 = 0;
var c: i32 = 0;
c = b + a;
//  ^^^^^ Error, i16 + i16 cannot be implicitly widened
c = b + a * 1;
//  ^^^^^^^^^ Error, i16 + (i16 + i16) cannot be implicitly widened
c = c + b * 1;
//      ^^^^^ Error, i16 * i16 cannot be implicitly widened 
c = b << 1; 
//  ^^^^^^ Error, i16 << i1 cannot be implicitly widened
c = -b; 
//  ^^ Error, -i16 cannot be implicitly widened
c = ~b;
//  ^^ Error, ~i16 cannot be implicitly widened
c = c + (b + 1); 
//       ^^^^^ Error, i16 + i16 cannot be implicitly widened
c = b + 1;
//  ^^^^^ Error, i16 + i16 cannot be implicitly widened
c = @intCast(c + (b + 1));
//                ^^^^^ Error, i16 + i16 cannot be implicitly widened
c = (a + b) - c;
//   ^^^^^ Error, i16 + i16 cannot be implicitly widened

I've underlined the part which is detected as invalid, to demonstrate when in the semantic checking process the error is detected.

Let me know if this is sufficiently clear.

matklad · 2023-07-04T09:45:42Z

Kinda-of naive suggestion, but could we do the following:

compute all intermediate results as if all values are arbitrary precision
emit overflow checks when using intermediate result as a value of a specific type (so, on assignment, function call, or a cast)

?

Implementation wise, this would require looking at the entire arithmetic expression, noting the types of input, computing the least-wide type to hold all intermediate results, and doing arithmetic in that wide type.

The end result would be that, eg, (a + b) / 2 and a + (b - a) / 2 are equivalent.

I think I seen such semantics in this post.

lerno · 2023-07-04T11:44:24Z

You can look up AIIR. There are several unsolved practical issues with this. Up to i32/u32, it is pretty ok, but once we use i64, we're flipping over to i128 which has quite different perf characteristics.

In addition to this the fact that intermediates may have somewhat unclear type size, affecting bit operations.

My thoughts and attempts at better semantics are collected here, as you see I made several aborted attempts before settling with this trade-off.

https://c3.handmade.network/blog/p/7651-overflow_trapping_in_practice
https://c3.handmade.network/blog/p/7656-c3__handling_casts_and_overflows_part_1
https://c3.handmade.network/blog/p/7661-c3__handling_casts_and_overflows_part_2
https://c3.handmade.network/blog/p/8138-fixing_bugs_in_our_proposal
https://c3.handmade.network/blog/p/8134-attempting_new_c3_type_conversion_semantics

lerno · 2023-07-04T20:25:18Z

Also @matklad, your proposal is similar to the proposal I retracted: #7967

judofyr · 2023-07-07T09:10:16Z

Test cases showing valid conversions

var a: i16 = 0;
var b: i16 = 0;
var c: i32 = 0;
var d: i32 = 0;
var e: *i16 = &a;

Do I understand this correctly that c = @intCast(b + a) will do the addition in i16 and then widen the answer? If so, I'm not sure if this is actually explicit enough to handle the case described. I suspect that users will just "blindly" add @intCast without actually realizing the implications. I can image the following scenario:

I start by defining a and b to be i16.
I try c = a + b, but I'll get a compiler error (due to the proposed solution here).
I tweak it into c = @intCast(a + b) (since that's suggested in the manual). Now my code compiles and I'm happy.
Turns out that a and b are always <127 so I realize I can save some bytes by changing them to i8.
Oops! Now my code overflows!

If you have an expression like c = a + b (where a and b are a narrower type than c) isn't it also more likely that you'd like to do the calculation in the type of c? With this proposal I would have to write c = @as(i32, a) + b which is a bit clunky (especially since I'd have to repeat the type of c).

lerno · 2023-07-07T12:15:25Z

So to me the main thing is to prevent the silent error of starting with this:

var a: i32 = 0;
var b: i32 = 0;
...
var c: i32 = a + b;

Then going to

var a: i16 = 0;
var b: i16 = 0;
...
var c: i32 = a + b;

Without the compiler requiring a cast.

It is correct that once the cast is written, the detection is broken. I think this might be inevitable. After all, a cast essentially means "silence this error". So if one applies a very broad cast, then yes that will create a bug down the line. But as you noted, it's also possible to do the right thing.

We could envision a clunky cast that explicitly says what we're converting from for extra safety:

var c: i32 = @intCastFrom(i16, a + b);

But I do think that widening a is the safe solution here.

andrewrk · 2023-07-09T17:57:04Z

I've taken a look at the example cases. I think it could be a nice compromise, if #3806 does not work out. However, if it works the way I want it to, then all of these lines that have proposed compile errors will become fully safe arithmetic that preserves the mathematical value of all operands and results. I want to explore if we can get to a place where there is no hidden safety checks on arithmetic, they would all be at @intCast sites (or other similar builtins such as @enumFromInt).

AssortedFantasy · 2023-07-10T01:03:52Z

@matklad

See #7416 by spexguy. It's basically the same idea. Make it so intermediate operations don't overflow its only when the final assignment occurs is overflow considered.

In my opinion this is a much more sane (for the programmer) way to go about mathematics.

I think it should be possible for additions. But maybe not multiplication and division.

Think about it like this: we recently fixed the comparision operators to work for every integer type. < > don't care what signed or unsigned nonsense is going on they just do what is mathematically correct. (They work like https://en.cppreference.com/w/cpp/utility/intcmp)

lerno · 2023-07-10T16:27:39Z

@AssortedFantasy signed<->unsigned comparison is trivial to solve with minimal overhead. Overflow is a problem of a different magnitude and cannot be solved the same way.

lerno mentioned this issue Jul 3, 2023

Change zig int promotion / cast rules #7967

Closed

andrewrk added the proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. label Jul 3, 2023

andrewrk added this to the 0.12.0 milestone Jul 3, 2023

andrewrk closed this as not planned Won't fix, can't repro, duplicate, stale Jul 3, 2023

andrewrk reopened this Jul 3, 2023

andrewrk mentioned this issue Jul 4, 2023

allow integer types to be any range #3806

Open

andrewrk changed the title ~~New integer promotion semantics for Zig~~ introduce a compile error for when a result location provides integer widening ambiguity Jul 9, 2023

mlugg mentioned this issue Dec 7, 2024

Proposal: remove Peer Type Resolution from the language #22182

Open

mlugg modified the milestones: 0.14.0, 0.15.0 Feb 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

introduce a compile error for when a result location provides integer widening ambiguity #16310

introduce a compile error for when a result location provides integer widening ambiguity #16310

lerno commented Jul 3, 2023

andrewrk commented Jul 3, 2023

lerno commented Jul 3, 2023

matklad commented Jul 4, 2023

lerno commented Jul 4, 2023

lerno commented Jul 4, 2023

judofyr commented Jul 7, 2023

lerno commented Jul 7, 2023

andrewrk commented Jul 9, 2023 •

edited

Loading

AssortedFantasy commented Jul 10, 2023

lerno commented Jul 10, 2023

introduce a compile error for when a result location provides integer widening ambiguity #16310

introduce a compile error for when a result location provides integer widening ambiguity #16310

Comments

lerno commented Jul 3, 2023

The problem

The proposed solution

Other solutions

1. Last step widening

2. Push down type widening

Summary

Example

Example with constant folding

Example with simple expressions

andrewrk commented Jul 3, 2023

lerno commented Jul 3, 2023

matklad commented Jul 4, 2023

lerno commented Jul 4, 2023

lerno commented Jul 4, 2023

judofyr commented Jul 7, 2023

lerno commented Jul 7, 2023

andrewrk commented Jul 9, 2023 • edited Loading

AssortedFantasy commented Jul 10, 2023

lerno commented Jul 10, 2023

andrewrk commented Jul 9, 2023 •

edited

Loading