-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
unreliable codegen around vbroadcastss in .NET 8 release builds #96156
Comments
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue DetailsDescriptionCode that runs fine in .NET 7 fails in .NET 8. Reproduction StepsThis affects a couple helper methods which have been in place for years.
Expected behaviorNumerical simulations remain correct. Actual behaviorSimulations experience numerical divergence as timesteps proceed, ultimately either completing with wrong results or crashing on Regression?Yes. .NET 8 debug builds work fine. Flipping projects back to .NET 7 removes this problem. Rolling either of the affected repos I've detected so far back to .NET 7 and then moving projects forward to .NET 8 with no other changes also reproduces the error. Known WorkaroundsUse of Configuration.NET 8, Win 10 22H2, AMD Zen 3. Yes, specific to .NET 8 and x64 intrinsics. Other informationIn .NET 7 the disassembly is
In .NET 8 it's of the form
However, the JIT from the apparent super quick workaround of
appears to be functionally identical, both locally
and in the surrounding call sites I've checked. After several hours' investigation my current best guess is there's some small fraction out of hundreds of calls where I'm not particularly concerned about this issue—looking at what .NET 8 JIT does, removing the entire helper class as legacy code and changing to inline
|
@twest820 do you have a repro project? |
Not a simplified repro at this time, unfortunately. But, if this isn't something going sideways on my dev machine, the There's another repo that's also affected (at commit 80ca9ef) since it contains the same legacy helper methods. We don't have permission to release the data the tests run on, though, so it'd be harder to work with. |
@tannergooding @dotnet/avx512-contrib |
We have encountered similar issue. After switching to .NET8, code that was using |
@twest820 sorry it has taken us so long to look into this. Is this the kind of failure you are seeing with 8.0?
|
@threbejk if you have a repro you can share that would also be helpful. |
The bug in the original repro seems to be the call at line 84 of Codegen (good on left, bad on left) shows the The issue is still present in .Net9. Looks like the broadcast ends up getting marked as contained by the multiply and then gets dropped entirely. |
Some forms of |
FF15CF029600 call [BayesianPG.Extensions.DateTimeExtensions:DaysInMonth(System.DateTime):int]
C5F857C0 vxorps xmm0, xmm0, xmm0
C5FA2AC0 vcvtsi2ss xmm0, xmm0, eax
C5FA104D0C vmovss xmm1, dword ptr [rbp+0x0C]
C5F25EC0 vdivss xmm0, xmm1, xmm0
C5FA11457C vmovss dword ptr [rbp+0x7C], xmm0
488B4B48 mov rcx, gword ptr [rbx+0x48]
488B4958 mov rcx, gword ptr [rcx+0x58]
488B4960 mov rcx, gword ptr [rcx+0x60]
418BD4 mov edx, r12d
8B85EC020000 mov eax, dword ptr [rbp+0x2EC]
448BC0 mov r8d, eax
C58859457C vmulps xmm0, xmm14, xmmword ptr [rbp+0x7C] Seems like the broadcast versions require AVX512, and we're not checking for that. Relevant code is Looks like this changed in #84821 |
Simple repro vs 9p6 using System;
using System.Runtime.Intrinsics;
using System.Runtime.Intrinsics.X86;
using System.Runtime.CompilerServices;
class Runtime_96156
{
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public unsafe static Vector128<float> BroadcastScalarToVector128(float value)
{
return Avx.BroadcastScalarToVector128(&value);
}
public static void Main()
{
Vector128<float> c = Vector128.Create(1.0f);
Vector128<float> r = Problem(2.0f, 0.5f, c);
Console.WriteLine($"{r}");
}
[MethodImpl(MethodImplOptions.NoInlining)]
public static Vector128<float> Problem(float a, float b, Vector128<float> c)
{
return Avx.Multiply(c, BroadcastScalarToVector128(a / b));
}
}
; Assembly listing for method Runtime_96156:Problem(float,float,System.Runtime.Intrinsics.Vector128`1[float]):System.Runtime.Intrinsics.Vector128`1[float] (FullOpts)
; Emitting BLENDED_CODE for X64 with AVX - Windows
; FullOpts code
; optimized code
; rsp based frame
; partially interruptible
; No PGO data
; 0 inlinees with PGO data; 1 single block inlinees; 0 inlinees without PGO data
G_M000_IG01: ;; offset=0x0000
push rax
G_M000_IG02: ;; offset=0x0001
vdivss xmm0, xmm1, xmm2
vmovss dword ptr [rsp+0x04], xmm0
vmovups xmm0, xmmword ptr [r9]
lea rax, [rsp+0x04]
vmulps xmm0, xmm0, xmmword ptr [rax]
vmovups xmmword ptr [rcx], xmm0
mov rax, rcx
G_M000_IG03: ;; offset=0x0020
add rsp, 8
ret
; Total bytes of code 37 |
Ah, looks like this just got fixed in main by #104517 (so should be fixed in P7). @tannergooding you think we should try and backport a fix to 8.0 (given all the refactoring, maybe not a straight port)? |
@tannergooding sorry to nag, curious what you think here about a backport. |
Ah sorry, I had seen this over the weekend and forgot to respond when I got in this morning. I think a backport would be good, I'll see if I can get a minimal fix so we don't have to take the larger refactoring. |
Since this is fixed in .NET 9, I am going to change the milestone to .NET 8. |
Got the backport up here: #105659 |
@twest820 this should be fixed in 8.0 servicing -- can you verify if you get a chance? |
Hi @AndyAyersMS, I moved the build to .NET 9.0 a while ago (can confirm reverting to the original unsafe code works there). @tannergooding's unit test coverage looks good to me, though. FWIW, I've left the build here with the safe alternative public static Vector128<float> BroadcastScalarToVector128(float value)
{
Vector128<float> value128 = Vector128.CreateScalarUnsafe(value);
return Avx2.BroadcastScalarToVector128(value128);
} as we've stopped supporting processors without FMA. |
Description
Code that runs fine in .NET 7 fails in .NET 8.
Reproduction Steps
This affects a couple helper methods which have been in place for years.
Expected behavior
Numerical simulations remain correct.
Actual behavior
Simulations experience numerical divergence as timesteps proceed, ultimately either completing with wrong results or crashing on
ArgumentOutOfRangeException
s raised by internal consistency checks detectingNaN
s or other incorrect values resulting from things like numerical overflows in exponentiation.Regression?
Yes. .NET 8 debug builds work fine. Flipping projects back to .NET 7 removes this problem. Rolling either of the affected repos I've detected so far back to .NET 7 and then moving projects forward to .NET 8 with no other changes also reproduces the error.
Known Workarounds
Use of
Vector128.Create(value)
orVector256.Create(value)
instead of callingAvx.BroadcastScalarToVectorNNN()
directly has, so far, been successful in avoiding this issue.Configuration
.NET 8, Win 10 22H2, AMD Zen 3. Yes, specific to .NET 8 and x64 intrinsics.
Other information
In .NET 7 the disassembly is
In .NET 8 it's of the form
However, the JIT from the apparent super quick workaround of
appears to be functionally identical, both locally
and in the surrounding call sites I've checked. After several hours' investigation my current best guess is there's some small fraction out of hundreds of calls where
rbp
isn't stable, leadingvbroadcastss
to sometimes load the wrong value.I'm not particularly concerned about this issue—looking at what .NET 8 JIT does, removing the entire helper class as legacy code and changing to inline
VectorNNN.Create()
calls makes sense—but it strikes me as strange enough to be worth documenting, particularly should someone else happen to hit it. Our code's nearly all floating point so I don't have a read on whether thevpbroadcast
instructions are also affected.The text was updated successfully, but these errors were encountered: