-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Avx10.2 Instructions in Floating Point Conversions #111775
Conversation
…Lower Avx10.2 nodes accordingly.
…DE." This reverts commit 067e31e.
…DE." This reverts commit 067e31e.
…embedded rounding" This reverts commit 493572f.
…DE." This reverts commit 067e31e.
This reverts commit 61719f8.
Co-authored-by: Bruce Forstall <brucefo@microsoft.com>
…1996/runtime into kcm-avx102-api-public-pr
Note regarding the
|
1 similar comment
Note regarding the
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
Co-authored-by: Michał Petryka <35800402+MichalPetryka@users.noreply.github.com>
@tannergooding can you help get this PR reviewed? It contains the optimization for saturating conversions instructions introduced with AVX10.2. |
@tannergooding can you please help to review this PR? The prelim review changes have been made and the CI looks good here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
CC. @dotnet/jit-contrib for secondary review
@BruceForstall @En3Tho rectified the same issue with images here as well. (#112535 (comment)) |
/ba-g timeout |
* main: (27 commits) Fold null checks against known non-null values (dotnet#109164) JIT: Always track the context for late devirt (dotnet#112396) JIT: array allocation fixes (dotnet#112676) [H/3] Fix test closing connection too fast (dotnet#112691) Fix LINQ handling of iterator.Take(...).Last(...) (dotnet#112680) [browser][MT] move wasm MT CI legs to extra-platforms (dotnet#112690) JIT: Don't use `Compiler::compFloatingPointUsed` to check if FP kills are needed (dotnet#112668) [LoongArch64] Fix a typo within PR#112166. (dotnet#112672) Fix new EH hang on DebugBreak (dotnet#112640) Use encode callback instead of renting a buffer to write to in DSAKeyFormatHelper Move some links to other doc (dotnet#112574) Reflection-based XmlSerializer - Deserialize empty collections and allow for sub-types in collection items. (dotnet#111723) JIT: Use `fgCalledCount` for OSR method entry weight (dotnet#112662) Use Avx10.2 Instructions in Floating Point Conversions (dotnet#111775) Expose StressLog via CDAC and port StressLogAnalyzer to managed code (dotnet#104999) JIT: Use linear block order for MinOpts in LSRA (dotnet#108147) Update dependencies from https://github.com/dotnet/arcade build 20250213.2 (dotnet#112625) JIT: Clean up and optimize call arg lowering (dotnet#112639) Update dependencies from https://github.com/dotnet/emsdk build 20250217.1 (dotnet#112645) JIT: Support `FIELD_LIST` for returns (dotnet#112308) ...
Overview
This PR tracks optimizing x64 floating point to integer conversions using the new saturating instructions introduced in AVX10.2. We are following the spec doc to add the new instructions and optimize the x64/x86 conversions.
Addresses #109080
Testing
All of the changes made for testing are present in this branch
Step 1: Run superpmi.exe on library mch files using JITLateDisasm to check if any errors occur. Use JITLateDisasm to check for a valid decoding of the byte stream through LLVM disasmbler
For this step, a new coredistools was used built from the LLVM repo. After running superpmi with JITLateDisasm, no decoding failures were detected. Please contact for getting access to the superpmi logs.
Step 2: Run superpmi and check for asmdiffs and assert errors.
Below is the summary of superpmi run between this PR and PR #111209
Diff makes sense here. All of the diffs in superpmi logs belong to conversion scenario. E.g.
Since these diffs are expected, we can conclude that the superpmi run is successful
Step 3: Run the JIT test suite using a stable subset of tests on SDE
Results
data:image/s3,"s3://crabby-images/1ac5b/1ac5b9cf996db98397c23a427848ba8974a184e0" alt="image"
Optimized ASM
Note: Below is a case by case basis of comparison between asm generated for
Avx512
vsAvx10.2
. TheAvx10v2
asm has been collected in sde.Case: Float to Int packed
** Test code**
Left Side is AVX512 vs Right Side is AVX10.2
data:image/s3,"s3://crabby-images/bdf32/bdf32ebb95fe3e1e94a7a28e7d9336ed8cc7f5ab" alt="image"
Case: Float to UInt packed
** Test code**
Left Side is AVX512 vs Right Side is AVX10.2
data:image/s3,"s3://crabby-images/0378b/0378be0f4ba5eb245e4b8c22adfd399b1815cef3" alt="image"
Case: Double to long packed
** Test code**
Left Side is AVX512 vs Right Side is AVX10.2
data:image/s3,"s3://crabby-images/f40b8/f40b8021182063966128eed7bbba44511d1f871d" alt="image"
Case: Double to Ulong packed
** Test code**
Left Side is AVX512 vs Right Side is AVX10.2
data:image/s3,"s3://crabby-images/9eeaa/9eeaa44db73aaf83fa7653176f3507194d537926" alt="image"
Case: Float to Int Scalar
** Test code**
Left Side is AVX512 vs Right Side is AVX10.2
data:image/s3,"s3://crabby-images/961ab/961ab5234004479de2ecd5ab5f2da7a96d0bae02" alt="image"
Case: Float to UInt Scalar
** Test code**
Left Side is AVX512 vs Right Side is AVX10.2
data:image/s3,"s3://crabby-images/0cd4c/0cd4c07c02e9bc5cd488151ad095a791c3358126" alt="image"
Case: Float to Long Scalar
** Test code**
Left Side is AVX512 vs Right Side is AVX10.2
data:image/s3,"s3://crabby-images/9aee4/9aee47c2814346ef0dae41bdd4531f3d79c6177d" alt="image"
Case: Float to ULong Scalar
** Test code**
Left Side is AVX512 vs Right Side is AVX10.2
data:image/s3,"s3://crabby-images/8d309/8d3097f3dd53dae56f615f83847e5f19cad76c9d" alt="image"
Case: Double to Long Scalar
** Test code**
Left Side is AVX512 vs Right Side is AVX10.2
data:image/s3,"s3://crabby-images/d01d9/d01d9bb009cde6b7c028b31af5e3a5b4d4595fa4" alt="image"
Case: Double to ULong Scalar
** Test code**
Left Side is AVX512 vs Right Side is AVX10.2
data:image/s3,"s3://crabby-images/8d71f/8d71fadd8fe3961c215ede98660f72c050591a6c" alt="image"
Case: Double to int Scalar
** Test code**
Left Side is AVX512 vs Right Side is AVX10.2
data:image/s3,"s3://crabby-images/d0126/d0126548ee6461898a04ab7c376aebcff2cd9784" alt="image"
Case: Double to UInt Scalar
** Test code**
Left Side is AVX512 vs Right Side is AVX10.2
data:image/s3,"s3://crabby-images/ed3ce/ed3ce26ead73b208a7046e39f2762ef57ba33bdb" alt="image"