You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As part of effort to convert target dependent intrinsic in .NET libraries to target-independent Vector* function, I went through intrinsics used in .NET libraries. I have the list below and some possible options to switch to cross-platform vectors if we either expand Vector API or have JIT optimize certain patterns where multiple Vector functions can achieve same result
c. Uses the following
i. Avx512BW.PackUnsignedSaturate
ii. Avx512Vbmi.PermuteVar64x8
iii. Avx512BW.Shuffle
d. Cannot Upgrade- No way to switch PackUnsignedSaturate
c. Uses the following
i. Avx2.Shuffle
ii. Avx2.And
iii. Avx2.Min
iv. Avx.Store
d. Shuffle with non constant ‘indices’ will be problematic to convert- But should be fine with ShuffleUnsafe implemented
AsciiStringSearchValuesTeddyBase.cs/ TeddyHelper.cs
a. Has AVX512F path
b.
d. Uses the following
i. PackUnsignedSaturate: no 1-1
ii. Shuffle – possible with shuffleunsafe
iii. Permute2x128
iv. AlignRight : no 1-1
v. PermuteVar8x64x2
SpanHelpers.cs : Consider all span under this umbrella
a. Has AVX512F path
b.
// Avx2 branch also operates on Sse2 sizes, so check is combined.
c. Uses the following
i. Shuffle
ii. Avx2.Permute2x128
iii. PermuteVar8x32
iv. Permute4x64
v. Avx2.And
vi. Avx2.MultiplyHigh
vii. Avx2.MultiplyLow
viii. Avx2.Or
ix. Avx2.SubtractSaturate
x. Avx2.CompareGreaterThan
xi. Avx2.Subtract
xii. Avx2.Add
IndexOfAnyAsciiSearcher
a. No AVx512F path – Tried impl/had issues
b.
c. Uses Testz/ PackUnsignedSaturate – can possibly move to more efficient patterns similar to ‘HasMatch’
BitArray is the only one where it’s feasible currently and that’s dependent on #99596
Some patterns we can consider
Sse2.multiply – vector multiply does not work the same way. Vector version stores only the lower half after multiplication. Intrinsic version(for sse and avx upgrades type uint->ulong for eg). So Widen -> Multiply might work
The text was updated successfully, but these errors were encountered:
This is a bit of a meta issue as it applies to many areas across the BCL, but I've marked it as intrinsics since they all have to do with updating the intrinsic code.
As part of effort to convert target dependent intrinsic in .NET libraries to target-independent
Vector*
function, I went through intrinsics used in .NET libraries. I have the list below and some possible options to switch to cross-platform vectors if we either expand Vector API or have JIT optimize certain patterns where multiple Vector functions can achieve same resultBase64 Encoder/Decoder
a. Has AVX512 path
b.
runtime/src/libraries/System.Private.CoreLib/src/System/Buffers/Text/Base64Decoder.cs
Line 40 in f94bab0
c.
runtime/src/libraries/System.Private.CoreLib/src/System/Buffers/Text/Base64Encoder.cs
Line 38 in f94bab0
d. Cannot convert everything to Vector* without expanding Vector surface area
ProbablisiticMap
a. Has AVX512 paths
b.
runtime/src/libraries/System.Private.CoreLib/src/System/SearchValues/ProbabilisticMap.cs
Lines 107 to 265 in f94bab0
c. Uses the following
i. Avx512BW.PackUnsignedSaturate
ii. Avx512Vbmi.PermuteVar64x8
iii. Avx512BW.Shuffle
d. Cannot Upgrade- No way to switch PackUnsignedSaturate
XxHashShared.c
a. No Avx512 path
b.
runtime/src/libraries/System.IO.Hashing/src/System/IO/Hashing/XxHashShared.cs
Line 519 in e9e33e1
c. Uses Avx2.Multiply
d. Cannot switch Intrinsic Multiply to vector multiply
BitArray.cs
a. Has AVX512 path
b.
runtime/src/libraries/System.Collections/src/System/Collections/BitArray.cs
Lines 840 to 888 in f94bab0
c. Uses the following
i. Avx2.Shuffle
ii. Avx2.And
iii. Avx2.Min
iv. Avx.Store
d. Shuffle with non constant ‘indices’ will be problematic to convert- But should be fine with ShuffleUnsafe implemented
AsciiStringSearchValuesTeddyBase.cs/ TeddyHelper.cs
a. Has AVX512F path
b.
runtime/src/libraries/System.Private.CoreLib/src/System/SearchValues/Strings/AsciiStringSearchValuesTeddyBase.cs
Lines 427 to 479 in f94bab0
c. Related : TeddyHelper :
runtime/src/libraries/System.Private.CoreLib/src/System/SearchValues/Strings/Helpers/TeddyHelper.cs
Line 47 in f94bab0
d. Uses the following
i. PackUnsignedSaturate: no 1-1
ii. Shuffle – possible with shuffleunsafe
iii. Permute2x128
iv. AlignRight : no 1-1
v. PermuteVar8x64x2
SpanHelpers.cs : Consider all span under this umbrella
a. Has AVX512F path
b.
runtime/src/libraries/System.Private.CoreLib/src/System/SpanHelpers.Byte.cs
Line 462 in f94bab0
c. Uses the following
i. Shuffle
ii. Avx2.Permute2x128
iii. PermuteVar8x32
iv. Permute4x64
v. Avx2.And
vi. Avx2.MultiplyHigh
vii. Avx2.MultiplyLow
viii. Avx2.Or
ix. Avx2.SubtractSaturate
x. Avx2.CompareGreaterThan
xi. Avx2.Subtract
xii. Avx2.Add
IndexOfAnyAsciiSearcher
a. No AVx512F path – Tried impl/had issues
b.
runtime/src/libraries/System.Private.CoreLib/src/System/SearchValues/IndexOfAnyAsciiSearcher.cs
Line 237 in f94bab0
c. Uses following
i. PackUnsignedSaturate
ii. Shuffle
Matrix4x4.Impl
a. No avx512 path and in some cases avx paths
b.
runtime/src/libraries/System.Private.CoreLib/src/System/Numerics/Matrix4x4.Impl.cs
Line 1422 in e9e33e1
c. Uses foll
i. Shuffle/Permute – constant indices..so possible?
ii. UnpackLow
iii. UnpackHigh
Ascii.Equality
a. Avx512 path added
b.
runtime/src/libraries/System.Private.CoreLib/src/System/Text/Ascii.Equality.cs
Line 94 in e9e33e1
c. Already uses Vector – switch check?
Ascii.Utility.
a. Has avx512 path
b.
runtime/src/libraries/System.Private.CoreLib/src/System/Text/Ascii.Utility.cs
Line 1557 in e9e33e1
c. Uses Testz/ PackUnsignedSaturate – can possibly move to more efficient patterns similar to ‘HasMatch’
BitArray is the only one where it’s feasible currently and that’s dependent on #99596
Some patterns we can consider
Sse2.multiply
– vector multiply does not work the same way. Vector version stores only the lower half after multiplication. Intrinsic version(for sse and avx upgrades type uint->ulong for eg). SoWiden -> Multiply
might workThe text was updated successfully, but these errors were encountered: