Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add BFloat16 #98643

Open
wants to merge 77 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 63 commits
Commits
Show all changes
77 commits
Select commit Hold shift + click to select a range
589afe0
Add api for BFloat16
huoyaoyuan Feb 18, 2024
312d051
Creating
huoyaoyuan Feb 18, 2024
5e1c981
Equals and GetHashCode
huoyaoyuan Feb 18, 2024
1fb4765
Comparison
huoyaoyuan Feb 18, 2024
fc05d3b
Constants and comment
huoyaoyuan Feb 18, 2024
152fe99
Xml doc
huoyaoyuan Feb 18, 2024
25a16e7
Using rounding for cast
huoyaoyuan Feb 18, 2024
50d90aa
Ref source
huoyaoyuan Feb 18, 2024
559f2e0
Simple tests
huoyaoyuan Feb 18, 2024
b24839c
Conversion tests
huoyaoyuan Feb 19, 2024
8284526
Stripping sign is redundant
huoyaoyuan Feb 19, 2024
8e32e71
Fix test copied from Half
huoyaoyuan Feb 19, 2024
4bd266e
Fix conversion test cases
huoyaoyuan Feb 19, 2024
6df00e6
Constants and well-known values
huoyaoyuan Feb 21, 2024
ff295fd
Categorizing methods
huoyaoyuan Feb 21, 2024
09af2b2
Reorder conversion members
huoyaoyuan Feb 21, 2024
1a8f0ad
Operators batch 1
huoyaoyuan Feb 21, 2024
c9fc867
Operators batch 2
huoyaoyuan Feb 21, 2024
e9fc0f8
TryConvert
huoyaoyuan Feb 21, 2024
c967aa5
Operators batch 3
huoyaoyuan Feb 21, 2024
17c13c0
Parsing and formatting
huoyaoyuan Feb 21, 2024
ad780a0
Add comments about how to determine parse and format info
huoyaoyuan Feb 21, 2024
c01949f
Add missing interface implementations
huoyaoyuan Feb 21, 2024
b63c1df
NumberBufferLength
huoyaoyuan Feb 21, 2024
754a3c8
Add more comment
huoyaoyuan Feb 22, 2024
bcc260f
Correct MinFastFloatDecimalExponent
huoyaoyuan Feb 24, 2024
5a3d200
Add explicit conversion to
huoyaoyuan Feb 24, 2024
c420dd3
Explicit convert from
huoyaoyuan Feb 24, 2024
13e65d1
Fullfill casting operators
huoyaoyuan Feb 24, 2024
8c5f546
Fullfill some formatting
huoyaoyuan Feb 24, 2024
0cb3932
Apply suggestions from code review
huoyaoyuan Apr 6, 2024
b615e68
Merge branch 'main' into BFloat16
huoyaoyuan Apr 6, 2024
8f70d91
Generic DiyFp
huoyaoyuan Feb 24, 2024
2458dd8
Generic Grisu3
huoyaoyuan Feb 24, 2024
a8bb94b
Generic Dragon4
huoyaoyuan Feb 24, 2024
9644914
Add MaxRoundTripDigits to MaxPrecisionCustomFormat to FormatInfo
huoyaoyuan Feb 25, 2024
a29db5c
Generic FormatFloat
huoyaoyuan Feb 25, 2024
a8a8a49
Adapt with existing FP types
huoyaoyuan Feb 25, 2024
2fd392f
Merge branch 'fp-formatting-generic' into BFloat16
huoyaoyuan Apr 6, 2024
f1582e7
Adapt formatting traits
huoyaoyuan Apr 6, 2024
eace3a6
Use generic format and delete Number.BFloat16
huoyaoyuan Apr 6, 2024
abd1e80
Update ref source
huoyaoyuan Apr 6, 2024
62156c9
Merge branch 'main'
huoyaoyuan May 3, 2024
e8012c9
Enable constant value tests
huoyaoyuan May 3, 2024
f711f8d
IsFinite/IsNaN
huoyaoyuan May 3, 2024
08168ff
IsPositive/IsNegative/IsSubnormal
huoyaoyuan May 3, 2024
d59a8c5
ToDouble
huoyaoyuan May 3, 2024
eb6dc47
Merge branch 'main' into BFloat16
huoyaoyuan May 23, 2024
25b7684
Merge branch 'main'
huoyaoyuan Jun 4, 2024
832651e
Merge branch 'main' into BFloat16
huoyaoyuan Jun 9, 2024
a07fe96
Fix test case
huoyaoyuan Jun 9, 2024
f9c35d3
Add double conversion test
huoyaoyuan Jun 9, 2024
4059b66
Parse tests
huoyaoyuan Jun 9, 2024
4b4d1a5
Formatting tests
huoyaoyuan Jun 9, 2024
6ed52f5
RoundTripping tests
huoyaoyuan Jun 9, 2024
14b0d85
Port float->Half conversion algorithm to double->BFloat16 to handle U…
huoyaoyuan Jun 10, 2024
ea1dd5f
Port function tests from Half
huoyaoyuan Jun 10, 2024
dfd49c8
Convert the precesion of test cases.
huoyaoyuan Jun 10, 2024
f5461ac
Merge branch 'main' into BFloat16
danmoseley Dec 2, 2024
daaec69
Merge branch 'main' into BFloat16
huoyaoyuan Dec 29, 2024
9938e8b
Align with TryWriteBig/LittleEndian
huoyaoyuan Dec 29, 2024
b889417
Remove redundant 'partial'
huoyaoyuan Dec 31, 2024
1282c85
Merge branch 'main' into BFloat16
tannergooding Jan 27, 2025
e6dd118
Merge branch 'main' into BFloat16
huoyaoyuan Feb 6, 2025
922f411
Use DefaultParseStyle
huoyaoyuan Feb 6, 2025
86dce2d
Fill conversion in signed integer
huoyaoyuan Feb 6, 2025
9f729a3
Fill conversion in unsigned integer and floating point
huoyaoyuan Feb 6, 2025
6baf940
Add conversion for S.R.Numerics
huoyaoyuan Feb 6, 2025
fb88f8e
Use float member function instead of MathF
huoyaoyuan Feb 6, 2025
d697344
Fill conversion in decimal
huoyaoyuan Feb 7, 2025
4e83ad9
Add conversion for NFloat
huoyaoyuan Feb 7, 2025
d58c80a
Use soft rounding for uint->bf16
huoyaoyuan Feb 9, 2025
8639fa3
Generic math rounding from unsigned and signed integer
huoyaoyuan Feb 10, 2025
34b1d07
Cleanup helper methods
huoyaoyuan Feb 10, 2025
8adbeb2
Add integer rounding tests
huoyaoyuan Feb 10, 2025
1f2653f
Move helpers and fix comment
huoyaoyuan Feb 11, 2025
9046622
Update comment
huoyaoyuan Feb 11, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -4325,6 +4325,9 @@
<data name="NotSupported_EmitDebugInfo" xml:space="preserve">
<value>Emitting debug info is not supported for this member.</value>
</data>
<data name="Arg_MustBeBFloat16" xml:space="preserve">
<value>Object must be of type BFloat16.</value>
</data>
<data name="NotSupported_ReferenceEnumOrPrimitiveTypeRequired" xml:space="preserve">
<value>The specified type must be a reference type, a primitive type, or an enum type.</value>
</data>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -595,6 +595,7 @@
<Compile Include="$(MSBuildThisFileDirectory)System\Number.Grisu3.cs" />
<Compile Include="$(MSBuildThisFileDirectory)System\Number.NumberToFloatingPointBits.cs" />
<Compile Include="$(MSBuildThisFileDirectory)System\Number.Parsing.cs" />
<Compile Include="$(MSBuildThisFileDirectory)System\Numerics\BFloat16.cs" />
<Compile Include="$(MSBuildThisFileDirectory)System\Numerics\BitOperations.cs" />
<Compile Include="$(MSBuildThisFileDirectory)System\Numerics\Matrix3x2.cs" />
<Compile Include="$(MSBuildThisFileDirectory)System\Numerics\Matrix3x2.Impl.cs" />
Expand Down
16 changes: 8 additions & 8 deletions src/libraries/System.Private.CoreLib/src/System/Half.cs
Original file line number Diff line number Diff line change
Expand Up @@ -738,8 +738,8 @@ public static explicit operator Half(float value)
const uint SingleBiasedExponentMask = float.BiasedExponentMask;
// Exponent displacement #2
const uint Exponent13 = 0x0680_0000u;
// Maximum value that is not Infinity in Half
const float MaxHalfValueBelowInfinity = 65520.0f;
// The value above Half.MaxValue
const float HalfAboveMaxValue = 65520.0f;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'm not sure this name change is an improvement The other name wasn't great, but I think it was a little more clear.

In particular this is the maximum infinitely precise value that will round down to MaxValue rather than rounding up to PositiveInfinity

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is still pending.

// Mask for exponent bits in Half
const uint ExponentMask = BiasedExponentMask;
uint bitValue = BitConverter.SingleToUInt32Bits(value);
Expand All @@ -750,7 +750,7 @@ public static explicit operator Half(float value)
// Clear sign bit
value = float.Abs(value);
// Rectify values that are Infinity in Half. (float.Min now emits vminps instruction if one of two arguments is a constant)
value = float.Min(MaxHalfValueBelowInfinity, value);
value = float.Min(HalfAboveMaxValue, value);
// Rectify lower exponent
uint exponentOffset0 = BitConverter.SingleToUInt32Bits(float.Max(value, BitConverter.UInt32BitsToSingle(MinExp)));
// Extract exponent
Expand Down Expand Up @@ -1362,7 +1362,7 @@ int IFloatingPoint<Half>.GetExponentShortestBitLength()
int IFloatingPoint<Half>.GetSignificandByteCount() => sizeof(ushort);

/// <inheritdoc cref="IFloatingPoint{TSelf}.GetSignificandBitLength()" />
int IFloatingPoint<Half>.GetSignificandBitLength() => 11;
int IFloatingPoint<Half>.GetSignificandBitLength() => SignificandLength;

/// <inheritdoc cref="IFloatingPoint{TSelf}.TryWriteExponentBigEndian(Span{byte}, out int)" />
bool IFloatingPoint<Half>.TryWriteExponentBigEndian(Span<byte> destination, out int bytesWritten)
Expand Down Expand Up @@ -2327,7 +2327,7 @@ public static bool TryParse(ReadOnlySpan<byte> utf8Text, NumberStyles style, IFo
static int IBinaryFloatParseAndFormatInfo<Half>.NumberBufferLength => Number.HalfNumberBufferLength;

static ulong IBinaryFloatParseAndFormatInfo<Half>.ZeroBits => 0;
static ulong IBinaryFloatParseAndFormatInfo<Half>.InfinityBits => 0x7C00;
static ulong IBinaryFloatParseAndFormatInfo<Half>.InfinityBits => PositiveInfinityBits;

static ulong IBinaryFloatParseAndFormatInfo<Half>.NormalMantissaMask => (1UL << SignificandLength) - 1;
static ulong IBinaryFloatParseAndFormatInfo<Half>.DenormalMantissaMask => TrailingSignificandMask;
Expand All @@ -2339,15 +2339,15 @@ public static bool TryParse(ReadOnlySpan<byte> utf8Text, NumberStyles style, IFo
static int IBinaryFloatParseAndFormatInfo<Half>.MaxDecimalExponent => 5;

static int IBinaryFloatParseAndFormatInfo<Half>.ExponentBias => ExponentBias;
static ushort IBinaryFloatParseAndFormatInfo<Half>.ExponentBits => 5;
static ushort IBinaryFloatParseAndFormatInfo<Half>.ExponentBits => BiasedExponentLength;

static int IBinaryFloatParseAndFormatInfo<Half>.OverflowDecimalExponent => (MaxExponent + (2 * SignificandLength)) / 3;
static int IBinaryFloatParseAndFormatInfo<Half>.InfinityExponent => 0x1F;
static int IBinaryFloatParseAndFormatInfo<Half>.InfinityExponent => MaxBiasedExponent;

static ushort IBinaryFloatParseAndFormatInfo<Half>.NormalMantissaBits => SignificandLength;
static ushort IBinaryFloatParseAndFormatInfo<Half>.DenormalMantissaBits => TrailingSignificandLength;

static int IBinaryFloatParseAndFormatInfo<Half>.MinFastFloatDecimalExponent => -8;
static int IBinaryFloatParseAndFormatInfo<Half>.MinFastFloatDecimalExponent => -26;
static int IBinaryFloatParseAndFormatInfo<Half>.MaxFastFloatDecimalExponent => 4;

static int IBinaryFloatParseAndFormatInfo<Half>.MinExponentRoundToEven => -21;
Expand Down
50 changes: 42 additions & 8 deletions src/libraries/System.Private.CoreLib/src/System/Number.Parsing.cs
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,9 @@ internal interface IBinaryIntegerParseAndFormatInfo<TSelf> : IBinaryInteger<TSel
internal interface IBinaryFloatParseAndFormatInfo<TSelf> : IBinaryFloatingPointIeee754<TSelf>, IMinMaxValue<TSelf>
where TSelf : unmanaged, IBinaryFloatParseAndFormatInfo<TSelf>
{
/// <remarks>
/// Ceiling(Log10(5^(Abs(MinBinaryExponent) - 1))) + NormalMantissaBits + 1 + 1
/// </remarks>
static abstract int NumberBufferLength { get; }

static abstract ulong ZeroBits { get; }
Expand All @@ -61,7 +64,14 @@ internal interface IBinaryFloatParseAndFormatInfo<TSelf> : IBinaryFloatingPointI
static abstract int MinBinaryExponent { get; }
static abstract int MaxBinaryExponent { get; }

/// <remarks>
/// Floor(Log10(Epsilon))
/// </remarks>
static abstract int MinDecimalExponent { get; }

/// <remarks>
/// Ceiling(Log10(MaxValue))
/// </remarks>
static abstract int MaxDecimalExponent { get; }

static abstract int ExponentBias { get; }
Expand All @@ -73,29 +83,53 @@ internal interface IBinaryFloatParseAndFormatInfo<TSelf> : IBinaryFloatingPointI
static abstract ushort NormalMantissaBits { get; }
static abstract ushort DenormalMantissaBits { get; }

/// <remarks>
/// Ceiling(Log10(2^(MinBinaryExponent - 1 - DenormalMantissaBits - 64)))
/// </remarks>
static abstract int MinFastFloatDecimalExponent { get; }

/// <remarks>
/// MaxDecimalExponent - 1
/// </remarks>
static abstract int MaxFastFloatDecimalExponent { get; }

/// <remarks>
/// -Floor(Log5(2^(64 - NormalMantissaBits)))
/// </remarks>
static abstract int MinExponentRoundToEven { get; }

/// <remarks>
/// Floor(Log5(2^(NormalMantissaBits + 1)))
/// </remarks>
static abstract int MaxExponentRoundToEven { get; }

/// <summary>
/// Max(n) when 10^n can be precisely represented
/// </summary>
static abstract int MaxExponentFastPath { get; }
static abstract ulong MaxMantissaFastPath { get; }

static abstract TSelf BitsToFloat(ulong bits);

static abstract ulong FloatToBits(TSelf value);

// Maximum number of digits required to guarantee that any given floating point
// number can roundtrip. Some numbers may require less, but none will require more.
/// <summary>
/// Maximum number of digits required to guarantee that any given floating point
/// number can roundtrip. Some numbers may require less, but none will require more.
/// </summary>
/// <remarks>
/// Ceiling(Log10(2^NormalMantissaBits)) + 1
/// </remarks>
static abstract int MaxRoundTripDigits { get; }

// SinglePrecisionCustomFormat and DoublePrecisionCustomFormat are used to ensure that
// custom format strings return the same string as in previous releases when the format
// would return x digits or less (where x is the value of the corresponding constant).
// In order to support more digits, we would need to update ParseFormatSpecifier to pre-parse
// the format and determine exactly how many digits are being requested and whether they
// represent "significant digits" or "digits after the decimal point".
/// <summary>
/// MaxPrecisionCustomFormat is used to ensure that
/// custom format strings return the same string as in previous releases when the format
/// would return x digits or less (where x is the value of the corresponding constant).
/// In order to support more digits, we would need to update ParseFormatSpecifier to pre-parse
/// the format and determine exactly how many digits are being requested and whether they
/// represent "significant digits" or "digits after the decimal point".
/// </summary>
static abstract int MaxPrecisionCustomFormat { get; }
}

Expand Down
Loading
Loading