You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There are some ISA's that have hardware FMA instructions that are guaranteed to be available if hardware floating-point support is present, including PPC, RISC-V (on CPU's implementing the "F" and "D" extension), NVidia GPU's, and AMD GPU's.
GCC/Clang also has the __builtin_fma and __builtin_fmaf builtins that are guaranteed to be compiled down to a single FMA instruction on ISA's with hardware floating point that can carry out FMA using a single instruction, even with optimizations disabled (-O0).
There are use cases for implementing MulAdd/NegMulAdd/MulSub/NegMulSub using __builtin_fma and __builtin_fmaf for SCALAR/EMU128 on RISC-V CPU's that have the RISC-V F and D extensions (but might not necessarily support the "V" SIMD extension), NVidia GPU's, and AMD GPU's.
The text was updated successfully, but these errors were encountered:
There are some ISA's that have hardware FMA instructions that are guaranteed to be available if hardware floating-point support is present, including PPC, RISC-V (on CPU's implementing the "F" and "D" extension), NVidia GPU's, and AMD GPU's.
GCC/Clang also has the __builtin_fma and __builtin_fmaf builtins that are guaranteed to be compiled down to a single FMA instruction on ISA's with hardware floating point that can carry out FMA using a single instruction, even with optimizations disabled (-O0).
There are use cases for implementing MulAdd/NegMulAdd/MulSub/NegMulSub using __builtin_fma and __builtin_fmaf for SCALAR/EMU128 on RISC-V CPU's that have the RISC-V F and D extensions (but might not necessarily support the "V" SIMD extension), NVidia GPU's, and AMD GPU's.
The text was updated successfully, but these errors were encountered: