Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate away from nonfunctional fenv stubs #510

Merged
merged 3 commits into from
Feb 10, 2025

Conversation

tgross35
Copy link
Contributor

Many routines have some form of handling for rounding mode and floating point exceptions, which are implemented via a combination of stubs and force_eval! use. This is suboptimal, however, because:

  1. Rust does not interact with the floating point environment, so most of this code does nothing.
  2. The parts of the code that are not dead are not testable.
  3. force_eval! blocks optimizations, which is unnecessary because we do not rely on its side effects.

We cannot ensure correct rounding and exception handling in all cases without some form of arithmetic operations that are aware of this behavior. However, the cases where rounding mode is explicitly handled or exceptions are explicitly raised are testable. Make this possible here for functions that depend on math::fenv by moving the implementation to a nonpublic function that takes a Round and returns a Status.

Cc: #480

@tgross35 tgross35 force-pushed the replace-fenv branch 5 times, most recently from d87dc07 to 1266321 Compare February 10, 2025 12:15
@tgross35
Copy link
Contributor Author

Changes on the directly changed functions:

icount::icount_bench_ceil_group::icount_bench_ceil logspace:setup_ceil()
  Baselines:                      softfloat|softfloat
  Instructions:                        8259|9292                 (-11.1171%) [-1.12508x]
  L1 Hits:                            10094|11911                (-15.2548%) [-1.18001x]
  L2 Hits:                                4|2                    (+100.000%) [+2.00000x]
  RAM Hits:                               7|8                    (-12.5000%) [-1.14286x]
  Total read+write:                   10105|11921                (-15.2336%) [-1.17971x]
  Estimated Cycles:                   10359|12201                (-15.0971%) [-1.17782x]
icount::icount_bench_ceilf128_group::icount_bench_ceilf128 logspace:setup_ceilf128()
  Baselines:                      softfloat|softfloat
  Instructions:                        9035|45279                (-80.0459%) [-5.01151x]
  L1 Hits:                            12128|62803                (-80.6888%) [-5.17835x]
  L2 Hits:                                1|0                    (+++inf+++) [+++inf+++]
  RAM Hits:                               9|26                   (-65.3846%) [-2.88889x]
  Total read+write:                   12138|62829                (-80.6809%) [-5.17622x]
  Estimated Cycles:                   12448|63713                (-80.4624%) [-5.11833x]
icount::icount_bench_ceilf16_group::icount_bench_ceilf16 logspace:setup_ceilf16()
  Baselines:                      softfloat|softfloat
  Instructions:                       10371|34275                (-69.7418%) [-3.30489x]
  L1 Hits:                            12045|43974                (-72.6088%) [-3.65081x]
  L2 Hits:                                1|2                    (-50.0000%) [-2.00000x]
  RAM Hits:                               5|10                   (-50.0000%) [-2.00000x]
  Total read+write:                   12051|43986                (-72.6026%) [-3.64999x]
  Estimated Cycles:                   12225|44334                (-72.4252%) [-3.62650x]
icount::icount_bench_ceilf_group::icount_bench_ceilf logspace:setup_ceilf()
  Baselines:                      softfloat|softfloat
  Instructions:                        8739|9863                 (-11.3961%) [-1.12862x]
  L1 Hits:                            10577|12575                (-15.8887%) [-1.18890x]
  L2 Hits:                                1|2                    (-50.0000%) [-2.00000x]
  RAM Hits:                               5|6                    (-16.6667%) [-1.20000x]
  Total read+write:                   10583|12583                (-15.8945%) [-1.18898x]
  Estimated Cycles:                   10757|12795                (-15.9281%) [-1.18946x]
icount::icount_bench_floor_group::icount_bench_floor logspace:setup_floor()
  Baselines:                      softfloat|softfloat
  Instructions:                        8783|9803                 (-10.4050%) [-1.11613x]
  L1 Hits:                            10494|12298                (-14.6691%) [-1.17191x]
  L2 Hits:                                3|1                    (+200.000%) [+3.00000x]
  RAM Hits:                               7|8                    (-12.5000%) [-1.14286x]
  Total read+write:                   10504|12307                (-14.6502%) [-1.17165x]
  Estimated Cycles:                   10754|12583                (-14.5355%) [-1.17008x]
icount::icount_bench_floorf128_group::icount_bench_floorf128 logspace:setup_floorf128()
  Baselines:                      softfloat|softfloat
  Instructions:                       15662|50281                (-68.8511%) [-3.21038x]
  L1 Hits:                            20000|68676                (-70.8777%) [-3.43380x]
  L2 Hits:                                3|1                    (+200.000%) [+3.00000x]
  RAM Hits:                              12|29                   (-58.6207%) [-2.41667x]
  Total read+write:                   20015|68706                (-70.8686%) [-3.43273x]
  Estimated Cycles:                   20435|69696                (-70.6798%) [-3.41062x]
icount::icount_bench_floorf16_group::icount_bench_floorf16 logspace:setup_floorf16()
  Baselines:                      softfloat|softfloat
  Instructions:                       15362|37601                (-59.1447%) [-2.44766x]
  L1 Hits:                            18516|47901                (-61.3453%) [-2.58701x]
  L2 Hits:                                1|1                    (No change)
  RAM Hits:                               5|10                   (-50.0000%) [-2.00000x]
  Total read+write:                   18522|47912                (-61.3416%) [-2.58676x]
  Estimated Cycles:                   18696|48256                (-61.2566%) [-2.58109x]
icount::icount_bench_floorf_group::icount_bench_floorf logspace:setup_floorf()
  Baselines:                      softfloat|softfloat
  Instructions:                        9327|10403                (-10.3432%) [-1.11536x]
  L1 Hits:                            11038|12992                (-15.0400%) [-1.17702x]
  L2 Hits:                                3|2                    (+50.0000%) [+1.50000x]
  RAM Hits:                               6|5                    (+20.0000%) [+1.20000x]
  Total read+write:                   11047|12999                (-15.0165%) [-1.17670x]
  Estimated Cycles:                   11263|13177                (-14.5253%) [-1.16994x]
 icount::icount_bench_fma_group::icount_bench_fma logspace:setup_fma()
  Baselines:                      softfloat|softfloat
  Instructions:                       54873|53800                (+1.99442%) [+1.01994x]
  L1 Hits:                            61633|60563                (+1.76676%) [+1.01767x]
  L2 Hits:                                7|4                    (+75.0000%) [+1.75000x]
  RAM Hits:                              20|20                   (No change)
  Total read+write:                   61660|60587                (+1.77101%) [+1.01771x]
  Estimated Cycles:                   62368|61283                (+1.77047%) [+1.01770x]
icount::icount_bench_fmaf128_group::icount_bench_fmaf128 logspace:setup_fmaf128()
  Baselines:                      softfloat|softfloat
  Instructions:                      182151|175791               (+3.61793%) [+1.03618x]
  L1 Hits:                           239884|229938               (+4.32551%) [+1.04326x]
  L2 Hits:                               60|64                   (-6.25000%) [-1.06667x]
  RAM Hits:                              66|64                   (+3.12500%) [+1.03125x]
  Total read+write:                  240010|230066               (+4.32224%) [+1.04322x]
  Estimated Cycles:                  242494|232498               (+4.29939%) [+1.04299x]
 icount::icount_bench_round_group::icount_bench_round logspace:setup_round()
  Baselines:                      softfloat|softfloat
  Instructions:                       13345|14131                (-5.56224%) [-1.05890x]
  L1 Hits:                            14932|16507                (-9.54141%) [-1.10548x]
  L2 Hits:                                4|0                    (+++inf+++) [+++inf+++]
  RAM Hits:                               5|6                    (-16.6667%) [-1.20000x]
  Total read+write:                   14941|16513                (-9.51977%) [-1.10521x]
  Estimated Cycles:                   15127|16717                (-9.51128%) [-1.10511x]
icount::icount_bench_roundf128_group::icount_bench_roundf128 logspace:setup_roundf128()
  Baselines:                      softfloat|softfloat
  Instructions:                       78823|111559               (-29.3441%) [-1.41531x]
  L1 Hits:                           102396|147205               (-30.4399%) [-1.43760x]
  L2 Hits:                                2|0                    (+++inf+++) [+++inf+++]
  RAM Hits:                              25|26                   (-3.84615%) [-1.04000x]
  Total read+write:                  102423|147231               (-30.4338%) [-1.43748x]
  Estimated Cycles:                  103281|148115               (-30.2697%) [-1.43410x]
icount::icount_bench_roundf16_group::icount_bench_roundf16 logspace:setup_roundf16()
  Baselines:                      softfloat|softfloat
  Instructions:                       51800|71839                (-27.8943%) [-1.38685x]
  L1 Hits:                            63350|89421                (-29.1553%) [-1.41154x]
  L2 Hits:                                1|1                    (No change)
  RAM Hits:                               9|8                    (+12.5000%) [+1.12500x]
  Total read+write:                   63360|89430                (-29.1513%) [-1.41146x]
  Estimated Cycles:                   63670|89706                (-29.0237%) [-1.40892x]
icount::icount_bench_roundf_group::icount_bench_roundf logspace:setup_roundf()
  Baselines:                      softfloat|softfloat
  Instructions:                       13227|14106                (-6.23139%) [-1.06645x]
  L1 Hits:                            14816|16574                (-10.6070%) [-1.11866x]
  L2 Hits:                                2|1                    (+100.000%) [+2.00000x]
  RAM Hits:                               5|6                    (-16.6667%) [-1.20000x]
  Total read+write:                   14823|16581                (-10.6025%) [-1.11860x]
  Estimated Cycles:                   15001|16789                (-10.6498%) [-1.11919x]
icount::icount_bench_sqrt_group::icount_bench_sqrt logspace:setup_sqrt()
  Baselines:                      softfloat|softfloat
  Instructions:                       43141|42141                (+2.37299%) [+1.02373x]
  L1 Hits:                            45715|44718                (+2.22953%) [+1.02230x]
  L2 Hits:                                4|2                    (+100.000%) [+2.00000x]
  RAM Hits:                              16|15                   (+6.66667%) [+1.06667x]
  Total read+write:                   45735|44735                (+2.23539%) [+1.02235x]
  Estimated Cycles:                   46295|45253                (+2.30261%) [+1.02303x]
icount::icount_bench_sqrtf128_group::icount_bench_sqrtf128 logspace:setup_sqrtf128()
  Baselines:                      softfloat|softfloat
  Instructions:                      248361|241364               (+2.89894%) [+1.02899x]
  L1 Hits:                           318804|308311               (+3.40338%) [+1.03403x]
  L2 Hits:                                5|4                    (+25.0000%) [+1.25000x]
  RAM Hits:                              42|41                   (+2.43902%) [+1.02439x]
  Total read+write:                  318851|308356               (+3.40353%) [+1.03404x]
  Estimated Cycles:                  320299|309766               (+3.40031%) [+1.03400x]

The wins on the rounding functions are pretty significant. Looks like LLVM is having a slightly harder time optimizing sqrt and fma.

Many routines have some form of handling for rounding mode and floating
point exceptions, which are implemented via a combination of stubs and
`force_eval!` use. This is suboptimal, however, because:

1. Rust does not interact with the floating point environment, so most
   of this code does nothing.
2. The parts of the code that are not dead are not testable.
3. `force_eval!` blocks optimizations, which is unnecessary because we
   do not rely on its side effects.

We cannot ensure correct rounding and exception handling in all cases
without some form of arithmetic operations that are aware of this
behavior. However, the cases where rounding mode is explicitly handled
or exceptions are explicitly raised are testable. Make this possible
here for functions that depend on `math::fenv` by moving the
implementation to a nonpublic function that takes a `Round` and returns
a `Status`.

Link: rust-lang#480
@tgross35 tgross35 force-pushed the replace-fenv branch 2 times, most recently from 0d878df to e13e944 Compare February 10, 2025 12:32
@tgross35 tgross35 enabled auto-merge February 10, 2025 12:38
@tgross35 tgross35 merged commit 5151be0 into rust-lang:master Feb 10, 2025
35 checks passed
@tgross35 tgross35 deleted the replace-fenv branch February 10, 2025 13:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant