[flang] Integrate the option -flang-experimental-integer-overflow into -fno-wrapv #110063

yus3710-fj · 2024-09-26T00:25:49Z

nsw is now added to do-variable increment when -fno-wrapv is enabled as GFortran seems to do.
That means the option introduced by #91579 isn't necessary any more.

Note that the feature of -flang-experimental-integer-overflow is enabled by default.

…o -fno-wrapv nsw is added to do-variable increment when -fno-wrapv is enabled as GFortran seems to do. Therefore, the option introduced by PR91579 isn't necessary any more. Note that the feature of -flang-experimental-integer-overflow is now enabled by default.

llvmbot · 2024-10-21T10:23:19Z

@llvm/pr-subscribers-flang-driver
@llvm/pr-subscribers-flang-fir-hlfir

@llvm/pr-subscribers-clang

Author: Yusuke MINATO (yus3710-fj)

Changes

nsw is added to do-variable increment when -fno-wrapv is enabled as GFortran seems to do.
Therefore, the option introduced by #91579 isn't necessary any more.

Note that the feature of -flang-experimental-integer-overflow is now enabled by default.

Patch is 101.96 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/110063.diff

40 Files Affected:

(modified) clang/include/clang/Driver/Options.td (-4)
(modified) clang/lib/Driver/ToolChains/Flang.cpp (-1)
(modified) flang/include/flang/Lower/LoweringOptions.def (-5)
(modified) flang/include/flang/Optimizer/Transforms/Passes.h (+2-2)
(modified) flang/include/flang/Optimizer/Transforms/Passes.td (+1-1)
(modified) flang/include/flang/Tools/CrossToolHelpers.h (+1-1)
(modified) flang/lib/Frontend/CompilerInvocation.cpp (-6)
(modified) flang/lib/Frontend/FrontendActions.cpp (+2-2)
(modified) flang/lib/Lower/Bridge.cpp (+1-1)
(modified) flang/lib/Lower/IO.cpp (+1-1)
(modified) flang/lib/Optimizer/Passes/Pipelines.cpp (+3-3)
(modified) flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp (+4-4)
(modified) flang/test/Driver/frontend-forwarding.f90 (-2)
(modified) flang/test/Fir/convert-to-llvm-openmp-and-fir.fir (+5-5)
(modified) flang/test/Fir/loop01.fir (+11-219)
(modified) flang/test/Fir/loop02.fir (+2-2)
(modified) flang/test/Lower/HLFIR/goto-do-body.f90 (+2-2)
(modified) flang/test/Lower/OpenMP/parallel-private-clause-fixes.f90 (+2-2)
(modified) flang/test/Lower/OpenMP/wsloop-reduction-allocatable-array-minmax.f90 (+2-2)
(modified) flang/test/Lower/OpenMP/wsloop-variable.f90 (+2-2)
(modified) flang/test/Lower/array-character.f90 (+1-1)
(modified) flang/test/Lower/array-derived-assignments.f90 (+1-1)
(modified) flang/test/Lower/array-derived.f90 (+1-1)
(modified) flang/test/Lower/array-elemental-calls-char-byval.f90 (+1-1)
(modified) flang/test/Lower/array-elemental-calls-char.f90 (+1-1)
(modified) flang/test/Lower/array-expression-assumed-size.f90 (+1-1)
(modified) flang/test/Lower/array-expression-slice-1.f90 (+1-1)
(modified) flang/test/Lower/array-substring.f90 (+5-41)
(modified) flang/test/Lower/array-temp.f90 (+1-1)
(modified) flang/test/Lower/components.f90 (+1-1)
(modified) flang/test/Lower/do_loop.f90 (+17-56)
(modified) flang/test/Lower/do_loop_unstructured.f90 (+11-192)
(modified) flang/test/Lower/goto-do-body.f90 (+2-2)
(modified) flang/test/Lower/host-associated.f90 (+1-1)
(modified) flang/test/Lower/infinite_loop.f90 (+5-36)
(modified) flang/test/Lower/io-implied-do-fixes.f90 (+6-43)
(modified) flang/test/Lower/loops2.f90 (+1-1)
(modified) flang/test/Lower/mixed_loops.f90 (+3-3)
(modified) flang/test/Lower/vector-subscript-io.f90 (+17-17)
(modified) flang/tools/bbc/bbc.cpp (+1-8)

diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index 152c43d7908ff8..6ac1b3beac8a07 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -6771,10 +6771,6 @@ def flang_deprecated_no_hlfir : Flag<["-"], "flang-deprecated-no-hlfir">,
   Flags<[HelpHidden]>, Visibility<[FlangOption, FC1Option]>,
   HelpText<"Do not use HLFIR lowering (deprecated)">;
 
-def flang_experimental_integer_overflow : Flag<["-"], "flang-experimental-integer-overflow">,
-  Flags<[HelpHidden]>, Visibility<[FlangOption, FC1Option]>,
-  HelpText<"Add nsw flag to internal operations such as do-variable increment (experimental)">;
-
 //===----------------------------------------------------------------------===//
 // FLangOption + CoreOption + NoXarchOption
 //===----------------------------------------------------------------------===//
diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp
index a9d2b7a4dc48f9..366cadc2e54775 100644
--- a/clang/lib/Driver/ToolChains/Flang.cpp
+++ b/clang/lib/Driver/ToolChains/Flang.cpp
@@ -148,7 +148,6 @@ void Flang::addCodegenOptions(const ArgList &Args,
 
   Args.addAllArgs(CmdArgs, {options::OPT_flang_experimental_hlfir,
                             options::OPT_flang_deprecated_no_hlfir,
-                            options::OPT_flang_experimental_integer_overflow,
                             options::OPT_fno_ppc_native_vec_elem_order,
                             options::OPT_fppc_native_vec_elem_order});
 }
diff --git a/flang/include/flang/Lower/LoweringOptions.def b/flang/include/flang/Lower/LoweringOptions.def
index 231de533fbd30a..0b22e54b648e94 100644
--- a/flang/include/flang/Lower/LoweringOptions.def
+++ b/flang/include/flang/Lower/LoweringOptions.def
@@ -38,10 +38,5 @@ ENUM_LOWERINGOPT(Underscoring, unsigned, 1, 1)
 /// (i.e. wraps around as two's complement). Off by default.
 ENUM_LOWERINGOPT(IntegerWrapAround, unsigned, 1, 0)
 
-/// If true, add nsw flags to loop variable increments.
-/// Off by default.
-/// TODO: integrate this option with the above
-ENUM_LOWERINGOPT(NSWOnLoopVarInc, unsigned, 1, 0)
-
 #undef LOWERINGOPT
 #undef ENUM_LOWERINGOPT
diff --git a/flang/include/flang/Optimizer/Transforms/Passes.h b/flang/include/flang/Optimizer/Transforms/Passes.h
index 3b2af3a3398108..e4dcb0060c778a 100644
--- a/flang/include/flang/Optimizer/Transforms/Passes.h
+++ b/flang/include/flang/Optimizer/Transforms/Passes.h
@@ -65,7 +65,7 @@ namespace fir {
 std::unique_ptr<mlir::Pass> createAffineDemotionPass();
 std::unique_ptr<mlir::Pass>
 createArrayValueCopyPass(fir::ArrayValueCopyOptions options = {});
-std::unique_ptr<mlir::Pass> createCFGConversionPassWithNSW();
+std::unique_ptr<mlir::Pass> createCFGConversionPassWithoutNSW();
 std::unique_ptr<mlir::Pass> createMemDataFlowOptPass();
 std::unique_ptr<mlir::Pass> createPromoteToAffinePass();
 std::unique_ptr<mlir::Pass>
@@ -82,7 +82,7 @@ createVScaleAttrPass(std::pair<unsigned, unsigned> vscaleAttr);
 
 void populateCfgConversionRewrites(mlir::RewritePatternSet &patterns,
                                    bool forceLoopToExecuteOnce = false,
-                                   bool setNSW = false);
+                                   bool setNSW = true);
 
 // declarative passes
 #define GEN_PASS_REGISTRATION
diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td b/flang/include/flang/Optimizer/Transforms/Passes.td
index af6bd41cbb71da..09eed385c2c931 100644
--- a/flang/include/flang/Optimizer/Transforms/Passes.td
+++ b/flang/include/flang/Optimizer/Transforms/Passes.td
@@ -153,7 +153,7 @@ def CFGConversion : Pass<"cfg-conversion"> {
            /*default=*/"false",
            "force the body of a loop to execute at least once">,
     Option<"setNSW", "set-nsw", "bool",
-           /*default=*/"false",
+           /*default=*/"true",
            "set nsw on loop variable increment">
   ];
 }
diff --git a/flang/include/flang/Tools/CrossToolHelpers.h b/flang/include/flang/Tools/CrossToolHelpers.h
index df4b21ada058fe..1626970600eff2 100644
--- a/flang/include/flang/Tools/CrossToolHelpers.h
+++ b/flang/include/flang/Tools/CrossToolHelpers.h
@@ -122,7 +122,7 @@ struct MLIRToLLVMPassPipelineConfig : public FlangEPCallBacks {
   bool NoSignedZerosFPMath =
       false; ///< Set no-signed-zeros-fp-math attribute for functions.
   bool UnsafeFPMath = false; ///< Set unsafe-fp-math attribute for functions.
-  bool NSWOnLoopVarInc = false; ///< Add nsw flag to loop variable increments.
+  bool NSWOnLoopVarInc = true; ///< Add nsw flag to loop variable increments.
 };
 
 struct OffloadModuleOpts {
diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp
index 94d3d115417877..5da5236af2b0e1 100644
--- a/flang/lib/Frontend/CompilerInvocation.cpp
+++ b/flang/lib/Frontend/CompilerInvocation.cpp
@@ -1350,12 +1350,6 @@ bool CompilerInvocation::createFromArgs(
     invoc.loweringOpts.setNoPPCNativeVecElemOrder(true);
   }
 
-  // -flang-experimental-integer-overflow
-  if (args.hasArg(
-          clang::driver::options::OPT_flang_experimental_integer_overflow)) {
-    invoc.loweringOpts.setNSWOnLoopVarInc(true);
-  }
-
   // Preserve all the remark options requested, i.e. -Rpass, -Rpass-missed or
   // -Rpass-analysis. This will be used later when processing and outputting the
   // remarks generated by LLVM in ExecuteCompilerInvocation.cpp.
diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp
index f2e460fc53a67f..3ea242315484fd 100644
--- a/flang/lib/Frontend/FrontendActions.cpp
+++ b/flang/lib/Frontend/FrontendActions.cpp
@@ -828,8 +828,8 @@ void CodeGenAction::generateLLVMIR() {
     config.VScaleMax = vsr->second;
   }
 
-  if (ci.getInvocation().getLoweringOpts().getNSWOnLoopVarInc())
-    config.NSWOnLoopVarInc = true;
+  if (ci.getInvocation().getLoweringOpts().getIntegerWrapAround())
+    config.NSWOnLoopVarInc = false;
 
   // Create the pass pipeline
   fir::createMLIRToLLVMPassPipeline(pm, config, getCurrentFile());
diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp
index 877fe122265dd0..a3bd1ace11da21 100644
--- a/flang/lib/Lower/Bridge.cpp
+++ b/flang/lib/Lower/Bridge.cpp
@@ -2271,7 +2271,7 @@ class FirConverter : public Fortran::lower::AbstractConverter {
     assert(!incrementLoopNestInfo.empty() && "empty loop nest");
     mlir::Location loc = toLocation();
     mlir::arith::IntegerOverflowFlags flags{};
-    if (getLoweringOptions().getNSWOnLoopVarInc())
+    if (!getLoweringOptions().getIntegerWrapAround())
       flags = bitEnumSet(flags, mlir::arith::IntegerOverflowFlags::nsw);
     auto iofAttr = mlir::arith::IntegerOverflowFlagsAttr::get(
         builder->getContext(), flags);
diff --git a/flang/lib/Lower/IO.cpp b/flang/lib/Lower/IO.cpp
index 1894b0cfd1bec2..b534c81a605a90 100644
--- a/flang/lib/Lower/IO.cpp
+++ b/flang/lib/Lower/IO.cpp
@@ -929,7 +929,7 @@ static void genIoLoop(Fortran::lower::AbstractConverter &converter,
   fir::FirOpBuilder &builder = converter.getFirOpBuilder();
   mlir::Location loc = converter.getCurrentLocation();
   mlir::arith::IntegerOverflowFlags flags{};
-  if (converter.getLoweringOptions().getNSWOnLoopVarInc())
+  if (!converter.getLoweringOptions().getIntegerWrapAround())
     flags = bitEnumSet(flags, mlir::arith::IntegerOverflowFlags::nsw);
   auto iofAttr =
       mlir::arith::IntegerOverflowFlagsAttr::get(builder.getContext(), flags);
diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp
index 3fa5c54403bd8c..fc13d77469f17b 100644
--- a/flang/lib/Optimizer/Passes/Pipelines.cpp
+++ b/flang/lib/Optimizer/Passes/Pipelines.cpp
@@ -35,11 +35,11 @@ void addCanonicalizerPassWithoutRegionSimplification(mlir::OpPassManager &pm) {
 void addCfgConversionPass(mlir::PassManager &pm,
                           const MLIRToLLVMPassPipelineConfig &config) {
   if (config.NSWOnLoopVarInc)
-    addNestedPassToAllTopLevelOperationsConditionally(
-        pm, disableCfgConversion, fir::createCFGConversionPassWithNSW);
-  else
     addNestedPassToAllTopLevelOperationsConditionally(pm, disableCfgConversion,
                                                       fir::createCFGConversion);
+  else
+    addNestedPassToAllTopLevelOperationsConditionally(
+        pm, disableCfgConversion, fir::createCFGConversionPassWithoutNSW);
 }
 
 void addAVC(mlir::PassManager &pm, const llvm::OptimizationLevel &optLevel) {
diff --git a/flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp b/flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp
index 3b79d6d311b71c..411bf7f364a602 100644
--- a/flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp
+++ b/flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp
@@ -332,8 +332,6 @@ class CfgConversion : public fir::impl::CFGConversionBase<CfgConversion> {
 public:
   using CFGConversionBase<CfgConversion>::CFGConversionBase;
 
-  CfgConversion(bool setNSW) { this->setNSW = setNSW; }
-
   void runOnOperation() override {
     auto *context = &this->getContext();
     mlir::RewritePatternSet patterns(context);
@@ -366,6 +364,8 @@ void fir::populateCfgConversionRewrites(mlir::RewritePatternSet &patterns,
       patterns.getContext(), forceLoopToExecuteOnce, setNSW);
 }
 
-std::unique_ptr<mlir::Pass> fir::createCFGConversionPassWithNSW() {
-  return std::make_unique<CfgConversion>(true);
+std::unique_ptr<mlir::Pass> fir::createCFGConversionPassWithoutNSW() {
+  fir::CFGConversionOptions options;
+  options.setNSW = false;
+  return fir::createCFGConversion(options);
 }
diff --git a/flang/test/Driver/frontend-forwarding.f90 b/flang/test/Driver/frontend-forwarding.f90
index ff2d6609521464..55a74ccf40467b 100644
--- a/flang/test/Driver/frontend-forwarding.f90
+++ b/flang/test/Driver/frontend-forwarding.f90
@@ -20,7 +20,6 @@
 ! RUN:     -fversion-loops-for-stride \
 ! RUN:     -flang-experimental-hlfir \
 ! RUN:     -flang-deprecated-no-hlfir \
-! RUN:     -flang-experimental-integer-overflow \
 ! RUN:     -fno-ppc-native-vector-element-order \
 ! RUN:     -fppc-native-vector-element-order \
 ! RUN:     -mllvm -print-before-all \
@@ -52,7 +51,6 @@
 ! CHECK: "-fversion-loops-for-stride"
 ! CHECK: "-flang-experimental-hlfir"
 ! CHECK: "-flang-deprecated-no-hlfir"
-! CHECK: "-flang-experimental-integer-overflow"
 ! CHECK: "-fno-ppc-native-vector-element-order"
 ! CHECK: "-fppc-native-vector-element-order"
 ! CHECK: "-Rpass"
diff --git a/flang/test/Fir/convert-to-llvm-openmp-and-fir.fir b/flang/test/Fir/convert-to-llvm-openmp-and-fir.fir
index 335877e7c9a872..00f8e6e6cc9a6b 100644
--- a/flang/test/Fir/convert-to-llvm-openmp-and-fir.fir
+++ b/flang/test/Fir/convert-to-llvm-openmp-and-fir.fir
@@ -358,10 +358,10 @@ func.func @_QPopenmp_target_data_region() {
       %9 = arith.subi %8, %c1_i64 : i64
       %10 = fir.coordinate_of %0, %9 : (!fir.ref<!fir.array<1024xi32>>, i64) -> !fir.ref<i32>
       fir.store %6 to %10 : !fir.ref<i32>
-      %11 = arith.addi %arg0, %c1 : index
+      %11 = arith.addi %arg0, %c1 overflow<nsw> : index
       %12 = fir.convert %c1 : (index) -> i32
       %13 = fir.load %1 : !fir.ref<i32>
-      %14 = arith.addi %13, %12 : i32
+      %14 = arith.addi %13, %12 overflow<nsw> : i32
       fir.result %11, %14 : index, i32
     }
     fir.store %5#1 to %1 : !fir.ref<i32>
@@ -404,11 +404,11 @@ func.func @_QPopenmp_target_data_region() {
 // CHECK:             %[[VAL_21:.*]] = llvm.sub %[[VAL_19]], %[[VAL_20]]  : i64
 // CHECK:             %[[VAL_22:.*]] = llvm.getelementptr %[[VAL_1]][0, %[[VAL_21]]] : (!llvm.ptr, i64) -> !llvm.ptr
 // CHECK:             llvm.store %[[VAL_17]], %[[VAL_22]] : i32, !llvm.ptr
-// CHECK:             %[[VAL_23:.*]] = llvm.add %[[VAL_12]], %[[VAL_8]]  : i64
+// CHECK:             %[[VAL_23:.*]] = llvm.add %[[VAL_12]], %[[VAL_8]] overflow<nsw> : i64
 // CHECK:             %[[VAL_24:.*]] = llvm.trunc %[[VAL_8]] : i64 to i32
 // CHECK:             %[[VAL_25:.*]] = llvm.load %[[VAL_3]] : !llvm.ptr -> i32
-// CHECK:             %[[VAL_26:.*]] = llvm.add %[[VAL_25]], %[[VAL_24]]  : i32
-// CHECK:             %[[VAL_27:.*]] = llvm.add %[[VAL_12]], %[[VAL_8]]  : i64
+// CHECK:             %[[VAL_26:.*]] = llvm.add %[[VAL_25]], %[[VAL_24]] overflow<nsw> : i32
+// CHECK:             %[[VAL_27:.*]] = llvm.add %[[VAL_12]], %[[VAL_8]] overflow<nsw> : i64
 // CHECK:             %[[VAL_28:.*]] = llvm.mlir.constant(1 : index) : i64
 // CHECK:             %[[VAL_29:.*]] = llvm.sub %[[VAL_14]], %[[VAL_28]]  : i64
 // CHECK:             llvm.br ^bb1(%[[VAL_27]], %[[VAL_26]], %[[VAL_29]] : i64, i32, i64)
diff --git a/flang/test/Fir/loop01.fir b/flang/test/Fir/loop01.fir
index c1cbb522c378c0..30d10b9bbdb979 100644
--- a/flang/test/Fir/loop01.fir
+++ b/flang/test/Fir/loop01.fir
@@ -1,5 +1,7 @@
 // RUN: fir-opt --split-input-file --cfg-conversion %s | FileCheck %s
-// RUN: fir-opt --split-input-file --cfg-conversion="set-nsw=true" %s | FileCheck %s --check-prefix=NSW
+// RUN: fir-opt --split-input-file --cfg-conversion="set-nsw=false" %s | FileCheck %s --check-prefix=NO-NSW
+
+// NO-NSW-NOT: overflow<nsw>
 
 func.func @x(%lb : index, %ub : index, %step : index, %b : i1, %addr : !fir.ref<index>) {
   fir.do_loop %iv = %lb to %ub step %step unordered {
@@ -35,7 +37,7 @@ func.func private @f2() -> i1
 // CHECK:       fir.store %[[VAL_12]] to %[[VAL_4]] : !fir.ref<index>
 // CHECK:       br ^bb5
 // CHECK:     ^bb5:
-// CHECK:       %[[VAL_13:.*]] = arith.addi %[[VAL_8]], %[[VAL_2]] : index
+// CHECK:       %[[VAL_13:.*]] = arith.addi %[[VAL_8]], %[[VAL_2]] overflow<nsw> : index
 // CHECK:       %[[VAL_14:.*]] = arith.constant 1 : index
 // CHECK:       %[[VAL_15:.*]] = arith.subi %[[VAL_9]], %[[VAL_14]] : index
 // CHECK:       br ^bb1(%[[VAL_13]], %[[VAL_15]] : index, index)
@@ -44,34 +46,6 @@ func.func private @f2() -> i1
 // CHECK:     }
 // CHECK:     func private @f2() -> i1
 
-// NSW:     func @x(%[[VAL_0:.*]]: index, %[[VAL_1:.*]]: index, %[[VAL_2:.*]]: index, %[[VAL_3:.*]]: i1, %[[VAL_4:.*]]: !fir.ref<index>) {
-// NSW:       %[[VAL_5:.*]] = arith.subi %[[VAL_1]], %[[VAL_0]] : index
-// NSW:       %[[VAL_6:.*]] = arith.addi %[[VAL_5]], %[[VAL_2]] : index
-// NSW:       %[[VAL_7:.*]] = arith.divsi %[[VAL_6]], %[[VAL_2]] : index
-// NSW:       br ^bb1(%[[VAL_0]], %[[VAL_7]] : index, index)
-// NSW:     ^bb1(%[[VAL_8:.*]]: index, %[[VAL_9:.*]]: index):
-// NSW:       %[[VAL_10:.*]] = arith.constant 0 : index
-// NSW:       %[[VAL_11:.*]] = arith.cmpi sgt, %[[VAL_9]], %[[VAL_10]] : index
-// NSW:       cond_br %[[VAL_11]], ^bb2, ^bb6
-// NSW:     ^bb2:
-// NSW:       cond_br %[[VAL_3]], ^bb3, ^bb4
-// NSW:     ^bb3:
-// NSW:       fir.store %[[VAL_8]] to %[[VAL_4]] : !fir.ref<index>
-// NSW:       br ^bb5
-// NSW:     ^bb4:
-// NSW:       %[[VAL_12:.*]] = arith.constant 0 : index
-// NSW:       fir.store %[[VAL_12]] to %[[VAL_4]] : !fir.ref<index>
-// NSW:       br ^bb5
-// NSW:     ^bb5:
-// NSW:       %[[VAL_13:.*]] = arith.addi %[[VAL_8]], %[[VAL_2]] overflow<nsw> : index
-// NSW:       %[[VAL_14:.*]] = arith.constant 1 : index
-// NSW:       %[[VAL_15:.*]] = arith.subi %[[VAL_9]], %[[VAL_14]] : index
-// NSW:       br ^bb1(%[[VAL_13]], %[[VAL_15]] : index, index)
-// NSW:     ^bb6:
-// NSW:       return
-// NSW:     }
-// NSW:     func private @f2() -> i1
-
 // -----
 
 func.func @x2(%lo : index, %up : index, %ok : i1) {
@@ -101,36 +75,13 @@ func.func private @f3(i16)
 // CHECK:     cond_br %[[VAL_14]], ^bb2, ^bb3
 // CHECK:   ^bb2:
 // CHECK:     %[[VAL_15:.*]] = fir.call @f2() : () -> i1
-// CHECK:     %[[VAL_16:.*]] = arith.addi %[[VAL_4]], %[[VAL_3]] : index
+// CHECK:     %[[VAL_16:.*]] = arith.addi %[[VAL_4]], %[[VAL_3]] overflow<nsw> : index
 // CHECK:     br ^bb1(%[[VAL_16]], %[[VAL_15]] : index, i1)
 // CHECK:   ^bb3:
 // CHECK:     return
 // CHECK:   }
 // CHECK:   func private @f3(i16)
 
-// NSW:   func @x2(%[[VAL_0:.*]]: index, %[[VAL_1:.*]]: index, %[[VAL_2:.*]]: i1) {
-// NSW:     %[[VAL_3:.*]] = arith.constant 1 : index
-// NSW:     br ^bb1(%[[VAL_0]], %[[VAL_2]] : index, i1)
-// NSW:   ^bb1(%[[VAL_4:.*]]: index, %[[VAL_5:.*]]: i1):
-// NSW:     %[[VAL_6:.*]] = arith.constant 0 : index
-// NSW:     %[[VAL_7:.*]] = arith.cmpi slt, %[[VAL_6]], %[[VAL_3]] : index
-// NSW:     %[[VAL_8:.*]] = arith.cmpi sle, %[[VAL_4]], %[[VAL_1]] : index
-// NSW:     %[[VAL_9:.*]] = arith.cmpi slt, %[[VAL_3]], %[[VAL_6]] : index
-// NSW:     %[[VAL_10:.*]] = arith.cmpi sle, %[[VAL_1]], %[[VAL_4]] : index
-// NSW:     %[[VAL_11:.*]] = arith.andi %[[VAL_7]], %[[VAL_8]] : i1
-// NSW:     %[[VAL_12:.*]] = arith.andi %[[VAL_9]], %[[VAL_10]] : i1
-// NSW:     %[[VAL_13:.*]] = arith.ori %[[VAL_11]], %[[VAL_12]] : i1
-// NSW:     %[[VAL_14:.*]] = arith.andi %[[VAL_5]], %[[VAL_13]] : i1
-// NSW:     cond_br %[[VAL_14]], ^bb2, ^bb3
-// NSW:   ^bb2:
-// NSW:     %[[VAL_15:.*]] = fir.call @f2() : () -> i1
-// NSW:     %[[VAL_16:.*]] = arith.addi %[[VAL_4]], %[[VAL_3]] overflow<nsw> : index
-// NSW:     br ^bb1(%[[VAL_16]], %[[VAL_15]] : index, i1)
-// NSW:   ^bb3:
-// NSW:     return
-// NSW:   }
-// NSW:   func private @f3(i16)
-
 // -----
 
 // do_loop with an extra loop-carried value
@@ -159,7 +110,7 @@ func.func @x3(%lo : index, %up : index) -> i1 {
 // CHECK:           cond_br %[[VAL_11]], ^bb2, ^bb3
 // CHECK:         ^bb2:
 // CHECK:           %[[VAL_12:.*]] = fir.call @f2() : () -> i1
-// CHECK:           %[[VAL_13:.*]] = arith.addi %[[VAL_7]], %[[VAL_2]] : index
+// CHECK:           %[[VAL_13:.*]] = arith.addi %[[VAL_7]], %[[VAL_2]] overflow<nsw> : index
 // CHECK:           %[[VAL_14:.*]] = arith.constant 1 : index
 // CHECK:           %[[VAL_15:.*]] = arith.subi %[[VAL_9]], %[[VAL_14]] : index
 // CHECK:           br ^bb1(%[[VAL_13]], %[[VAL_12]], %[[VAL_15]] : index, i1, index)
@@ -167,29 +118,6 @@ func.func @x3(%lo : index, %up : index) -> i1 {
 // CHECK:           return %[[VAL_8]] : i1
 // CHECK:         }
 
-// NSW-LABEL:   func @x3(
-// NSW-SAME:             %[[VAL_0:.*]]: index,
-// NSW-SAME:             %[[VAL_1:.*]]: index) -> i1 {
-// NSW:           %[[VAL_2:.*]] = arith.constant 1 : index
-// NSW:           %[[VAL_3:.*]] = arith.constant true
-// NSW:           %[[VAL_4:.*]] = arith.subi %[[VAL_1]], %[[VAL_0]] : index
-// NSW:           %[[VAL_5:.*]] = arith.addi %[[VAL_4]], %[[VAL_2]] : index
-// NSW:           %[[VAL_6:.*]] = arith.divsi %[[VAL_5]], %[[VAL_2]] : index
-// NSW:           br ^bb1(%[[VAL_0]], %[[VAL_3]], %[[VAL_6]] : index, i1, index)
-// NSW:         ^bb1(%[[VAL_7:.*]]: index, %[[VAL_8:.*]]: i1, %[[VAL_9:.*]]: index):
-// NSW:           %[[VAL_10:.*]] = arith.constant 0 : index
-// NSW:           %[[VAL_11:.*]] = arith.cmpi sgt, %[[VAL_9]], %[[VAL_10]] : index
-// NSW:           cond_br %[[VAL_11]], ^bb2, ^bb3
-// NSW:         ^bb2:
-// NSW:           %[[VAL_12:.*]] = fir.call @f2() : () -> i1
-// NSW:           %[[VAL_13:.*]] = arith.addi %[[VAL_7]], %[[VAL_2]] overflow<nsw> : index
-// NSW:           %[[VAL_14:.*]] = arith.constant 1 : index
-// NSW:           %[[VAL_15:.*]] = arith.subi %[[VAL_9]], %[[VAL_14]] : index
-// NSW:           br ^bb1(%[[VAL_13]], %[[VAL_12]], %[[VAL_15]] : index, i1, index)
-// NSW:         ^bb3:
-// NSW:           return %[[VAL_8]] : i1
-// NSW:         }
-
 // -----
 
 // iterate_while with an extra loop-carried value
@@ -227,7 +155,7 @@ func.func private @f4(i32) -> i1
 // CHECK:           cond_br %[[VAL_16]], ^bb2, ^bb3
 // CHECK:         ^bb2:
 // CHECK:           %[[VAL_17:.*]] = fir.call @f2() : () -> i1
-// CHECK:           %[[VAL_18:.*]] = arith.addi %[[VAL_5]], %[[VAL_2]] : index
+// CHECK:           %[[VAL_18:.*]] = arith.addi %[[VAL_5]], %[[VAL_2]] overflow<nsw> : index
 // CHECK:           br ^bb1(%[[VAL_18]], %[[VAL_6]], %[[VAL_17]] : index, i1, i1)
 // CHECK:         ^bb3:
 // CHECK:           %[[VAL_19:.*]] = arith.andi %[[VAL_6]], %[[VAL_7]] : i1
@@ -235,34 +163,6 @@ func.func private @f4(i32) -> i1
 // CHECK:         }
 // CHECK:         func private @f4(i32) -> i1
 
-// NSW-LABEL:   func @y3(
-// NSW-SAME:             %[[VAL_0:.*]]: index,
-// NSW-SAME:             %[[VAL_1:.*]]: index) -> i1 {
-// NSW:           %[[VAL_2:.*]] = arith.constant 1 : index
-// NSW:           %[[VAL_3:.*]] = arith.constant true
-// NSW:           %[[VAL_4:.*]] = fir.call @f2() : () -> i1
-// NSW:           br ^bb1(%[[VAL_0]], %[[VAL_3]], %[[VAL_4]] : index, i1, i1)
-// NSW:         ^bb1(%[[VAL_5:.*]]: index, %[[VAL_6:.*]]: i...
[truncated]

llvmbot · 2024-10-21T10:23:20Z

@llvm/pr-subscribers-clang-driver

Author: Yusuke MINATO (yus3710-fj)

Changes

nsw is added to do-variable increment when -fno-wrapv is enabled as GFortran seems to do.
Therefore, the option introduced by #91579 isn't necessary any more.

Note that the feature of -flang-experimental-integer-overflow is now enabled by default.

Patch is 101.96 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/110063.diff

40 Files Affected:

(modified) clang/include/clang/Driver/Options.td (-4)
(modified) clang/lib/Driver/ToolChains/Flang.cpp (-1)
(modified) flang/include/flang/Lower/LoweringOptions.def (-5)
(modified) flang/include/flang/Optimizer/Transforms/Passes.h (+2-2)
(modified) flang/include/flang/Optimizer/Transforms/Passes.td (+1-1)
(modified) flang/include/flang/Tools/CrossToolHelpers.h (+1-1)
(modified) flang/lib/Frontend/CompilerInvocation.cpp (-6)
(modified) flang/lib/Frontend/FrontendActions.cpp (+2-2)
(modified) flang/lib/Lower/Bridge.cpp (+1-1)
(modified) flang/lib/Lower/IO.cpp (+1-1)
(modified) flang/lib/Optimizer/Passes/Pipelines.cpp (+3-3)
(modified) flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp (+4-4)
(modified) flang/test/Driver/frontend-forwarding.f90 (-2)
(modified) flang/test/Fir/convert-to-llvm-openmp-and-fir.fir (+5-5)
(modified) flang/test/Fir/loop01.fir (+11-219)
(modified) flang/test/Fir/loop02.fir (+2-2)
(modified) flang/test/Lower/HLFIR/goto-do-body.f90 (+2-2)
(modified) flang/test/Lower/OpenMP/parallel-private-clause-fixes.f90 (+2-2)
(modified) flang/test/Lower/OpenMP/wsloop-reduction-allocatable-array-minmax.f90 (+2-2)
(modified) flang/test/Lower/OpenMP/wsloop-variable.f90 (+2-2)
(modified) flang/test/Lower/array-character.f90 (+1-1)
(modified) flang/test/Lower/array-derived-assignments.f90 (+1-1)
(modified) flang/test/Lower/array-derived.f90 (+1-1)
(modified) flang/test/Lower/array-elemental-calls-char-byval.f90 (+1-1)
(modified) flang/test/Lower/array-elemental-calls-char.f90 (+1-1)
(modified) flang/test/Lower/array-expression-assumed-size.f90 (+1-1)
(modified) flang/test/Lower/array-expression-slice-1.f90 (+1-1)
(modified) flang/test/Lower/array-substring.f90 (+5-41)
(modified) flang/test/Lower/array-temp.f90 (+1-1)
(modified) flang/test/Lower/components.f90 (+1-1)
(modified) flang/test/Lower/do_loop.f90 (+17-56)
(modified) flang/test/Lower/do_loop_unstructured.f90 (+11-192)
(modified) flang/test/Lower/goto-do-body.f90 (+2-2)
(modified) flang/test/Lower/host-associated.f90 (+1-1)
(modified) flang/test/Lower/infinite_loop.f90 (+5-36)
(modified) flang/test/Lower/io-implied-do-fixes.f90 (+6-43)
(modified) flang/test/Lower/loops2.f90 (+1-1)
(modified) flang/test/Lower/mixed_loops.f90 (+3-3)
(modified) flang/test/Lower/vector-subscript-io.f90 (+17-17)
(modified) flang/tools/bbc/bbc.cpp (+1-8)

diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index 152c43d7908ff8..6ac1b3beac8a07 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -6771,10 +6771,6 @@ def flang_deprecated_no_hlfir : Flag<["-"], "flang-deprecated-no-hlfir">,
   Flags<[HelpHidden]>, Visibility<[FlangOption, FC1Option]>,
   HelpText<"Do not use HLFIR lowering (deprecated)">;
 
-def flang_experimental_integer_overflow : Flag<["-"], "flang-experimental-integer-overflow">,
-  Flags<[HelpHidden]>, Visibility<[FlangOption, FC1Option]>,
-  HelpText<"Add nsw flag to internal operations such as do-variable increment (experimental)">;
-
 //===----------------------------------------------------------------------===//
 // FLangOption + CoreOption + NoXarchOption
 //===----------------------------------------------------------------------===//
diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp
index a9d2b7a4dc48f9..366cadc2e54775 100644
--- a/clang/lib/Driver/ToolChains/Flang.cpp
+++ b/clang/lib/Driver/ToolChains/Flang.cpp
@@ -148,7 +148,6 @@ void Flang::addCodegenOptions(const ArgList &Args,
 
   Args.addAllArgs(CmdArgs, {options::OPT_flang_experimental_hlfir,
                             options::OPT_flang_deprecated_no_hlfir,
-                            options::OPT_flang_experimental_integer_overflow,
                             options::OPT_fno_ppc_native_vec_elem_order,
                             options::OPT_fppc_native_vec_elem_order});
 }
diff --git a/flang/include/flang/Lower/LoweringOptions.def b/flang/include/flang/Lower/LoweringOptions.def
index 231de533fbd30a..0b22e54b648e94 100644
--- a/flang/include/flang/Lower/LoweringOptions.def
+++ b/flang/include/flang/Lower/LoweringOptions.def
@@ -38,10 +38,5 @@ ENUM_LOWERINGOPT(Underscoring, unsigned, 1, 1)
 /// (i.e. wraps around as two's complement). Off by default.
 ENUM_LOWERINGOPT(IntegerWrapAround, unsigned, 1, 0)
 
-/// If true, add nsw flags to loop variable increments.
-/// Off by default.
-/// TODO: integrate this option with the above
-ENUM_LOWERINGOPT(NSWOnLoopVarInc, unsigned, 1, 0)
-
 #undef LOWERINGOPT
 #undef ENUM_LOWERINGOPT
diff --git a/flang/include/flang/Optimizer/Transforms/Passes.h b/flang/include/flang/Optimizer/Transforms/Passes.h
index 3b2af3a3398108..e4dcb0060c778a 100644
--- a/flang/include/flang/Optimizer/Transforms/Passes.h
+++ b/flang/include/flang/Optimizer/Transforms/Passes.h
@@ -65,7 +65,7 @@ namespace fir {
 std::unique_ptr<mlir::Pass> createAffineDemotionPass();
 std::unique_ptr<mlir::Pass>
 createArrayValueCopyPass(fir::ArrayValueCopyOptions options = {});
-std::unique_ptr<mlir::Pass> createCFGConversionPassWithNSW();
+std::unique_ptr<mlir::Pass> createCFGConversionPassWithoutNSW();
 std::unique_ptr<mlir::Pass> createMemDataFlowOptPass();
 std::unique_ptr<mlir::Pass> createPromoteToAffinePass();
 std::unique_ptr<mlir::Pass>
@@ -82,7 +82,7 @@ createVScaleAttrPass(std::pair<unsigned, unsigned> vscaleAttr);
 
 void populateCfgConversionRewrites(mlir::RewritePatternSet &patterns,
                                    bool forceLoopToExecuteOnce = false,
-                                   bool setNSW = false);
+                                   bool setNSW = true);
 
 // declarative passes
 #define GEN_PASS_REGISTRATION
diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td b/flang/include/flang/Optimizer/Transforms/Passes.td
index af6bd41cbb71da..09eed385c2c931 100644
--- a/flang/include/flang/Optimizer/Transforms/Passes.td
+++ b/flang/include/flang/Optimizer/Transforms/Passes.td
@@ -153,7 +153,7 @@ def CFGConversion : Pass<"cfg-conversion"> {
            /*default=*/"false",
            "force the body of a loop to execute at least once">,
     Option<"setNSW", "set-nsw", "bool",
-           /*default=*/"false",
+           /*default=*/"true",
            "set nsw on loop variable increment">
   ];
 }
diff --git a/flang/include/flang/Tools/CrossToolHelpers.h b/flang/include/flang/Tools/CrossToolHelpers.h
index df4b21ada058fe..1626970600eff2 100644
--- a/flang/include/flang/Tools/CrossToolHelpers.h
+++ b/flang/include/flang/Tools/CrossToolHelpers.h
@@ -122,7 +122,7 @@ struct MLIRToLLVMPassPipelineConfig : public FlangEPCallBacks {
   bool NoSignedZerosFPMath =
       false; ///< Set no-signed-zeros-fp-math attribute for functions.
   bool UnsafeFPMath = false; ///< Set unsafe-fp-math attribute for functions.
-  bool NSWOnLoopVarInc = false; ///< Add nsw flag to loop variable increments.
+  bool NSWOnLoopVarInc = true; ///< Add nsw flag to loop variable increments.
 };
 
 struct OffloadModuleOpts {
diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp
index 94d3d115417877..5da5236af2b0e1 100644
--- a/flang/lib/Frontend/CompilerInvocation.cpp
+++ b/flang/lib/Frontend/CompilerInvocation.cpp
@@ -1350,12 +1350,6 @@ bool CompilerInvocation::createFromArgs(
     invoc.loweringOpts.setNoPPCNativeVecElemOrder(true);
   }
 
-  // -flang-experimental-integer-overflow
-  if (args.hasArg(
-          clang::driver::options::OPT_flang_experimental_integer_overflow)) {
-    invoc.loweringOpts.setNSWOnLoopVarInc(true);
-  }
-
   // Preserve all the remark options requested, i.e. -Rpass, -Rpass-missed or
   // -Rpass-analysis. This will be used later when processing and outputting the
   // remarks generated by LLVM in ExecuteCompilerInvocation.cpp.
diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp
index f2e460fc53a67f..3ea242315484fd 100644
--- a/flang/lib/Frontend/FrontendActions.cpp
+++ b/flang/lib/Frontend/FrontendActions.cpp
@@ -828,8 +828,8 @@ void CodeGenAction::generateLLVMIR() {
     config.VScaleMax = vsr->second;
   }
 
-  if (ci.getInvocation().getLoweringOpts().getNSWOnLoopVarInc())
-    config.NSWOnLoopVarInc = true;
+  if (ci.getInvocation().getLoweringOpts().getIntegerWrapAround())
+    config.NSWOnLoopVarInc = false;
 
   // Create the pass pipeline
   fir::createMLIRToLLVMPassPipeline(pm, config, getCurrentFile());
diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp
index 877fe122265dd0..a3bd1ace11da21 100644
--- a/flang/lib/Lower/Bridge.cpp
+++ b/flang/lib/Lower/Bridge.cpp
@@ -2271,7 +2271,7 @@ class FirConverter : public Fortran::lower::AbstractConverter {
     assert(!incrementLoopNestInfo.empty() && "empty loop nest");
     mlir::Location loc = toLocation();
     mlir::arith::IntegerOverflowFlags flags{};
-    if (getLoweringOptions().getNSWOnLoopVarInc())
+    if (!getLoweringOptions().getIntegerWrapAround())
       flags = bitEnumSet(flags, mlir::arith::IntegerOverflowFlags::nsw);
     auto iofAttr = mlir::arith::IntegerOverflowFlagsAttr::get(
         builder->getContext(), flags);
diff --git a/flang/lib/Lower/IO.cpp b/flang/lib/Lower/IO.cpp
index 1894b0cfd1bec2..b534c81a605a90 100644
--- a/flang/lib/Lower/IO.cpp
+++ b/flang/lib/Lower/IO.cpp
@@ -929,7 +929,7 @@ static void genIoLoop(Fortran::lower::AbstractConverter &converter,
   fir::FirOpBuilder &builder = converter.getFirOpBuilder();
   mlir::Location loc = converter.getCurrentLocation();
   mlir::arith::IntegerOverflowFlags flags{};
-  if (converter.getLoweringOptions().getNSWOnLoopVarInc())
+  if (!converter.getLoweringOptions().getIntegerWrapAround())
     flags = bitEnumSet(flags, mlir::arith::IntegerOverflowFlags::nsw);
   auto iofAttr =
       mlir::arith::IntegerOverflowFlagsAttr::get(builder.getContext(), flags);
diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp
index 3fa5c54403bd8c..fc13d77469f17b 100644
--- a/flang/lib/Optimizer/Passes/Pipelines.cpp
+++ b/flang/lib/Optimizer/Passes/Pipelines.cpp
@@ -35,11 +35,11 @@ void addCanonicalizerPassWithoutRegionSimplification(mlir::OpPassManager &pm) {
 void addCfgConversionPass(mlir::PassManager &pm,
                           const MLIRToLLVMPassPipelineConfig &config) {
   if (config.NSWOnLoopVarInc)
-    addNestedPassToAllTopLevelOperationsConditionally(
-        pm, disableCfgConversion, fir::createCFGConversionPassWithNSW);
-  else
     addNestedPassToAllTopLevelOperationsConditionally(pm, disableCfgConversion,
                                                       fir::createCFGConversion);
+  else
+    addNestedPassToAllTopLevelOperationsConditionally(
+        pm, disableCfgConversion, fir::createCFGConversionPassWithoutNSW);
 }
 
 void addAVC(mlir::PassManager &pm, const llvm::OptimizationLevel &optLevel) {
diff --git a/flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp b/flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp
index 3b79d6d311b71c..411bf7f364a602 100644
--- a/flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp
+++ b/flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp
@@ -332,8 +332,6 @@ class CfgConversion : public fir::impl::CFGConversionBase<CfgConversion> {
 public:
   using CFGConversionBase<CfgConversion>::CFGConversionBase;
 
-  CfgConversion(bool setNSW) { this->setNSW = setNSW; }
-
   void runOnOperation() override {
     auto *context = &this->getContext();
     mlir::RewritePatternSet patterns(context);
@@ -366,6 +364,8 @@ void fir::populateCfgConversionRewrites(mlir::RewritePatternSet &patterns,
       patterns.getContext(), forceLoopToExecuteOnce, setNSW);
 }
 
-std::unique_ptr<mlir::Pass> fir::createCFGConversionPassWithNSW() {
-  return std::make_unique<CfgConversion>(true);
+std::unique_ptr<mlir::Pass> fir::createCFGConversionPassWithoutNSW() {
+  fir::CFGConversionOptions options;
+  options.setNSW = false;
+  return fir::createCFGConversion(options);
 }
diff --git a/flang/test/Driver/frontend-forwarding.f90 b/flang/test/Driver/frontend-forwarding.f90
index ff2d6609521464..55a74ccf40467b 100644
--- a/flang/test/Driver/frontend-forwarding.f90
+++ b/flang/test/Driver/frontend-forwarding.f90
@@ -20,7 +20,6 @@
 ! RUN:     -fversion-loops-for-stride \
 ! RUN:     -flang-experimental-hlfir \
 ! RUN:     -flang-deprecated-no-hlfir \
-! RUN:     -flang-experimental-integer-overflow \
 ! RUN:     -fno-ppc-native-vector-element-order \
 ! RUN:     -fppc-native-vector-element-order \
 ! RUN:     -mllvm -print-before-all \
@@ -52,7 +51,6 @@
 ! CHECK: "-fversion-loops-for-stride"
 ! CHECK: "-flang-experimental-hlfir"
 ! CHECK: "-flang-deprecated-no-hlfir"
-! CHECK: "-flang-experimental-integer-overflow"
 ! CHECK: "-fno-ppc-native-vector-element-order"
 ! CHECK: "-fppc-native-vector-element-order"
 ! CHECK: "-Rpass"
diff --git a/flang/test/Fir/convert-to-llvm-openmp-and-fir.fir b/flang/test/Fir/convert-to-llvm-openmp-and-fir.fir
index 335877e7c9a872..00f8e6e6cc9a6b 100644
--- a/flang/test/Fir/convert-to-llvm-openmp-and-fir.fir
+++ b/flang/test/Fir/convert-to-llvm-openmp-and-fir.fir
@@ -358,10 +358,10 @@ func.func @_QPopenmp_target_data_region() {
       %9 = arith.subi %8, %c1_i64 : i64
       %10 = fir.coordinate_of %0, %9 : (!fir.ref<!fir.array<1024xi32>>, i64) -> !fir.ref<i32>
       fir.store %6 to %10 : !fir.ref<i32>
-      %11 = arith.addi %arg0, %c1 : index
+      %11 = arith.addi %arg0, %c1 overflow<nsw> : index
       %12 = fir.convert %c1 : (index) -> i32
       %13 = fir.load %1 : !fir.ref<i32>
-      %14 = arith.addi %13, %12 : i32
+      %14 = arith.addi %13, %12 overflow<nsw> : i32
       fir.result %11, %14 : index, i32
     }
     fir.store %5#1 to %1 : !fir.ref<i32>
@@ -404,11 +404,11 @@ func.func @_QPopenmp_target_data_region() {
 // CHECK:             %[[VAL_21:.*]] = llvm.sub %[[VAL_19]], %[[VAL_20]]  : i64
 // CHECK:             %[[VAL_22:.*]] = llvm.getelementptr %[[VAL_1]][0, %[[VAL_21]]] : (!llvm.ptr, i64) -> !llvm.ptr
 // CHECK:             llvm.store %[[VAL_17]], %[[VAL_22]] : i32, !llvm.ptr
-// CHECK:             %[[VAL_23:.*]] = llvm.add %[[VAL_12]], %[[VAL_8]]  : i64
+// CHECK:             %[[VAL_23:.*]] = llvm.add %[[VAL_12]], %[[VAL_8]] overflow<nsw> : i64
 // CHECK:             %[[VAL_24:.*]] = llvm.trunc %[[VAL_8]] : i64 to i32
 // CHECK:             %[[VAL_25:.*]] = llvm.load %[[VAL_3]] : !llvm.ptr -> i32
-// CHECK:             %[[VAL_26:.*]] = llvm.add %[[VAL_25]], %[[VAL_24]]  : i32
-// CHECK:             %[[VAL_27:.*]] = llvm.add %[[VAL_12]], %[[VAL_8]]  : i64
+// CHECK:             %[[VAL_26:.*]] = llvm.add %[[VAL_25]], %[[VAL_24]] overflow<nsw> : i32
+// CHECK:             %[[VAL_27:.*]] = llvm.add %[[VAL_12]], %[[VAL_8]] overflow<nsw> : i64
 // CHECK:             %[[VAL_28:.*]] = llvm.mlir.constant(1 : index) : i64
 // CHECK:             %[[VAL_29:.*]] = llvm.sub %[[VAL_14]], %[[VAL_28]]  : i64
 // CHECK:             llvm.br ^bb1(%[[VAL_27]], %[[VAL_26]], %[[VAL_29]] : i64, i32, i64)
diff --git a/flang/test/Fir/loop01.fir b/flang/test/Fir/loop01.fir
index c1cbb522c378c0..30d10b9bbdb979 100644
--- a/flang/test/Fir/loop01.fir
+++ b/flang/test/Fir/loop01.fir
@@ -1,5 +1,7 @@
 // RUN: fir-opt --split-input-file --cfg-conversion %s | FileCheck %s
-// RUN: fir-opt --split-input-file --cfg-conversion="set-nsw=true" %s | FileCheck %s --check-prefix=NSW
+// RUN: fir-opt --split-input-file --cfg-conversion="set-nsw=false" %s | FileCheck %s --check-prefix=NO-NSW
+
+// NO-NSW-NOT: overflow<nsw>
 
 func.func @x(%lb : index, %ub : index, %step : index, %b : i1, %addr : !fir.ref<index>) {
   fir.do_loop %iv = %lb to %ub step %step unordered {
@@ -35,7 +37,7 @@ func.func private @f2() -> i1
 // CHECK:       fir.store %[[VAL_12]] to %[[VAL_4]] : !fir.ref<index>
 // CHECK:       br ^bb5
 // CHECK:     ^bb5:
-// CHECK:       %[[VAL_13:.*]] = arith.addi %[[VAL_8]], %[[VAL_2]] : index
+// CHECK:       %[[VAL_13:.*]] = arith.addi %[[VAL_8]], %[[VAL_2]] overflow<nsw> : index
 // CHECK:       %[[VAL_14:.*]] = arith.constant 1 : index
 // CHECK:       %[[VAL_15:.*]] = arith.subi %[[VAL_9]], %[[VAL_14]] : index
 // CHECK:       br ^bb1(%[[VAL_13]], %[[VAL_15]] : index, index)
@@ -44,34 +46,6 @@ func.func private @f2() -> i1
 // CHECK:     }
 // CHECK:     func private @f2() -> i1
 
-// NSW:     func @x(%[[VAL_0:.*]]: index, %[[VAL_1:.*]]: index, %[[VAL_2:.*]]: index, %[[VAL_3:.*]]: i1, %[[VAL_4:.*]]: !fir.ref<index>) {
-// NSW:       %[[VAL_5:.*]] = arith.subi %[[VAL_1]], %[[VAL_0]] : index
-// NSW:       %[[VAL_6:.*]] = arith.addi %[[VAL_5]], %[[VAL_2]] : index
-// NSW:       %[[VAL_7:.*]] = arith.divsi %[[VAL_6]], %[[VAL_2]] : index
-// NSW:       br ^bb1(%[[VAL_0]], %[[VAL_7]] : index, index)
-// NSW:     ^bb1(%[[VAL_8:.*]]: index, %[[VAL_9:.*]]: index):
-// NSW:       %[[VAL_10:.*]] = arith.constant 0 : index
-// NSW:       %[[VAL_11:.*]] = arith.cmpi sgt, %[[VAL_9]], %[[VAL_10]] : index
-// NSW:       cond_br %[[VAL_11]], ^bb2, ^bb6
-// NSW:     ^bb2:
-// NSW:       cond_br %[[VAL_3]], ^bb3, ^bb4
-// NSW:     ^bb3:
-// NSW:       fir.store %[[VAL_8]] to %[[VAL_4]] : !fir.ref<index>
-// NSW:       br ^bb5
-// NSW:     ^bb4:
-// NSW:       %[[VAL_12:.*]] = arith.constant 0 : index
-// NSW:       fir.store %[[VAL_12]] to %[[VAL_4]] : !fir.ref<index>
-// NSW:       br ^bb5
-// NSW:     ^bb5:
-// NSW:       %[[VAL_13:.*]] = arith.addi %[[VAL_8]], %[[VAL_2]] overflow<nsw> : index
-// NSW:       %[[VAL_14:.*]] = arith.constant 1 : index
-// NSW:       %[[VAL_15:.*]] = arith.subi %[[VAL_9]], %[[VAL_14]] : index
-// NSW:       br ^bb1(%[[VAL_13]], %[[VAL_15]] : index, index)
-// NSW:     ^bb6:
-// NSW:       return
-// NSW:     }
-// NSW:     func private @f2() -> i1
-
 // -----
 
 func.func @x2(%lo : index, %up : index, %ok : i1) {
@@ -101,36 +75,13 @@ func.func private @f3(i16)
 // CHECK:     cond_br %[[VAL_14]], ^bb2, ^bb3
 // CHECK:   ^bb2:
 // CHECK:     %[[VAL_15:.*]] = fir.call @f2() : () -> i1
-// CHECK:     %[[VAL_16:.*]] = arith.addi %[[VAL_4]], %[[VAL_3]] : index
+// CHECK:     %[[VAL_16:.*]] = arith.addi %[[VAL_4]], %[[VAL_3]] overflow<nsw> : index
 // CHECK:     br ^bb1(%[[VAL_16]], %[[VAL_15]] : index, i1)
 // CHECK:   ^bb3:
 // CHECK:     return
 // CHECK:   }
 // CHECK:   func private @f3(i16)
 
-// NSW:   func @x2(%[[VAL_0:.*]]: index, %[[VAL_1:.*]]: index, %[[VAL_2:.*]]: i1) {
-// NSW:     %[[VAL_3:.*]] = arith.constant 1 : index
-// NSW:     br ^bb1(%[[VAL_0]], %[[VAL_2]] : index, i1)
-// NSW:   ^bb1(%[[VAL_4:.*]]: index, %[[VAL_5:.*]]: i1):
-// NSW:     %[[VAL_6:.*]] = arith.constant 0 : index
-// NSW:     %[[VAL_7:.*]] = arith.cmpi slt, %[[VAL_6]], %[[VAL_3]] : index
-// NSW:     %[[VAL_8:.*]] = arith.cmpi sle, %[[VAL_4]], %[[VAL_1]] : index
-// NSW:     %[[VAL_9:.*]] = arith.cmpi slt, %[[VAL_3]], %[[VAL_6]] : index
-// NSW:     %[[VAL_10:.*]] = arith.cmpi sle, %[[VAL_1]], %[[VAL_4]] : index
-// NSW:     %[[VAL_11:.*]] = arith.andi %[[VAL_7]], %[[VAL_8]] : i1
-// NSW:     %[[VAL_12:.*]] = arith.andi %[[VAL_9]], %[[VAL_10]] : i1
-// NSW:     %[[VAL_13:.*]] = arith.ori %[[VAL_11]], %[[VAL_12]] : i1
-// NSW:     %[[VAL_14:.*]] = arith.andi %[[VAL_5]], %[[VAL_13]] : i1
-// NSW:     cond_br %[[VAL_14]], ^bb2, ^bb3
-// NSW:   ^bb2:
-// NSW:     %[[VAL_15:.*]] = fir.call @f2() : () -> i1
-// NSW:     %[[VAL_16:.*]] = arith.addi %[[VAL_4]], %[[VAL_3]] overflow<nsw> : index
-// NSW:     br ^bb1(%[[VAL_16]], %[[VAL_15]] : index, i1)
-// NSW:   ^bb3:
-// NSW:     return
-// NSW:   }
-// NSW:   func private @f3(i16)
-
 // -----
 
 // do_loop with an extra loop-carried value
@@ -159,7 +110,7 @@ func.func @x3(%lo : index, %up : index) -> i1 {
 // CHECK:           cond_br %[[VAL_11]], ^bb2, ^bb3
 // CHECK:         ^bb2:
 // CHECK:           %[[VAL_12:.*]] = fir.call @f2() : () -> i1
-// CHECK:           %[[VAL_13:.*]] = arith.addi %[[VAL_7]], %[[VAL_2]] : index
+// CHECK:           %[[VAL_13:.*]] = arith.addi %[[VAL_7]], %[[VAL_2]] overflow<nsw> : index
 // CHECK:           %[[VAL_14:.*]] = arith.constant 1 : index
 // CHECK:           %[[VAL_15:.*]] = arith.subi %[[VAL_9]], %[[VAL_14]] : index
 // CHECK:           br ^bb1(%[[VAL_13]], %[[VAL_12]], %[[VAL_15]] : index, i1, index)
@@ -167,29 +118,6 @@ func.func @x3(%lo : index, %up : index) -> i1 {
 // CHECK:           return %[[VAL_8]] : i1
 // CHECK:         }
 
-// NSW-LABEL:   func @x3(
-// NSW-SAME:             %[[VAL_0:.*]]: index,
-// NSW-SAME:             %[[VAL_1:.*]]: index) -> i1 {
-// NSW:           %[[VAL_2:.*]] = arith.constant 1 : index
-// NSW:           %[[VAL_3:.*]] = arith.constant true
-// NSW:           %[[VAL_4:.*]] = arith.subi %[[VAL_1]], %[[VAL_0]] : index
-// NSW:           %[[VAL_5:.*]] = arith.addi %[[VAL_4]], %[[VAL_2]] : index
-// NSW:           %[[VAL_6:.*]] = arith.divsi %[[VAL_5]], %[[VAL_2]] : index
-// NSW:           br ^bb1(%[[VAL_0]], %[[VAL_3]], %[[VAL_6]] : index, i1, index)
-// NSW:         ^bb1(%[[VAL_7:.*]]: index, %[[VAL_8:.*]]: i1, %[[VAL_9:.*]]: index):
-// NSW:           %[[VAL_10:.*]] = arith.constant 0 : index
-// NSW:           %[[VAL_11:.*]] = arith.cmpi sgt, %[[VAL_9]], %[[VAL_10]] : index
-// NSW:           cond_br %[[VAL_11]], ^bb2, ^bb3
-// NSW:         ^bb2:
-// NSW:           %[[VAL_12:.*]] = fir.call @f2() : () -> i1
-// NSW:           %[[VAL_13:.*]] = arith.addi %[[VAL_7]], %[[VAL_2]] overflow<nsw> : index
-// NSW:           %[[VAL_14:.*]] = arith.constant 1 : index
-// NSW:           %[[VAL_15:.*]] = arith.subi %[[VAL_9]], %[[VAL_14]] : index
-// NSW:           br ^bb1(%[[VAL_13]], %[[VAL_12]], %[[VAL_15]] : index, i1, index)
-// NSW:         ^bb3:
-// NSW:           return %[[VAL_8]] : i1
-// NSW:         }
-
 // -----
 
 // iterate_while with an extra loop-carried value
@@ -227,7 +155,7 @@ func.func private @f4(i32) -> i1
 // CHECK:           cond_br %[[VAL_16]], ^bb2, ^bb3
 // CHECK:         ^bb2:
 // CHECK:           %[[VAL_17:.*]] = fir.call @f2() : () -> i1
-// CHECK:           %[[VAL_18:.*]] = arith.addi %[[VAL_5]], %[[VAL_2]] : index
+// CHECK:           %[[VAL_18:.*]] = arith.addi %[[VAL_5]], %[[VAL_2]] overflow<nsw> : index
 // CHECK:           br ^bb1(%[[VAL_18]], %[[VAL_6]], %[[VAL_17]] : index, i1, i1)
 // CHECK:         ^bb3:
 // CHECK:           %[[VAL_19:.*]] = arith.andi %[[VAL_6]], %[[VAL_7]] : i1
@@ -235,34 +163,6 @@ func.func private @f4(i32) -> i1
 // CHECK:         }
 // CHECK:         func private @f4(i32) -> i1
 
-// NSW-LABEL:   func @y3(
-// NSW-SAME:             %[[VAL_0:.*]]: index,
-// NSW-SAME:             %[[VAL_1:.*]]: index) -> i1 {
-// NSW:           %[[VAL_2:.*]] = arith.constant 1 : index
-// NSW:           %[[VAL_3:.*]] = arith.constant true
-// NSW:           %[[VAL_4:.*]] = fir.call @f2() : () -> i1
-// NSW:           br ^bb1(%[[VAL_0]], %[[VAL_3]], %[[VAL_4]] : index, i1, i1)
-// NSW:         ^bb1(%[[VAL_5:.*]]: index, %[[VAL_6:.*]]: i...
[truncated]

tblah

Thanks! LGTM

tblah · 2024-10-28T11:52:45Z

@yus3710-fj after this patch we get a 7-10% slowdown for 503.bwaves_r from SPEC2017. I built with -Ofast -mcpu=native -fuse-ld=lld -flto

As I understood it, the previous patch was supposed to enable your transformation by default, and this one was only supposed to clean up old flags. Did I understand wrong?

Anyway please could you revert this patch and/or fix the regression.

yus3710-fj · 2024-10-28T12:45:18Z

Thank you for the information, @tblah. I'll revert this.

As I understood it, the previous patch was supposed to enable your transformation by default, and this one was only supposed to clean up old flags. Did I understand wrong?

This patch also enables some transformation by default. They would affect the performance. I apologize that I didn't check performance regression.

…flow into -fno-wrapv" (#113901) Reverts #110063 due to the performance regression on 503.bwaves_r in SPEC2017.

yus3710-fj · 2024-10-29T09:09:05Z

@yus3710-fj after this patch we get a 7-10% slowdown for 503.bwaves_r from SPEC2017. I built with -Ofast -mcpu=native -fuse-ld=lld -flto

@tblah I'm attempting to reproduce the performance issue on my Grace machine, but have not yet been successful. Could you please provide the measurement conditions?

tblah · 2024-10-29T13:29:30Z

@yus3710-fj after this patch we get a 7-10% slowdown for 503.bwaves_r from SPEC2017. I built with -Ofast -mcpu=native -fuse-ld=lld -flto

@tblah I'm attempting to reproduce the performance issue on my Grace machine, but have not yet been successful. Could you please provide the measurement conditions?

Thanks for having a look. I tried on a Grace machine and found a smaller regression of ~2.7%. For this measurement I compared bd6ab32 and the commit immediately before. The flags were as I described above. I am comparing the "base rate" numbers reported by spec.

The previous mesaurement (with the larger regression) was on an aws graviton 3 machine.

yus3710-fj · 2024-11-01T02:36:53Z

I confirmed the performance regression on both Graviton3 and Grace when the value of copies is set to 1.

The result of the measurement on Graviton3:

	max	min	avg
before this patch	52.56	52.44	52.518
after this patch	48.90 (-7.48%)	48.79 (-7.48%)	48.86 (-7.49%)

The difference in optimization messages (-Rpass=.*):

zext is used instead of sext after this patch.
LICM is promoted by this patch.

I'm investigating why the performance gets worse even though the optimization is promoted.

…o -fno-wrapv (llvm#110063) nsw is now added to do-variable increment when -fno-wrapv is enabled as GFortran seems to do. That means the option introduced by llvm#91579 isn't necessary any more. Note that the feature of -flang-experimental-integer-overflow is enabled by default.

…flow into -fno-wrapv" (llvm#113901) Reverts llvm#110063 due to the performance regression on 503.bwaves_r in SPEC2017.

tblah · 2024-12-05T11:19:17Z

Hi @yus3710-fj, thanks for investigating the cause of the regression.

There was a recent patch which improved bwaves performance: #111853. It seems that since this patch, I no longer see a regression with the integer overflow enabled so I would be happy to re-land your patch. Please feel free to open a new PR.

I think #117318 is still a real issue, just we happen not to run out of registers anymore now that there is better constant propagation.

I tested this on an AWS graviton 3 (neoverse-v1) machine.

yus3710-fj · 2024-12-06T01:14:59Z

Thank you for the informantion, @tblah. I also confirmed that the regression doesn't happen after fbd89bc.
I'm a little concerned that the regression still occurs when LTO is diabled. However, I'd like to reland this patch because this isn't the root cause of the issue.

…flow into -fno-wrapv" (#118933) This relands #110063. The performance issue on 503.bwaves_r is found not to be related to the patch, and is resolved by fbd89bc when LTO is enabled.

vzakhari · 2024-12-17T19:10:34Z

Hello! Did anyone notice a performance regression after this PR on exchange2? I see it on both EPYC 9684X and grace. The slowdown is more than 1.5x.

I verified that performance restores when adding -fwrapv with the latest flang compiler. The compilation options are -Ofast -march=native. I will look through LLVM dumps, but please let me know if it is a known problem.

tblah · 2024-12-18T09:33:09Z

Hello! Did anyone notice a performance regression after this PR on exchange2? I see it on both EPYC 9684X and grace. The slowdown is more than 1.5x.

I verified that performance restores when adding -fwrapv with the latest flang compiler. The compilation options are -Ofast -march=native. I will look through LLVM dumps, but please let me know if it is a known problem.

My team doesn't closely monitor V2 performance. Another team looked into it. I'm told that initially there was a slowdown because function specialization was not performed on the last iteration of digits. Then that fixed after #118219 so we are happy on our end.

However if you aren't using LTO then you won't be seeing the function specialization anyway. I'm told that the unspecialized version of digits was worse after this patch because of extra spills in the hot part of the code. To me that sounds similar to the bwaves regression the first time this patch landed. More info here: #117318

vzakhari · 2024-12-19T17:36:33Z

Thank you for the pointers and the information, Tom! It looks like exchange2 is pretty much the same problem as with bwaves. I posted a note into #117318

yus3710-fj force-pushed the remove-experimental-integer-overflow branch from c14087b to 934a116 Compare October 21, 2024 10:18

yus3710-fj changed the title ~~[flang] integrate the option -flang-experimental-integer-overflow into -fno-wrapv~~ [flang] Integrate the option -flang-experimental-integer-overflow into -fno-wrapv Oct 21, 2024

yus3710-fj marked this pull request as ready for review October 21, 2024 10:22

llvmbot added clang Clang issues not falling into any other category clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' flang:driver flang Flang issues not falling into any other category flang:fir-hlfir flang:openmp labels Oct 21, 2024

yus3710-fj requested review from tblah, jeanPerier and vzakhari October 21, 2024 10:26

tblah approved these changes Oct 21, 2024

View reviewed changes

yus3710-fj merged commit 96bb375 into llvm:main Oct 25, 2024
16 checks passed

yus3710-fj deleted the remove-experimental-integer-overflow branch October 25, 2024 06:20

frobtech mentioned this pull request Oct 25, 2024

p/libc fexcept t #113664

Closed

yus3710-fj mentioned this pull request Oct 28, 2024

Revert "[flang] Integrate the option -flang-experimental-integer-overflow into -fno-wrapv" #113901

Merged

tblah pushed a commit that referenced this pull request Oct 28, 2024

Revert "[flang] Integrate the option -flang-experimental-integer-over…

bd6ab32

…flow into -fno-wrapv" (#113901) Reverts #110063 due to the performance regression on 503.bwaves_r in SPEC2017.

yus3710-fj mentioned this pull request Nov 22, 2024

[LSR][SCEV] Increases register spilling in 503.bwaves_r (SPEC CPU 2017) #117318

Open

yus3710-fj restored the remove-experimental-integer-overflow branch December 6, 2024 01:20

yus3710-fj mentioned this pull request Dec 6, 2024

Reland "[flang] Integrate the option -flang-experimental-integer-overflow into -fno-wrapv" #118933

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[flang] Integrate the option -flang-experimental-integer-overflow into -fno-wrapv #110063

[flang] Integrate the option -flang-experimental-integer-overflow into -fno-wrapv #110063

yus3710-fj commented Sep 26, 2024 •

edited

Loading

llvmbot commented Oct 21, 2024 •

edited

Loading

llvmbot commented Oct 21, 2024

tblah left a comment

tblah commented Oct 28, 2024

yus3710-fj commented Oct 28, 2024

yus3710-fj commented Oct 29, 2024

tblah commented Oct 29, 2024

yus3710-fj commented Nov 1, 2024

tblah commented Dec 5, 2024

yus3710-fj commented Dec 6, 2024

vzakhari commented Dec 17, 2024

tblah commented Dec 18, 2024

vzakhari commented Dec 19, 2024

[flang] Integrate the option -flang-experimental-integer-overflow into -fno-wrapv #110063

[flang] Integrate the option -flang-experimental-integer-overflow into -fno-wrapv #110063

Conversation

yus3710-fj commented Sep 26, 2024 • edited Loading

llvmbot commented Oct 21, 2024 • edited Loading

llvmbot commented Oct 21, 2024

tblah left a comment

Choose a reason for hiding this comment

tblah commented Oct 28, 2024

yus3710-fj commented Oct 28, 2024

yus3710-fj commented Oct 29, 2024

tblah commented Oct 29, 2024

yus3710-fj commented Nov 1, 2024

tblah commented Dec 5, 2024

yus3710-fj commented Dec 6, 2024

vzakhari commented Dec 17, 2024

tblah commented Dec 18, 2024

vzakhari commented Dec 19, 2024

yus3710-fj commented Sep 26, 2024 •

edited

Loading

llvmbot commented Oct 21, 2024 •

edited

Loading