Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[flang] Integrate the option -flang-experimental-integer-overflow into -fno-wrapv #110063

Merged

Conversation

yus3710-fj
Copy link
Contributor

@yus3710-fj yus3710-fj commented Sep 26, 2024

nsw is now added to do-variable increment when -fno-wrapv is enabled as GFortran seems to do.
That means the option introduced by #91579 isn't necessary any more.

Note that the feature of -flang-experimental-integer-overflow is enabled by default.

…o -fno-wrapv

nsw is added to do-variable increment when -fno-wrapv is enabled
as GFortran seems to do.
Therefore, the option introduced by PR91579 isn't necessary any
more.

Note that the feature of -flang-experimental-integer-overflow
is now enabled by default.
@yus3710-fj yus3710-fj force-pushed the remove-experimental-integer-overflow branch from c14087b to 934a116 Compare October 21, 2024 10:18
@yus3710-fj yus3710-fj changed the title [flang] integrate the option -flang-experimental-integer-overflow into -fno-wrapv [flang] Integrate the option -flang-experimental-integer-overflow into -fno-wrapv Oct 21, 2024
@yus3710-fj yus3710-fj marked this pull request as ready for review October 21, 2024 10:22
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' flang:driver flang Flang issues not falling into any other category flang:fir-hlfir flang:openmp labels Oct 21, 2024
@llvmbot
Copy link
Member

llvmbot commented Oct 21, 2024

@llvm/pr-subscribers-flang-driver
@llvm/pr-subscribers-flang-fir-hlfir

@llvm/pr-subscribers-clang

Author: Yusuke MINATO (yus3710-fj)

Changes

nsw is added to do-variable increment when -fno-wrapv is enabled as GFortran seems to do.
Therefore, the option introduced by #91579 isn't necessary any more.

Note that the feature of -flang-experimental-integer-overflow is now enabled by default.


Patch is 101.96 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/110063.diff

40 Files Affected:

  • (modified) clang/include/clang/Driver/Options.td (-4)
  • (modified) clang/lib/Driver/ToolChains/Flang.cpp (-1)
  • (modified) flang/include/flang/Lower/LoweringOptions.def (-5)
  • (modified) flang/include/flang/Optimizer/Transforms/Passes.h (+2-2)
  • (modified) flang/include/flang/Optimizer/Transforms/Passes.td (+1-1)
  • (modified) flang/include/flang/Tools/CrossToolHelpers.h (+1-1)
  • (modified) flang/lib/Frontend/CompilerInvocation.cpp (-6)
  • (modified) flang/lib/Frontend/FrontendActions.cpp (+2-2)
  • (modified) flang/lib/Lower/Bridge.cpp (+1-1)
  • (modified) flang/lib/Lower/IO.cpp (+1-1)
  • (modified) flang/lib/Optimizer/Passes/Pipelines.cpp (+3-3)
  • (modified) flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp (+4-4)
  • (modified) flang/test/Driver/frontend-forwarding.f90 (-2)
  • (modified) flang/test/Fir/convert-to-llvm-openmp-and-fir.fir (+5-5)
  • (modified) flang/test/Fir/loop01.fir (+11-219)
  • (modified) flang/test/Fir/loop02.fir (+2-2)
  • (modified) flang/test/Lower/HLFIR/goto-do-body.f90 (+2-2)
  • (modified) flang/test/Lower/OpenMP/parallel-private-clause-fixes.f90 (+2-2)
  • (modified) flang/test/Lower/OpenMP/wsloop-reduction-allocatable-array-minmax.f90 (+2-2)
  • (modified) flang/test/Lower/OpenMP/wsloop-variable.f90 (+2-2)
  • (modified) flang/test/Lower/array-character.f90 (+1-1)
  • (modified) flang/test/Lower/array-derived-assignments.f90 (+1-1)
  • (modified) flang/test/Lower/array-derived.f90 (+1-1)
  • (modified) flang/test/Lower/array-elemental-calls-char-byval.f90 (+1-1)
  • (modified) flang/test/Lower/array-elemental-calls-char.f90 (+1-1)
  • (modified) flang/test/Lower/array-expression-assumed-size.f90 (+1-1)
  • (modified) flang/test/Lower/array-expression-slice-1.f90 (+1-1)
  • (modified) flang/test/Lower/array-substring.f90 (+5-41)
  • (modified) flang/test/Lower/array-temp.f90 (+1-1)
  • (modified) flang/test/Lower/components.f90 (+1-1)
  • (modified) flang/test/Lower/do_loop.f90 (+17-56)
  • (modified) flang/test/Lower/do_loop_unstructured.f90 (+11-192)
  • (modified) flang/test/Lower/goto-do-body.f90 (+2-2)
  • (modified) flang/test/Lower/host-associated.f90 (+1-1)
  • (modified) flang/test/Lower/infinite_loop.f90 (+5-36)
  • (modified) flang/test/Lower/io-implied-do-fixes.f90 (+6-43)
  • (modified) flang/test/Lower/loops2.f90 (+1-1)
  • (modified) flang/test/Lower/mixed_loops.f90 (+3-3)
  • (modified) flang/test/Lower/vector-subscript-io.f90 (+17-17)
  • (modified) flang/tools/bbc/bbc.cpp (+1-8)
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index 152c43d7908ff8..6ac1b3beac8a07 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -6771,10 +6771,6 @@ def flang_deprecated_no_hlfir : Flag<["-"], "flang-deprecated-no-hlfir">,
   Flags<[HelpHidden]>, Visibility<[FlangOption, FC1Option]>,
   HelpText<"Do not use HLFIR lowering (deprecated)">;
 
-def flang_experimental_integer_overflow : Flag<["-"], "flang-experimental-integer-overflow">,
-  Flags<[HelpHidden]>, Visibility<[FlangOption, FC1Option]>,
-  HelpText<"Add nsw flag to internal operations such as do-variable increment (experimental)">;
-
 //===----------------------------------------------------------------------===//
 // FLangOption + CoreOption + NoXarchOption
 //===----------------------------------------------------------------------===//
diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp
index a9d2b7a4dc48f9..366cadc2e54775 100644
--- a/clang/lib/Driver/ToolChains/Flang.cpp
+++ b/clang/lib/Driver/ToolChains/Flang.cpp
@@ -148,7 +148,6 @@ void Flang::addCodegenOptions(const ArgList &Args,
 
   Args.addAllArgs(CmdArgs, {options::OPT_flang_experimental_hlfir,
                             options::OPT_flang_deprecated_no_hlfir,
-                            options::OPT_flang_experimental_integer_overflow,
                             options::OPT_fno_ppc_native_vec_elem_order,
                             options::OPT_fppc_native_vec_elem_order});
 }
diff --git a/flang/include/flang/Lower/LoweringOptions.def b/flang/include/flang/Lower/LoweringOptions.def
index 231de533fbd30a..0b22e54b648e94 100644
--- a/flang/include/flang/Lower/LoweringOptions.def
+++ b/flang/include/flang/Lower/LoweringOptions.def
@@ -38,10 +38,5 @@ ENUM_LOWERINGOPT(Underscoring, unsigned, 1, 1)
 /// (i.e. wraps around as two's complement). Off by default.
 ENUM_LOWERINGOPT(IntegerWrapAround, unsigned, 1, 0)
 
-/// If true, add nsw flags to loop variable increments.
-/// Off by default.
-/// TODO: integrate this option with the above
-ENUM_LOWERINGOPT(NSWOnLoopVarInc, unsigned, 1, 0)
-
 #undef LOWERINGOPT
 #undef ENUM_LOWERINGOPT
diff --git a/flang/include/flang/Optimizer/Transforms/Passes.h b/flang/include/flang/Optimizer/Transforms/Passes.h
index 3b2af3a3398108..e4dcb0060c778a 100644
--- a/flang/include/flang/Optimizer/Transforms/Passes.h
+++ b/flang/include/flang/Optimizer/Transforms/Passes.h
@@ -65,7 +65,7 @@ namespace fir {
 std::unique_ptr<mlir::Pass> createAffineDemotionPass();
 std::unique_ptr<mlir::Pass>
 createArrayValueCopyPass(fir::ArrayValueCopyOptions options = {});
-std::unique_ptr<mlir::Pass> createCFGConversionPassWithNSW();
+std::unique_ptr<mlir::Pass> createCFGConversionPassWithoutNSW();
 std::unique_ptr<mlir::Pass> createMemDataFlowOptPass();
 std::unique_ptr<mlir::Pass> createPromoteToAffinePass();
 std::unique_ptr<mlir::Pass>
@@ -82,7 +82,7 @@ createVScaleAttrPass(std::pair<unsigned, unsigned> vscaleAttr);
 
 void populateCfgConversionRewrites(mlir::RewritePatternSet &patterns,
                                    bool forceLoopToExecuteOnce = false,
-                                   bool setNSW = false);
+                                   bool setNSW = true);
 
 // declarative passes
 #define GEN_PASS_REGISTRATION
diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td b/flang/include/flang/Optimizer/Transforms/Passes.td
index af6bd41cbb71da..09eed385c2c931 100644
--- a/flang/include/flang/Optimizer/Transforms/Passes.td
+++ b/flang/include/flang/Optimizer/Transforms/Passes.td
@@ -153,7 +153,7 @@ def CFGConversion : Pass<"cfg-conversion"> {
            /*default=*/"false",
            "force the body of a loop to execute at least once">,
     Option<"setNSW", "set-nsw", "bool",
-           /*default=*/"false",
+           /*default=*/"true",
            "set nsw on loop variable increment">
   ];
 }
diff --git a/flang/include/flang/Tools/CrossToolHelpers.h b/flang/include/flang/Tools/CrossToolHelpers.h
index df4b21ada058fe..1626970600eff2 100644
--- a/flang/include/flang/Tools/CrossToolHelpers.h
+++ b/flang/include/flang/Tools/CrossToolHelpers.h
@@ -122,7 +122,7 @@ struct MLIRToLLVMPassPipelineConfig : public FlangEPCallBacks {
   bool NoSignedZerosFPMath =
       false; ///< Set no-signed-zeros-fp-math attribute for functions.
   bool UnsafeFPMath = false; ///< Set unsafe-fp-math attribute for functions.
-  bool NSWOnLoopVarInc = false; ///< Add nsw flag to loop variable increments.
+  bool NSWOnLoopVarInc = true; ///< Add nsw flag to loop variable increments.
 };
 
 struct OffloadModuleOpts {
diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp
index 94d3d115417877..5da5236af2b0e1 100644
--- a/flang/lib/Frontend/CompilerInvocation.cpp
+++ b/flang/lib/Frontend/CompilerInvocation.cpp
@@ -1350,12 +1350,6 @@ bool CompilerInvocation::createFromArgs(
     invoc.loweringOpts.setNoPPCNativeVecElemOrder(true);
   }
 
-  // -flang-experimental-integer-overflow
-  if (args.hasArg(
-          clang::driver::options::OPT_flang_experimental_integer_overflow)) {
-    invoc.loweringOpts.setNSWOnLoopVarInc(true);
-  }
-
   // Preserve all the remark options requested, i.e. -Rpass, -Rpass-missed or
   // -Rpass-analysis. This will be used later when processing and outputting the
   // remarks generated by LLVM in ExecuteCompilerInvocation.cpp.
diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp
index f2e460fc53a67f..3ea242315484fd 100644
--- a/flang/lib/Frontend/FrontendActions.cpp
+++ b/flang/lib/Frontend/FrontendActions.cpp
@@ -828,8 +828,8 @@ void CodeGenAction::generateLLVMIR() {
     config.VScaleMax = vsr->second;
   }
 
-  if (ci.getInvocation().getLoweringOpts().getNSWOnLoopVarInc())
-    config.NSWOnLoopVarInc = true;
+  if (ci.getInvocation().getLoweringOpts().getIntegerWrapAround())
+    config.NSWOnLoopVarInc = false;
 
   // Create the pass pipeline
   fir::createMLIRToLLVMPassPipeline(pm, config, getCurrentFile());
diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp
index 877fe122265dd0..a3bd1ace11da21 100644
--- a/flang/lib/Lower/Bridge.cpp
+++ b/flang/lib/Lower/Bridge.cpp
@@ -2271,7 +2271,7 @@ class FirConverter : public Fortran::lower::AbstractConverter {
     assert(!incrementLoopNestInfo.empty() && "empty loop nest");
     mlir::Location loc = toLocation();
     mlir::arith::IntegerOverflowFlags flags{};
-    if (getLoweringOptions().getNSWOnLoopVarInc())
+    if (!getLoweringOptions().getIntegerWrapAround())
       flags = bitEnumSet(flags, mlir::arith::IntegerOverflowFlags::nsw);
     auto iofAttr = mlir::arith::IntegerOverflowFlagsAttr::get(
         builder->getContext(), flags);
diff --git a/flang/lib/Lower/IO.cpp b/flang/lib/Lower/IO.cpp
index 1894b0cfd1bec2..b534c81a605a90 100644
--- a/flang/lib/Lower/IO.cpp
+++ b/flang/lib/Lower/IO.cpp
@@ -929,7 +929,7 @@ static void genIoLoop(Fortran::lower::AbstractConverter &converter,
   fir::FirOpBuilder &builder = converter.getFirOpBuilder();
   mlir::Location loc = converter.getCurrentLocation();
   mlir::arith::IntegerOverflowFlags flags{};
-  if (converter.getLoweringOptions().getNSWOnLoopVarInc())
+  if (!converter.getLoweringOptions().getIntegerWrapAround())
     flags = bitEnumSet(flags, mlir::arith::IntegerOverflowFlags::nsw);
   auto iofAttr =
       mlir::arith::IntegerOverflowFlagsAttr::get(builder.getContext(), flags);
diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp
index 3fa5c54403bd8c..fc13d77469f17b 100644
--- a/flang/lib/Optimizer/Passes/Pipelines.cpp
+++ b/flang/lib/Optimizer/Passes/Pipelines.cpp
@@ -35,11 +35,11 @@ void addCanonicalizerPassWithoutRegionSimplification(mlir::OpPassManager &pm) {
 void addCfgConversionPass(mlir::PassManager &pm,
                           const MLIRToLLVMPassPipelineConfig &config) {
   if (config.NSWOnLoopVarInc)
-    addNestedPassToAllTopLevelOperationsConditionally(
-        pm, disableCfgConversion, fir::createCFGConversionPassWithNSW);
-  else
     addNestedPassToAllTopLevelOperationsConditionally(pm, disableCfgConversion,
                                                       fir::createCFGConversion);
+  else
+    addNestedPassToAllTopLevelOperationsConditionally(
+        pm, disableCfgConversion, fir::createCFGConversionPassWithoutNSW);
 }
 
 void addAVC(mlir::PassManager &pm, const llvm::OptimizationLevel &optLevel) {
diff --git a/flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp b/flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp
index 3b79d6d311b71c..411bf7f364a602 100644
--- a/flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp
+++ b/flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp
@@ -332,8 +332,6 @@ class CfgConversion : public fir::impl::CFGConversionBase<CfgConversion> {
 public:
   using CFGConversionBase<CfgConversion>::CFGConversionBase;
 
-  CfgConversion(bool setNSW) { this->setNSW = setNSW; }
-
   void runOnOperation() override {
     auto *context = &this->getContext();
     mlir::RewritePatternSet patterns(context);
@@ -366,6 +364,8 @@ void fir::populateCfgConversionRewrites(mlir::RewritePatternSet &patterns,
       patterns.getContext(), forceLoopToExecuteOnce, setNSW);
 }
 
-std::unique_ptr<mlir::Pass> fir::createCFGConversionPassWithNSW() {
-  return std::make_unique<CfgConversion>(true);
+std::unique_ptr<mlir::Pass> fir::createCFGConversionPassWithoutNSW() {
+  fir::CFGConversionOptions options;
+  options.setNSW = false;
+  return fir::createCFGConversion(options);
 }
diff --git a/flang/test/Driver/frontend-forwarding.f90 b/flang/test/Driver/frontend-forwarding.f90
index ff2d6609521464..55a74ccf40467b 100644
--- a/flang/test/Driver/frontend-forwarding.f90
+++ b/flang/test/Driver/frontend-forwarding.f90
@@ -20,7 +20,6 @@
 ! RUN:     -fversion-loops-for-stride \
 ! RUN:     -flang-experimental-hlfir \
 ! RUN:     -flang-deprecated-no-hlfir \
-! RUN:     -flang-experimental-integer-overflow \
 ! RUN:     -fno-ppc-native-vector-element-order \
 ! RUN:     -fppc-native-vector-element-order \
 ! RUN:     -mllvm -print-before-all \
@@ -52,7 +51,6 @@
 ! CHECK: "-fversion-loops-for-stride"
 ! CHECK: "-flang-experimental-hlfir"
 ! CHECK: "-flang-deprecated-no-hlfir"
-! CHECK: "-flang-experimental-integer-overflow"
 ! CHECK: "-fno-ppc-native-vector-element-order"
 ! CHECK: "-fppc-native-vector-element-order"
 ! CHECK: "-Rpass"
diff --git a/flang/test/Fir/convert-to-llvm-openmp-and-fir.fir b/flang/test/Fir/convert-to-llvm-openmp-and-fir.fir
index 335877e7c9a872..00f8e6e6cc9a6b 100644
--- a/flang/test/Fir/convert-to-llvm-openmp-and-fir.fir
+++ b/flang/test/Fir/convert-to-llvm-openmp-and-fir.fir
@@ -358,10 +358,10 @@ func.func @_QPopenmp_target_data_region() {
       %9 = arith.subi %8, %c1_i64 : i64
       %10 = fir.coordinate_of %0, %9 : (!fir.ref<!fir.array<1024xi32>>, i64) -> !fir.ref<i32>
       fir.store %6 to %10 : !fir.ref<i32>
-      %11 = arith.addi %arg0, %c1 : index
+      %11 = arith.addi %arg0, %c1 overflow<nsw> : index
       %12 = fir.convert %c1 : (index) -> i32
       %13 = fir.load %1 : !fir.ref<i32>
-      %14 = arith.addi %13, %12 : i32
+      %14 = arith.addi %13, %12 overflow<nsw> : i32
       fir.result %11, %14 : index, i32
     }
     fir.store %5#1 to %1 : !fir.ref<i32>
@@ -404,11 +404,11 @@ func.func @_QPopenmp_target_data_region() {
 // CHECK:             %[[VAL_21:.*]] = llvm.sub %[[VAL_19]], %[[VAL_20]]  : i64
 // CHECK:             %[[VAL_22:.*]] = llvm.getelementptr %[[VAL_1]][0, %[[VAL_21]]] : (!llvm.ptr, i64) -> !llvm.ptr
 // CHECK:             llvm.store %[[VAL_17]], %[[VAL_22]] : i32, !llvm.ptr
-// CHECK:             %[[VAL_23:.*]] = llvm.add %[[VAL_12]], %[[VAL_8]]  : i64
+// CHECK:             %[[VAL_23:.*]] = llvm.add %[[VAL_12]], %[[VAL_8]] overflow<nsw> : i64
 // CHECK:             %[[VAL_24:.*]] = llvm.trunc %[[VAL_8]] : i64 to i32
 // CHECK:             %[[VAL_25:.*]] = llvm.load %[[VAL_3]] : !llvm.ptr -> i32
-// CHECK:             %[[VAL_26:.*]] = llvm.add %[[VAL_25]], %[[VAL_24]]  : i32
-// CHECK:             %[[VAL_27:.*]] = llvm.add %[[VAL_12]], %[[VAL_8]]  : i64
+// CHECK:             %[[VAL_26:.*]] = llvm.add %[[VAL_25]], %[[VAL_24]] overflow<nsw> : i32
+// CHECK:             %[[VAL_27:.*]] = llvm.add %[[VAL_12]], %[[VAL_8]] overflow<nsw> : i64
 // CHECK:             %[[VAL_28:.*]] = llvm.mlir.constant(1 : index) : i64
 // CHECK:             %[[VAL_29:.*]] = llvm.sub %[[VAL_14]], %[[VAL_28]]  : i64
 // CHECK:             llvm.br ^bb1(%[[VAL_27]], %[[VAL_26]], %[[VAL_29]] : i64, i32, i64)
diff --git a/flang/test/Fir/loop01.fir b/flang/test/Fir/loop01.fir
index c1cbb522c378c0..30d10b9bbdb979 100644
--- a/flang/test/Fir/loop01.fir
+++ b/flang/test/Fir/loop01.fir
@@ -1,5 +1,7 @@
 // RUN: fir-opt --split-input-file --cfg-conversion %s | FileCheck %s
-// RUN: fir-opt --split-input-file --cfg-conversion="set-nsw=true" %s | FileCheck %s --check-prefix=NSW
+// RUN: fir-opt --split-input-file --cfg-conversion="set-nsw=false" %s | FileCheck %s --check-prefix=NO-NSW
+
+// NO-NSW-NOT: overflow<nsw>
 
 func.func @x(%lb : index, %ub : index, %step : index, %b : i1, %addr : !fir.ref<index>) {
   fir.do_loop %iv = %lb to %ub step %step unordered {
@@ -35,7 +37,7 @@ func.func private @f2() -> i1
 // CHECK:       fir.store %[[VAL_12]] to %[[VAL_4]] : !fir.ref<index>
 // CHECK:       br ^bb5
 // CHECK:     ^bb5:
-// CHECK:       %[[VAL_13:.*]] = arith.addi %[[VAL_8]], %[[VAL_2]] : index
+// CHECK:       %[[VAL_13:.*]] = arith.addi %[[VAL_8]], %[[VAL_2]] overflow<nsw> : index
 // CHECK:       %[[VAL_14:.*]] = arith.constant 1 : index
 // CHECK:       %[[VAL_15:.*]] = arith.subi %[[VAL_9]], %[[VAL_14]] : index
 // CHECK:       br ^bb1(%[[VAL_13]], %[[VAL_15]] : index, index)
@@ -44,34 +46,6 @@ func.func private @f2() -> i1
 // CHECK:     }
 // CHECK:     func private @f2() -> i1
 
-// NSW:     func @x(%[[VAL_0:.*]]: index, %[[VAL_1:.*]]: index, %[[VAL_2:.*]]: index, %[[VAL_3:.*]]: i1, %[[VAL_4:.*]]: !fir.ref<index>) {
-// NSW:       %[[VAL_5:.*]] = arith.subi %[[VAL_1]], %[[VAL_0]] : index
-// NSW:       %[[VAL_6:.*]] = arith.addi %[[VAL_5]], %[[VAL_2]] : index
-// NSW:       %[[VAL_7:.*]] = arith.divsi %[[VAL_6]], %[[VAL_2]] : index
-// NSW:       br ^bb1(%[[VAL_0]], %[[VAL_7]] : index, index)
-// NSW:     ^bb1(%[[VAL_8:.*]]: index, %[[VAL_9:.*]]: index):
-// NSW:       %[[VAL_10:.*]] = arith.constant 0 : index
-// NSW:       %[[VAL_11:.*]] = arith.cmpi sgt, %[[VAL_9]], %[[VAL_10]] : index
-// NSW:       cond_br %[[VAL_11]], ^bb2, ^bb6
-// NSW:     ^bb2:
-// NSW:       cond_br %[[VAL_3]], ^bb3, ^bb4
-// NSW:     ^bb3:
-// NSW:       fir.store %[[VAL_8]] to %[[VAL_4]] : !fir.ref<index>
-// NSW:       br ^bb5
-// NSW:     ^bb4:
-// NSW:       %[[VAL_12:.*]] = arith.constant 0 : index
-// NSW:       fir.store %[[VAL_12]] to %[[VAL_4]] : !fir.ref<index>
-// NSW:       br ^bb5
-// NSW:     ^bb5:
-// NSW:       %[[VAL_13:.*]] = arith.addi %[[VAL_8]], %[[VAL_2]] overflow<nsw> : index
-// NSW:       %[[VAL_14:.*]] = arith.constant 1 : index
-// NSW:       %[[VAL_15:.*]] = arith.subi %[[VAL_9]], %[[VAL_14]] : index
-// NSW:       br ^bb1(%[[VAL_13]], %[[VAL_15]] : index, index)
-// NSW:     ^bb6:
-// NSW:       return
-// NSW:     }
-// NSW:     func private @f2() -> i1
-
 // -----
 
 func.func @x2(%lo : index, %up : index, %ok : i1) {
@@ -101,36 +75,13 @@ func.func private @f3(i16)
 // CHECK:     cond_br %[[VAL_14]], ^bb2, ^bb3
 // CHECK:   ^bb2:
 // CHECK:     %[[VAL_15:.*]] = fir.call @f2() : () -> i1
-// CHECK:     %[[VAL_16:.*]] = arith.addi %[[VAL_4]], %[[VAL_3]] : index
+// CHECK:     %[[VAL_16:.*]] = arith.addi %[[VAL_4]], %[[VAL_3]] overflow<nsw> : index
 // CHECK:     br ^bb1(%[[VAL_16]], %[[VAL_15]] : index, i1)
 // CHECK:   ^bb3:
 // CHECK:     return
 // CHECK:   }
 // CHECK:   func private @f3(i16)
 
-// NSW:   func @x2(%[[VAL_0:.*]]: index, %[[VAL_1:.*]]: index, %[[VAL_2:.*]]: i1) {
-// NSW:     %[[VAL_3:.*]] = arith.constant 1 : index
-// NSW:     br ^bb1(%[[VAL_0]], %[[VAL_2]] : index, i1)
-// NSW:   ^bb1(%[[VAL_4:.*]]: index, %[[VAL_5:.*]]: i1):
-// NSW:     %[[VAL_6:.*]] = arith.constant 0 : index
-// NSW:     %[[VAL_7:.*]] = arith.cmpi slt, %[[VAL_6]], %[[VAL_3]] : index
-// NSW:     %[[VAL_8:.*]] = arith.cmpi sle, %[[VAL_4]], %[[VAL_1]] : index
-// NSW:     %[[VAL_9:.*]] = arith.cmpi slt, %[[VAL_3]], %[[VAL_6]] : index
-// NSW:     %[[VAL_10:.*]] = arith.cmpi sle, %[[VAL_1]], %[[VAL_4]] : index
-// NSW:     %[[VAL_11:.*]] = arith.andi %[[VAL_7]], %[[VAL_8]] : i1
-// NSW:     %[[VAL_12:.*]] = arith.andi %[[VAL_9]], %[[VAL_10]] : i1
-// NSW:     %[[VAL_13:.*]] = arith.ori %[[VAL_11]], %[[VAL_12]] : i1
-// NSW:     %[[VAL_14:.*]] = arith.andi %[[VAL_5]], %[[VAL_13]] : i1
-// NSW:     cond_br %[[VAL_14]], ^bb2, ^bb3
-// NSW:   ^bb2:
-// NSW:     %[[VAL_15:.*]] = fir.call @f2() : () -> i1
-// NSW:     %[[VAL_16:.*]] = arith.addi %[[VAL_4]], %[[VAL_3]] overflow<nsw> : index
-// NSW:     br ^bb1(%[[VAL_16]], %[[VAL_15]] : index, i1)
-// NSW:   ^bb3:
-// NSW:     return
-// NSW:   }
-// NSW:   func private @f3(i16)
-
 // -----
 
 // do_loop with an extra loop-carried value
@@ -159,7 +110,7 @@ func.func @x3(%lo : index, %up : index) -> i1 {
 // CHECK:           cond_br %[[VAL_11]], ^bb2, ^bb3
 // CHECK:         ^bb2:
 // CHECK:           %[[VAL_12:.*]] = fir.call @f2() : () -> i1
-// CHECK:           %[[VAL_13:.*]] = arith.addi %[[VAL_7]], %[[VAL_2]] : index
+// CHECK:           %[[VAL_13:.*]] = arith.addi %[[VAL_7]], %[[VAL_2]] overflow<nsw> : index
 // CHECK:           %[[VAL_14:.*]] = arith.constant 1 : index
 // CHECK:           %[[VAL_15:.*]] = arith.subi %[[VAL_9]], %[[VAL_14]] : index
 // CHECK:           br ^bb1(%[[VAL_13]], %[[VAL_12]], %[[VAL_15]] : index, i1, index)
@@ -167,29 +118,6 @@ func.func @x3(%lo : index, %up : index) -> i1 {
 // CHECK:           return %[[VAL_8]] : i1
 // CHECK:         }
 
-// NSW-LABEL:   func @x3(
-// NSW-SAME:             %[[VAL_0:.*]]: index,
-// NSW-SAME:             %[[VAL_1:.*]]: index) -> i1 {
-// NSW:           %[[VAL_2:.*]] = arith.constant 1 : index
-// NSW:           %[[VAL_3:.*]] = arith.constant true
-// NSW:           %[[VAL_4:.*]] = arith.subi %[[VAL_1]], %[[VAL_0]] : index
-// NSW:           %[[VAL_5:.*]] = arith.addi %[[VAL_4]], %[[VAL_2]] : index
-// NSW:           %[[VAL_6:.*]] = arith.divsi %[[VAL_5]], %[[VAL_2]] : index
-// NSW:           br ^bb1(%[[VAL_0]], %[[VAL_3]], %[[VAL_6]] : index, i1, index)
-// NSW:         ^bb1(%[[VAL_7:.*]]: index, %[[VAL_8:.*]]: i1, %[[VAL_9:.*]]: index):
-// NSW:           %[[VAL_10:.*]] = arith.constant 0 : index
-// NSW:           %[[VAL_11:.*]] = arith.cmpi sgt, %[[VAL_9]], %[[VAL_10]] : index
-// NSW:           cond_br %[[VAL_11]], ^bb2, ^bb3
-// NSW:         ^bb2:
-// NSW:           %[[VAL_12:.*]] = fir.call @f2() : () -> i1
-// NSW:           %[[VAL_13:.*]] = arith.addi %[[VAL_7]], %[[VAL_2]] overflow<nsw> : index
-// NSW:           %[[VAL_14:.*]] = arith.constant 1 : index
-// NSW:           %[[VAL_15:.*]] = arith.subi %[[VAL_9]], %[[VAL_14]] : index
-// NSW:           br ^bb1(%[[VAL_13]], %[[VAL_12]], %[[VAL_15]] : index, i1, index)
-// NSW:         ^bb3:
-// NSW:           return %[[VAL_8]] : i1
-// NSW:         }
-
 // -----
 
 // iterate_while with an extra loop-carried value
@@ -227,7 +155,7 @@ func.func private @f4(i32) -> i1
 // CHECK:           cond_br %[[VAL_16]], ^bb2, ^bb3
 // CHECK:         ^bb2:
 // CHECK:           %[[VAL_17:.*]] = fir.call @f2() : () -> i1
-// CHECK:           %[[VAL_18:.*]] = arith.addi %[[VAL_5]], %[[VAL_2]] : index
+// CHECK:           %[[VAL_18:.*]] = arith.addi %[[VAL_5]], %[[VAL_2]] overflow<nsw> : index
 // CHECK:           br ^bb1(%[[VAL_18]], %[[VAL_6]], %[[VAL_17]] : index, i1, i1)
 // CHECK:         ^bb3:
 // CHECK:           %[[VAL_19:.*]] = arith.andi %[[VAL_6]], %[[VAL_7]] : i1
@@ -235,34 +163,6 @@ func.func private @f4(i32) -> i1
 // CHECK:         }
 // CHECK:         func private @f4(i32) -> i1
 
-// NSW-LABEL:   func @y3(
-// NSW-SAME:             %[[VAL_0:.*]]: index,
-// NSW-SAME:             %[[VAL_1:.*]]: index) -> i1 {
-// NSW:           %[[VAL_2:.*]] = arith.constant 1 : index
-// NSW:           %[[VAL_3:.*]] = arith.constant true
-// NSW:           %[[VAL_4:.*]] = fir.call @f2() : () -> i1
-// NSW:           br ^bb1(%[[VAL_0]], %[[VAL_3]], %[[VAL_4]] : index, i1, i1)
-// NSW:         ^bb1(%[[VAL_5:.*]]: index, %[[VAL_6:.*]]: i...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Oct 21, 2024

@llvm/pr-subscribers-clang-driver

Author: Yusuke MINATO (yus3710-fj)

Changes

nsw is added to do-variable increment when -fno-wrapv is enabled as GFortran seems to do.
Therefore, the option introduced by #91579 isn't necessary any more.

Note that the feature of -flang-experimental-integer-overflow is now enabled by default.


Patch is 101.96 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/110063.diff

40 Files Affected:

  • (modified) clang/include/clang/Driver/Options.td (-4)
  • (modified) clang/lib/Driver/ToolChains/Flang.cpp (-1)
  • (modified) flang/include/flang/Lower/LoweringOptions.def (-5)
  • (modified) flang/include/flang/Optimizer/Transforms/Passes.h (+2-2)
  • (modified) flang/include/flang/Optimizer/Transforms/Passes.td (+1-1)
  • (modified) flang/include/flang/Tools/CrossToolHelpers.h (+1-1)
  • (modified) flang/lib/Frontend/CompilerInvocation.cpp (-6)
  • (modified) flang/lib/Frontend/FrontendActions.cpp (+2-2)
  • (modified) flang/lib/Lower/Bridge.cpp (+1-1)
  • (modified) flang/lib/Lower/IO.cpp (+1-1)
  • (modified) flang/lib/Optimizer/Passes/Pipelines.cpp (+3-3)
  • (modified) flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp (+4-4)
  • (modified) flang/test/Driver/frontend-forwarding.f90 (-2)
  • (modified) flang/test/Fir/convert-to-llvm-openmp-and-fir.fir (+5-5)
  • (modified) flang/test/Fir/loop01.fir (+11-219)
  • (modified) flang/test/Fir/loop02.fir (+2-2)
  • (modified) flang/test/Lower/HLFIR/goto-do-body.f90 (+2-2)
  • (modified) flang/test/Lower/OpenMP/parallel-private-clause-fixes.f90 (+2-2)
  • (modified) flang/test/Lower/OpenMP/wsloop-reduction-allocatable-array-minmax.f90 (+2-2)
  • (modified) flang/test/Lower/OpenMP/wsloop-variable.f90 (+2-2)
  • (modified) flang/test/Lower/array-character.f90 (+1-1)
  • (modified) flang/test/Lower/array-derived-assignments.f90 (+1-1)
  • (modified) flang/test/Lower/array-derived.f90 (+1-1)
  • (modified) flang/test/Lower/array-elemental-calls-char-byval.f90 (+1-1)
  • (modified) flang/test/Lower/array-elemental-calls-char.f90 (+1-1)
  • (modified) flang/test/Lower/array-expression-assumed-size.f90 (+1-1)
  • (modified) flang/test/Lower/array-expression-slice-1.f90 (+1-1)
  • (modified) flang/test/Lower/array-substring.f90 (+5-41)
  • (modified) flang/test/Lower/array-temp.f90 (+1-1)
  • (modified) flang/test/Lower/components.f90 (+1-1)
  • (modified) flang/test/Lower/do_loop.f90 (+17-56)
  • (modified) flang/test/Lower/do_loop_unstructured.f90 (+11-192)
  • (modified) flang/test/Lower/goto-do-body.f90 (+2-2)
  • (modified) flang/test/Lower/host-associated.f90 (+1-1)
  • (modified) flang/test/Lower/infinite_loop.f90 (+5-36)
  • (modified) flang/test/Lower/io-implied-do-fixes.f90 (+6-43)
  • (modified) flang/test/Lower/loops2.f90 (+1-1)
  • (modified) flang/test/Lower/mixed_loops.f90 (+3-3)
  • (modified) flang/test/Lower/vector-subscript-io.f90 (+17-17)
  • (modified) flang/tools/bbc/bbc.cpp (+1-8)
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index 152c43d7908ff8..6ac1b3beac8a07 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -6771,10 +6771,6 @@ def flang_deprecated_no_hlfir : Flag<["-"], "flang-deprecated-no-hlfir">,
   Flags<[HelpHidden]>, Visibility<[FlangOption, FC1Option]>,
   HelpText<"Do not use HLFIR lowering (deprecated)">;
 
-def flang_experimental_integer_overflow : Flag<["-"], "flang-experimental-integer-overflow">,
-  Flags<[HelpHidden]>, Visibility<[FlangOption, FC1Option]>,
-  HelpText<"Add nsw flag to internal operations such as do-variable increment (experimental)">;
-
 //===----------------------------------------------------------------------===//
 // FLangOption + CoreOption + NoXarchOption
 //===----------------------------------------------------------------------===//
diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp
index a9d2b7a4dc48f9..366cadc2e54775 100644
--- a/clang/lib/Driver/ToolChains/Flang.cpp
+++ b/clang/lib/Driver/ToolChains/Flang.cpp
@@ -148,7 +148,6 @@ void Flang::addCodegenOptions(const ArgList &Args,
 
   Args.addAllArgs(CmdArgs, {options::OPT_flang_experimental_hlfir,
                             options::OPT_flang_deprecated_no_hlfir,
-                            options::OPT_flang_experimental_integer_overflow,
                             options::OPT_fno_ppc_native_vec_elem_order,
                             options::OPT_fppc_native_vec_elem_order});
 }
diff --git a/flang/include/flang/Lower/LoweringOptions.def b/flang/include/flang/Lower/LoweringOptions.def
index 231de533fbd30a..0b22e54b648e94 100644
--- a/flang/include/flang/Lower/LoweringOptions.def
+++ b/flang/include/flang/Lower/LoweringOptions.def
@@ -38,10 +38,5 @@ ENUM_LOWERINGOPT(Underscoring, unsigned, 1, 1)
 /// (i.e. wraps around as two's complement). Off by default.
 ENUM_LOWERINGOPT(IntegerWrapAround, unsigned, 1, 0)
 
-/// If true, add nsw flags to loop variable increments.
-/// Off by default.
-/// TODO: integrate this option with the above
-ENUM_LOWERINGOPT(NSWOnLoopVarInc, unsigned, 1, 0)
-
 #undef LOWERINGOPT
 #undef ENUM_LOWERINGOPT
diff --git a/flang/include/flang/Optimizer/Transforms/Passes.h b/flang/include/flang/Optimizer/Transforms/Passes.h
index 3b2af3a3398108..e4dcb0060c778a 100644
--- a/flang/include/flang/Optimizer/Transforms/Passes.h
+++ b/flang/include/flang/Optimizer/Transforms/Passes.h
@@ -65,7 +65,7 @@ namespace fir {
 std::unique_ptr<mlir::Pass> createAffineDemotionPass();
 std::unique_ptr<mlir::Pass>
 createArrayValueCopyPass(fir::ArrayValueCopyOptions options = {});
-std::unique_ptr<mlir::Pass> createCFGConversionPassWithNSW();
+std::unique_ptr<mlir::Pass> createCFGConversionPassWithoutNSW();
 std::unique_ptr<mlir::Pass> createMemDataFlowOptPass();
 std::unique_ptr<mlir::Pass> createPromoteToAffinePass();
 std::unique_ptr<mlir::Pass>
@@ -82,7 +82,7 @@ createVScaleAttrPass(std::pair<unsigned, unsigned> vscaleAttr);
 
 void populateCfgConversionRewrites(mlir::RewritePatternSet &patterns,
                                    bool forceLoopToExecuteOnce = false,
-                                   bool setNSW = false);
+                                   bool setNSW = true);
 
 // declarative passes
 #define GEN_PASS_REGISTRATION
diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td b/flang/include/flang/Optimizer/Transforms/Passes.td
index af6bd41cbb71da..09eed385c2c931 100644
--- a/flang/include/flang/Optimizer/Transforms/Passes.td
+++ b/flang/include/flang/Optimizer/Transforms/Passes.td
@@ -153,7 +153,7 @@ def CFGConversion : Pass<"cfg-conversion"> {
            /*default=*/"false",
            "force the body of a loop to execute at least once">,
     Option<"setNSW", "set-nsw", "bool",
-           /*default=*/"false",
+           /*default=*/"true",
            "set nsw on loop variable increment">
   ];
 }
diff --git a/flang/include/flang/Tools/CrossToolHelpers.h b/flang/include/flang/Tools/CrossToolHelpers.h
index df4b21ada058fe..1626970600eff2 100644
--- a/flang/include/flang/Tools/CrossToolHelpers.h
+++ b/flang/include/flang/Tools/CrossToolHelpers.h
@@ -122,7 +122,7 @@ struct MLIRToLLVMPassPipelineConfig : public FlangEPCallBacks {
   bool NoSignedZerosFPMath =
       false; ///< Set no-signed-zeros-fp-math attribute for functions.
   bool UnsafeFPMath = false; ///< Set unsafe-fp-math attribute for functions.
-  bool NSWOnLoopVarInc = false; ///< Add nsw flag to loop variable increments.
+  bool NSWOnLoopVarInc = true; ///< Add nsw flag to loop variable increments.
 };
 
 struct OffloadModuleOpts {
diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp
index 94d3d115417877..5da5236af2b0e1 100644
--- a/flang/lib/Frontend/CompilerInvocation.cpp
+++ b/flang/lib/Frontend/CompilerInvocation.cpp
@@ -1350,12 +1350,6 @@ bool CompilerInvocation::createFromArgs(
     invoc.loweringOpts.setNoPPCNativeVecElemOrder(true);
   }
 
-  // -flang-experimental-integer-overflow
-  if (args.hasArg(
-          clang::driver::options::OPT_flang_experimental_integer_overflow)) {
-    invoc.loweringOpts.setNSWOnLoopVarInc(true);
-  }
-
   // Preserve all the remark options requested, i.e. -Rpass, -Rpass-missed or
   // -Rpass-analysis. This will be used later when processing and outputting the
   // remarks generated by LLVM in ExecuteCompilerInvocation.cpp.
diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp
index f2e460fc53a67f..3ea242315484fd 100644
--- a/flang/lib/Frontend/FrontendActions.cpp
+++ b/flang/lib/Frontend/FrontendActions.cpp
@@ -828,8 +828,8 @@ void CodeGenAction::generateLLVMIR() {
     config.VScaleMax = vsr->second;
   }
 
-  if (ci.getInvocation().getLoweringOpts().getNSWOnLoopVarInc())
-    config.NSWOnLoopVarInc = true;
+  if (ci.getInvocation().getLoweringOpts().getIntegerWrapAround())
+    config.NSWOnLoopVarInc = false;
 
   // Create the pass pipeline
   fir::createMLIRToLLVMPassPipeline(pm, config, getCurrentFile());
diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp
index 877fe122265dd0..a3bd1ace11da21 100644
--- a/flang/lib/Lower/Bridge.cpp
+++ b/flang/lib/Lower/Bridge.cpp
@@ -2271,7 +2271,7 @@ class FirConverter : public Fortran::lower::AbstractConverter {
     assert(!incrementLoopNestInfo.empty() && "empty loop nest");
     mlir::Location loc = toLocation();
     mlir::arith::IntegerOverflowFlags flags{};
-    if (getLoweringOptions().getNSWOnLoopVarInc())
+    if (!getLoweringOptions().getIntegerWrapAround())
       flags = bitEnumSet(flags, mlir::arith::IntegerOverflowFlags::nsw);
     auto iofAttr = mlir::arith::IntegerOverflowFlagsAttr::get(
         builder->getContext(), flags);
diff --git a/flang/lib/Lower/IO.cpp b/flang/lib/Lower/IO.cpp
index 1894b0cfd1bec2..b534c81a605a90 100644
--- a/flang/lib/Lower/IO.cpp
+++ b/flang/lib/Lower/IO.cpp
@@ -929,7 +929,7 @@ static void genIoLoop(Fortran::lower::AbstractConverter &converter,
   fir::FirOpBuilder &builder = converter.getFirOpBuilder();
   mlir::Location loc = converter.getCurrentLocation();
   mlir::arith::IntegerOverflowFlags flags{};
-  if (converter.getLoweringOptions().getNSWOnLoopVarInc())
+  if (!converter.getLoweringOptions().getIntegerWrapAround())
     flags = bitEnumSet(flags, mlir::arith::IntegerOverflowFlags::nsw);
   auto iofAttr =
       mlir::arith::IntegerOverflowFlagsAttr::get(builder.getContext(), flags);
diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp
index 3fa5c54403bd8c..fc13d77469f17b 100644
--- a/flang/lib/Optimizer/Passes/Pipelines.cpp
+++ b/flang/lib/Optimizer/Passes/Pipelines.cpp
@@ -35,11 +35,11 @@ void addCanonicalizerPassWithoutRegionSimplification(mlir::OpPassManager &pm) {
 void addCfgConversionPass(mlir::PassManager &pm,
                           const MLIRToLLVMPassPipelineConfig &config) {
   if (config.NSWOnLoopVarInc)
-    addNestedPassToAllTopLevelOperationsConditionally(
-        pm, disableCfgConversion, fir::createCFGConversionPassWithNSW);
-  else
     addNestedPassToAllTopLevelOperationsConditionally(pm, disableCfgConversion,
                                                       fir::createCFGConversion);
+  else
+    addNestedPassToAllTopLevelOperationsConditionally(
+        pm, disableCfgConversion, fir::createCFGConversionPassWithoutNSW);
 }
 
 void addAVC(mlir::PassManager &pm, const llvm::OptimizationLevel &optLevel) {
diff --git a/flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp b/flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp
index 3b79d6d311b71c..411bf7f364a602 100644
--- a/flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp
+++ b/flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp
@@ -332,8 +332,6 @@ class CfgConversion : public fir::impl::CFGConversionBase<CfgConversion> {
 public:
   using CFGConversionBase<CfgConversion>::CFGConversionBase;
 
-  CfgConversion(bool setNSW) { this->setNSW = setNSW; }
-
   void runOnOperation() override {
     auto *context = &this->getContext();
     mlir::RewritePatternSet patterns(context);
@@ -366,6 +364,8 @@ void fir::populateCfgConversionRewrites(mlir::RewritePatternSet &patterns,
       patterns.getContext(), forceLoopToExecuteOnce, setNSW);
 }
 
-std::unique_ptr<mlir::Pass> fir::createCFGConversionPassWithNSW() {
-  return std::make_unique<CfgConversion>(true);
+std::unique_ptr<mlir::Pass> fir::createCFGConversionPassWithoutNSW() {
+  fir::CFGConversionOptions options;
+  options.setNSW = false;
+  return fir::createCFGConversion(options);
 }
diff --git a/flang/test/Driver/frontend-forwarding.f90 b/flang/test/Driver/frontend-forwarding.f90
index ff2d6609521464..55a74ccf40467b 100644
--- a/flang/test/Driver/frontend-forwarding.f90
+++ b/flang/test/Driver/frontend-forwarding.f90
@@ -20,7 +20,6 @@
 ! RUN:     -fversion-loops-for-stride \
 ! RUN:     -flang-experimental-hlfir \
 ! RUN:     -flang-deprecated-no-hlfir \
-! RUN:     -flang-experimental-integer-overflow \
 ! RUN:     -fno-ppc-native-vector-element-order \
 ! RUN:     -fppc-native-vector-element-order \
 ! RUN:     -mllvm -print-before-all \
@@ -52,7 +51,6 @@
 ! CHECK: "-fversion-loops-for-stride"
 ! CHECK: "-flang-experimental-hlfir"
 ! CHECK: "-flang-deprecated-no-hlfir"
-! CHECK: "-flang-experimental-integer-overflow"
 ! CHECK: "-fno-ppc-native-vector-element-order"
 ! CHECK: "-fppc-native-vector-element-order"
 ! CHECK: "-Rpass"
diff --git a/flang/test/Fir/convert-to-llvm-openmp-and-fir.fir b/flang/test/Fir/convert-to-llvm-openmp-and-fir.fir
index 335877e7c9a872..00f8e6e6cc9a6b 100644
--- a/flang/test/Fir/convert-to-llvm-openmp-and-fir.fir
+++ b/flang/test/Fir/convert-to-llvm-openmp-and-fir.fir
@@ -358,10 +358,10 @@ func.func @_QPopenmp_target_data_region() {
       %9 = arith.subi %8, %c1_i64 : i64
       %10 = fir.coordinate_of %0, %9 : (!fir.ref<!fir.array<1024xi32>>, i64) -> !fir.ref<i32>
       fir.store %6 to %10 : !fir.ref<i32>
-      %11 = arith.addi %arg0, %c1 : index
+      %11 = arith.addi %arg0, %c1 overflow<nsw> : index
       %12 = fir.convert %c1 : (index) -> i32
       %13 = fir.load %1 : !fir.ref<i32>
-      %14 = arith.addi %13, %12 : i32
+      %14 = arith.addi %13, %12 overflow<nsw> : i32
       fir.result %11, %14 : index, i32
     }
     fir.store %5#1 to %1 : !fir.ref<i32>
@@ -404,11 +404,11 @@ func.func @_QPopenmp_target_data_region() {
 // CHECK:             %[[VAL_21:.*]] = llvm.sub %[[VAL_19]], %[[VAL_20]]  : i64
 // CHECK:             %[[VAL_22:.*]] = llvm.getelementptr %[[VAL_1]][0, %[[VAL_21]]] : (!llvm.ptr, i64) -> !llvm.ptr
 // CHECK:             llvm.store %[[VAL_17]], %[[VAL_22]] : i32, !llvm.ptr
-// CHECK:             %[[VAL_23:.*]] = llvm.add %[[VAL_12]], %[[VAL_8]]  : i64
+// CHECK:             %[[VAL_23:.*]] = llvm.add %[[VAL_12]], %[[VAL_8]] overflow<nsw> : i64
 // CHECK:             %[[VAL_24:.*]] = llvm.trunc %[[VAL_8]] : i64 to i32
 // CHECK:             %[[VAL_25:.*]] = llvm.load %[[VAL_3]] : !llvm.ptr -> i32
-// CHECK:             %[[VAL_26:.*]] = llvm.add %[[VAL_25]], %[[VAL_24]]  : i32
-// CHECK:             %[[VAL_27:.*]] = llvm.add %[[VAL_12]], %[[VAL_8]]  : i64
+// CHECK:             %[[VAL_26:.*]] = llvm.add %[[VAL_25]], %[[VAL_24]] overflow<nsw> : i32
+// CHECK:             %[[VAL_27:.*]] = llvm.add %[[VAL_12]], %[[VAL_8]] overflow<nsw> : i64
 // CHECK:             %[[VAL_28:.*]] = llvm.mlir.constant(1 : index) : i64
 // CHECK:             %[[VAL_29:.*]] = llvm.sub %[[VAL_14]], %[[VAL_28]]  : i64
 // CHECK:             llvm.br ^bb1(%[[VAL_27]], %[[VAL_26]], %[[VAL_29]] : i64, i32, i64)
diff --git a/flang/test/Fir/loop01.fir b/flang/test/Fir/loop01.fir
index c1cbb522c378c0..30d10b9bbdb979 100644
--- a/flang/test/Fir/loop01.fir
+++ b/flang/test/Fir/loop01.fir
@@ -1,5 +1,7 @@
 // RUN: fir-opt --split-input-file --cfg-conversion %s | FileCheck %s
-// RUN: fir-opt --split-input-file --cfg-conversion="set-nsw=true" %s | FileCheck %s --check-prefix=NSW
+// RUN: fir-opt --split-input-file --cfg-conversion="set-nsw=false" %s | FileCheck %s --check-prefix=NO-NSW
+
+// NO-NSW-NOT: overflow<nsw>
 
 func.func @x(%lb : index, %ub : index, %step : index, %b : i1, %addr : !fir.ref<index>) {
   fir.do_loop %iv = %lb to %ub step %step unordered {
@@ -35,7 +37,7 @@ func.func private @f2() -> i1
 // CHECK:       fir.store %[[VAL_12]] to %[[VAL_4]] : !fir.ref<index>
 // CHECK:       br ^bb5
 // CHECK:     ^bb5:
-// CHECK:       %[[VAL_13:.*]] = arith.addi %[[VAL_8]], %[[VAL_2]] : index
+// CHECK:       %[[VAL_13:.*]] = arith.addi %[[VAL_8]], %[[VAL_2]] overflow<nsw> : index
 // CHECK:       %[[VAL_14:.*]] = arith.constant 1 : index
 // CHECK:       %[[VAL_15:.*]] = arith.subi %[[VAL_9]], %[[VAL_14]] : index
 // CHECK:       br ^bb1(%[[VAL_13]], %[[VAL_15]] : index, index)
@@ -44,34 +46,6 @@ func.func private @f2() -> i1
 // CHECK:     }
 // CHECK:     func private @f2() -> i1
 
-// NSW:     func @x(%[[VAL_0:.*]]: index, %[[VAL_1:.*]]: index, %[[VAL_2:.*]]: index, %[[VAL_3:.*]]: i1, %[[VAL_4:.*]]: !fir.ref<index>) {
-// NSW:       %[[VAL_5:.*]] = arith.subi %[[VAL_1]], %[[VAL_0]] : index
-// NSW:       %[[VAL_6:.*]] = arith.addi %[[VAL_5]], %[[VAL_2]] : index
-// NSW:       %[[VAL_7:.*]] = arith.divsi %[[VAL_6]], %[[VAL_2]] : index
-// NSW:       br ^bb1(%[[VAL_0]], %[[VAL_7]] : index, index)
-// NSW:     ^bb1(%[[VAL_8:.*]]: index, %[[VAL_9:.*]]: index):
-// NSW:       %[[VAL_10:.*]] = arith.constant 0 : index
-// NSW:       %[[VAL_11:.*]] = arith.cmpi sgt, %[[VAL_9]], %[[VAL_10]] : index
-// NSW:       cond_br %[[VAL_11]], ^bb2, ^bb6
-// NSW:     ^bb2:
-// NSW:       cond_br %[[VAL_3]], ^bb3, ^bb4
-// NSW:     ^bb3:
-// NSW:       fir.store %[[VAL_8]] to %[[VAL_4]] : !fir.ref<index>
-// NSW:       br ^bb5
-// NSW:     ^bb4:
-// NSW:       %[[VAL_12:.*]] = arith.constant 0 : index
-// NSW:       fir.store %[[VAL_12]] to %[[VAL_4]] : !fir.ref<index>
-// NSW:       br ^bb5
-// NSW:     ^bb5:
-// NSW:       %[[VAL_13:.*]] = arith.addi %[[VAL_8]], %[[VAL_2]] overflow<nsw> : index
-// NSW:       %[[VAL_14:.*]] = arith.constant 1 : index
-// NSW:       %[[VAL_15:.*]] = arith.subi %[[VAL_9]], %[[VAL_14]] : index
-// NSW:       br ^bb1(%[[VAL_13]], %[[VAL_15]] : index, index)
-// NSW:     ^bb6:
-// NSW:       return
-// NSW:     }
-// NSW:     func private @f2() -> i1
-
 // -----
 
 func.func @x2(%lo : index, %up : index, %ok : i1) {
@@ -101,36 +75,13 @@ func.func private @f3(i16)
 // CHECK:     cond_br %[[VAL_14]], ^bb2, ^bb3
 // CHECK:   ^bb2:
 // CHECK:     %[[VAL_15:.*]] = fir.call @f2() : () -> i1
-// CHECK:     %[[VAL_16:.*]] = arith.addi %[[VAL_4]], %[[VAL_3]] : index
+// CHECK:     %[[VAL_16:.*]] = arith.addi %[[VAL_4]], %[[VAL_3]] overflow<nsw> : index
 // CHECK:     br ^bb1(%[[VAL_16]], %[[VAL_15]] : index, i1)
 // CHECK:   ^bb3:
 // CHECK:     return
 // CHECK:   }
 // CHECK:   func private @f3(i16)
 
-// NSW:   func @x2(%[[VAL_0:.*]]: index, %[[VAL_1:.*]]: index, %[[VAL_2:.*]]: i1) {
-// NSW:     %[[VAL_3:.*]] = arith.constant 1 : index
-// NSW:     br ^bb1(%[[VAL_0]], %[[VAL_2]] : index, i1)
-// NSW:   ^bb1(%[[VAL_4:.*]]: index, %[[VAL_5:.*]]: i1):
-// NSW:     %[[VAL_6:.*]] = arith.constant 0 : index
-// NSW:     %[[VAL_7:.*]] = arith.cmpi slt, %[[VAL_6]], %[[VAL_3]] : index
-// NSW:     %[[VAL_8:.*]] = arith.cmpi sle, %[[VAL_4]], %[[VAL_1]] : index
-// NSW:     %[[VAL_9:.*]] = arith.cmpi slt, %[[VAL_3]], %[[VAL_6]] : index
-// NSW:     %[[VAL_10:.*]] = arith.cmpi sle, %[[VAL_1]], %[[VAL_4]] : index
-// NSW:     %[[VAL_11:.*]] = arith.andi %[[VAL_7]], %[[VAL_8]] : i1
-// NSW:     %[[VAL_12:.*]] = arith.andi %[[VAL_9]], %[[VAL_10]] : i1
-// NSW:     %[[VAL_13:.*]] = arith.ori %[[VAL_11]], %[[VAL_12]] : i1
-// NSW:     %[[VAL_14:.*]] = arith.andi %[[VAL_5]], %[[VAL_13]] : i1
-// NSW:     cond_br %[[VAL_14]], ^bb2, ^bb3
-// NSW:   ^bb2:
-// NSW:     %[[VAL_15:.*]] = fir.call @f2() : () -> i1
-// NSW:     %[[VAL_16:.*]] = arith.addi %[[VAL_4]], %[[VAL_3]] overflow<nsw> : index
-// NSW:     br ^bb1(%[[VAL_16]], %[[VAL_15]] : index, i1)
-// NSW:   ^bb3:
-// NSW:     return
-// NSW:   }
-// NSW:   func private @f3(i16)
-
 // -----
 
 // do_loop with an extra loop-carried value
@@ -159,7 +110,7 @@ func.func @x3(%lo : index, %up : index) -> i1 {
 // CHECK:           cond_br %[[VAL_11]], ^bb2, ^bb3
 // CHECK:         ^bb2:
 // CHECK:           %[[VAL_12:.*]] = fir.call @f2() : () -> i1
-// CHECK:           %[[VAL_13:.*]] = arith.addi %[[VAL_7]], %[[VAL_2]] : index
+// CHECK:           %[[VAL_13:.*]] = arith.addi %[[VAL_7]], %[[VAL_2]] overflow<nsw> : index
 // CHECK:           %[[VAL_14:.*]] = arith.constant 1 : index
 // CHECK:           %[[VAL_15:.*]] = arith.subi %[[VAL_9]], %[[VAL_14]] : index
 // CHECK:           br ^bb1(%[[VAL_13]], %[[VAL_12]], %[[VAL_15]] : index, i1, index)
@@ -167,29 +118,6 @@ func.func @x3(%lo : index, %up : index) -> i1 {
 // CHECK:           return %[[VAL_8]] : i1
 // CHECK:         }
 
-// NSW-LABEL:   func @x3(
-// NSW-SAME:             %[[VAL_0:.*]]: index,
-// NSW-SAME:             %[[VAL_1:.*]]: index) -> i1 {
-// NSW:           %[[VAL_2:.*]] = arith.constant 1 : index
-// NSW:           %[[VAL_3:.*]] = arith.constant true
-// NSW:           %[[VAL_4:.*]] = arith.subi %[[VAL_1]], %[[VAL_0]] : index
-// NSW:           %[[VAL_5:.*]] = arith.addi %[[VAL_4]], %[[VAL_2]] : index
-// NSW:           %[[VAL_6:.*]] = arith.divsi %[[VAL_5]], %[[VAL_2]] : index
-// NSW:           br ^bb1(%[[VAL_0]], %[[VAL_3]], %[[VAL_6]] : index, i1, index)
-// NSW:         ^bb1(%[[VAL_7:.*]]: index, %[[VAL_8:.*]]: i1, %[[VAL_9:.*]]: index):
-// NSW:           %[[VAL_10:.*]] = arith.constant 0 : index
-// NSW:           %[[VAL_11:.*]] = arith.cmpi sgt, %[[VAL_9]], %[[VAL_10]] : index
-// NSW:           cond_br %[[VAL_11]], ^bb2, ^bb3
-// NSW:         ^bb2:
-// NSW:           %[[VAL_12:.*]] = fir.call @f2() : () -> i1
-// NSW:           %[[VAL_13:.*]] = arith.addi %[[VAL_7]], %[[VAL_2]] overflow<nsw> : index
-// NSW:           %[[VAL_14:.*]] = arith.constant 1 : index
-// NSW:           %[[VAL_15:.*]] = arith.subi %[[VAL_9]], %[[VAL_14]] : index
-// NSW:           br ^bb1(%[[VAL_13]], %[[VAL_12]], %[[VAL_15]] : index, i1, index)
-// NSW:         ^bb3:
-// NSW:           return %[[VAL_8]] : i1
-// NSW:         }
-
 // -----
 
 // iterate_while with an extra loop-carried value
@@ -227,7 +155,7 @@ func.func private @f4(i32) -> i1
 // CHECK:           cond_br %[[VAL_16]], ^bb2, ^bb3
 // CHECK:         ^bb2:
 // CHECK:           %[[VAL_17:.*]] = fir.call @f2() : () -> i1
-// CHECK:           %[[VAL_18:.*]] = arith.addi %[[VAL_5]], %[[VAL_2]] : index
+// CHECK:           %[[VAL_18:.*]] = arith.addi %[[VAL_5]], %[[VAL_2]] overflow<nsw> : index
 // CHECK:           br ^bb1(%[[VAL_18]], %[[VAL_6]], %[[VAL_17]] : index, i1, i1)
 // CHECK:         ^bb3:
 // CHECK:           %[[VAL_19:.*]] = arith.andi %[[VAL_6]], %[[VAL_7]] : i1
@@ -235,34 +163,6 @@ func.func private @f4(i32) -> i1
 // CHECK:         }
 // CHECK:         func private @f4(i32) -> i1
 
-// NSW-LABEL:   func @y3(
-// NSW-SAME:             %[[VAL_0:.*]]: index,
-// NSW-SAME:             %[[VAL_1:.*]]: index) -> i1 {
-// NSW:           %[[VAL_2:.*]] = arith.constant 1 : index
-// NSW:           %[[VAL_3:.*]] = arith.constant true
-// NSW:           %[[VAL_4:.*]] = fir.call @f2() : () -> i1
-// NSW:           br ^bb1(%[[VAL_0]], %[[VAL_3]], %[[VAL_4]] : index, i1, i1)
-// NSW:         ^bb1(%[[VAL_5:.*]]: index, %[[VAL_6:.*]]: i...
[truncated]

Copy link
Contributor

@tblah tblah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! LGTM

@yus3710-fj yus3710-fj merged commit 96bb375 into llvm:main Oct 25, 2024
16 checks passed
@yus3710-fj yus3710-fj deleted the remove-experimental-integer-overflow branch October 25, 2024 06:20
@frobtech frobtech mentioned this pull request Oct 25, 2024
@tblah
Copy link
Contributor

tblah commented Oct 28, 2024

@yus3710-fj after this patch we get a 7-10% slowdown for 503.bwaves_r from SPEC2017. I built with -Ofast -mcpu=native -fuse-ld=lld -flto

As I understood it, the previous patch was supposed to enable your transformation by default, and this one was only supposed to clean up old flags. Did I understand wrong?

Anyway please could you revert this patch and/or fix the regression.

@yus3710-fj
Copy link
Contributor Author

Thank you for the information, @tblah. I'll revert this.

As I understood it, the previous patch was supposed to enable your transformation by default, and this one was only supposed to clean up old flags. Did I understand wrong?

This patch also enables some transformation by default. They would affect the performance. I apologize that I didn't check performance regression.

tblah pushed a commit that referenced this pull request Oct 28, 2024
…flow into -fno-wrapv" (#113901)

Reverts #110063 due to the performance regression on
503.bwaves_r in SPEC2017.
@yus3710-fj
Copy link
Contributor Author

@yus3710-fj after this patch we get a 7-10% slowdown for 503.bwaves_r from SPEC2017. I built with -Ofast -mcpu=native -fuse-ld=lld -flto

@tblah I'm attempting to reproduce the performance issue on my Grace machine, but have not yet been successful. Could you please provide the measurement conditions?

@tblah
Copy link
Contributor

tblah commented Oct 29, 2024

@yus3710-fj after this patch we get a 7-10% slowdown for 503.bwaves_r from SPEC2017. I built with -Ofast -mcpu=native -fuse-ld=lld -flto

@tblah I'm attempting to reproduce the performance issue on my Grace machine, but have not yet been successful. Could you please provide the measurement conditions?

Thanks for having a look. I tried on a Grace machine and found a smaller regression of ~2.7%. For this measurement I compared bd6ab32 and the commit immediately before. The flags were as I described above. I am comparing the "base rate" numbers reported by spec.

The previous mesaurement (with the larger regression) was on an aws graviton 3 machine.

@yus3710-fj
Copy link
Contributor Author

I confirmed the performance regression on both Graviton3 and Grace when the value of copies is set to 1.

The result of the measurement on Graviton3:

max min avg
before this patch 52.56 52.44 52.518
after this patch 48.90 (-7.48%) 48.79 (-7.48%) 48.86 (-7.49%)

The difference in optimization messages (-Rpass=.*):

  • zext is used instead of sext after this patch.
  • LICM is promoted by this patch.

I'm investigating why the performance gets worse even though the optimization is promoted.

NoumanAmir657 pushed a commit to NoumanAmir657/llvm-project that referenced this pull request Nov 4, 2024
…o -fno-wrapv (llvm#110063)

nsw is now added to do-variable increment when -fno-wrapv is enabled as
GFortran seems to do.
That means the option introduced by llvm#91579 isn't necessary any more.

Note that the feature of -flang-experimental-integer-overflow is enabled
by default.
NoumanAmir657 pushed a commit to NoumanAmir657/llvm-project that referenced this pull request Nov 4, 2024
…flow into -fno-wrapv" (llvm#113901)

Reverts llvm#110063 due to the performance regression on
503.bwaves_r in SPEC2017.
@tblah
Copy link
Contributor

tblah commented Dec 5, 2024

Hi @yus3710-fj, thanks for investigating the cause of the regression.

There was a recent patch which improved bwaves performance: #111853. It seems that since this patch, I no longer see a regression with the integer overflow enabled so I would be happy to re-land your patch. Please feel free to open a new PR.

I think #117318 is still a real issue, just we happen not to run out of registers anymore now that there is better constant propagation.

I tested this on an AWS graviton 3 (neoverse-v1) machine.

@yus3710-fj
Copy link
Contributor Author

Thank you for the informantion, @tblah. I also confirmed that the regression doesn't happen after fbd89bc.
I'm a little concerned that the regression still occurs when LTO is diabled. However, I'd like to reland this patch because this isn't the root cause of the issue.

@yus3710-fj yus3710-fj restored the remove-experimental-integer-overflow branch December 6, 2024 01:20
yus3710-fj added a commit that referenced this pull request Dec 10, 2024
…flow into -fno-wrapv" (#118933)

This relands #110063.
The performance issue on 503.bwaves_r is found not to be related to the
patch, and is resolved by fbd89bc when LTO is enabled.
@vzakhari
Copy link
Contributor

Hello! Did anyone notice a performance regression after this PR on exchange2? I see it on both EPYC 9684X and grace. The slowdown is more than 1.5x.

I verified that performance restores when adding -fwrapv with the latest flang compiler. The compilation options are -Ofast -march=native. I will look through LLVM dumps, but please let me know if it is a known problem.

@tblah
Copy link
Contributor

tblah commented Dec 18, 2024

Hello! Did anyone notice a performance regression after this PR on exchange2? I see it on both EPYC 9684X and grace. The slowdown is more than 1.5x.

I verified that performance restores when adding -fwrapv with the latest flang compiler. The compilation options are -Ofast -march=native. I will look through LLVM dumps, but please let me know if it is a known problem.

My team doesn't closely monitor V2 performance. Another team looked into it. I'm told that initially there was a slowdown because function specialization was not performed on the last iteration of digits. Then that fixed after #118219 so we are happy on our end.

However if you aren't using LTO then you won't be seeing the function specialization anyway. I'm told that the unspecialized version of digits was worse after this patch because of extra spills in the hot part of the code. To me that sounds similar to the bwaves regression the first time this patch landed. More info here: #117318

@vzakhari
Copy link
Contributor

Thank you for the pointers and the information, Tom! It looks like exchange2 is pretty much the same problem as with bwaves. I posted a note into #117318

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang Clang issues not falling into any other category flang:driver flang:fir-hlfir flang:openmp flang Flang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants