Section mismatch warnings around numa_nodes_parsed #1302

nathanchance · 2021-02-12T00:18:48Z

$ make -skj"$(nproc)" LLVM=1 O=build/x86_64 distclean allmodconfig vmlinux
...
WARNING: modpost: vmlinux.o(.text+0x14428e): Section mismatch in reference from the function test_bit() to the variable .init.data:numa_nodes_parsed
The function test_bit() references
the variable __initdata numa_nodes_parsed.
This is often because test_bit lacks a __initdata
annotation or the annotation of numa_nodes_parsed is wrong.

WARNING: modpost: vmlinux.o(.text+0x14436f): Section mismatch in reference from the function __first_node() to the variable .init.data:numa_nodes_parsed
The function __first_node() references
the variable __initdata numa_nodes_parsed.
This is often because __first_node lacks a __initdata
annotation or the annotation of numa_nodes_parsed is wrong.

WARNING: modpost: vmlinux.o(.text+0x1443b7): Section mismatch in reference from the function __next_node() to the variable .init.data:numa_nodes_parsed
The function __next_node() references
the variable __initdata numa_nodes_parsed.
This is often because __next_node lacks a __initdata
annotation or the annotation of numa_nodes_parsed is wrong.
...

These are all small functions marked with inline so they should be getting inlined but new pass manager probably changed that.

The text was updated successfully, but these errors were encountered:

arndb · 2021-02-14T08:43:52Z

As in #1301, the problem here is not that llvm decides against inlining the function, but that it creates a specialized version of the function that hardcodes the reference to a global variable in an incompatible section.

nickdesaulniers · 2021-02-19T21:29:02Z

synthesized functions should retain the section attribute from the function they were synthesized from; we can fix that in LLVM. A smaller test case would be helpful.

arndb · 2021-02-21T00:12:21Z

Reduced test case: https://godbolt.org/z/KMaYvW

nickdesaulniers · 2021-02-21T03:36:48Z

Sorry, I don't fully follow from the reduced test case. The issue was that test_bit was not inlined into g (which is in .init.text), so the reference to a and b in test_bit is bad? Oh, the version of test_bit emitted has the reference to numa_nodes_parsed in it. That's new. Ah, and the specialized version is not in the same section of the caller it would have been in had it been inlined.

The first thing to do then is find what pass creates these specialized functions. It may be a middle end pass; emitting llvm ir via clang -emit-llvm -S ... -o foo.ll then opt -O2 -print-after-all foo.ll then observing which passes moved the reference to numa_nodes_parsed into test_bit will tell us which pass we need to teach about inheriting section function attributes.

nickdesaulniers · 2021-02-21T03:39:59Z

Emitting the single specialized function seems a little odd on first glance. (Why not just inline the thing?) Adding

extern d y;
void x(void) {
    test_bit(y.bits);
}

makes test_bit get inlined into both callers. Seems like something curious when there's only 1 caller.

nickdesaulniers · 2021-02-23T00:50:12Z

Protip (from @aeubanks) :

-flegacy-pass-manager to use the LPM
-fexperimental-new-pass-manager for explicitly using the NPM
(-fno-experimental-new-pass-manager is the same as -flegacy-pass-manager)

Looks indeed like this is NPM related. --enable-new-pm/--enable-new-pm=0 for opt.
Looks like "IPSCCPPass"/ipsccp (aka "interprocedural sparse conditional constant propagation and merging") is the culprit. I wonder if that should be running after inlining?

nickdesaulniers · 2021-02-23T00:58:56Z

It looks like IPSCCPPass was never run with the Legacy Pass Manager (LPM).

$ opt -O2 foo.ll -o - -S --enable-new-pm -print-after-all -print-module-scope 2>&1 | grep "IR Dump"

*** IR Dump After VerifierPass ***
*** IR Dump After Annotation2MetadataPass ***
*** IR Dump After ForceFunctionAttrsPass ***
*** IR Dump After InferFunctionAttrsPass ***
*** IR Dump After SimplifyCFGPass *** (function: g)
*** IR Dump After SROA *** (function: g)
*** IR Dump After EarlyCSEPass *** (function: g)
*** IR Dump After LowerExpectIntrinsicPass *** (function: g)
*** IR Dump After SimplifyCFGPass *** (function: test_bit)
*** IR Dump After SROA *** (function: test_bit)
*** IR Dump After EarlyCSEPass *** (function: test_bit)
*** IR Dump After LowerExpectIntrinsicPass *** (function: test_bit)
*** IR Dump After IPSCCPPass ***
...

$ opt -O2 foo.ll -o - -S -print-after-all -print-module-scope 2>&1 | grep "IR Dump"
...
*** IR Dump After Function Integration/Inlining ***
<test_bit is now inlined>

nickdesaulniers · 2021-02-23T01:11:19Z

It looks like IPSCCPPass was never run with the Legacy Pass Manager (LPM).

Nevermind, it's spelled out differently (ugh)

*** IR Dump After Interprocedural Sparse Conditional Constant Propagation ***

so LPM does the same transform, just NPM didn't do the inlining.

Next is to query the inliner
opt -passes=ipsccp,inline foo.ll -o - -S -debug-only=inline brb (need a debug build of LLVM)

nickdesaulniers · 2021-02-23T01:27:58Z

$ opt -passes=inline foo.ll -o - -S -debug-only=inline
Inlining calls in: g
    NOT Inlining (cost=55, threshold=45), Call:   call void @test_bit(i32* getelementptr inbounds (%struct.d, %struct.d* @numa_nodes_parsed, i32 0, i32 0, i64 0))
...
$ opt -O2 foo.ll -o - -S -debug-only=inline                              
Inlining calls in: g
Inlining calls in: g
    NOT Inlining (cost=45, threshold=45), Call:   call fastcc void @test_bit()
...
$ opt -O2 foo.ll -o - -S -debug-only=inline --enable-new-pm=0
Inliner visiting SCC: g: 1 call sites.
    Inlining (cost=-14955, threshold=225), Call:   call fastcc void @test_bit()
    -> Deleting dead function: test_bit
...

I'll have to think about this more; though I still think perhaps IPSCCP should not be sinking parameters with section attributes, or into callees with different section attributes, since when inlining heuristics change, we get these kind of issues. Perhaps I need to ponder "what should happen, if the callee wasn't so small?" Or perhaps, test_bit should always be marked __always_inline such that it can be called from both __init and regular code in .text?

(-debug-only=inline-cost is interesting to play with)

nickdesaulniers · 2021-03-05T21:33:53Z

Working on a patch in https://reviews.llvm.org/D97971.

nickdesaulniers · 2021-03-25T18:40:52Z

I'm incredibly distracted right now with assembler related issues in Android, but I was able to eliminate all of these with:

diff --git a/include/asm-generic/bitops/instrumented-non-atomic.h b/include/asm-generic/bitops/instrumented-non-atomic.h
index 37363d570b9b..aef7c766f335 100644
--- a/include/asm-generic/bitops/instrumented-non-atomic.h
+++ b/include/asm-generic/bitops/instrumented-non-atomic.h
@@ -129,7 +129,7 @@ static inline bool __test_and_change_bit(long nr, volatile unsigned long *addr)
  * @nr: bit number to test
  * @addr: Address to start counting from
  */
-static inline bool test_bit(long nr, const volatile unsigned long *addr)
+static __always_inline bool test_bit(long nr, const volatile unsigned long *addr)
 {
        instrument_atomic_read(addr + BIT_WORD(nr), sizeof(long));
        return arch_test_bit(nr, addr);
diff --git a/include/linux/nodemask.h b/include/linux/nodemask.h
index ac398e143c9a..a34be5098bbb 100644
--- a/include/linux/nodemask.h
+++ b/include/linux/nodemask.h
@@ -235,7 +235,7 @@ static inline int __nodes_full(const nodemask_t *srcp, unsigned int nbits)
 }
 
 #define nodes_weight(nodemask) __nodes_weight(&(nodemask), MAX_NUMNODES)
-static inline int __nodes_weight(const nodemask_t *srcp, unsigned int nbits)
+static __always_inline int __nodes_weight(const nodemask_t *srcp, unsigned int nbits)
 {
        return bitmap_weight(srcp->bits, nbits);
 }
@@ -260,13 +260,13 @@ static inline void __nodes_shift_left(nodemask_t *dstp,
           > MAX_NUMNODES, then the silly min_ts could be dropped. */
 
 #define first_node(src) __first_node(&(src))
-static inline int __first_node(const nodemask_t *srcp)
+static __always_inline int __first_node(const nodemask_t *srcp)
 {
        return min_t(int, MAX_NUMNODES, find_first_bit(srcp->bits, MAX_NUMNODES));
 }
 
 #define next_node(n, src) __next_node((n), &(src))
-static inline int __next_node(int n, const nodemask_t *srcp)
+static __always_inline int __next_node(int n, const nodemask_t *srcp)
 {
        return min_t(int,MAX_NUMNODES,find_next_bit(srcp->bits, MAX_NUMNODES, n+1));
 }

that include/asm-generic/bitops/instrumented-non-atomic.h but other definitions of test_bit were not seems like a smoking gun for one particular config that seems to be related. (We should find out what that is, then see if we still get the warning for a different definition of test_bit with allyesconfig minus that config). bitmap_weight is itself __always_inline, so one-line callers of it probably should be, too.

include/linux/nodemask.h has a curious comment:

118 /*                                                                                                                                                                          
119  * The inline keyword gives the compiler room to decide to inline, or                                                                                                       
120  * not inline a function as it sees best.  However, as these functions                                                                                                      
121  * are called in both __init and non-__init functions, if they are not                                                                                                      
122  * inlined we will end up with a section mis-match error (of the type of                                                                                                    
123  * freeable items not being freed).  So we must use __always_inline here                                                                                                    
124  * to fix the problem.  If other functions in the future also end up in                                                                                                     
125  * this situation they will also need to be annotated as __always_inline                                                                                                    
126  */

So it feels kind of suspicious to be kneecapping LLVM's constant propagation for a complex edge case for satisfying rules highly specific to the Linux kernel, when the kernel itself has a comment in these headers saying "use __always_inline if these are called from both __init and non-__init functions" in the same header that defines the problematic functions.

arndb · 2021-03-25T18:57:26Z

I am a bit skeptical about whether that comment still applies though. This was added in 2013, and the compilers were quite a bit different back then. I would not be surprised if whatever bug happened back than was completely unreproducible with gcc-4.9 and higher.

The hack I have locally to build cleanly just drops the __initdata annotation on numa_nodes_parsed ;-)

Another possible hack would be to annotate amd_numa_init() as attribute((flatten))

nickdesaulniers · 2021-03-25T19:09:10Z

That comment is also perfectly in line with Boris' feedback on get_smp_config. So I think it makes a good rule of thumb for the kernel: "if you have static inline functions called from __init and non-__init functions, you probably want __always_inline rather than inline."

If marking all callees recursively __always_inline is too painful, __attribute__((flatten)) can help, but boy it sure is an awfully big hammer. I suspect we don't need to completely flatten all callees into amd_numa_init.

I am a bit skeptical about whether that comment still applies though. This was added in 2013, and the compilers were quite a bit different back then. I would not be surprised if whatever bug happened back than was completely unreproducible with gcc-4.9 and higher.

Are you referring to a different comment in the same source file, or the one I posted? Because the comment I linked to looks totally reasonable to me, and I don't see how that would have changed in compilers since 2013.

nickdesaulniers · 2021-03-25T22:59:20Z

that include/asm-generic/bitops/instrumented-non-atomic.h but other definitions of test_bit were not seems like a smoking gun for one particular config that seems to be related.
We should find out what that is, then see if we still get the warning for a different definition of test_bit with allyesconfig minus that config

allyesconfig minus CONFIG_UBSAN_OBJECT_SIZE fixes most (except for test_bit), so Arnd's findings related to CONFIG_UBSAN_OBJECT_SIZE=y are definitely related with these. Perhaps a focus on addressing that in LLVM is better than modifying the constant propagation replacement to resolve this set of warnings.

I do think it would be clearer for kernel maintainers when sending them patches to specify precisely which configs trigger this (rather than use less precise language like "clang sometimes decides to not inline") though, consider:

For test_bit, there are 38 local copies made for x86 allyesconfig minus CONFIG_UBSAN_OBJECT_SIZE, they look huge due to setup and calls to __sanitizer_cov_trace_pc, __kasan_check_read, __sanitizer_cov_trace_const_cmp1, __ubsan_handle_load_invalid_value. So the response to "how could test_bit not be inlined, it's so tiny!?" is "yeah, but CONFIG_GCOV_KERNEL=y (__sanitizer_cov_trace_pc, __sanitizer_cov_trace_const_cmp1), CONFIG_KASAN=y (__kasan_check_read), and CONFIG_UBSAN_SANITIZE_ALL=y (__ubsan_handle_load_invalid_value) have conspired to mess that up." The code coverage runtime calls look the most expensive (CONFIG_KCOV=y), but it's actually all 3 that need to be disabled for test_bit to inlined into the relevant caller.

AMD_NUMA+KASAN alone (on top of defconfig) are enough to trigger the warnings from test_bit. (It's always the sanitizers).

I also wonder why these warnings get printed twice? Does modpost have a bug where it's accidentally doing double the work?

nickdesaulniers · 2021-03-25T23:51:28Z

A quick hack to LLVM to run LowerConstantIntrinsicsPass before inlining (more so, an arbitrary point, but still before inlining):

EDIT this diff is now highly outdated...

diff --git a/llvm/lib/Passes/PassBuilder.cpp b/llvm/lib/Passes/PassBuilder.cpp
index 52f7026c40eb..a41bf6419cd4 100644
--- a/llvm/lib/Passes/PassBuilder.cpp
+++ b/llvm/lib/Passes/PassBuilder.cpp
@@ -666,6 +666,7 @@ PassBuilder::buildFunctionSimplificationPipeline(OptimizationLevel Level,
   // Form SSA out of local memory accesses after breaking apart aggregates into
   // scalars.
   FPM.addPass(SROA());
+  FPM.addPass(LowerConstantIntrinsicsPass());
 
   // Catch trivial redundancies
   FPM.addPass(EarlyCSEPass(true /* Enable mem-ssa. */));

Fixes all of the observed warnings in Arnd's example, and the kernel except test_bit (sanitizers), but also produces tons of linkage errors related to failed compile time asserts (BUILD_BUG_ON). I'm pretty sure LowerConstantIntrinsicsPass needs to run post inlining so that __builtin_constant_p can work at all like it does in GCC; perhaps if the whole pass cannot be rerun we could create a new pass for handling @llvm.objectsize that's run before inlining (splitting the pass in two, and moving one part before inlining). (But that seems like a lot of work when 4 __always_inline would also solve the warning).

nickdesaulniers · 2021-05-04T18:06:22Z

I happened to be reading though the source of modpost (scripts/mod/modpost.c); this is exactly Pattern 5 that is whitelisted, though it relies on GCC renaming the optimized callee.

nickdesaulniers · 2021-06-02T18:50:59Z

commit 4a3893d ("modpost: don't emit section mismatch warnings for compiler optimizations")

is what added modpost pattern 5; I'd like to see if I can dig up the lore thread around its discussion.

nickdesaulniers · 2021-06-28T23:29:36Z

I think we should just create a compiler specific allow list in modpost for this.

nickdesaulniers · 2021-07-01T21:12:20Z

diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
index 3e623ccc020b..c160f800f5e3 100644
--- a/scripts/mod/modpost.c
+++ b/scripts/mod/modpost.c
@@ -853,6 +853,27 @@ static int match(const char *sym, const char * const pat[])
        return 0;
 }
 
+struct secref_exception {
+       const char * const fromsym, * const tosym;
+};
+
+static const struct secref_exception secref_allowlist [] = {
+       { .fromsym = "__first_node", .tosym = "numa_nodes_parsed" },
+       { .fromsym = "__next_node", .tosym = "numa_nodes_parsed" },
+       { .fromsym = "__nodes_weight", .tosym = "numa_nodes_parsed" },
+       { .fromsym = "early_get_smp_config", .tosym = "x86_init" },
+       { .fromsym = "test_bit", .tosym = "numa_nodes_parsed" },
+};
+
+static int match_allowlist(const char* fromsym, const char* tosym) {
+       int i = 0, e = sizeof(secref_allowlist) / sizeof(secref_allowlist[0]);
+       for (; i != e; ++i)
+               if (!strcmp(secref_allowlist[i].fromsym, fromsym) &&
+                   !strcmp(secref_allowlist[i].tosym, tosym))
+                       return 1;
+       return 0;
+}
+
 /* sections that we do not want to do full section mismatch check on */
 static const char *const section_white_list[] =
 {
@@ -1192,6 +1213,8 @@ static const struct sectioncheck *section_mismatch(
  *   tosec   = init section
  *   fromsec = text section
  *   refsymname = *.constprop.*
+ *   LLVM will do similar constant propagation, but it will not rename the
+ *   transformed callee.
  *
  * Pattern 6:
  *   Hide section mismatch warnings for ELF local symbols.  The goal
@@ -1235,9 +1258,10 @@ static int secref_whitelist(const struct sectioncheck *mismatch,
 
        /* Check for pattern 5 */
        if (match(fromsec, text_sections) &&
-           match(tosec, init_sections) &&
-           match(fromsym, optim_symbols))
-               return 0;
+           match(tosec, init_sections))
+               if (match(fromsym, optim_symbols) ||
+                   match_allowlist(fromsym, tosym))
+                       return 0;
 
        /* Check for pattern 6 */
        if (strstarts(fromsym, ".L"))

nickdesaulniers · 2021-07-30T17:50:42Z

@apinski-cavium in #gcc irc helped me pinpoint some relevant GCC commits
eb598ce634300e2e6633769ffda6205406169281
036546e58ac96338a9167fb1b239670fdca99ab3
9187e02deb92bfe982c845a682468fce39c8bf9b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44324
https://gcc.gnu.org/pipermail/gcc-patches/2010-May/284756.html

kees · 2021-09-16T21:23:24Z

Is the plan to send the modpost patch or to have LLVM perform similar clone renaming?

nickdesaulniers · 2021-09-29T20:58:12Z

Probably modpost. It's silly IMO to ask LLVM to change to some arbitrary convention that the kernel's modpost happens to rely on.

I can hack up a change quickly to LLVM, but this results in 73+ existing test failures in LLVM. I can try to minimize the checking in modpost to object files built by clang.

nickdesaulniers · 2021-09-29T23:01:07Z

sent: https://lore.kernel.org/lkml/20210929225850.3889950-1-ndesaulniers@google.com/

nickdesaulniers · 2021-10-06T00:52:02Z

Jotting down my research notes from today. I ran a ~4hr creduce on the defconfig issue: https://gist.github.com/nickdesaulniers/ea307c16f747938dbfad34cab6f04a72. What's interesting about this case is that the BCP representation in IR %1 = call i1 @llvm.is.constant.i64(i64 %0) will actually never be true, but we don't figure that until after inlining, but we decided during inlining that the true case was too big so we don't inline. Inline cost analysis basically does aggressive constant propagation itself in order to recommend whether inlining would be beneficial or not, and it seems to be mishandling a special case.

I have a fix in llvm/lib/Analysis/InlineCost.cpp, but this bug might actually be deeper in llvm/lib/Analysis/ConstantFolding.cpp.

Basically, in the above case, we know that i64 %0 is not Constant, but instead of folding call i1 @llvm.is.constant.i64(i64 %0) to i64 0, we bail (evaluating the whole size of both branches of the conditional, rather than aggressively folding the whole thing)! (If i64 %0 was Constant, then we would have folded call i1 @llvm.is.constant.i64(i64 %0) to i64 1 by removing the false case and further simplifying the true case).

Stated another way, as soon as we see a non llvm::Constant parameter to a function call, we pesimize the ability of the CallAnalyzer to simplify a call site. BUT! for @llvm.is.constant.i64 that's actually a special case where we can replace the call to that intrinsic function with 0. CallAnalyzer::simplifyCallSite or even ConstantFoldCall should be able to fold a call to @llvm.is.constant.i64 when the parameter isn't dependent on the parameter of the Function parent of the CallInst.

That fixes the issue observed for the defconfig, but not the allconfig. So there may be more than one bug here; perhaps object size is related for allyesconfig or has a similar issue. Or my WIP local patch may have a bug; though I don't regress any existing LLVM tests. I will clean up the patch and post it tomorrow.

nickdesaulniers · 2021-10-06T23:00:20Z

https://reviews.llvm.org/D111272 is the first part, which fixes the unusually high cost seen in the defconfigs.

I'm now investigating the issues from the all*configs; I don't see any unusually high "costs" preventing inlining. Instead, I'm seeing negative costs in defconfigs vs just-over-threshold costs in allmodconfig. Let me try to what can cause the costs to become negative, since that's a little curious. That might indicate if something is missing in the all*configs.

Absurdly to the point of comedy, the mere act of questioning "why was this inlined or not" (-Rpass-missed=inline) changes the cost model. WTF!

llvm.is.constant* intrinsics are evaluated to 0 or 1 integral values. A common use case for llvm.is.constant comes from the higher level __builtin_constant_p. A common usage pattern of __builtin_constant_p in the Linux kernel is: void foo (int bar) { if (__builtin_constant_p(bar)) { // lots of code that will fold away to a constant. } else { // a little bit of code, usually a libcall. } } A minor issue in InlineCost calculations is when `bar` is _not_ Constant and still will not be after inlining, we don't discount the true branch and the inline cost of `foo` ends up being the cost of both branches together, rather than just the false branch. This leads to code like the above where inlining will not help prove bar Constant, but it still would be beneficial to inline foo, because the "true" branch is irrelevant from a cost perspective. For example, IPSCCP can sink a passed constant argument to foo: const int x = 42; void bar (void) { foo(x); } This improves our inlining decisions, and fixes a few head scratching cases were the disassembly shows a relatively small `foo` not inlined into a lone caller. We could further improve this modeling by tracking whether the argument to llvm.is.constant* is a parameter of the function, and if inlining would allow that parameter to become Constant. This idea is noted in a FIXME comment. Link: ClangBuiltLinux/linux#1302 Reviewed By: kazu Differential Revision: https://reviews.llvm.org/D111272

nickdesaulniers · 2021-10-08T23:44:02Z

Before this week's changes, we had warning from:

test_bit (allmodconfig)
__nodes_weight (allmodconfig, defconfig)
early_get_smp_config (allmodconfig)
__first_node (allmodconfig)
__next_node (allmodconfig)

part 1 has landed. This eliminates the issues with defconfig, but does nothing for allmodconfig.

I have some ideas for making the analysis even more precise, it just involves a lot more work at compile time. ("Will inlining change what __builtin_constant_p evaluates to?" rather than "__builtin_constant_p is either 1 or 0, but not both/a runtime call").

the second issue I see is that calls to @llvm.objectsize. (UBSAN) aren't being modeled/simplified either: https://reviews.llvm.org/D111456 (ie. no amount of inlining would change that call; it's not a runtime call).

That eliminates 2, 3, and 4 from allmodconfig.

This leaves us with 2 cases from allmodconfig test_bit and __next_node. It looks like they each have 1 caller, so they should be getting a severe discount towards inline cost, but they're not; it seems to have something to do with block frequency and the inliner thinking the BB's the calls exist in are cold. I suspect that all the handling for KASAN is messing up the block frequency calculations (like the KASAN runtime handlers having the same expected frequency although they probably should never run except in exceptional cases).

bwendling · 2021-11-16T19:35:44Z

Slight ping, as I'm running into this situation right now. Is there a status update on the various patches?

llvm.is.constant* intrinsics are evaluated to 0 or 1 integral values. A common use case for llvm.is.constant comes from the higher level __builtin_constant_p. A common usage pattern of __builtin_constant_p in the Linux kernel is: void foo (int bar) { if (__builtin_constant_p(bar)) { // lots of code that will fold away to a constant. } else { // a little bit of code, usually a libcall. } } A minor issue in InlineCost calculations is when `bar` is _not_ Constant and still will not be after inlining, we don't discount the true branch and the inline cost of `foo` ends up being the cost of both branches together, rather than just the false branch. This leads to code like the above where inlining will not help prove bar Constant, but it still would be beneficial to inline foo, because the "true" branch is irrelevant from a cost perspective. For example, IPSCCP can sink a passed constant argument to foo: const int x = 42; void bar (void) { foo(x); } This improves our inlining decisions, and fixes a few head scratching cases were the disassembly shows a relatively small `foo` not inlined into a lone caller. We could further improve this modeling by tracking whether the argument to llvm.is.constant* is a parameter of the function, and if inlining would allow that parameter to become Constant. This idea is noted in a FIXME comment. Link: ClangBuiltLinux/linux#1302 Reviewed By: kazu Differential Revision: https://reviews.llvm.org/D111272 (cherry picked from commit 9697f93)

nickdesaulniers · 2022-01-24T21:03:03Z

note that -fsanitize=object-size was removed in
commit 69d0db0 ("ubsan: remove CONFIG_UBSAN_OBJECT_SIZE")
so after that https://reviews.llvm.org/D111456 might be less pressing.

llvm.is.constant* intrinsics are evaluated to 0 or 1 integral values. A common use case for llvm.is.constant comes from the higher level __builtin_constant_p. A common usage pattern of __builtin_constant_p in the Linux kernel is: void foo (int bar) { if (__builtin_constant_p(bar)) { // lots of code that will fold away to a constant. } else { // a little bit of code, usually a libcall. } } A minor issue in InlineCost calculations is when `bar` is _not_ Constant and still will not be after inlining, we don't discount the true branch and the inline cost of `foo` ends up being the cost of both branches together, rather than just the false branch. This leads to code like the above where inlining will not help prove bar Constant, but it still would be beneficial to inline foo, because the "true" branch is irrelevant from a cost perspective. For example, IPSCCP can sink a passed constant argument to foo: const int x = 42; void bar (void) { foo(x); } This improves our inlining decisions, and fixes a few head scratching cases were the disassembly shows a relatively small `foo` not inlined into a lone caller. We could further improve this modeling by tracking whether the argument to llvm.is.constant* is a parameter of the function, and if inlining would allow that parameter to become Constant. This idea is noted in a FIXME comment. Link: ClangBuiltLinux/linux#1302 Reviewed By: kazu Differential Revision: https://reviews.llvm.org/D111272 (cherry picked from commit 9697f93)

nickdesaulniers · 2022-09-10T14:32:14Z

https://reviews.llvm.org/rG9697f93587f46300814f1c6c68af347441d6e05d is what ultimately landed in LLVM; it allowed us to get more aggressive about inlining callees containing calls to __builtin_constant_p.

https://reviews.llvm.org/D111456 (just pinged Kazu) is maybe still interesting if this comes up again, but closing this out now. We no longer have any instances of section mismatch warnings for x86_64 allmodconfig due to inlining decisions.

nickdesaulniers · 2022-09-10T14:33:13Z

https://reviews.llvm.org/rG9697f93587f46300814f1c6c68af347441d6e05d shipped in clang-14.

llvm.is.constant* intrinsics are evaluated to 0 or 1 integral values. A common use case for llvm.is.constant comes from the higher level __builtin_constant_p. A common usage pattern of __builtin_constant_p in the Linux kernel is: void foo (int bar) { if (__builtin_constant_p(bar)) { // lots of code that will fold away to a constant. } else { // a little bit of code, usually a libcall. } } A minor issue in InlineCost calculations is when `bar` is _not_ Constant and still will not be after inlining, we don't discount the true branch and the inline cost of `foo` ends up being the cost of both branches together, rather than just the false branch. This leads to code like the above where inlining will not help prove bar Constant, but it still would be beneficial to inline foo, because the "true" branch is irrelevant from a cost perspective. For example, IPSCCP can sink a passed constant argument to foo: const int x = 42; void bar (void) { foo(x); } This improves our inlining decisions, and fixes a few head scratching cases were the disassembly shows a relatively small `foo` not inlined into a lone caller. We could further improve this modeling by tracking whether the argument to llvm.is.constant* is a parameter of the function, and if inlining would allow that parameter to become Constant. This idea is noted in a FIXME comment. Link: ClangBuiltLinux/linux#1302 Reviewed By: kazu Differential Revision: https://reviews.llvm.org/D111272

Very similar to https://reviews.llvm.org/D111272. We very often can evaluate calls to llvm.objectsize.* regardless of inlining. Don't count calls to llvm.objectsize.* against the InlineCost when we can evaluate the call to a constant. Link: ClangBuiltLinux/linux#1302 Reviewed By: manojgupta Differential Revision: https://reviews.llvm.org/D111456

nathanchance added [BUG] Untriaged Something isn't working [ARCH] x86_64 This bug impacts ARCH=x86_64 labels Feb 12, 2021

This was referenced Mar 1, 2021

Section mismatch warning from arch/x86/mm/amdtopology.c #1301

Closed

CONFIG_UBSAN_OBJECT_SIZE leads to Section mismatch warnings #1321

Closed

nickdesaulniers added [BUG] llvm A bug that should be fixed in upstream LLVM [PATCH] Submitted A patch has been submitted for review and removed [BUG] Untriaged Something isn't working labels Mar 5, 2021

nickdesaulniers self-assigned this Mar 5, 2021

kees added the CONFIG_WERROR Has in an error with CONFIG_WERROR (all{mod,yes}config) (or emits a non-compiler warning) label Sep 17, 2021

kees added Clean build Issue needs to be fixed to get a warning-free all*config build and removed CONFIG_WERROR Has in an error with CONFIG_WERROR (all{mod,yes}config) (or emits a non-compiler warning) labels Sep 22, 2021

nickdesaulniers added the [PATCH] Rejected The submitted patch was rejected label Sep 30, 2021

nickdesaulniers mentioned this issue Jan 6, 2022

Please backport "[InlineCost] model calls to llvm.is.constant* more carefully" to 13.x llvm/llvm-project#53058

Closed

nickdesaulniers closed this as completed Sep 10, 2022

nickdesaulniers added [FIXED][LLVM] 14 This bug was fixed in LLVM 14.x [PATCH] Bitrot Patch is outdated and needs to be refreshed or revisited and removed [PATCH] Submitted A patch has been submitted for review [PATCH] Rejected The submitted patch was rejected labels Sep 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Section mismatch warnings around numa_nodes_parsed #1302

Section mismatch warnings around numa_nodes_parsed #1302

nathanchance commented Feb 12, 2021

arndb commented Feb 14, 2021

nickdesaulniers commented Feb 19, 2021

arndb commented Feb 21, 2021

nickdesaulniers commented Feb 21, 2021 •

edited

Loading

nickdesaulniers commented Feb 21, 2021

nickdesaulniers commented Feb 23, 2021 •

edited

Loading

nickdesaulniers commented Feb 23, 2021

nickdesaulniers commented Feb 23, 2021

nickdesaulniers commented Feb 23, 2021 •

edited

Loading

nickdesaulniers commented Mar 5, 2021

nickdesaulniers commented Mar 25, 2021

arndb commented Mar 25, 2021

nickdesaulniers commented Mar 25, 2021

nickdesaulniers commented Mar 25, 2021 •

edited

Loading

nickdesaulniers commented Mar 25, 2021 •

edited

Loading

nickdesaulniers commented May 4, 2021

nickdesaulniers commented Jun 2, 2021

nickdesaulniers commented Jun 28, 2021

nickdesaulniers commented Jul 1, 2021

nickdesaulniers commented Jul 30, 2021 •

edited

Loading

kees commented Sep 16, 2021

nickdesaulniers commented Sep 29, 2021

nickdesaulniers commented Sep 29, 2021

nickdesaulniers commented Oct 6, 2021 •

edited

Loading

nickdesaulniers commented Oct 6, 2021 •

edited

Loading

nickdesaulniers commented Oct 8, 2021

bwendling commented Nov 16, 2021

nickdesaulniers commented Jan 24, 2022

nickdesaulniers commented Sep 10, 2022

nickdesaulniers commented Sep 10, 2022

Section mismatch warnings around numa_nodes_parsed #1302

Section mismatch warnings around numa_nodes_parsed #1302

Comments

nathanchance commented Feb 12, 2021

arndb commented Feb 14, 2021

nickdesaulniers commented Feb 19, 2021

arndb commented Feb 21, 2021

nickdesaulniers commented Feb 21, 2021 • edited Loading

nickdesaulniers commented Feb 21, 2021

nickdesaulniers commented Feb 23, 2021 • edited Loading

nickdesaulniers commented Feb 23, 2021

nickdesaulniers commented Feb 23, 2021

nickdesaulniers commented Feb 23, 2021 • edited Loading

nickdesaulniers commented Mar 5, 2021

nickdesaulniers commented Mar 25, 2021

arndb commented Mar 25, 2021

nickdesaulniers commented Mar 25, 2021

nickdesaulniers commented Mar 25, 2021 • edited Loading

nickdesaulniers commented Mar 25, 2021 • edited Loading

nickdesaulniers commented May 4, 2021

nickdesaulniers commented Jun 2, 2021

nickdesaulniers commented Jun 28, 2021

nickdesaulniers commented Jul 1, 2021

nickdesaulniers commented Jul 30, 2021 • edited Loading

kees commented Sep 16, 2021

nickdesaulniers commented Sep 29, 2021

nickdesaulniers commented Sep 29, 2021

nickdesaulniers commented Oct 6, 2021 • edited Loading

nickdesaulniers commented Oct 6, 2021 • edited Loading

nickdesaulniers commented Oct 8, 2021

bwendling commented Nov 16, 2021

nickdesaulniers commented Jan 24, 2022

nickdesaulniers commented Sep 10, 2022

nickdesaulniers commented Sep 10, 2022

nickdesaulniers commented Feb 21, 2021 •

edited

Loading

nickdesaulniers commented Feb 23, 2021 •

edited

Loading

nickdesaulniers commented Feb 23, 2021 •

edited

Loading

nickdesaulniers commented Mar 25, 2021 •

edited

Loading

nickdesaulniers commented Mar 25, 2021 •

edited

Loading

nickdesaulniers commented Jul 30, 2021 •

edited

Loading

nickdesaulniers commented Oct 6, 2021 •

edited

Loading

nickdesaulniers commented Oct 6, 2021 •

edited

Loading