-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
issue with volatile? #1566
Comments
I'll see if I can reduce the set of Kconfig changes on top of |
$ make -skj"$(nproc)" LLVM=1 defconfig
$ scripts/config -d BRANCH_PROFILE_NONE -e PROFILE_ANNOTATED_BRANCHES -d GENERIC_CPU -e MPSC
$ make -skj"$(nproc)" LLVM=1 olddefconfig arch/x86/kernel/traps.o
arch/x86/kernel/traps.o: warning: objtool: handle_xfd_event()+0xb0: unreachable instruction
$ git diff --no-index ../defconfig .config
diff --git a/../defconfig b/.config
index 7763c13a14f8..d650c3511a9e 100644
--- a/../defconfig
+++ b/.config
@@ -331,12 +331,13 @@ CONFIG_X86_SUPPORTS_MEMORY_FAILURE=y
CONFIG_SCHED_OMIT_FRAME_POINTER=y
# CONFIG_HYPERVISOR_GUEST is not set
# CONFIG_MK8 is not set
-# CONFIG_MPSC is not set
+CONFIG_MPSC=y
# CONFIG_MCORE2 is not set
# CONFIG_MATOM is not set
-CONFIG_GENERIC_CPU=y
-CONFIG_X86_INTERNODE_CACHE_SHIFT=6
-CONFIG_X86_L1_CACHE_SHIFT=6
+# CONFIG_GENERIC_CPU is not set
+CONFIG_X86_INTERNODE_CACHE_SHIFT=7
+CONFIG_X86_L1_CACHE_SHIFT=7
+CONFIG_X86_P6_NOP=y
CONFIG_X86_TSC=y
CONFIG_X86_CMPXCHG64=y
CONFIG_X86_CMOV=y
@@ -4728,9 +4729,11 @@ CONFIG_FTRACE=y
# CONFIG_MMIOTRACE is not set
# CONFIG_FTRACE_SYSCALLS is not set
# CONFIG_TRACER_SNAPSHOT is not set
-CONFIG_BRANCH_PROFILE_NONE=y
-# CONFIG_PROFILE_ANNOTATED_BRANCHES is not set
+CONFIG_TRACE_BRANCH_PROFILING=y
+# CONFIG_BRANCH_PROFILE_NONE is not set
+CONFIG_PROFILE_ANNOTATED_BRANCHES=y
# CONFIG_PROFILE_ALL_BRANCHES is not set
+# CONFIG_BRANCH_TRACER is not set
CONFIG_BLK_DEV_IO_TRACE=y
CONFIG_KPROBE_EVENTS=y
CONFIG_UPROBE_EVENTS=y |
When run through
|
The disassembly is literally the same between $ clang -O2 -c traps.i -w
$ llvm-objdump -drj .discard.reachable traps.o
traps.o: file format elf64-x86-64
Disassembly of section .discard.reachable:
0000000000000000 <.discard.reachable>:
0: 00 00 addb %al, (%rax)
0000000000000000: R_X86_64_PC32 .text+0x2
2: 00 00 addb %al, (%rax)
$ clang -O2 -c traps.i -w -march=nocona
$ llvm-objdump -drj .discard.reachable traps.o
traps.o: file format elf64-x86-64
Disassembly of section .discard.reachable:
0000000000000000 <.discard.reachable>:
0: 00 00 addb %al, (%rax)
0000000000000000: R_X86_64_PC32 .text+0x4
2: 00 00 addb %al, (%rax)
$ llvm-objdump -dr traps.o
traps.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <handle_xfd_event>:
0: 0f 0b ud2
2: 31 c0 xorl %eax, %eax
4: e9 00 00 00 00 jmp 0x9 <handle_xfd_event+0x9>
0000000000000005: R_X86_64_PLT32 arch_local_irq_disable-0x4 Note: gcc generates the former, not the latter. Dumping the LLVM IR from either, then running that through So having a schedule, we seem to be moving the setting of return value above the second asm statement. So I think there was a mixup between diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 429dcebe2b99..af287f92550f 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -118,18 +118,18 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
#define __stringify_label(n) #n
#define __annotate_reachable(c) ({ \
- asm volatile(__stringify_label(c) ":\n\t" \
+ asm (__stringify_label(c) ":\n\t" \
".pushsection .discard.reachable\n\t" \
".long " __stringify_label(c) "b - .\n\t" \
- ".popsection\n\t" : : "i" (c)); \
+ ".popsection\n\t" ::: "memory"); \
})
#define annotate_reachable() __annotate_reachable(__COUNTER__)
#define __annotate_unreachable(c) ({ \
- asm volatile(__stringify_label(c) ":\n\t" \
+ asm (__stringify_label(c) ":\n\t" \
".pushsection .discard.unreachable\n\t" \
".long " __stringify_label(c) "b - .\n\t" \
- ".popsection\n\t" : : "i" (c)); \
+ ".popsection\n\t" ::: "memory"); \
})
#define annotate_unreachable() __annotate_unreachable(__COUNTER__) Though reading the commit message of d0c2e69, it sounds like commit 36cd505d1917eca7d56bb2382b306556dcb9e192 (HEAD -> master)
Author: Nick Desaulniers <ndesaulniers@google.com>
Date: Wed Jan 12 16:16:33 2022 -0800
compiler.h: prefer memory clobber & %= to volatile & __COUNTER__
commit dcce50e6cc4d ("compiler.h: Fix annotation macro misplacement with Clang")
mentions:
> 'volatile' is ignored for some reason and Clang feels free to move the
> reachable() annotation away from its intended location.
Indeed, volatile is not a compiler barrier. Particularly once `-march=`
flags are used under certain configs, LLVM's machine schedule can be
observed moving instructions across the asm statement meant to point to
known reachable or unreachable code, as reported by 0day bot.
Prefer a memory clobber which is a compiler barrier that prevents these
re-orderings and remove the volatile qualifier.
Looking closer, the use of __COUNTER__ seems to have been used to
prevent de-duplication of these asm statements. The GCC manual mentions:
> Under certain circumstances, GCC may duplicate (or remove duplicates
> of) your assembly code when optimizing. This can lead to unexpected
> duplicate symbol errors during compilation if your asm code defines
> symbols or labels. Using ‘%=’ (see AssemblerTemplate) may help resolve
> this problem.
>
> ‘%=’ Outputs a number that is unique to each instance of the asm
> statement in the entire compilation. This option is useful when
> creating local labels and referring to them multiple times in a single
> template that generates multiple assembler instructions.
commit 3d1e236022cc ("objtool: Prevent GCC from merging annotate_unreachable()")
Mentions that
> The inline asm ‘%=’ token could be used for that, but unfortunately
> older versions of GCC don't support it.
From testing all versions of GCC available on godbolt.org, GCC 4.1+
seems to support 4.1. Since the minimum supported version of GCC at the
moment is GCC 5.1, it sounds like this is no longer a concern.
Prefer the %= assembler template to having to stringify __COUNTER__.
This commit is effectively a revert of the following commits:
commit dcce50e6cc4d ("compiler.h: Fix annotation macro misplacement with Clang")
commit f1069a8756b9 ("compiler.h: Avoid using inline asm operand modifiers")
commit d0c2e691d1cb ("objtool: Add a comment for the unreachable annotation macros")
commit ec1e1b610917 ("objtool: Prevent GCC from merging annotate_unreachable(), take 2")
commit 3d1e236022cc ("objtool: Prevent GCC from merging annotate_unreachable()")
Link: https://github.com/ClangBuiltLinux/linux/issues/1566
Link: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#Volatile
Link: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#AssemblerTemplate
Link: https://lore.kernel.org/llvm/202112080834.XFYU8b5Q-lkp@intel.com/
Link: https://lore.kernel.org/llvm/202111300857.IyINAyJk-lkp@intel.com/
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 429dcebe2b99..b87f841d0c10 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -108,30 +108,19 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
# define barrier_before_unreachable() do { } while (0)
#endif
-/* Unreachable code */
+/* These macros help objtool understand GCC code flow for unreachable code. */
#ifdef CONFIG_STACK_VALIDATION
-/*
- * These macros help objtool understand GCC code flow for unreachable code.
- * The __COUNTER__ based labels are a hack to make each instance of the macros
- * unique, to convince GCC not to merge duplicate inline asm statements.
- */
-#define __stringify_label(n) #n
-
-#define __annotate_reachable(c) ({ \
- asm volatile(__stringify_label(c) ":\n\t" \
- ".pushsection .discard.reachable\n\t" \
- ".long " __stringify_label(c) "b - .\n\t" \
- ".popsection\n\t" : : "i" (c)); \
-})
-#define annotate_reachable() __annotate_reachable(__COUNTER__)
-
-#define __annotate_unreachable(c) ({ \
- asm volatile(__stringify_label(c) ":\n\t" \
- ".pushsection .discard.unreachable\n\t" \
- ".long " __stringify_label(c) "b - .\n\t" \
- ".popsection\n\t" : : "i" (c)); \
-})
-#define annotate_unreachable() __annotate_unreachable(__COUNTER__)
+#define annotate_reachable() \
+ asm (".Lreachable%=:\n\t" \
+ ".pushsection .discard.reachable\n\t" \
+ ".long .Lreachable%= - .\n\t" \
+ ".popsection\n\t" ::: "memory");
+
+#define annotate_unreachable() \
+ asm (".Lunreachable%=:\n\t" \
+ ".pushsection .discard.unreachable\n\t" \
+ ".long .Lunreachable%= - .\n\t" \
+ ".popsection\n\t" ::: "memory");
#define ASM_UNREACHABLE \
"999:\n\t" \ I will test this more before sending. I probably want to do this to |
thinking about this more, this isn't the only place where the kernel relies on the inline asm having a label refer to C statements following the inline asm. Pretty sure the exception fixup stuff works this way, too. Hmm. EDIT: at least that uses asm goto to actually inform the compiler about possible control flow. Here we have a label in inline asm hoping to refer to C code after it, but compiler CAN reorder instructions across asm statements as long as the expressed constraints are satisfied. |
filed https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104236 for |
There is no GCC bug here, rather Linux inline-asm is again broken even by the documentation which has not changed in years. |
testing v2: https://gist.github.com/nickdesaulniers/d4ad556d2eec986698ab6313e3101c83 (on top of a revert of dcce50e) |
I'm not ruling out the presence of compiler bugs, but there is no bug in
I sent a patch to diagnose when I started a thread on adding some kernel documentation; it's not been shown that there's a compiler barrier that can prevent all such instruction reordering. I'm not sure yet personally whether @nathanchance made the comment on IRC:
So I think I should still pursue the documentation patches, and there's perhaps more cleanup or refactoring work to be done, but I'm going to close this out. Happy to continue the discussion here though. |
Boris cc'ed us on an x86 change (now in mainline as dcce50e) from @jpoimboe. The commit message sounds ominous about
volatile
qualifiedasm
.It mentions CONFIG_TRACE_BRANCH_PROFILING. If I revert dcce50e though, defconfig+CONFIG_TRACE_BRANCH_PROFILING=y isn't enough to repro for me in ToT clang (clang-14) or clang-11.
Looking at the archives for handle_xfd_event, it looks like there's 2 0day bot reports. Both are randconfig builds. If I take the first config, then revert dcce50e, I can reproduce the flood of objtool warnings.
I didn't make much progress today, but I'm curious:
The text was updated successfully, but these errors were encountered: