Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

9934a5b2ed5aa6e6bbb2e55c3cd98839722c226e breaks x86_64 ThinLTO #1440

Closed
samitolvanen opened this issue Aug 18, 2021 · 16 comments
Closed

9934a5b2ed5aa6e6bbb2e55c3cd98839722c226e breaks x86_64 ThinLTO #1440

samitolvanen opened this issue Aug 18, 2021 · 16 comments
Assignees
Labels
[ARCH] x86_64 This bug impacts ARCH=x86_64 [BUG] llvm (main) A bug in an unreleased version of LLVM (this label is appropriate for regressions) [FEATURE] LTO Related to building the kernel with LLVM Link Time Optimization [FIXED][LLVM] 14 This bug was fixed in LLVM 14.x

Comments

@samitolvanen
Copy link
Member

samitolvanen commented Aug 18, 2021

Starting with LLVM commit 9934a5b2ed5aa6e6bbb2e55c3cd98839722c226e ("CVP] processSwitch: Remove default case when switch cover all possible values."), ToT x86_64 defconfig + ThinLTO fails to boot:

[    0.574465] ------------[ cut here ]------------
[    0.575073] WARNING: CPU: 0 PID: 1 at arch/x86/events/intel/core.c:4318 intel_pmu_cpu_starting+0x32e/0x3c0
[    0.575411] Modules linked in:
[    0.575824] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G        W         5.14.0-rc6+ #1
[    0.576410] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
[    0.577411] RIP: 0010:intel_pmu_cpu_starting+0x32e/0x3c0
[    0.578121] Code: f7 dd 89 d8 48 8b 4c 24 08 f0 48 0f ab 84 29 38 01 00 00 48 8b 3c 24 49 89 bf 40 33 01 00 89 de e8 07 91 ff ff e9 fa fc ff ff <0f> 0b 49 c7 87 40 33 01 00 00 00 00 00 e9 e5 fe ff ff 41 8b b5 1c
[    0.578410] RSP: 0000:ffffb49b80017bb0 EFLAGS: 00010246
[    0.579108] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[    0.579410] RDX: 0000000000000068 RSI: 0000000000000000 RDI: 0000000000000000
[    0.580369] RBP: ffff9cd2dd200000 R08: 0000000000000000 R09: 0000000000000001
[    0.580411] R10: ffffffffb1e8a720 R11: 0000000000000000 R12: 0000000000000000
[    0.581361] R13: 0000000000000000 R14: 0000000000000000 R15: ffff9cd2dd200000
[    0.581411] FS:  0000000000000000(0000) GS:ffff9cd2dd200000(0000) knlGS:0000000000000000
[    0.582410] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.583165] CR2: ffff9cd2ce201000 CR3: 000000000d222001 CR4: 0000000000770ff0
[    0.583411] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    0.584344] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[    0.584410] PKRU: 55555554
[    0.584775] Call Trace:
[    0.585105]  ? x86_pmu_dead_cpu+0x20/0x20
[    0.585411]  ? x86_pmu_starting_cpu+0x11/0x20
[    0.585988]  ? cpuhp_invoke_callback+0x125/0x300
[    0.586411]  ? lock_release+0x27/0x2a0
[    0.586911]  ? cpuhp_issue_call+0x18b/0x1c0
[    0.587411]  ? x86_pmu_dead_cpu+0x20/0x20
[    0.587952]  ? __cpuhp_setup_state_cpuslocked+0x1ec/0x2d0
[    0.588411]  ? x86_pmu_starting_cpu+0x20/0x20
[    0.588989]  ? x86_pmu_dead_cpu+0x20/0x20
[    0.589410]  ? x86_pmu_starting_cpu+0x20/0x20
[    0.589988]  ? __cpuhp_setup_state+0x36/0x50
[    0.590411]  ? map_vsyscall+0x61/0x61
[    0.590899]  ? init_hw_perf_events+0x43a/0x61a
[    0.591411]  ? map_vsyscall+0x61/0x61
[    0.591902]  ? do_one_initcall+0xc3/0x1e0
[    0.592410]  ? trace_rcu_dyntick+0x28/0xb0
[    0.592958]  ? rcu_nmi_exit+0x91/0xb0
[    0.593411]  ? rcu_read_lock_sched_held+0x18/0x90
[    0.594036]  ? _raw_spin_unlock_irqrestore+0x3f/0x90
[    0.594411]  ? trace_kfree+0x28/0xb0
[    0.594893]  ? kfree+0x2c/0x190
[    0.595324]  ? rcu_read_lock_sched_held+0x18/0x90
[    0.595411]  ? rcu_read_lock_sched_held+0x18/0x90
[    0.596035]  ? lock_acquire+0x36/0x1b0
[    0.596411]  ? trace_lock_release+0x28/0xb0
[    0.596969]  ? lock_release+0x27/0x2a0
[    0.597411]  ? _raw_write_unlock+0x1a/0x30
[    0.597956]  ? proc_register+0x112/0x1b0
[    0.598411]  ? do_pre_smp_initcalls+0x36/0x49
[    0.598993]  ? kernel_init+0x11/0x1a0
[    0.599410]  ? kernel_init_freeable+0x152/0x1d7
[    0.600015]  ? rest_init+0x1e0/0x1e0
[    0.600411]  ? kernel_init+0x11/0x1a0
[    0.600903]  ? ret_from_fork+0x22/0x30
[    0.601406] irq event stamp: 0
[    0.601820] hardirqs last  enabled at (0): [<0000000000000000>] 0x0
[    0.602410] hardirqs last disabled at (0): [<ffffffffb1e840e8>] copy_process+0x6e8/0x1310
[    0.603410] softirqs last  enabled at (0): [<ffffffffb1e840f3>] copy_process+0x6f3/0x1310
[    0.604410] softirqs last disabled at (0): [<0000000000000000>] 0x0
[    0.605236] ---[ end trace 8cc507e855b2c7e9 ]---
[    0.605472] rcu: Hierarchical SRCU implementation.
[    0.606365] ------------[ cut here ]------------
[    0.606414] WARNING: CPU: 0 PID: 1 at kernel/jump_label.c:845 jump_label_test+0xcb/0xe3
[    0.607411] Modules linked in:
[    0.607820] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G        W         5.14.0-rc6+ #1
[    0.608410] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
[    0.609413] RIP: 0010:jump_label_test+0xcb/0xe3
[    0.610023] Code: 09 eb 2f fe f6 c3 01 bb 00 00 00 00 0f 85 50 ff ff ff eb 2c 0f 0b e9 54 ff ff ff 0f 0b e9 5a ff ff ff 0f 0b eb 9d 0f 0b eb a2 <0f> 0b e9 50 ff ff ff 0f 0b e9 56 ff ff ff 0f 0b eb 96 0f 0b eb 94
[    0.610411] RSP: 0000:ffffb49b80017d58 EFLAGS: 00010246
[    0.611110] RAX: 0000000000000001 RBX: ffffffffb3c68701 RCX: ffffffffb3968300
[    0.611410] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffffffffb3b89a8a
[    0.612358] RBP: ffffb49b80017ed8 R08: 0000000000000000 R09: ffff9cd2c1198000
[    0.612410] R10: ffffffffb1eb6f2a R11: ffffffffb3b89a8a R12: 0000000000000000
[    0.613347] R13: ffffffffb382b7f8 R14: ffffffffb3c68764 R15: ffffffffb3b89a8a
[    0.613411] FS:  0000000000000000(0000) GS:ffff9cd2dd200000(0000) knlGS:0000000000000000
[    0.614410] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.615171] CR2: ffff9cd2ce201000 CR3: 000000000d222001 CR4: 0000000000770ff0
[    0.615412] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    0.616345] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[    0.616410] PKRU: 55555554
[    0.616775] Call Trace:
[    0.617106]  ? __initstub__kmod_jump_label__238_867_jump_label_testearly+0x5/0x8
[    0.617410]  ? do_one_initcall+0xc3/0x1e0
[    0.617941]  ? trace_rcu_dyntick+0x28/0xb0
[    0.618411]  ? rcu_read_lock_sched_held+0x18/0x90
[    0.619033]  ? _raw_spin_unlock_irqrestore+0x3f/0x90
[    0.619410]  ? trace_kfree+0x28/0xb0
[    0.619887]  ? kfree+0x2c/0x190
[    0.620305]  ? rcu_read_lock_sched_held+0x18/0x90
[    0.620411]  ? rcu_read_lock_sched_held+0x18/0x90
[    0.621031]  ? lock_acquire+0x36/0x1b0
[    0.621410]  ? trace_lock_release+0x28/0xb0
[    0.621970]  ? lock_release+0x27/0x2a0
[    0.622411]  ? _raw_write_unlock+0x1a/0x30
[    0.622953]  ? proc_register+0x112/0x1b0
[    0.623411]  ? do_pre_smp_initcalls+0x36/0x49
[    0.623994]  ? kernel_init+0x11/0x1a0
[    0.624410]  ? kernel_init_freeable+0x152/0x1d7
[    0.625007]  ? rest_init+0x1e0/0x1e0
[    0.625411]  ? kernel_init+0x11/0x1a0
[    0.625897]  ? ret_from_fork+0x22/0x30
[    0.626391] irq event stamp: 0
[    0.626410] hardirqs last  enabled at (0): [<0000000000000000>] 0x0
[    0.627238] hardirqs last disabled at (0): [<ffffffffb1e840e8>] copy_process+0x6e8/0x1310
[    0.627410] softirqs last  enabled at (0): [<ffffffffb1e840f3>] copy_process+0x6f3/0x1310
[    0.628410] softirqs last disabled at (0): [<0000000000000000>] 0x0
[    0.629233] ---[ end trace 8cc507e855b2c7ea ]---
[    0.629417] ------------[ cut here ]------------
[    0.630023] WARNING: CPU: 0 PID: 1 at kernel/jump_label.c:848 jump_label_test+0xd2/0xe3
[    0.630410] Modules linked in:
[    0.630823] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G        W         5.14.0-rc6+ #1
[    0.631410] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
[    0.632410] RIP: 0010:jump_label_test+0xd2/0xe3
[    0.633018] Code: bb 00 00 00 00 0f 85 50 ff ff ff eb 2c 0f 0b e9 54 ff ff ff 0f 0b e9 5a ff ff ff 0f 0b eb 9d 0f 0b eb a2 0f 0b e9 50 ff ff ff <0f> 0b e9 56 ff ff ff 0f 0b eb 96 0f 0b eb 94 5b c3 e8 03 00 00 00
[    0.633410] RSP: 0000:ffffb49b80017d58 EFLAGS: 00010246
[    0.634112] RAX: 0000000000000001 RBX: ffffffffb3c68701 RCX: ffffffffb3968300
[    0.634410] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffffffffb3b89a8a
[    0.635357] RBP: ffffb49b80017ed8 R08: 0000000000000000 R09: ffff9cd2c1198000
[    0.635410] R10: ffffffffb1eb6f2a R11: ffffffffb3b89a8a R12: 0000000000000000
[    0.636356] R13: ffffffffb382b7f8 R14: ffffffffb3c68764 R15: ffffffffb3b89a8a
[    0.636410] FS:  0000000000000000(0000) GS:ffff9cd2dd200000(0000) knlGS:0000000000000000
[    0.637410] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.638175] CR2: ffff9cd2ce201000 CR3: 000000000d222001 CR4: 0000000000770ff0
[    0.638411] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    0.639398] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[    0.639410] PKRU: 55555554
[    0.639773] Call Trace:
[    0.640100]  ? __initstub__kmod_jump_label__238_867_jump_label_testearly+0x5/0x8
[    0.640410]  ? do_one_initcall+0xc3/0x1e0
[    0.640940]  ? trace_rcu_dyntick+0x28/0xb0
[    0.641411]  ? rcu_read_lock_sched_held+0x18/0x90
[    0.642029]  ? _raw_spin_unlock_irqrestore+0x3f/0x90
[    0.642410]  ? trace_kfree+0x28/0xb0
[    0.642884]  ? kfree+0x2c/0x190
[    0.643303]  ? rcu_read_lock_sched_held+0x18/0x90
[    0.643411]  ? rcu_read_lock_sched_held+0x18/0x90
[    0.644031]  ? lock_acquire+0x36/0x1b0
[    0.644411]  ? trace_lock_release+0x28/0xb0
[    0.644998]  ? lock_release+0x27/0x2a0
[    0.645411]  ? _raw_write_unlock+0x1a/0x30
[    0.645952]  ? proc_register+0x112/0x1b0
[    0.646411]  ? do_pre_smp_initcalls+0x36/0x49
[    0.646986]  ? kernel_init+0x11/0x1a0
[    0.647410]  ? kernel_init_freeable+0x152/0x1d7
[    0.648051]  ? rest_init+0x1e0/0x1e0
[    0.648411]  ? kernel_init+0x11/0x1a0
[    0.648899]  ? ret_from_fork+0x22/0x30
[    0.649397] irq event stamp: 0
[    0.649410] hardirqs last  enabled at (0): [<0000000000000000>] 0x0
[    0.650231] hardirqs last disabled at (0): [<ffffffffb1e840e8>] copy_process+0x6e8/0x1310
[    0.650410] softirqs last  enabled at (0): [<ffffffffb1e840f3>] copy_process+0x6f3/0x1310
[    0.651411] softirqs last disabled at (0): [<0000000000000000>] 0x0
[    0.652238] ---[ end trace 8cc507e855b2c7eb ]---
[    0.652417] jump_label: Fatal kernel bug, unexpected op at jump_label_test+0x1d/0xe3 [ffffffffb3b89aaf] (e9 a9 00 00 00 != 0f 1f 44 00 00)) size:5 type:1
[    0.653416] ------------[ cut here ]------------
[    0.654326] kernel BUG at arch/x86/kernel/jump_label.c:73!
[    0.654420] invalid opcode: 0000 [#1] SMP PTI
[    0.655409] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G        W         5.14.0-rc6+ #1
[    0.655409] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
[    0.655409] RIP: 0010:__jump_label_patch+0x1ab/0x1c0
[    0.655409] Code: 5d 41 5e 41 5f 5d c3 48 c7 c7 0d 27 48 b3 4c 89 fe 4c 89 fa 4c 89 f9 49 89 e8 45 89 e1 31 c0 41 56 e8 8c fb 0c 00 48 83 c4 08 <0f> 0b e8 7e b1 e1 00 0f 0b 0f 0b 0f 0b 00 00 cc cc 00 00 cc cc 48
[    0.655409] RSP: 0000:ffffb49b80017c48 EFLAGS: 00010286
[    0.655409] RAX: 000000000000008d RBX: ffffffffb3c899c8 RCX: d29ef1c90cda3d00
[    0.655409] RDX: 4000000000000000 RSI: 0000000000000000 RDI: ffffffffb1f087da
[    0.655409] RBP: ffffffffb3575f1a R08: 0000000000000000 R09: ffff9cd2dd0a0000
[    0.655409] R10: 000000000000bffd R11: ffffb49b80017ae0 R12: 0000000000000005
[    0.655409] R13: ffffffffb3575f1a R14: 0000000000000001 R15: ffffffffb3b89aaf
[    0.655409] FS:  0000000000000000(0000) GS:ffff9cd2dd200000(0000) knlGS:0000000000000000
[    0.655409] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.655409] CR2: ffff9cd2ce201000 CR3: 000000000d222001 CR4: 0000000000770ff0
[    0.655409] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    0.655409] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[    0.655409] PKRU: 55555554
[    0.655409] Call Trace:
[    0.655409]  ? jump_label_test+0x1d/0xe3
[    0.655409]  ? jump_label_test+0x2c/0xe3
[    0.655409]  ? jump_label_test+0x22/0xe3
[    0.655409]  ? arch_jump_label_transform_queue+0x28/0x60
[    0.655409]  ? __jump_label_update+0x9c/0x160
[    0.655409]  ? __initstub__kmod_jump_label__224_774_jump_label_init_moduleearly+0x13/0x13
[    0.655409]  ? static_key_disable_cpuslocked+0x3f/0x80
[    0.655409]  ? jump_label_test+0x40/0xe3
[    0.655409]  ? __initstub__kmod_jump_label__238_867_jump_label_testearly+0x5/0x8
[    0.655409]  ? do_one_initcall+0xc3/0x1e0
[    0.655409]  ? trace_rcu_dyntick+0x28/0xb0
[    0.655409]  ? rcu_read_lock_sched_held+0x18/0x90
[    0.655409]  ? _raw_spin_unlock_irqrestore+0x3f/0x90
[    0.655409]  ? trace_kfree+0x28/0xb0
[    0.655409]  ? kfree+0x2c/0x190
[    0.655409]  ? rcu_read_lock_sched_held+0x18/0x90
[    0.655409]  ? rcu_read_lock_sched_held+0x18/0x90
[    0.655409]  ? lock_acquire+0x36/0x1b0
[    0.655409]  ? trace_lock_release+0x28/0xb0
[    0.655409]  ? lock_release+0x27/0x2a0
[    0.655409]  ? _raw_write_unlock+0x1a/0x30
[    0.655409]  ? proc_register+0x112/0x1b0
[    0.655409]  ? do_pre_smp_initcalls+0x36/0x49
[    0.655409]  ? kernel_init+0x11/0x1a0
[    0.655409]  ? kernel_init_freeable+0x152/0x1d7
[    0.655409]  ? rest_init+0x1e0/0x1e0
[    0.655409]  ? kernel_init+0x11/0x1a0
[    0.655409]  ? ret_from_fork+0x22/0x30
[    0.655409] Modules linked in:
[    0.655412] ---[ end trace 8cc507e855b2c7ec ]---
[    0.656173] RIP: 0010:__jump_label_patch+0x1ab/0x1c0
[    0.656411] Code: 5d 41 5e 41 5f 5d c3 48 c7 c7 0d 27 48 b3 4c 89 fe 4c 89 fa 4c 89 f9 49 89 e8 45 89 e1 31 c0 41 56 e8 8c fb 0c 00 48 83 c4 08 <0f> 0b e8 7e b1 e1 00 0f 0b 0f 0b 0f 0b 00 00 cc cc 00 00 cc cc 48
[    0.657411] RSP: 0000:ffffb49b80017c48 EFLAGS: 00010286
[    0.658271] RAX: 000000000000008d RBX: ffffffffb3c899c8 RCX: d29ef1c90cda3d00
[    0.658410] RDX: 4000000000000000 RSI: 0000000000000000 RDI: ffffffffb1f087da
[    0.659357] RBP: ffffffffb3575f1a R08: 0000000000000000 R09: ffff9cd2dd0a0000
[    0.659410] R10: 000000000000bffd R11: ffffb49b80017ae0 R12: 0000000000000005
[    0.660356] R13: ffffffffb3575f1a R14: 0000000000000001 R15: ffffffffb3b89aaf
[    0.660410] FS:  0000000000000000(0000) GS:ffff9cd2dd200000(0000) knlGS:0000000000000000
[    0.661410] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.662172] CR2: ffff9cd2ce201000 CR3: 000000000d222001 CR4: 0000000000770ff0
[    0.662411] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    0.663360] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[    0.663410] PKRU: 55555554
[    0.663777] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[    0.664409] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]---

@nickdesaulniers @nathanchance

Edit: Updated to refer to the correct commit, which Nathan pointed out below, to avoid confusion.

@samitolvanen samitolvanen added [BUG] llvm (main) A bug in an unreleased version of LLVM (this label is appropriate for regressions) [FEATURE] LTO Related to building the kernel with LLVM Link Time Optimization [ARCH] x86_64 This bug impacts ARCH=x86_64 labels Aug 18, 2021
@nathanchance
Copy link
Member

I am not sure that bisect is right. I can boot up five times in a row at 2379949aadcee8d4028dec0508f88bda290636bc. My bisection landed on llvm/llvm-project@9934a5b:

$ git bisect run ...
...
9934a5b2ed5aa6e6bbb2e55c3cd98839722c226e is the first bad commit
commit 9934a5b2ed5aa6e6bbb2e55c3cd98839722c226e
Author: Jun Ma <JunMa@linux.alibaba.com>
Date:   Thu Jul 15 18:37:40 2021 +0800

    [CVP] processSwitch: Remove default case when switch cover all possible values.

    Differential Revision: https://reviews.llvm.org/D106056

 llvm/include/llvm/Transforms/Utils/Local.h         |  5 +++
 .../Scalar/CorrelatedValuePropagation.cpp          | 27 ++++++++++++++-
 llvm/lib/Transforms/Utils/Local.cpp                | 24 ++++++++++++++
 llvm/lib/Transforms/Utils/SimplifyCFG.cpp          | 23 -------------
 .../Transforms/CorrelatedValuePropagation/basic.ll | 38 +++++++++++++++++++++-
 5 files changed, 92 insertions(+), 25 deletions(-)
bisect run success

$ git bisect log
# bad: [d9873711cb03ac7aedcaadcba42f82c66e962e6e] [GlobalISel] Add IRTranslator support for G_ISNAN
# good: [2379949aadcee8d4028dec0508f88bda290636bc] [X86] AVX512FP16 instructions enabling 3/6
git bisect start 'd9873711cb03ac7aedcaadcba42f82c66e962e6e' '2379949aadcee8d4028dec0508f88bda290636bc'
# bad: [0d0628b2d213a43f80e4967d83b905c6d2211651] [OpenCL] C++ for OpenCL version 2021 introduced to command line.
git bisect bad 0d0628b2d213a43f80e4967d83b905c6d2211651
# bad: [cc327bd5231126006b4177b8ce0946ce52e2f645] [NFC] Cleanup attribute methods in Function
git bisect bad cc327bd5231126006b4177b8ce0946ce52e2f645
# bad: [043926a3a0773cdcefbcfbfefc6ba15623301122] [sanitizer] Add hexagon support to asan
git bisect bad 043926a3a0773cdcefbcfbfefc6ba15623301122
# bad: [9934a5b2ed5aa6e6bbb2e55c3cd98839722c226e] [CVP] processSwitch: Remove default case when switch cover all possible values.
git bisect bad 9934a5b2ed5aa6e6bbb2e55c3cd98839722c226e
# good: [a7ebc4d145892fd22442832549cb12c4b6920dea] [DAGCombiner] Teach isKnownToBeAPowerOfTwo handle SPLAT_VECTOR
git bisect good a7ebc4d145892fd22442832549cb12c4b6920dea
# good: [3a063f5ad0147e8cad3c9a247b4327e7b32eb3da] [NFC][CVP] Add one switch testcase
git bisect good 3a063f5ad0147e8cad3c9a247b4327e7b32eb3da
# first bad commit: [9934a5b2ed5aa6e6bbb2e55c3cd98839722c226e] [CVP] processSwitch: Remove default case when switch cover all possible values.

I have reported this upstream: https://reviews.llvm.org/D106056

@nathanchance nathanchance changed the title 2379949aadcee8d4028dec0508f88bda290636bc breaks x86_64 ThinLTO 9934a5b2ed5aa6e6bbb2e55c3cd98839722c226e breaks x86_64 ThinLTO Aug 18, 2021
@samitolvanen
Copy link
Member Author

samitolvanen commented Aug 18, 2021

Indeed, that teaches me not to bisect things before caffeine...

@junparser
Copy link

hi @samitolvanen @nathanchance

After some simple digging, the log [ 0.652417] jump_label: Fatal kernel bug, unexpected op at jump_label_test+0x1d/0xe3 [ffffffffb3b89aaf] (e9 a9 00 00 00 != 0f 1f 44 00 00)) size:5 type:1 with memcmp(addr, expect, size) shows we may get wrong addr.

BTW, the https://reviews.llvm.org/D106056 just try to removes unreachable default case. Do you fail to boot without ThinLTO?

I'll try to reproduce this.

@junparser
Copy link

@samitolvanen @nathanchance comfirmed, it shows that LLVM trunk remove some default case in cfg80211_edmg_chandef_valid, which is correct tranformation. however, some operation in tools/objtool/check.c emit "vmlinux.o: warning: objtool: cfg80211_edmg_chandef_valid()+0x169: can't find jump dest instruction at .text.cfg80211_edmg_chandef_valid+0x17b", and the boot failed.

@junparser
Copy link

BTW, I also find some similar to this issue, https://lore.kernel.org/lkml/20190612025227.lxumqqtqao6iqms3@treble/T/

@junparser
Copy link

I do believe this undefined behavior caused by libtool, it passed when i removed the default switch case in cfg80211_edmg_chandef_valid

@junparser
Copy link

I seems that the value of chandef->edmg.bw_config is out of the range [4, 15] at runtime in https://github.com/torvalds/linux/blob/master/net/wireless/chan.c#L86. However, with thinlto, we have proved that value range is range [4, 15} and remove the default case into unreachable() which cause undefined behavior.

@junparser
Copy link

junparser commented Aug 19, 2021

diff --git a/net/wireless/chan.c b/net/wireless/chan.c
index 869c43d4414c..53e8ed341381 100644
--- a/net/wireless/chan.c
+++ b/net/wireless/chan.c
@@ -108,9 +108,6 @@ static bool cfg80211_edmg_chandef_valid(const struct cfg80211_chan_def *chandef)
                if (max_contiguous < 4)
                        return false;
                break;
-
-       default:
-               return false;
        }

        /* check bw_config against aggregated (non contiguous) edmg channels */
@@ -134,8 +131,6 @@ static bool cfg80211_edmg_chandef_valid(const struct cfg80211_chan_def *chandef)
                if (num_of_enabled < 4 || max_contiguous < 2)
                        return false;
                break;
-       default:
-               return false;
        }

        return true;

@junparser
Copy link

https://reviews.llvm.org/D106056 has been reverted, FYI

@nickdesaulniers
Copy link
Member

Jotting down notes; I've been looking into this yesterday evening and this afternoon. I got distracted by #1446 , but it looks like there's a patch out now for that, and I can just use GNU objdump for now. Nothing conclusive yet, but still some notes:

  1. https://lore.kernel.org/lkml/20210824210507.GC17784@worktop.programming.kicks-ass.net/ talks about jump tables with empty entries, but from @nathanchance 's reduced test case there are no jump tables involved. We lower the switch to a series of comparisons and jumps. Once https://reviews.llvm.org/D106056 is reapplied, we no longer jump to reasonable locations. This gist shows the before and after disassembly.
  2. @LebedevRI mentioned on IRC that this may be a phase ordering issue on IRC. Testing that hypothesis, I figured that perhaps unreachableblockelim wasn't being run after correlated-propagation for LTO; but with https://reviews.llvm.org/D106056 reapplied, we still get curious results for the non-LTO case. See 03_D106056-no-lto.txt from the above gist in which we no longer jump past the end of the function, but rather into the middle of the bistream.
  3. We never eliminate the switch in IR, so the "unreachabledefault" block remains all the way through to ISEL. ISEL turns the select with 4 possible destinations (including the unreachable default) into 3 conditional and 1 unconditional jump, with the unconditional block being the unconditional jump's target. One thing that surprises me here (but maybe it's just a red herring) is that unreachable-mbb-elimination is fed code like:
bb.11.if.then:
; predecessors: %bb.10
  successors: %bb.4(0x80000000), %bb.6(0x00000000); %bb.4(100.00%), %bb.6(0.00%)

  %7:gr32 = MOV32ri 12288
  BT32rr killed %7:gr32, %0:gr32, implicit-def $eflags
  JCC_1 %bb.4, 2, implicit $eflags
  JMP_1 %bb.6
...
bb.6.if.then.unreachabledefault:
; predecessors: %bb.11
...

but doesn't touch it. (Some MBB's get renamed or moved, so the result is nearly the same):

bb.6.if.then:
; predecessors: %bb.5
  successors: %bb.8(0x80000000), %bb.10(0x00000000); %bb.8(100.00%), %bb.10(0.00%)

  %7:gr32 = MOV32ri 12288
  BT32rr killed %7:gr32, %0:gr32, implicit-def $eflags
  JCC_1 %bb.8, 2, implicit $eflags
  JMP_1 %bb.10
...
bb.10.if.then.unreachabledefault:
; predecessors: %bb.6

So if we have a successor that we won't visit 100.00% because it's unreachable, shouldn't unreachable-mbb-elimination be replacing the conditional jump and unconditional jump by an unconditional jump to the previously conditional jump's destination? (or is that somehow bad/illegal; am I misunderstanding something about unreachable)? ie. I would have expected unreachable-mbb-elimination to perhaps have produced:

bb.6.if.then:
; predecessors: %bb.5
  successors: %bb.8(0x80000000); %bb.8(100.00%)

  %7:gr32 = MOV32ri 12288
  BT32rr killed %7:gr32, %0:gr32, implicit-def $eflags
  JMP_1 %bb.8
...

(I guess technically that would make the BT32rr dead, too). Thoughts, @topperc @RKSimon @LebedevRI?

@LebedevRI
Copy link

Can you give me a standalone reproducer that only involves opt -O3?

@nickdesaulniers
Copy link
Member

nickdesaulniers commented Sep 1, 2021

(I followed up with Roman on IRC)

This looks like a follow up to https://llvm.org/pr43129; in fact, I spent all day looking at code that was added in https://reviews.llvm.org/D68131. I did come up with this test case, reduced from Nathan's reproducer:

@foo = global i32 0

define i32 @baz(i32 %0) {
  switch i32 %0, label %if.then.unreachabledefault [
    i32 4, label %sw.epilog8
    i32 5, label %sw.epilog8
    i32 8, label %sw.bb2
    i32 9, label %sw.bb2
    i32 12, label %sw.bb4
    i32 13, label %sw.bb4
  ]

sw.bb2:
  br label %return

sw.bb4:
  br label %return

sw.epilog8:
  br label %return

if.then.unreachabledefault:
  unreachable

return:
  %retval.0 = phi i32 [ 1, %sw.epilog8 ], [ 0, %sw.bb2 ], [ 0, %sw.bb4 ]
  ret i32 %retval.0
}
$ llc -O2 foo.ll -o -
```asm
# ...
baz:
        .cfi_startproc
        xorl    %eax, %eax
        movl    $13056, %ecx
        btl     %edi, %ecx
        jae     .LBB0_1
        retq
.LBB0_1:
        movl    $48, %eax
        btl     %edi, %eax
        jae     .LBB0_4 # oops!
        movl    $1, %eax
        retq
.LBB0_4:
.Lfunc_end0:
# ...

cc @zmodem

@nickdesaulniers
Copy link
Member

nickdesaulniers commented Sep 1, 2021

Please help test https://reviews.llvm.org/D109103 and https://reviews.llvm.org/D109106 with https://reviews.llvm.org/D106056 re-applied locally.

@nickdesaulniers nickdesaulniers self-assigned this Sep 1, 2021
@nickdesaulniers nickdesaulniers added the [PATCH] Submitted A patch has been submitted for review label Sep 1, 2021
@nathanchance
Copy link
Member

I tested diff 370104 and diff 370343 of https://reviews.llvm.org/D109103 with https://reviews.llvm.org/D106056 reapplied and the kernel booted in QEMU and I did not see any objtool warnings.

nickdesaulniers added a commit to llvm/llvm-project that referenced this issue Sep 8, 2021
Upload a test that shows ISEL taking a SwitchInst that has an
unreachable BB for a default target being lowered to an unconditional
jump off the end of a function.

Link: https://bugs.llvm.org/show_bug.cgi?id=50080
Link: ClangBuiltLinux/linux#679
Link: ClangBuiltLinux/linux#1440

Reviewed By: craig.topper, hans

Differential Revision: https://reviews.llvm.org/D109106
@nickdesaulniers nickdesaulniers added [FIXED][LLVM] 14 This bug was fixed in LLVM 14.x and removed [PATCH] Submitted A patch has been submitted for review labels Sep 8, 2021
@yshui
Copy link
Member

yshui commented Oct 6, 2022

I am seeing this problem again.

[    0.989168][    T1] jump_label: Fatal kernel bug, unexpected op at swap_writepage+0x1c/0x60 [(____ptrval____)] (eb 1c 48 c7 c2 != 66 90 0f 1f 00)) size:2 type:1
[    0.990045][    T1] ------------[ cut here ]------------
[    0.990386][    T1] kernel BUG at arch/x86/kernel/jump_label.c:73!
[    0.990752][    T1] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[    0.991107][    T1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.0.0-zen-local+ #15
[    0.991544][    T1] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS d55cb5a 04/01/2014
[    0.992065][    T1] RIP: 0010:__jump_label_patch+0x18d/0x1a0
[    0.992584][    T1] Code: 5e 41 5f 5d e9 24 c2 f6 ff 48 c7 c7 f7 55 71 ac 4c 89 fe 4c 89 fa 4c 89 f9 49 89 d8 45 89 e1 41 56 e8 9c 59 0f 00 48 83 c4 08 <0f> 0b 0f 0b 0f 0b 0f 0b 00 00 cc cc 00 00 cc cc 00 00 cc 48 c7 c7
[    0.993702][    T1] RSP: 0018:ffff9afa011a3c18 EFLAGS: 00010286
[    0.994052][    T1] RAX: 000000000000008c RBX: ffffffffac783ae1 RCX: ffffffffacc65d80
[    0.994509][    T1] RDX: 0000000000000000 RSI: 0000000000000002 RDI: c00000010001141b
[    0.994982][    T1] RBP: ffffffffad133224 R08: 0000000000000000 R09: ffffffffacc7e010
[    0.995438][    T1] R10: 000000010001141b R11: 000000000000041b R12: 0000000000000002
[    0.995893][    T1] R13: ffffffffac783ae1 R14: 0000000000000001 R15: ffffffffaa5cec5c
[    0.996361][    T1] FS:  0000000000000000(0000) GS:ffff9afa04600000(0000) knlGS:0000000000000000
[    0.996865][    T1] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.997241][    T1] CR2: ffff9afa07a01000 CR3: 0000000007210000 CR4: 0000000000350ff0
[    0.997691][    T1] Call Trace:
[    0.997878][    T1]  <TASK>
[    0.998050][    T1]  ? swap_writepage+0x1c/0x60
[    0.998314][    T1]  ? swap_writepage+0x2b/0x60
[    0.998578][    T1]  ? swap_writepage+0x1e/0x60
[    0.998842][    T1]  ? arch_jump_label_transform_queue+0x26/0x60
[    0.999195][    T1]  ? __jump_label_update+0x96/0x150
[    0.999500][    T1]  ? static_key_slow_inc_cpuslocked+0x4c/0x80
[    0.999885][    T1]  ? frontswap_register_ops+0x2c/0x40
[    1.000193][    T1]  ? init_zswap+0x19b/0x233
[    1.000464][    T1]  ? init_frontswap+0x9b/0x9b
[    1.000745][    T1]  ? do_one_initcall+0x120/0x2b0
[    1.001037][    T1]  ? do_initcall_level+0x7a/0xd8
[    1.001318][    T1]  ? do_initcalls+0x44/0x6b
[    1.001572][    T1]  ? kernel_init_freeable+0xd8/0x122
[    1.001882][    T1]  ? rest_init+0xc0/0xc0
[    1.002130][    T1]  ? kernel_init+0x11/0x1a0
[    1.002383][    T1]  ? ret_from_fork+0x22/0x30
[    1.002644][    T1]  </TASK>

I think it's caused by the particular static key in frontswap_enabled

Kernel 6.0 with LLVM/Clang 15.0.2

@yshui
Copy link
Member

yshui commented Oct 6, 2022

Hmm, I can reproduce this problem using as far back as clang 13. I don't have older clang available from my system's package management.

For now it's possible to workaround by disabling CONFIG_ZSWAP.

mem-frob pushed a commit to draperlaboratory/hope-llvm-project that referenced this issue Oct 7, 2022
Upload a test that shows ISEL taking a SwitchInst that has an
unreachable BB for a default target being lowered to an unconditional
jump off the end of a function.

Link: https://bugs.llvm.org/show_bug.cgi?id=50080
Link: ClangBuiltLinux/linux#679
Link: ClangBuiltLinux/linux#1440

Reviewed By: craig.topper, hans

Differential Revision: https://reviews.llvm.org/D109106
mem-frob pushed a commit to draperlaboratory/hope-llvm-project that referenced this issue Oct 7, 2022
…n is unreachable

Otherwise we end up with an extra conditional jump, following by an
unconditional jump off the end of a function. ie.

  bb.0:
    BT32rr ..
    JCC_1 %bb.4 ...
  bb.1:
    BT32rr ..
    JCC_1 %bb.2 ...
    JMP_1 %bb.3
  bb.2:
    ...
  bb.3.unreachable:
  bb.4:
    ...

  Should be equivalent to:
  bb.0:
    BT32rr ..
    JCC_1 %bb.4 ...
    JMP_1 %bb.2
  bb.1:
  bb.2:
    ...
  bb.3.unreachable:
  bb.4:
    ...

This can occur since at the higher level IR (Instruction) SwitchInsts
are required to have BBs for default destinations, even when it can be
deduced that such BBs are unreachable.

For most programs, this isn't an issue, just wasted instructions since the
unreachable has been statically proven.

The x86_64 Linux kernel when built with CONFIG_LTO_CLANG_THIN=y fails to
boot though once D106056 is re-applied.  D106056 makes it more likely
that correlation-propagation (CVP) can deduce that the default case of
SwitchInsts are unreachable. The x86_64 kernel uses a binary post
processor called objtool, which emits this warning:

vmlinux.o: warning: objtool: cfg80211_edmg_chandef_valid()+0x169: can't
find jump dest instruction at .text.cfg80211_edmg_chandef_valid+0x17b

I haven't debugged precisely why this causes a failure at boot time, but
fixing this very obvious jump off the end of the function fixes the
warning and boot problem.

Link: https://bugs.llvm.org/show_bug.cgi?id=50080
Fixes: ClangBuiltLinux/linux#679
Fixes: ClangBuiltLinux/linux#1440

Reviewed By: hans

Differential Revision: https://reviews.llvm.org/D109103
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[ARCH] x86_64 This bug impacts ARCH=x86_64 [BUG] llvm (main) A bug in an unreleased version of LLVM (this label is appropriate for regressions) [FEATURE] LTO Related to building the kernel with LLVM Link Time Optimization [FIXED][LLVM] 14 This bug was fixed in LLVM 14.x
Projects
None yet
Development

No branches or pull requests

6 participants