Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

node exporter cpu crash on boot on 64cpu flavor host #9940

Open
demetthyl opened this issue Feb 12, 2025 · 0 comments
Open

node exporter cpu crash on boot on 64cpu flavor host #9940

demetthyl opened this issue Feb 12, 2025 · 0 comments

Comments

@demetthyl
Copy link

demetthyl commented Feb 12, 2025

Bug Report

Describe the bug
When starting fluent-bit on a 64cpu ubuntu with node exporter input, fluent-bit instant crashes

To Reproduce

  • Start a 64cpu ubuntu VM

  • Install fluent-bit and run node exporter cpu metrics

  • Steps to reproduce the problem:

1/ apt-get install fluent-bit
2/ sudo /opt/fluent-bit/bin/fluent-bit -i node_exporter_metrics -pmetrics=cpu -p path.sysfs=/sys -Z -vv -o stdout

Output:

[2025/02/12 13:34:17] [debug] [input:node_exporter_metrics:node_exporter_metrics.0] CPU is missing core_throttle_count: /sys/devices/system/cpu/cpu39
[2025/02/12 13:34:17] [debug] [input:node_exporter_metrics:node_exporter_metrics.0] CPU is missing package_throttle_count: /sys/devices/system/cpu/cpu39
[2025/02/12 13:34:17] [debug] [input:node_exporter_metrics:node_exporter_metrics.0] CPU is missing core_throttle_count: /sys/devices/system/cpu/cpu4
[2025/02/12 13:34:17] [debug] [input:node_exporter_metrics:node_exporter_metrics.0] CPU is missing package_throttle_count: /sys/devices/system/cpu/cpu4
[2025/02/12 13:34:17] [debug] [input:node_exporter_metrics:node_exporter_metrics.0] CPU is missing core_throttle_count: /sys/devices/system/cpu/cpu40
[2025/02/12 13:34:17] [debug] [input:node_exporter_metrics:node_exporter_metrics.0] CPU is missing package_throttle_count: /sys/devices/system/cpu/cpu40
[2025/02/12 13:34:17] [debug] [input:node_exporter_metrics:node_exporter_metrics.0] CPU is missing core_throttle_count: /sys/devices/system/cpu/cpu41
[2025/02/12 13:34:17] [debug] [input:node_exporter_metrics:node_exporter_metrics.0] CPU is missing package_throttle_count: /sys/devices/system/cpu/cpu41
[2025/02/12 13:34:17] [debug] [input:node_exporter_metrics:node_exporter_metrics.0] CPU is missing core_throttle_count: /sys/devices/system/cpu/cpu42
[2025/02/12 13:34:17] [engine] caught signal (SIGSEGV)
[2025/02/12 13:34:17] [debug] [input:node_exporter_metrics:node_exporter_metrics.0] CPU is missing package_throttle_count: /sys/devices/system/cpu/cpu42
[2025/02/12 13:34:17] [debug] [input:node_exporter_metrics:node_exporter_metrics.0] CPU is missing core_throttle_count: /sys/devices/system/cpu/cpu43
[2025/02/12 13:34:17] [debug] [input:node_exporter_metrics:node_exporter_metrics.0] CPU is missing package_throttle_count: /sys/devices/system/cpu/cpu43
#0 0x61dce0d07451 in ne_cpu_update() at plugins/in_node_exporter_metrics/ne_cpu_linux.c:393
#1 0x61dce0d05ef7 in activate_collector() at plugins/in_node_exporter_metrics/ne.c:160
#2 0x61dce0d05ef7 in in_ne_init() at plugins/in_node_exporter_metrics/ne.c:223
#3 0x61dce0c8be35 in input_thread() at src/flb_input_thread.c:365
#4 0x61dce0cad8a1 in step_callback() at src/flb_worker.c:43
#5 0x734b7ca9caa3 in ???() at ???:0
#6 0x734b7cb29c3b in ???() at ???:0
#7 0xffffffffffffffff in ???() at ???:0
Aborted

Notable point: if i point on an unexisting sys path (/sys2 for example), i have /proc datas in stdout and error about sys2 :

[2025/02/12 13:49:04] [error] [input:node_exporter_metrics:node_exporter_metrics.0] read error, check permissions: /sys2/devices/system/cpu/cpu[0-9]*

But fluent-bit is up and running

=> something happens in /sys with 64cpu : volumetry ?

For information, ive installed prometheus node exporter standalone on the host, to compare, and it works like a charm

=> Do i hit a buffer or something when fluent-bit tries to gather cpu infos in /sys for 64cpus ?

Your Environment

  • Version used: 3.2.6
  • Server type and version: Openstack VM
  • Operating System and version:

uname -a
Linux r1-mongodb-f2-0 6.8.0-39-generic #39-Ubuntu SMP PREEMPT_DYNAMIC Fri Jul 5 21:49:14 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

ubuntu@r1-mongodb-f2-0:~$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 40 bits physical, 57 bits virtual
Byte Order: Little Endian
CPU(s): 64
On-line CPU(s) list: 0-63
Vendor ID: GenuineIntel
Model name: Intel Xeon Processor (Icelake)
CPU family: 6
Model: 134
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s): 64
Stepping: 0
BogoMIPS: 5200.00
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology cpuid tsc_known_freq pni pclmulqdq
vmx ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_fault ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow fl
expriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsa
ves avx_vnni avx512_bf16 wbnoinvd arat vnmi avx512vbmi umip pku ospke waitpkg avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq la57 rdpid bus_lock_detect cldemote movdiri mo
vdir64b fsrm md_clear serialize tsxldtrk avx512_fp16 arch_capabilities
Virtualization features:
Virtualization: VT-x
Hypervisor vendor: KVM
Virtualization type: full
Caches (sum of all):
L1d: 2 MiB (64 instances)
L1i: 2 MiB (64 instances)
L2: 256 MiB (64 instances)
L3: 1 GiB (64 instances)

  • Filters and plugins:
    node-exporter embedded with 3.2.6 fluent-bit
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant