Help me troubleshoot Scale random freeze logs - kernel: general protection fault

Currently running:

  • Supermicro X11SSM-F
  • E3-1240 v6 CPU that I load tested for 24 hours
  • 2x16GB DDR4 ECC UDIMM memory (new) that I load tested for 24 hours.
  • ElectricEel-24.10.2.1

I moved my pool from a Supermicro X10 setup (everything running just fine on Scale ElectricEel-24.10.2.1 same version) to this Supermicro X11 setup. The spinning rust was the same, and I had an Intel Optane M10 as SLOG. I did expand the SLOG setup with another Optane M10 so they run as a mirror now.

I had two random freezes. One the may 2nd, one today. When the NAS (and everything related to it) becomes unavailable, the console hangs when I log into IPMI. I have to power reset the machine. No errors in the IPMI logs (I had some ECC memory issues with previous memory and I was nicely getting unrecoverable error messages in the IPMI logs for those. This time nothing.

When looking at the logs, both occurrences look the same. They are at different timestamps.

(Journalctl outputs at the bottom of this post)

In the first and fourth line of the logs it mentions:

  • kernel: general protection fault
  • kernel: RIP: 0010:intel_idle_ibrs+0x81/0xf0

From what I know “intel_idle_ibrs” has something to do with CPU C-States. In the BIOS the current settings were active during these two freezes:

  • CPU C-States - enabled
  • Enhanced C-States - enabled
  • C-State Auto Demotion - C1 and C3
  • C-State Un-Demotion - C1 and C3
  • Package C-State Demotion - Disabled
  • Package C-State Un-Demotion - Disabled
  • C-State Pre-Wake - Enabled
  • (4x) Package C-State Limit - Auto

Just for sure I turned off CPU C-States for now. What do you guys think? If it’s memory I would expect some logs in the IPMI logs. Maybe there’s still something wrong with my CPU, even though a load test was succesful for 24 hours before?

First freeze:

May 02 14:49:25 nas1 kernel: general protection fault, maybe for address 0xffff9e91d7ab2c00: 0000 [#1] PREEMPT SMP PTI
May 02 14:49:25 nas1 kernel: CPU: 2 PID: 0 Comm: swapper/2 Tainted: P           OE      6.6.44-production+truenas #1
May 02 14:49:25 nas1 kernel: Hardware name: TAROX /X11SSM-F, BIOS 3.4 10/04/2024
May 02 14:49:25 nas1 kernel: RIP: 0010:intel_idle_ibrs+0x81/0xf0
May 02 14:49:25 nas1 kernel: Code: 48 89 d1 65 48 8b 04 25 00 2c 03 00 0f 01 c8 48 8b 00 a8 08 75 14 66 90 0f 00 2d b6 03 50 00 b9 01 00 00 00 4c 89 c0 0f 01 c9 <65> 48 8b 04 25 00 2c 03 00 f0 80 60 02 df f0 83 >
May 02 14:49:25 nas1 kernel: RSP: 0018:ffffb31dc00f7e78 EFLAGS: 00010046
May 02 14:49:25 nas1 kernel: RAX: 0000000000000040 RBX: 0000000000000006 RCX: 0000000000000001
May 02 14:49:25 nas1 kernel: RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000001
May 02 14:49:25 nas1 kernel: RBP: ffffffffa079b800 R08: 0000000000000040 R09: 0000000000000036
May 02 14:49:25 nas1 kernel: R10: 0000000000000008 R11: ffff9e91d7ab1fe4 R12: ffffffffa079b800
May 02 14:49:25 nas1 kernel: R13: ffffffffa079ba88 R14: 0000000000000006 R15: 0000000000000000
May 02 14:49:25 nas1 kernel: FS:  0000000000000000(0000) GS:ffff9e91d7a80000(0000) knlGS:0000000000000000
May 02 14:49:25 nas1 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 02 14:49:25 nas1 kernel: CR2: 00007f4ad0b0f000 CR3: 00000002f1220004 CR4: 00000000003706e0
May 02 14:49:25 nas1 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
May 02 14:49:25 nas1 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
May 02 14:49:25 nas1 kernel: Call Trace:
May 02 14:49:25 nas1 kernel:  <TASK>
May 02 14:49:25 nas1 kernel:  ? die_addr+0x36/0x90
May 02 14:49:25 nas1 kernel:  ? exc_general_protection+0x1c5/0x430
May 02 14:49:25 nas1 kernel:  ? irqentry_enter+0x3d/0x50
May 02 14:49:25 nas1 kernel:  ? asm_exc_general_protection+0x26/0x30
May 02 14:49:25 nas1 kernel:  ? intel_idle_ibrs+0x81/0xf0
May 02 14:49:25 nas1 kernel:  ? intel_idle_ibrs+0x15/0xf0
May 02 14:49:25 nas1 kernel:  cpuidle_enter_state+0x81/0x440
May 02 14:49:25 nas1 kernel:  cpuidle_enter+0x2d/0x40
May 02 14:49:25 nas1 kernel:  do_idle+0x20d/0x270
May 02 14:49:25 nas1 kernel:  cpu_startup_entry+0x2a/0x30
May 02 14:49:25 nas1 kernel:  start_secondary+0x11e/0x140
May 02 14:49:25 nas1 kernel:  secondary_startup_64_no_verify+0x18f/0x19b
May 02 14:49:25 nas1 kernel:  </TASK>
May 02 14:49:25 nas1 kernel: Modules linked in: rpcsec_gss_krb5(E) scst_vdisk(OE) isert_scst(OE) iscsi_scst(OE) scst(OE) rdma_cm(E) iw_cm(E) ib_cm(E) dlm(E) libcrc32c(E) crc32c_generic(E) nvme_fabrics(E) overlay>
May 02 14:49:25 nas1 kernel:  efi_pstore(E) configfs(E) ip_tables(E) x_tables(E) autofs4(E) zfs(POE) spl(OE) efivarfs(E) hid_generic(E) usbhid(E) hid(E) uas(E) usb_storage(E) sd_mod(E) ses(E) enclosure(E) scsi_t>
May 02 14:49:25 nas1 kernel: ---[ end trace 0000000000000000 ]---
May 02 14:49:25 nas1 kernel: ------------[ cut here ]------------
May 02 14:49:25 nas1 kernel: NETDEV WATCHDOG: eno1 (igb): transmit queue 1 timed out 6580 ms
May 02 14:49:25 nas1 kernel: WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:525 dev_watchdog+0x235/0x240
May 02 14:49:25 nas1 kernel: Modules linked in: rpcsec_gss_krb5(E) scst_vdisk(OE) isert_scst(OE) iscsi_scst(OE) scst(OE) rdma_cm(E) iw_cm(E) ib_cm(E) dlm(E) libcrc32c(E) crc32c_generic(E) nvme_fabrics(E) overlay>
May 02 14:49:25 nas1 kernel:  efi_pstore(E) configfs(E) ip_tables(E) x_tables(E) autofs4(E) zfs(POE) spl(OE) efivarfs(E) hid_generic(E) usbhid(E) hid(E) uas(E) usb_storage(E) sd_mod(E) ses(E) enclosure(E) scsi_t>
May 02 14:49:25 nas1 kernel: CPU: 0 PID: 0 Comm: swapper/0 Tainted: P      D    OE      6.6.44-production+truenas #1
May 02 14:49:25 nas1 kernel: Hardware name: TAROX /X11SSM-F, BIOS 3.4 10/04/2024
May 02 14:49:25 nas1 kernel: RIP: 0010:dev_watchdog+0x235/0x240
May 02 14:49:25 nas1 kernel: Code: ff ff ff 48 89 df c6 05 b6 26 40 01 01 e8 f3 2c fa ff 45 89 f8 44 89 f1 48 89 de 48 89 c2 48 c7 c7 60 f9 f1 9f e8 cb 16 6b ff <0f> 0b e9 2a ff ff ff 0f 1f 40 00 90 90 90 90 90 >
May 02 14:49:25 nas1 kernel: RSP: 0018:ffffb31dc0003e70 EFLAGS: 00010286
May 02 14:49:25 nas1 kernel: RAX: 0000000000000000 RBX: ffff9e8ac1a20000 RCX: 0000000000000027
May 02 14:49:25 nas1 kernel: RDX: ffff9e91d7a213c8 RSI: 0000000000000001 RDI: ffff9e91d7a213c0
May 02 14:49:25 nas1 kernel: RBP: ffff9e8ac1a20488 R08: 0000000000000000 R09: ffffb31dc0003cf8
May 02 14:49:25 nas1 kernel: R10: 0000000000000003 R11: ffffffffa06d1e48 R12: ffff9e8ac220b140
May 02 14:49:25 nas1 kernel: R13: ffff9e8ac1a203dc R14: 0000000000000001 R15: 00000000000019b4
May 02 14:49:25 nas1 kernel: FS:  0000000000000000(0000) GS:ffff9e91d7a00000(0000) knlGS:0000000000000000
May 02 14:49:25 nas1 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 02 14:49:25 nas1 kernel: CR2: 00007f32ad061a1c CR3: 00000002f1220006 CR4: 00000000003706f0
May 02 14:49:25 nas1 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
May 02 14:49:25 nas1 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
May 02 14:49:25 nas1 kernel: Call Trace:
May 02 14:49:25 nas1 kernel:  <IRQ>
May 02 14:49:25 nas1 kernel:  ? dev_watchdog+0x235/0x240
May 02 14:49:25 nas1 kernel:  ? __warn+0x81/0x130
May 02 14:49:25 nas1 kernel:  ? dev_watchdog+0x235/0x240
May 02 14:49:25 nas1 kernel:  ? report_bug+0x171/0x1a0
May 02 14:49:25 nas1 kernel:  ? console_unlock+0x78/0x120
May 02 14:49:25 nas1 kernel:  ? handle_bug+0x41/0x70
May 02 14:49:25 nas1 kernel:  ? exc_invalid_op+0x17/0x70
May 02 14:49:25 nas1 kernel:  ? asm_exc_invalid_op+0x1a/0x20
May 02 14:49:25 nas1 kernel:  ? dev_watchdog+0x235/0x240
May 02 14:49:25 nas1 kernel:  ? __pfx_dev_watchdog+0x10/0x10
May 02 14:49:25 nas1 kernel:  call_timer_fn+0x24/0x130
May 02 14:49:25 nas1 kernel:  ? __pfx_dev_watchdog+0x10/0x10
May 02 14:49:25 nas1 kernel:  __run_timers+0x222/0x2c0
May 02 14:49:25 nas1 kernel:  run_timer_softirq+0x1d/0x40
May 02 14:49:25 nas1 kernel:  handle_softirqs+0xd7/0x2c0
May 02 14:49:25 nas1 kernel:  __irq_exit_rcu+0x98/0xc0
May 02 14:49:25 nas1 kernel:  sysvec_apic_timer_interrupt+0x72/0x90
May 02 14:49:25 nas1 kernel:  </IRQ>
May 02 14:49:25 nas1 kernel:  <TASK>
May 02 14:49:25 nas1 kernel:  asm_sysvec_apic_timer_interrupt+0x1a/0x20
May 02 14:49:25 nas1 kernel: RIP: 0010:cpuidle_enter_state+0xcc/0x440
May 02 14:49:25 nas1 kernel: Code: 8a 7d 53 ff e8 35 f4 ff ff 8b 53 04 49 89 c5 0f 1f 44 00 00 31 ff e8 f3 b9 52 ff 45 84 ff 0f 85 57 02 00 00 fb 0f 1f 44 00 00 <45> 85 f6 0f 88 85 01 00 00 49 63 d6 48 8d 04 52 >
May 02 14:49:25 nas1 kernel: RSP: 0018:ffffffffa0603e38 EFLAGS: 00000246
May 02 14:49:25 nas1 kernel: RAX: ffff9e91d7a33440 RBX: ffff9e91d7a3cec8 RCX: 000000000000001f
May 02 14:49:25 nas1 kernel: RDX: 0000000000000000 RSI: 0000000022a1d59e RDI: 0000000000000000
May 02 14:49:25 nas1 kernel: RBP: 0000000000000006 R08: 0000000000000002 R09: 0000000000000ed1
May 02 14:49:25 nas1 kernel: R10: 0000000000000010 R11: ffff9e91d7a31fe4 R12: ffffffffa079b800
May 02 14:49:25 nas1 kernel: R13: 0000824a868d183d R14: 0000000000000006 R15: 0000000000000000
May 02 14:49:25 nas1 kernel:  ? cpuidle_enter_state+0xbd/0x440
May 02 14:49:25 nas1 kernel:  cpuidle_enter+0x2d/0x40
May 02 14:49:25 nas1 kernel:  do_idle+0x20d/0x270
May 02 14:49:25 nas1 kernel:  cpu_startup_entry+0x2a/0x30
May 02 14:49:25 nas1 kernel:  rest_init+0xd0/0xd0
May 02 14:49:25 nas1 kernel:  arch_call_rest_init+0xe/0x30
May 02 14:49:25 nas1 kernel:  start_kernel+0x4ea/0x790
May 02 14:49:25 nas1 kernel:  x86_64_start_reservations+0x18/0x30
May 02 14:49:25 nas1 kernel:  x86_64_start_kernel+0x96/0xa0
May 02 14:49:25 nas1 kernel:  secondary_startup_64_no_verify+0x18f/0x19b
May 02 14:49:25 nas1 kernel:  </TASK>
May 02 14:49:25 nas1 kernel: ---[ end trace 0000000000000000 ]---
May 02 14:49:25 nas1 kernel: igb 0000:05:00.0 eno1: Reset adapter
May 02 14:49:50 nas1 kernel: rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
May 02 14:49:50 nas1 kernel: rcu:         2-...0: (4 GPs behind) idle=1e64/0/0x1 softirq=5637388/5637388 fqs=2541
May 02 14:49:50 nas1 kernel: rcu:         (detected by 0, t=5252 jiffies, g=17814017, q=5910 ncpus=8)
May 02 14:49:50 nas1 kernel: Sending NMI from CPU 0 to CPUs 2:
May 02 14:49:50 nas1 kernel: rcu: rcu_preempt kthread starved for 2482 jiffies! g17814017 f0x0 RCU_GP_DOING_FQS(6) ->state=0x0 ->cpu=6
May 02 14:49:50 nas1 kernel: rcu:         Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
May 02 14:49:50 nas1 kernel: rcu: RCU grace-period kthread stack dump:
May 02 14:49:50 nas1 kernel: task:rcu_preempt     state:I stack:0     pid:17    ppid:2      flags:0x00004000
May 02 14:49:50 nas1 kernel: Call Trace:
May 02 14:49:50 nas1 kernel:  <TASK>
May 02 14:49:50 nas1 kernel:  __schedule+0x349/0x950
May 02 14:49:50 nas1 kernel:  ? __pfx_rcu_gp_kthread+0x10/0x10
May 02 14:49:50 nas1 kernel:  schedule+0x5b/0xa0
May 02 14:49:50 nas1 kernel:  schedule_timeout+0x98/0x160
May 02 14:49:50 nas1 kernel:  ? __pfx_process_timeout+0x10/0x10
May 02 14:49:50 nas1 kernel:  rcu_gp_fqs_loop+0x141/0x550
May 02 14:49:50 nas1 kernel:  rcu_gp_kthread+0xd8/0x190
May 02 14:49:50 nas1 kernel:  kthread+0xe5/0x120
May 02 14:49:50 nas1 kernel:  ? __pfx_kthread+0x10/0x10
May 02 14:49:50 nas1 kernel:  ret_from_fork+0x31/0x50
May 02 14:49:50 nas1 kernel:  ? __pfx_kthread+0x10/0x10
May 02 14:49:50 nas1 kernel:  ret_from_fork_asm+0x1b/0x30
May 02 14:49:50 nas1 kernel:  </TASK>
May 02 14:49:50 nas1 kernel: rcu: Stack dump where RCU GP kthread last ran:
May 02 14:49:50 nas1 kernel: Sending NMI from CPU 0 to CPUs 6:
May 02 14:49:50 nas1 kernel: NMI backtrace for cpu 6 skipped: idling at intel_idle_ibrs+0x81/0xf0
May 02 14:49:50 nas1 kernel: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 2-...D } 6069 jiffies s: 1493 root: 0x4/.
May 02 14:49:50 nas1 kernel: rcu: blocking rcu_node structures (internal RCU debug):
May 02 14:49:50 nas1 kernel: Sending NMI from CPU 0 to CPUs 2:
May 02 14:51:03 nas1 kernel: rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
May 02 14:51:03 nas1 kernel: rcu:         2-...0: (4 GPs behind) idle=1e64/0/0x1 softirq=5637388/5637388 fqs=10312
May 02 14:51:03 nas1 kernel: rcu:         (detected by 0, t=23491 jiffies, g=17814017, q=22358 ncpus=8)
May 02 14:51:03 nas1 kernel: Sending NMI from CPU 0 to CPUs 2:
May 02 14:51:03 nas1 kernel: rcu: rcu_preempt kthread starved for 2482 jiffies! g17814017 f0x0 RCU_GP_DOING_FQS(6) ->state=0x0 ->cpu=5
May 02 14:51:03 nas1 kernel: rcu:         Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
May 02 14:51:03 nas1 kernel: rcu: RCU grace-period kthread stack dump:
May 02 14:51:03 nas1 kernel: task:rcu_preempt     state:I stack:0     pid:17    ppid:2      flags:0x00004000
May 02 14:51:03 nas1 kernel: Call Trace:
May 02 14:51:03 nas1 kernel:  <TASK>
May 02 14:51:03 nas1 kernel:  __schedule+0x349/0x950
May 02 14:51:03 nas1 kernel:  ? __pfx_rcu_gp_kthread+0x10/0x10
May 02 14:51:03 nas1 kernel:  schedule+0x5b/0xa0
May 02 14:51:03 nas1 kernel:  schedule_timeout+0x98/0x160
May 02 14:51:03 nas1 kernel:  ? __pfx_process_timeout+0x10/0x10
May 02 14:51:03 nas1 kernel:  rcu_gp_fqs_loop+0x141/0x550
May 02 14:51:03 nas1 kernel:  rcu_gp_kthread+0xd8/0x190
May 02 14:51:03 nas1 kernel:  kthread+0xe5/0x120
May 02 14:51:03 nas1 kernel:  ? __pfx_kthread+0x10/0x10
May 02 14:51:03 nas1 kernel:  ret_from_fork+0x31/0x50
May 02 14:51:03 nas1 kernel:  ? __pfx_kthread+0x10/0x10
May 02 14:51:03 nas1 kernel:  ret_from_fork_asm+0x1b/0x30
May 02 14:51:03 nas1 kernel:  </TASK>
May 02 14:51:03 nas1 kernel: rcu: Stack dump where RCU GP kthread last ran:
May 02 14:51:03 nas1 kernel: Sending NMI from CPU 0 to CPUs 5:
May 02 14:51:03 nas1 kernel: NMI backtrace for cpu 5 skipped: idling at intel_idle_ibrs+0x81/0xf0
May 02 14:51:03 nas1 kernel: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 2-...D } 24429 jiffies s: 1493 root: 0x4/.
May 02 14:51:03 nas1 kernel: rcu: blocking rcu_node structures (internal RCU debug):
May 02 14:51:03 nas1 kernel: Sending NMI from CPU 0 to CPUs 2:
-- Boot 7ca1d064e6434fafb7d4515c45931ec9 --
May 02 15:42:20 nas1 kernel: Linux version 6.6.44-production+truenas (root@tnsbuilds01.tn.ixsystems.net) (gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1 SMP PREEMPT_DYNAMIC Mon Mar 31 1>

Second Freeze:

May 07 04:17:38 nas1 kernel: general protection fault, maybe for address 0xffff906f97ab2c00: 0000 [#1] PREEMPT SMP PTI
May 07 04:17:38 nas1 kernel: CPU: 2 PID: 0 Comm: swapper/2 Tainted: P           OE      6.6.44-production+truenas #1
May 07 04:17:38 nas1 kernel: Hardware name: TAROX /X11SSM-F, BIOS 3.4 10/04/2024
May 07 04:17:38 nas1 kernel: RIP: 0010:intel_idle_ibrs+0x81/0xf0
May 07 04:17:38 nas1 kernel: Code: 48 89 d1 65 48 8b 04 25 00 2c 03 00 0f 01 c8 48 8b 00 a8 08 75 14 66 90 0f 00 2d b6 03 50 00 b9 01 00 00 00 4c 89 c0 0f 01 c9 <65> 48 8b 04 25 00 2c 03 00 f0 80 60 02 df f0 83 >
May 07 04:17:38 nas1 kernel: RSP: 0018:ffffbb35400f7e78 EFLAGS: 00010046
May 07 04:17:38 nas1 kernel: RAX: 0000000000000040 RBX: 0000000000000006 RCX: 0000000000000001
May 07 04:17:38 nas1 kernel: RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000001
May 07 04:17:38 nas1 kernel: RBP: ffffffffb2d9b800 R08: 0000000000000040 R09: 0000000000000078
May 07 04:17:38 nas1 kernel: R10: 0000000000000018 R11: ffff906f97ab1fe4 R12: ffffffffb2d9b800
May 07 04:17:38 nas1 kernel: R13: ffffffffb2d9ba88 R14: 0000000000000006 R15: 0000000000000000
May 07 04:17:38 nas1 kernel: FS:  0000000000000000(0000) GS:ffff906f97a80000(0000) knlGS:0000000000000000
May 07 04:17:38 nas1 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 07 04:17:38 nas1 kernel: CR2: 00007f93b869c000 CR3: 0000000706a20003 CR4: 00000000003706e0
May 07 04:17:38 nas1 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
May 07 04:17:38 nas1 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
May 07 04:17:38 nas1 kernel: Call Trace:
May 07 04:17:38 nas1 kernel:  <TASK>
May 07 04:17:38 nas1 kernel:  ? die_addr+0x36/0x90
May 07 04:17:38 nas1 kernel:  ? exc_general_protection+0x1c5/0x430
May 07 04:17:38 nas1 kernel:  ? irqentry_enter+0x3d/0x50
May 07 04:17:38 nas1 kernel:  ? asm_exc_general_protection+0x26/0x30
May 07 04:17:38 nas1 kernel:  ? intel_idle_ibrs+0x81/0xf0
May 07 04:17:38 nas1 kernel:  ? intel_idle_ibrs+0x15/0xf0
May 07 04:17:38 nas1 kernel:  cpuidle_enter_state+0x81/0x440
May 07 04:17:38 nas1 kernel:  cpuidle_enter+0x2d/0x40
May 07 04:17:38 nas1 kernel:  do_idle+0x20d/0x270
May 07 04:17:38 nas1 kernel:  cpu_startup_entry+0x2a/0x30
May 07 04:17:38 nas1 kernel:  start_secondary+0x11e/0x140
May 07 04:17:38 nas1 kernel:  secondary_startup_64_no_verify+0x18f/0x19b
May 07 04:17:38 nas1 kernel:  </TASK>
May 07 04:17:38 nas1 kernel: Modules linked in: tls(E) rpcsec_gss_krb5(E) scst_vdisk(OE) isert_scst(OE) iscsi_scst(OE) scst(OE) rdma_cm(E) iw_cm(E) ib_cm(E) dlm(E) libcrc32c(E) crc32c_generic(E) nvme_fabrics(E) >
May 07 04:17:38 nas1 kernel:  efi_pstore(E) configfs(E) ip_tables(E) x_tables(E) autofs4(E) zfs(POE) spl(OE) efivarfs(E) hid_generic(E) usbhid(E) hid(E) uas(E) usb_storage(E) sd_mod(E) ses(E) enclosure(E) scsi_t>
May 07 04:17:38 nas1 kernel: ---[ end trace 0000000000000000 ]---
May 07 04:17:38 nas1 kernel: ------------[ cut here ]------------
May 07 04:17:38 nas1 kernel: NETDEV WATCHDOG: eno1 (igb): transmit queue 1 timed out 5228 ms
May 07 04:17:38 nas1 kernel: WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:525 dev_watchdog+0x235/0x240
May 07 04:17:38 nas1 kernel: Modules linked in: tls(E) rpcsec_gss_krb5(E) scst_vdisk(OE) isert_scst(OE) iscsi_scst(OE) scst(OE) rdma_cm(E) iw_cm(E) ib_cm(E) dlm(E) libcrc32c(E) crc32c_generic(E) nvme_fabrics(E) >
May 07 04:17:38 nas1 kernel:  efi_pstore(E) configfs(E) ip_tables(E) x_tables(E) autofs4(E) zfs(POE) spl(OE) efivarfs(E) hid_generic(E) usbhid(E) hid(E) uas(E) usb_storage(E) sd_mod(E) ses(E) enclosure(E) scsi_t>
May 07 04:17:38 nas1 kernel: CPU: 0 PID: 0 Comm: swapper/0 Tainted: P      D    OE      6.6.44-production+truenas #1
May 07 04:17:38 nas1 kernel: Hardware name: TAROX /X11SSM-F, BIOS 3.4 10/04/2024
May 07 04:17:38 nas1 kernel: RIP: 0010:dev_watchdog+0x235/0x240
May 07 04:17:38 nas1 kernel: Code: ff ff ff 48 89 df c6 05 b6 26 40 01 01 e8 f3 2c fa ff 45 89 f8 44 89 f1 48 89 de 48 89 c2 48 c7 c7 60 f9 51 b2 e8 cb 16 6b ff <0f> 0b e9 2a ff ff ff 0f 1f 40 00 90 90 90 90 90 >
May 07 04:17:38 nas1 kernel: RSP: 0018:ffffbb3540003e70 EFLAGS: 00010286
May 07 04:17:38 nas1 kernel: RAX: 0000000000000000 RBX: ffff90685306c000 RCX: 0000000000000027
May 07 04:17:38 nas1 kernel: RDX: ffff906f97a213c8 RSI: 0000000000000001 RDI: ffff906f97a213c0
May 07 04:17:38 nas1 kernel: RBP: ffff90685306c488 R08: 0000000000000000 R09: ffffbb3540003cf8
May 07 04:17:38 nas1 kernel: R10: 0000000000000003 R11: ffffffffb2cd1e48 R12: ffff9068c1f18140
May 07 04:17:38 nas1 kernel: R13: ffff90685306c3dc R14: 0000000000000001 R15: 000000000000146c
May 07 04:17:38 nas1 kernel: FS:  0000000000000000(0000) GS:ffff906f97a00000(0000) knlGS:0000000000000000
May 07 04:17:38 nas1 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 07 04:17:38 nas1 kernel: CR2: 00007fffbad55ba8 CR3: 0000000706a20001 CR4: 00000000003706f0
May 07 04:17:38 nas1 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
May 07 04:17:38 nas1 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
May 07 04:17:38 nas1 kernel: Call Trace:
May 07 04:17:38 nas1 kernel:  <IRQ>
May 07 04:17:38 nas1 kernel:  ? dev_watchdog+0x235/0x240
May 07 04:17:38 nas1 kernel:  ? __warn+0x81/0x130
May 07 04:17:38 nas1 kernel:  ? dev_watchdog+0x235/0x240
May 07 04:17:38 nas1 kernel:  ? report_bug+0x171/0x1a0
May 07 04:17:38 nas1 kernel:  ? console_unlock+0x78/0x120
May 07 04:17:38 nas1 kernel:  ? handle_bug+0x41/0x70
May 07 04:17:38 nas1 kernel:  ? exc_invalid_op+0x17/0x70
May 07 04:17:38 nas1 kernel:  ? asm_exc_invalid_op+0x1a/0x20
May 07 04:17:38 nas1 kernel:  ? dev_watchdog+0x235/0x240
May 07 04:17:38 nas1 kernel:  ? __pfx_dev_watchdog+0x10/0x10
May 07 04:17:38 nas1 kernel:  call_timer_fn+0x24/0x130
May 07 04:17:38 nas1 kernel:  ? __pfx_dev_watchdog+0x10/0x10
May 07 04:17:38 nas1 kernel:  __run_timers+0x222/0x2c0
May 07 04:17:38 nas1 kernel:  run_timer_softirq+0x1d/0x40
May 07 04:17:38 nas1 kernel:  handle_softirqs+0xd7/0x2c0
May 07 04:17:38 nas1 kernel:  __irq_exit_rcu+0x98/0xc0
May 07 04:17:38 nas1 kernel:  sysvec_apic_timer_interrupt+0x72/0x90
May 07 04:17:38 nas1 kernel:  </IRQ>
May 07 04:17:38 nas1 kernel:  <TASK>
May 07 04:17:38 nas1 kernel:  asm_sysvec_apic_timer_interrupt+0x1a/0x20
May 07 04:17:38 nas1 kernel: RIP: 0010:cpuidle_enter_state+0xcc/0x440
May 07 04:17:38 nas1 kernel: Code: 8a 7d 53 ff e8 35 f4 ff ff 8b 53 04 49 89 c5 0f 1f 44 00 00 31 ff e8 f3 b9 52 ff 45 84 ff 0f 85 57 02 00 00 fb 0f 1f 44 00 00 <45> 85 f6 0f 88 85 01 00 00 49 63 d6 48 8d 04 52 >
May 07 04:17:38 nas1 kernel: RSP: 0018:ffffffffb2c03e38 EFLAGS: 00000246
May 07 04:17:38 nas1 kernel: RAX: ffff906f97a33440 RBX: ffff906f97a3cec8 RCX: 000000000000001f
May 07 04:17:38 nas1 kernel: RDX: 0000000000000000 RSI: 0000000022a1d501 RDI: 0000000000000000
May 07 04:17:38 nas1 kernel: RBP: 0000000000000006 R08: 0000000000000002 R09: 0000000000000ed8
May 07 04:17:38 nas1 kernel: R10: 0000000000000008 R11: ffff906f97a31fe4 R12: ffffffffb2d9b800
May 07 04:17:38 nas1 kernel: R13: 000163933bbbf7e8 R14: 0000000000000006 R15: 0000000000000000
May 07 04:17:38 nas1 kernel:  ? cpuidle_enter_state+0xbd/0x440
May 07 04:17:38 nas1 kernel:  cpuidle_enter+0x2d/0x40
May 07 04:17:38 nas1 kernel:  do_idle+0x20d/0x270
May 07 04:17:38 nas1 kernel:  cpu_startup_entry+0x2a/0x30
May 07 04:17:38 nas1 kernel:  rest_init+0xd0/0xd0
May 07 04:17:38 nas1 kernel:  arch_call_rest_init+0xe/0x30
May 07 04:17:38 nas1 kernel:  start_kernel+0x4ea/0x790
May 07 04:17:38 nas1 kernel:  x86_64_start_reservations+0x18/0x30
May 07 04:17:38 nas1 kernel:  x86_64_start_kernel+0x96/0xa0
May 07 04:17:38 nas1 kernel:  secondary_startup_64_no_verify+0x18f/0x19b
May 07 04:17:38 nas1 kernel:  </TASK>
May 07 04:17:38 nas1 kernel: ---[ end trace 0000000000000000 ]---
May 07 04:17:38 nas1 kernel: igb 0000:05:00.0 eno1: Reset adapter
May 07 04:17:42 nas1 kernel: igb 0000:05:00.0 eno1: igb: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
May 07 04:18:02 nas1 kernel: rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
May 07 04:18:02 nas1 kernel: rcu:         2-...0: (1 GPs behind) idle=0034/0/0x1 softirq=29807707/29807708 fqs=2594
May 07 04:18:02 nas1 kernel: rcu:         (detected by 4, t=5252 jiffies, g=97757405, q=6685 ncpus=8)
May 07 04:18:02 nas1 kernel: Sending NMI from CPU 4 to CPUs 2:
May 07 04:18:02 nas1 kernel: rcu: rcu_preempt kthread starved for 2479 jiffies! g97757405 f0x0 RCU_GP_DOING_FQS(6) ->state=0x0 ->cpu=5
May 07 04:18:02 nas1 kernel: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: {
May 07 04:18:02 nas1 kernel: rcu:         Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
May 07 04:18:02 nas1 kernel: rcu: RCU grace-period kthread stack dump:
May 07 04:18:02 nas1 kernel: task:rcu_preempt     state:I stack:0     pid:17    ppid:2      flags:0x00004000
May 07 04:18:02 nas1 kernel: Call Trace:
May 07 04:18:02 nas1 kernel:  2-...D
May 07 04:18:02 nas1 kernel:  <TASK>
May 07 04:18:02 nas1 kernel:  } 5996 jiffies s: 1701 root: 0x4/.
May 07 04:18:02 nas1 kernel:  __schedule+0x349/0x950
May 07 04:18:02 nas1 kernel: rcu: blocking rcu_node structures (internal RCU debug):
May 07 04:18:02 nas1 kernel:  ? __pfx_rcu_gp_kthread+0x10/0x10
May 07 04:18:02 nas1 kernel: 
May 07 04:18:02 nas1 kernel: Sending NMI from CPU 0 to CPUs 2:
May 07 04:18:02 nas1 kernel:  schedule+0x5b/0xa0
May 07 04:18:02 nas1 kernel:  schedule_timeout+0x98/0x160
May 07 04:18:02 nas1 kernel:  ? __pfx_process_timeout+0x10/0x10
May 07 04:18:02 nas1 kernel:  rcu_gp_fqs_loop+0x141/0x550
May 07 04:18:02 nas1 kernel:  rcu_gp_kthread+0xd8/0x190
May 07 04:18:02 nas1 kernel:  kthread+0xe5/0x120
May 07 04:18:02 nas1 kernel:  ? __pfx_kthread+0x10/0x10
May 07 04:18:02 nas1 kernel:  ret_from_fork+0x31/0x50
May 07 04:18:02 nas1 kernel:  ? __pfx_kthread+0x10/0x10
May 07 04:18:02 nas1 kernel:  ret_from_fork_asm+0x1b/0x30
May 07 04:18:02 nas1 kernel:  </TASK>
May 07 04:18:02 nas1 kernel: rcu: Stack dump where RCU GP kthread last ran:
May 07 04:19:15 nas1 kernel: rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
May 07 04:19:15 nas1 kernel: rcu:         2-...0: (1 GPs behind) idle=0034/0/0x1 softirq=29807707/29807708 fqs=10163
May 07 04:19:15 nas1 kernel: rcu:         (detected by 4, t=23491 jiffies, g=97757405, q=24570 ncpus=8)
May 07 04:19:15 nas1 kernel: Sending NMI from CPU 4 to CPUs 2:
May 07 04:19:15 nas1 kernel: rcu: rcu_preempt kthread timer wakeup didn't happen for 2483 jiffies! g97757405 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
May 07 04:19:15 nas1 kernel: rcu:         Possible timer handling issue on cpu=4 timer-softirq=10392482
May 07 04:19:15 nas1 kernel: rcu: rcu_preempt kthread starved for 2484 jiffies! g97757405 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=4
May 07 04:19:15 nas1 kernel: rcu:         Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
May 07 04:19:15 nas1 kernel: rcu: RCU grace-period kthread stack dump:
May 07 04:19:15 nas1 kernel: task:rcu_preempt     state:I stack:0     pid:17    ppid:2      flags:0x00004000
May 07 04:19:15 nas1 kernel: Call Trace:
May 07 04:19:15 nas1 kernel:  <TASK>
May 07 04:19:15 nas1 kernel:  __schedule+0x349/0x950
May 07 04:19:15 nas1 kernel:  ? __pfx_rcu_gp_kthread+0x10/0x10
May 07 04:19:15 nas1 kernel:  schedule+0x5b/0xa0
May 07 04:19:15 nas1 kernel:  schedule_timeout+0x98/0x160
May 07 04:19:15 nas1 kernel:  ? __pfx_process_timeout+0x10/0x10
May 07 04:19:15 nas1 kernel:  rcu_gp_fqs_loop+0x141/0x550
May 07 04:19:15 nas1 kernel:  rcu_gp_kthread+0xd8/0x190
May 07 04:19:15 nas1 kernel:  kthread+0xe5/0x120
May 07 04:19:15 nas1 kernel:  ? __pfx_kthread+0x10/0x10
May 07 04:19:15 nas1 kernel:  ret_from_fork+0x31/0x50
May 07 04:19:15 nas1 kernel:  ? __pfx_kthread+0x10/0x10
May 07 04:19:15 nas1 kernel:  ret_from_fork_asm+0x1b/0x30
May 07 04:19:15 nas1 kernel:  </TASK>
May 07 04:19:15 nas1 kernel: rcu: Stack dump where RCU GP kthread last ran:
May 07 04:19:15 nas1 kernel: CPU: 4 PID: 0 Comm: swapper/4 Tainted: P      D W  OE      6.6.44-production+truenas #1
May 07 04:19:15 nas1 kernel: Hardware name: TAROX /X11SSM-F, BIOS 3.4 10/04/2024
May 07 04:19:15 nas1 kernel: RIP: 0010:cpuidle_enter_state+0xcc/0x440
May 07 04:19:15 nas1 kernel: Code: 8a 7d 53 ff e8 35 f4 ff ff 8b 53 04 49 89 c5 0f 1f 44 00 00 31 ff e8 f3 b9 52 ff 45 84 ff 0f 85 57 02 00 00 fb 0f 1f 44 00 00 <45> 85 f6 0f 88 85 01 00 00 49 63 d6 48 8d 04 52 >
May 07 04:19:15 nas1 kernel: RSP: 0018:ffffbb3540107e90 EFLAGS: 00000246
May 07 04:19:15 nas1 kernel: RAX: ffff906f97b33440 RBX: ffff906f97b3cec8 RCX: 000000000000001f
May 07 04:19:15 nas1 kernel: RDX: 0000000000000004 RSI: 0000000022a1d501 RDI: 0000000000000000
May 07 04:19:15 nas1 kernel: RBP: 0000000000000006 R08: 0000000000000002 R09: 0000000000000ed2
May 07 04:19:15 nas1 kernel: R10: 0000000000000014 R11: ffff906f97b31fe4 R12: ffffffffb2d9b800
May 07 04:19:15 nas1 kernel: R13: 000163a7827d2088 R14: 0000000000000006 R15: 0000000000000000
May 07 04:19:15 nas1 kernel: FS:  0000000000000000(0000) GS:ffff906f97b00000(0000) knlGS:0000000000000000
May 07 04:19:15 nas1 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 07 04:19:15 nas1 kernel: CR2: 00007fad05773d10 CR3: 0000000706a20006 CR4: 00000000003706e0
May 07 04:19:15 nas1 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
May 07 04:19:15 nas1 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
May 07 04:19:15 nas1 kernel: Call Trace:
May 07 04:19:15 nas1 kernel:  <IRQ>
May 07 04:19:15 nas1 kernel:  ? rcu_check_gp_kthread_starvation+0x120/0x1a0
May 07 04:19:15 nas1 kernel:  ? rcu_sched_clock_irq+0xda1/0x1110
May 07 04:19:15 nas1 kernel:  ? update_process_times+0x74/0xb0
May 07 04:19:15 nas1 kernel:  ? tick_sched_handle+0x21/0x60
May 07 04:19:15 nas1 kernel:  ? tick_sched_timer+0x6f/0x90
May 07 04:19:15 nas1 kernel:  ? __pfx_tick_sched_timer+0x10/0x10
May 07 04:19:15 nas1 kernel:  ? __hrtimer_run_queues+0x10f/0x2b0
May 07 04:19:15 nas1 kernel:  ? hrtimer_interrupt+0xf8/0x230
May 07 04:19:15 nas1 kernel:  ? __sysvec_apic_timer_interrupt+0x4d/0x140
May 07 04:19:15 nas1 kernel:  ? sysvec_apic_timer_interrupt+0x6d/0x90
May 07 04:19:15 nas1 kernel:  </IRQ>
May 07 04:19:15 nas1 kernel:  <TASK>
May 07 04:19:15 nas1 kernel:  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
May 07 04:19:15 nas1 kernel:  ? cpuidle_enter_state+0xcc/0x440
May 07 04:19:15 nas1 kernel:  cpuidle_enter+0x2d/0x40
May 07 04:19:15 nas1 kernel:  do_idle+0x20d/0x270
May 07 04:19:15 nas1 kernel:  cpu_startup_entry+0x2a/0x30
May 07 04:19:15 nas1 kernel:  start_secondary+0x11e/0x140
May 07 04:19:15 nas1 kernel:  secondary_startup_64_no_verify+0x18f/0x19b
May 07 04:19:15 nas1 kernel:  </TASK>
May 07 04:19:16 nas1 kernel: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 2-...D } 24493 jiffies s: 1701 root: 0x4/.
May 07 04:19:16 nas1 kernel: rcu: blocking rcu_node structures (internal RCU debug):
May 07 04:19:16 nas1 kernel: Sending NMI from CPU 0 to CPUs 2:
May 07 04:20:28 nas1 kernel: rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
May 07 04:20:28 nas1 kernel: rcu:         2-...0: (1 GPs behind) idle=0034/0/0x1 softirq=29807707/29807708 fqs=17999
May 07 04:20:28 nas1 kernel: rcu:         (detected by 4, t=41732 jiffies, g=97757405, q=42493 ncpus=8)
May 07 04:20:28 nas1 kernel: Sending NMI from CPU 4 to CPUs 2:
May 07 04:20:28 nas1 kernel: rcu: rcu_preempt kthread starved for 2481 jiffies! g97757405 f0x0 RCU_GP_DOING_FQS(6) ->state=0x0 ->cpu=6
May 07 04:20:28 nas1 kernel: rcu:         Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
May 07 04:20:28 nas1 kernel: rcu: RCU grace-period kthread stack dump:
May 07 04:20:28 nas1 kernel: task:rcu_preempt     state:I stack:0     pid:17    ppid:2      flags:0x00004000
May 07 04:20:28 nas1 kernel: Call Trace:
May 07 04:20:28 nas1 kernel:  <TASK>
May 07 04:20:28 nas1 kernel:  __schedule+0x349/0x950
May 07 04:20:28 nas1 kernel:  ? __pfx_rcu_gp_kthread+0x10/0x10
May 07 04:20:28 nas1 kernel:  schedule+0x5b/0xa0
May 07 04:20:28 nas1 kernel:  schedule_timeout+0x98/0x160
May 07 04:20:28 nas1 kernel:  ? __pfx_process_timeout+0x10/0x10
May 07 04:20:28 nas1 kernel:  rcu_gp_fqs_loop+0x141/0x550
May 07 04:20:28 nas1 kernel:  rcu_gp_kthread+0xd8/0x190
May 07 04:20:28 nas1 kernel:  kthread+0xe5/0x120
May 07 04:20:28 nas1 kernel:  ? __pfx_kthread+0x10/0x10
May 07 04:20:28 nas1 kernel:  ret_from_fork+0x31/0x50
May 07 04:20:28 nas1 kernel:  ? __pfx_kthread+0x10/0x10
May 07 04:20:28 nas1 kernel:  ret_from_fork_asm+0x1b/0x30
May 07 04:20:28 nas1 kernel:  </TASK>
May 07 04:20:28 nas1 kernel: rcu: Stack dump where RCU GP kthread last ran:
May 07 04:20:28 nas1 kernel: Sending NMI from CPU 4 to CPUs 6:
May 07 04:20:28 nas1 kernel: NMI backtrace for cpu 6 skipped: idling at intel_idle_ibrs+0x81/0xf0
May 07 04:20:30 nas1 kernel: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 2-...D } 42925 jiffies s: 1701 root: 0x4/.
May 07 04:20:30 nas1 kernel: rcu: blocking rcu_node structures (internal RCU debug):
May 07 04:20:30 nas1 kernel: Sending NMI from CPU 0 to CPUs 2:
May 07 04:21:41 nas1 kernel: rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
May 07 04:21:41 nas1 kernel: rcu:         2-...0: (1 GPs behind) idle=0034/0/0x1 softirq=29807707/29807708 fqs=25823
May 07 04:21:41 nas1 kernel: rcu:         (detected by 4, t=59971 jiffies, g=97757405, q=60812 ncpus=8)
May 07 04:21:41 nas1 kernel: Sending NMI from CPU 4 to CPUs 2:
May 07 04:21:41 nas1 kernel: rcu: rcu_preempt kthread timer wakeup didn't happen for 2484 jiffies! g97757405 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
May 07 04:21:41 nas1 kernel: rcu:         Possible timer handling issue on cpu=4 timer-softirq=10393023
May 07 04:21:41 nas1 kernel: rcu: rcu_preempt kthread starved for 2485 jiffies! g97757405 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=4
May 07 04:21:41 nas1 kernel: rcu:         Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
May 07 04:21:41 nas1 kernel: rcu: RCU grace-period kthread stack dump:
May 07 04:21:41 nas1 kernel: task:rcu_preempt     state:I stack:0     pid:17    ppid:2      flags:0x00004000
May 07 04:21:41 nas1 kernel: Call Trace:
May 07 04:21:41 nas1 kernel:  <TASK>
May 07 04:21:41 nas1 kernel:  __schedule+0x349/0x950
May 07 04:21:41 nas1 kernel:  ? __pfx_rcu_gp_kthread+0x10/0x10
May 07 04:21:41 nas1 kernel:  schedule+0x5b/0xa0
May 07 04:21:41 nas1 kernel:  schedule_timeout+0x98/0x160
May 07 04:21:41 nas1 kernel:  ? __pfx_process_timeout+0x10/0x10
May 07 04:21:41 nas1 kernel:  rcu_gp_fqs_loop+0x141/0x550
May 07 04:21:41 nas1 kernel:  rcu_gp_kthread+0xd8/0x190
May 07 04:21:41 nas1 kernel:  kthread+0xe5/0x120
May 07 04:21:41 nas1 kernel:  ? __pfx_kthread+0x10/0x10
May 07 04:21:41 nas1 kernel:  ret_from_fork+0x31/0x50
May 07 04:21:41 nas1 kernel:  ? __pfx_kthread+0x10/0x10
May 07 04:21:41 nas1 kernel:  ret_from_fork_asm+0x1b/0x30
May 07 04:21:41 nas1 kernel:  </TASK>
May 07 04:21:41 nas1 kernel: rcu: Stack dump where RCU GP kthread last ran:
May 07 04:21:41 nas1 kernel: CPU: 4 PID: 0 Comm: swapper/4 Tainted: P      D W  OE      6.6.44-production+truenas #1
May 07 04:21:41 nas1 kernel: Hardware name: TAROX /X11SSM-F, BIOS 3.4 10/04/2024
May 07 04:21:41 nas1 kernel: RIP: 0010:cpuidle_enter_state+0xcc/0x440
May 07 04:21:41 nas1 kernel: Code: 8a 7d 53 ff e8 35 f4 ff ff 8b 53 04 49 89 c5 0f 1f 44 00 00 31 ff e8 f3 b9 52 ff 45 84 ff 0f 85 57 02 00 00 fb 0f 1f 44 00 00 <45> 85 f6 0f 88 85 01 00 00 49 63 d6 48 8d 04 52 >
May 07 04:21:41 nas1 kernel: RSP: 0018:ffffbb3540107e90 EFLAGS: 00000246
May 07 04:21:41 nas1 kernel: RAX: ffff906f97b33440 RBX: ffff906f97b3cec8 RCX: 000000000000001f
May 07 04:21:41 nas1 kernel: RDX: 0000000000000004 RSI: 0000000022a1d501 RDI: 0000000000000000
May 07 04:21:41 nas1 kernel: RBP: 0000000000000006 R08: 0000000000000002 R09: 0000000000000ed2
May 07 04:21:41 nas1 kernel: R10: 0000000000000018 R11: ffff906f97b31fe4 R12: ffffffffb2d9b800
May 07 04:21:41 nas1 kernel: R13: 000163c97c3d475d R14: 0000000000000006 R15: 0000000000000000
May 07 04:21:41 nas1 kernel: FS:  0000000000000000(0000) GS:ffff906f97b00000(0000) knlGS:0000000000000000
May 07 04:21:41 nas1 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 07 04:21:41 nas1 kernel: CR2: 00007f7546b01e18 CR3: 0000000706a20004 CR4: 00000000003706e0
May 07 04:21:41 nas1 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
May 07 04:21:41 nas1 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
May 07 04:21:41 nas1 kernel: Call Trace:
May 07 04:21:41 nas1 kernel:  <IRQ>
May 07 04:21:41 nas1 kernel:  ? rcu_check_gp_kthread_starvation+0x120/0x1a0
May 07 04:21:41 nas1 kernel:  ? rcu_sched_clock_irq+0xda1/0x1110
May 07 04:21:41 nas1 kernel:  ? update_process_times+0x74/0xb0
May 07 04:21:41 nas1 kernel:  ? tick_sched_handle+0x21/0x60
May 07 04:21:41 nas1 kernel:  ? tick_sched_timer+0x6f/0x90
May 07 04:21:41 nas1 kernel:  ? __pfx_tick_sched_timer+0x10/0x10
May 07 04:21:41 nas1 kernel:  ? __hrtimer_run_queues+0x10f/0x2b0
May 07 04:21:41 nas1 kernel:  ? hrtimer_interrupt+0xf8/0x230
May 07 04:21:41 nas1 kernel:  ? __sysvec_apic_timer_interrupt+0x4d/0x140
May 07 04:21:41 nas1 kernel:  ? sysvec_apic_timer_interrupt+0x6d/0x90
May 07 04:21:41 nas1 kernel:  </IRQ>
May 07 04:21:41 nas1 kernel:  <TASK>
May 07 04:21:41 nas1 kernel:  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
May 07 04:21:41 nas1 kernel:  ? cpuidle_enter_state+0xcc/0x440
May 07 04:21:41 nas1 kernel:  cpuidle_enter+0x2d/0x40
May 07 04:21:41 nas1 kernel:  do_idle+0x20d/0x270
May 07 04:21:41 nas1 kernel:  cpu_startup_entry+0x2a/0x30
May 07 04:21:41 nas1 kernel:  start_secondary+0x11e/0x140
May 07 04:21:41 nas1 kernel:  secondary_startup_64_no_verify+0x18f/0x19b
May 07 04:21:41 nas1 kernel:  </TASK>
May 07 04:21:44 nas1 kernel: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 2-...D } 61357 jiffies s: 1701 root: 0x4/.
May 07 04:21:44 nas1 kernel: rcu: blocking rcu_node structures (internal RCU debug):
May 07 04:21:44 nas1 kernel: Sending NMI from CPU 0 to CPUs 2:
May 07 04:22:54 nas1 kernel: rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
May 07 04:22:54 nas1 kernel: rcu:         2-...0: (1 GPs behind) idle=0034/0/0x1 softirq=29807707/29807708 fqs=33657
May 07 04:22:54 nas1 kernel: rcu:         (detected by 4, t=78212 jiffies, g=97757405, q=79178 ncpus=8)
May 07 04:22:54 nas1 kernel: Sending NMI from CPU 4 to CPUs 2:
May 07 04:22:54 nas1 kernel: rcu: rcu_preempt kthread starved for 2480 jiffies! g97757405 f0x0 RCU_GP_DOING_FQS(6) ->state=0x0 ->cpu=0
May 07 04:22:54 nas1 kernel: rcu:         Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
May 07 04:22:54 nas1 kernel: rcu: RCU grace-period kthread stack dump:
May 07 04:22:54 nas1 kernel: task:rcu_preempt     state:I stack:0     pid:17    ppid:2      flags:0x00004000
May 07 04:22:54 nas1 kernel: Call Trace:
May 07 04:22:54 nas1 kernel:  <TASK>
May 07 04:22:54 nas1 kernel:  __schedule+0x349/0x950
May 07 04:22:54 nas1 kernel:  ? __pfx_rcu_gp_kthread+0x10/0x10
May 07 04:22:54 nas1 kernel:  schedule+0x5b/0xa0
May 07 04:22:54 nas1 kernel:  schedule_timeout+0x98/0x160
May 07 04:22:54 nas1 kernel:  ? __pfx_process_timeout+0x10/0x10
May 07 04:22:54 nas1 kernel:  rcu_gp_fqs_loop+0x141/0x550
May 07 04:22:54 nas1 kernel:  rcu_gp_kthread+0xd8/0x190
May 07 04:22:54 nas1 kernel:  kthread+0xe5/0x120
May 07 04:22:54 nas1 kernel:  ? __pfx_kthread+0x10/0x10
May 07 04:22:54 nas1 kernel:  ret_from_fork+0x31/0x50
May 07 04:22:54 nas1 kernel:  ? __pfx_kthread+0x10/0x10
May 07 04:22:54 nas1 kernel:  ret_from_fork_asm+0x1b/0x30
May 07 04:22:54 nas1 kernel:  </TASK>
May 07 04:22:54 nas1 kernel: rcu: Stack dump where RCU GP kthread last ran:
May 07 04:22:54 nas1 kernel: Sending NMI from CPU 4 to CPUs 0:
May 07 04:22:54 nas1 kernel: NMI backtrace for cpu 0 skipped: idling at intel_idle_ibrs+0x81/0xf0
May 07 04:22:58 nas1 kernel: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 2-...D } 79789 jiffies s: 1701 root: 0x4/.
May 07 04:22:58 nas1 kernel: rcu: blocking rcu_node structures (internal RCU debug):
May 07 04:22:58 nas1 kernel: Sending NMI from CPU 0 to CPUs 2:
May 07 04:24:07 nas1 kernel: rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
May 07 04:24:07 nas1 kernel: rcu:         2-...0: (1 GPs behind) idle=0034/0/0x1 softirq=29807707/29807708 fqs=41508
May 07 04:24:07 nas1 kernel: rcu:         (detected by 4, t=96451 jiffies, g=97757405, q=97473 ncpus=8)
May 07 04:24:07 nas1 kernel: Sending NMI from CPU 4 to CPUs 2:
May 07 04:24:07 nas1 kernel: rcu: rcu_preempt kthread starved for 2482 jiffies! g97757405 f0x0 RCU_GP_DOING_FQS(6) ->state=0x0 ->cpu=3
May 07 04:24:07 nas1 kernel: rcu:         Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
May 07 04:24:07 nas1 kernel: rcu: RCU grace-period kthread stack dump:
May 07 04:24:07 nas1 kernel: task:rcu_preempt     state:I stack:0     pid:17    ppid:2      flags:0x00004000
May 07 04:24:07 nas1 kernel: Call Trace:
May 07 04:24:07 nas1 kernel:  <TASK>
May 07 04:24:07 nas1 kernel:  __schedule+0x349/0x950
May 07 04:24:07 nas1 kernel:  ? __pfx_rcu_gp_kthread+0x10/0x10
May 07 04:24:07 nas1 kernel:  schedule+0x5b/0xa0
May 07 04:24:07 nas1 kernel:  schedule_timeout+0x98/0x160
May 07 04:24:07 nas1 kernel:  ? __pfx_process_timeout+0x10/0x10
May 07 04:24:07 nas1 kernel:  rcu_gp_fqs_loop+0x141/0x550
May 07 04:24:07 nas1 kernel:  rcu_gp_kthread+0xd8/0x190
May 07 04:24:07 nas1 kernel:  kthread+0xe5/0x120
May 07 04:24:07 nas1 kernel:  ? __pfx_kthread+0x10/0x10
May 07 04:24:07 nas1 kernel:  ret_from_fork+0x31/0x50
May 07 04:24:07 nas1 kernel:  ? __pfx_kthread+0x10/0x10
May 07 04:24:07 nas1 kernel:  ret_from_fork_asm+0x1b/0x30
May 07 04:24:07 nas1 kernel:  </TASK>
May 07 04:24:07 nas1 kernel: rcu: Stack dump where RCU GP kthread last ran:
May 07 04:24:07 nas1 kernel: Sending NMI from CPU 4 to CPUs 3:
May 07 04:24:07 nas1 kernel: NMI backtrace for cpu 3 skipped: idling at intel_idle_ibrs+0x81/0xf0
May 07 04:24:11 nas1 kernel: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 2-...D } 98221 jiffies s: 1701 root: 0x4/.
May 07 04:24:11 nas1 kernel: rcu: blocking rcu_node structures (internal RCU debug):
May 07 04:24:11 nas1 kernel: Sending NMI from CPU 0 to CPUs 2:
May 07 04:25:20 nas1 kernel: rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
May 07 04:25:20 nas1 kernel: rcu:         2-...0: (1 GPs behind) idle=0034/0/0x1 softirq=29807707/29807708 fqs=49183
May 07 04:25:20 nas1 kernel: rcu:         (detected by 4, t=114690 jiffies, g=97757405, q=115752 ncpus=8)
May 07 04:25:20 nas1 kernel: Sending NMI from CPU 4 to CPUs 2:
May 07 04:25:20 nas1 kernel: rcu: rcu_preempt kthread starved for 2481 jiffies! g97757405 f0x0 RCU_GP_DOING_FQS(6) ->state=0x0 ->cpu=6
May 07 04:25:20 nas1 kernel: rcu:         Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
May 07 04:25:20 nas1 kernel: rcu: RCU grace-period kthread stack dump:
May 07 04:25:20 nas1 kernel: task:rcu_preempt     state:I stack:0     pid:17    ppid:2      flags:0x00004000
May 07 04:25:20 nas1 kernel: Call Trace:
May 07 04:25:20 nas1 kernel:  <TASK>
May 07 04:25:20 nas1 kernel:  __schedule+0x349/0x950
May 07 04:25:20 nas1 kernel:  ? __pfx_rcu_gp_kthread+0x10/0x10
May 07 04:25:20 nas1 kernel:  schedule+0x5b/0xa0
May 07 04:25:20 nas1 kernel:  schedule_timeout+0x98/0x160
May 07 04:25:20 nas1 kernel:  ? __pfx_process_timeout+0x10/0x10
May 07 04:25:20 nas1 kernel:  rcu_gp_fqs_loop+0x141/0x550
May 07 04:25:20 nas1 kernel:  rcu_gp_kthread+0xd8/0x190
May 07 04:25:20 nas1 kernel:  kthread+0xe5/0x120
May 07 04:25:20 nas1 kernel:  ? __pfx_kthread+0x10/0x10
May 07 04:25:20 nas1 kernel:  ret_from_fork+0x31/0x50
May 07 04:25:20 nas1 kernel:  ? __pfx_kthread+0x10/0x10
May 07 04:25:20 nas1 kernel:  ret_from_fork_asm+0x1b/0x30
May 07 04:25:20 nas1 kernel:  </TASK>
May 07 04:25:20 nas1 kernel: rcu: Stack dump where RCU GP kthread last ran:
May 07 04:25:20 nas1 kernel: Sending NMI from CPU 4 to CPUs 6:
May 07 04:25:20 nas1 kernel: NMI backtrace for cpu 6 skipped: idling at intel_idle+0x62/0xb0
May 07 04:25:25 nas1 kernel: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 2-...D } 116653 jiffies s: 1701 root: 0x4/.
May 07 04:25:25 nas1 kernel: rcu: blocking rcu_node structures (internal RCU debug):
May 07 04:25:25 nas1 kernel: Sending NMI from CPU 0 to CPUs 2:
May 07 04:26:33 nas1 kernel: rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
May 07 04:26:33 nas1 kernel: rcu:         2-...0: (1 GPs behind) idle=0034/0/0x1 softirq=29807707/29807708 fqs=56852
May 07 04:26:33 nas1 kernel: rcu:         (detected by 4, t=132929 jiffies, g=97757405, q=134025 ncpus=8)
May 07 04:26:33 nas1 kernel: Sending NMI from CPU 4 to CPUs 2:
May 07 04:26:33 nas1 kernel: rcu: rcu_preempt kthread starved for 2482 jiffies! g97757405 f0x0 RCU_GP_DOING_FQS(6) ->state=0x0 ->cpu=6
May 07 04:26:33 nas1 kernel: rcu:         Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
May 07 04:26:33 nas1 kernel: rcu: RCU grace-period kthread stack dump:
May 07 04:26:33 nas1 kernel: task:rcu_preempt     state:I stack:0     pid:17    ppid:2      flags:0x00004000
May 07 04:26:33 nas1 kernel: Call Trace:
May 07 04:26:33 nas1 kernel:  <TASK>
May 07 04:26:33 nas1 kernel:  __schedule+0x349/0x950
May 07 04:26:33 nas1 kernel:  ? __pfx_rcu_gp_kthread+0x10/0x10
May 07 04:26:33 nas1 kernel:  schedule+0x5b/0xa0
May 07 04:26:33 nas1 kernel:  schedule_timeout+0x98/0x160
May 07 04:26:33 nas1 kernel:  ? __pfx_process_timeout+0x10/0x10
May 07 04:26:33 nas1 kernel:  rcu_gp_fqs_loop+0x141/0x550
May 07 04:26:33 nas1 kernel:  rcu_gp_kthread+0xd8/0x190
May 07 04:26:33 nas1 kernel:  kthread+0xe5/0x120
May 07 04:26:33 nas1 kernel:  ? __pfx_kthread+0x10/0x10
May 07 04:26:33 nas1 kernel:  ret_from_fork+0x31/0x50
May 07 04:26:33 nas1 kernel:  ? __pfx_kthread+0x10/0x10
May 07 04:26:33 nas1 kernel:  ret_from_fork_asm+0x1b/0x30
May 07 04:26:33 nas1 kernel:  </TASK>
May 07 04:26:33 nas1 kernel: rcu: Stack dump where RCU GP kthread last ran:
May 07 04:26:33 nas1 kernel: Sending NMI from CPU 4 to CPUs 6:
May 07 04:26:33 nas1 kernel: NMI backtrace for cpu 6 skipped: idling at intel_idle_ibrs+0x81/0xf0
May 07 04:26:39 nas1 kernel: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 2-...D } 135085 jiffies s: 1701 root: 0x4/.
May 07 04:26:39 nas1 kernel: rcu: blocking rcu_node structures (internal RCU debug):
May 07 04:26:39 nas1 kernel: Sending NMI from CPU 0 to CPUs 2:
May 07 04:27:46 nas1 kernel: rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
May 07 04:27:46 nas1 kernel: rcu:         2-...0: (1 GPs behind) idle=0034/0/0x1 softirq=29807707/29807708 fqs=64527
May 07 04:27:46 nas1 kernel: rcu:         (detected by 4, t=151168 jiffies, g=97757405, q=152301 ncpus=8)
May 07 04:27:46 nas1 kernel: Sending NMI from CPU 4 to CPUs 2:
May 07 04:27:46 nas1 kernel: rcu: rcu_preempt kthread starved for 2482 jiffies! g97757405 f0x0 RCU_GP_DOING_FQS(6) ->state=0x0 ->cpu=6
May 07 04:27:46 nas1 kernel: rcu:         Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
May 07 04:27:46 nas1 kernel: rcu: RCU grace-period kthread stack dump:
May 07 04:27:46 nas1 kernel: task:rcu_preempt     state:I stack:0     pid:17    ppid:2      flags:0x00004000
May 07 04:27:46 nas1 kernel: Call Trace:
May 07 04:27:46 nas1 kernel:  <TASK>
May 07 04:27:46 nas1 kernel:  __schedule+0x349/0x950
May 07 04:27:46 nas1 kernel:  ? __pfx_rcu_gp_kthread+0x10/0x10
May 07 04:27:46 nas1 kernel:  schedule+0x5b/0xa0
May 07 04:27:46 nas1 kernel:  schedule_timeout+0x98/0x160
May 07 04:27:46 nas1 kernel:  ? __pfx_process_timeout+0x10/0x10
May 07 04:27:46 nas1 kernel:  rcu_gp_fqs_loop+0x141/0x550
May 07 04:27:46 nas1 kernel:  rcu_gp_kthread+0xd8/0x190
May 07 04:27:46 nas1 kernel:  kthread+0xe5/0x120
May 07 04:27:46 nas1 kernel:  ? __pfx_kthread+0x10/0x10
May 07 04:27:46 nas1 kernel:  ret_from_fork+0x31/0x50
May 07 04:27:46 nas1 kernel:  ? __pfx_kthread+0x10/0x10
May 07 04:27:46 nas1 kernel:  ret_from_fork_asm+0x1b/0x30
May 07 04:27:46 nas1 kernel:  </TASK>
May 07 04:27:46 nas1 kernel: rcu: Stack dump where RCU GP kthread last ran:
May 07 04:27:46 nas1 kernel: Sending NMI from CPU 4 to CPUs 6:
May 07 04:27:46 nas1 kernel: NMI backtrace for cpu 6 skipped: idling at intel_idle_ibrs+0x81/0xf0
May 07 04:27:53 nas1 kernel: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 2-...D } 153517 jiffies s: 1701 root: 0x4/.
May 07 04:27:53 nas1 kernel: rcu: blocking rcu_node structures (internal RCU debug):
May 07 04:27:53 nas1 kernel: Sending NMI from CPU 0 to CPUs 2:
May 07 04:28:59 nas1 kernel: rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
May 07 04:28:59 nas1 kernel: rcu:         2-...0: (1 GPs behind) idle=0034/0/0x1 softirq=29807707/29807708 fqs=72393
May 07 04:28:59 nas1 kernel: rcu:         (detected by 4, t=169407 jiffies, g=97757405, q=170633 ncpus=8)
May 07 04:28:59 nas1 kernel: Sending NMI from CPU 4 to CPUs 2:
May 07 04:28:59 nas1 kernel: rcu: rcu_preempt kthread starved for 2482 jiffies! g97757405 f0x0 RCU_GP_DOING_FQS(6) ->state=0x0 ->cpu=6
May 07 04:28:59 nas1 kernel: rcu:         Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
May 07 04:28:59 nas1 kernel: rcu: RCU grace-period kthread stack dump:
May 07 04:28:59 nas1 kernel: task:rcu_preempt     state:I stack:0     pid:17    ppid:2      flags:0x00004000
May 07 04:28:59 nas1 kernel: Call Trace:
May 07 04:28:59 nas1 kernel:  <TASK>
May 07 04:28:59 nas1 kernel:  __schedule+0x349/0x950
May 07 04:28:59 nas1 kernel:  ? __pfx_rcu_gp_kthread+0x10/0x10
May 07 04:28:59 nas1 kernel:  schedule+0x5b/0xa0
May 07 04:28:59 nas1 kernel:  schedule_timeout+0x98/0x160
May 07 04:28:59 nas1 kernel:  ? __pfx_process_timeout+0x10/0x10
May 07 04:28:59 nas1 kernel:  rcu_gp_fqs_loop+0x141/0x550
May 07 04:28:59 nas1 kernel:  rcu_gp_kthread+0xd8/0x190
May 07 04:28:59 nas1 kernel:  kthread+0xe5/0x120
May 07 04:28:59 nas1 kernel:  ? __pfx_kthread+0x10/0x10
May 07 04:28:59 nas1 kernel:  ret_from_fork+0x31/0x50
May 07 04:28:59 nas1 kernel:  ? __pfx_kthread+0x10/0x10
May 07 04:28:59 nas1 kernel:  ret_from_fork_asm+0x1b/0x30
May 07 04:28:59 nas1 kernel:  </TASK>
May 07 04:28:59 nas1 kernel: rcu: Stack dump where RCU GP kthread last ran:
May 07 04:28:59 nas1 kernel: Sending NMI from CPU 4 to CPUs 6:
May 07 04:28:59 nas1 kernel: NMI backtrace for cpu 6 skipped: idling at intel_idle_ibrs+0x81/0xf0
May 07 04:29:06 nas1 kernel: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 2-...D } 171949 jiffies s: 1701 root: 0x4/.
May 07 04:29:06 nas1 kernel: rcu: blocking rcu_node structures (internal RCU debug):
May 07 04:29:06 nas1 kernel: Sending NMI from CPU 0 to CPUs 2:
May 07 04:30:12 nas1 kernel: rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
May 07 04:30:12 nas1 kernel: rcu:         2-...0: (1 GPs behind) idle=0034/0/0x1 softirq=29807707/29807708 fqs=80219
May 07 04:30:12 nas1 kernel: rcu:         (detected by 4, t=187646 jiffies, g=97757405, q=188932 ncpus=8)
May 07 04:30:12 nas1 kernel: Sending NMI from CPU 4 to CPUs 2:
May 07 04:30:12 nas1 kernel: rcu: rcu_preempt kthread starved for 2481 jiffies! g97757405 f0x0 RCU_GP_DOING_FQS(6) ->state=0x0 ->cpu=6
May 07 04:30:12 nas1 kernel: rcu:         Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
May 07 04:30:12 nas1 kernel: rcu: RCU grace-period kthread stack dump:
May 07 04:30:12 nas1 kernel: task:rcu_preempt     state:I stack:0     pid:17    ppid:2      flags:0x00004000
May 07 04:30:12 nas1 kernel: Call Trace:
May 07 04:30:12 nas1 kernel:  <TASK>
May 07 04:30:12 nas1 kernel:  __schedule+0x349/0x950
May 07 04:30:12 nas1 kernel:  ? __pfx_rcu_gp_kthread+0x10/0x10
May 07 04:30:12 nas1 kernel:  schedule+0x5b/0xa0
May 07 04:30:12 nas1 kernel:  schedule_timeout+0x98/0x160
May 07 04:30:12 nas1 kernel:  ? __pfx_process_timeout+0x10/0x10
May 07 04:30:12 nas1 kernel:  rcu_gp_fqs_loop+0x141/0x550
May 07 04:30:12 nas1 kernel:  rcu_gp_kthread+0xd8/0x190
May 07 04:30:12 nas1 kernel:  kthread+0xe5/0x120
May 07 04:30:12 nas1 kernel:  ? __pfx_kthread+0x10/0x10
May 07 04:30:12 nas1 kernel:  ret_from_fork+0x31/0x50
May 07 04:30:12 nas1 kernel:  ? __pfx_kthread+0x10/0x10
May 07 04:30:12 nas1 kernel:  ret_from_fork_asm+0x1b/0x30
May 07 04:30:12 nas1 kernel:  </TASK>
May 07 04:30:12 nas1 kernel: rcu: Stack dump where RCU GP kthread last ran:
May 07 04:30:12 nas1 kernel: Sending NMI from CPU 4 to CPUs 6:
May 07 04:30:12 nas1 kernel: NMI backtrace for cpu 6 skipped: idling at intel_idle_ibrs+0x81/0xf0
May 07 04:30:20 nas1 kernel: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 2-...D } 190381 jiffies s: 1701 root: 0x4/.
May 07 04:30:20 nas1 kernel: rcu: blocking rcu_node structures (internal RCU debug):
May 07 04:30:20 nas1 kernel: Sending NMI from CPU 0 to CPUs 2:
-- Boot c63f3c1cd0184f979ab8fc0139719ba2 --
May 07 10:55:33 nas1 kernel: Linux version 6.6.44-production+truenas (root@tnsbuilds01.tn.ixsystems.net) (gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1 SMP PREEMPT_DYNAMIC Mon Mar 31 1>

Stab in the dark, disable ALL the power daemon control stuff on that box. All the c-state stuff is related to sleep states. Could be throttling the cpu down when it’s idle enough then choking on waking it up. Let it run full blast, turn off anything, anywhere, that mentions power states (in the gui, in the bios, anywhere). See if it survives a few days.

Disclaimer: I’m not the hardware ninja and there’s no guarantee this will help at all, just a suggestion.

For now it didn’t crash on me yet, however I had another issue, NFS Server died (same issue as reported here TrueNAS NFS random crash - #47 by Dhiru) so I had to reboot myself in the end. I’m having a lot of NFS issues since I moved to Scale…but that’s for another topic. Let’s see if I can keep it stable now (if I don’t have to reboot for other issues)