I recently replaced my motherboard as an attempt to solve an HBA i/o error issue (i’ll add context at the end). And since then I’ve had an error message show up on the local shell.
“irq 16: nobody cared (try booting with the “irapoll” option)”
plus the resulting traces.
CPU: 12 PID: 0 Comm: swapper/12 Tainted: P OE 6.6.44-production+truenas #1
Hardware name: Micro-Star International Co., Ltd. MS-7D06/MPG 2590 GAMING CARBON WIFI (MS-7D06), BIOS 1.60 01/14/2022
Call Trace:
<IRQ>
dump_stack_1v1+0x47/0x60
-report_bad_irq+0x2b/0xc0
note_interrupt+0x2c2/0x300
handle_irq_event+0x6f/0x80
handle_fasteoi_irq+0x7c/0x210
__common_interrupt+0x3c/0xa0
common_interrupt+0x81/0xa0
<IRQ>
< TASK>
asm_common_interrupt+0x26/0x40
RIP: 0010: cpuidle_enter_state+0xcc/0x440
Code: 8a 7d 53 ff e8 35 f4 ff ff eb 53 04 49 89 c5 of 1f 44 09 06 31 ff e8 f3 b9 52 ff 45 84 ff 0f 85 57 02 00 00 fb Of 1f 44 00 00 <45>
RSP: 0018:ffffa0e3c019be90 EFLAGS: 00000246
RAX: ffff901eff733440 RBX: ffff901eff73d0c0 RCX: 000000000000001f
RDX: 000000000000000c RSI: 00000000238e38e3 RDI: 0000000000000000
RBP: 0000000000000001 R08: 0000000000000000 R09: 000000000000000b
R10: 0000000000000008 R11: ffff901eff731fe4 R12: ffffffff8579b800
R13: 0000003274e241a2 R14: 0000000000000001 R15: 0000000000000000
There is typically more after it, but the next instance of the TrueNas shell prompt comes up after this.
Here is some details of my system:
i7-11700k + z590 + 64gb RAM
boot drive connected to a PCIeX1 sata port multiplier (not used for anything else)
4x16TB pool (mirror/striped) connected directly to the motherboard
4x4TB pool (mirror/striped), half the drives connected to the motherboard, half to an LSI HBA
2 hotswap bays with “warm” spares in them connected to the LSI HBA
1 2TB nvme SSD for apps.
1 10G NIC
What I’ve tried:
updating BIOS
reinstall TrueNAS
changing cards/pcie slots around
plugging the boot drive into the motherboard
What to do next:
hopefully you all can help me figure out how to make it go away
do nothing? besides the notification, it doesnt seem to affect the servers functioning.
put my old motherboard back on.
THe historical HBA issue.
In the passed I’ve had occasional issues with drive read/write errors on drives connected to my HBA. The drives themselves are always ok. It’s probably a heat issue, Im using a consumer desktop chassis. But I do have a side panel fan blowing air towards it as well as a small fan strapped directly to the HBA’s heatsink running at full tilt. My old motherboard had 4 SATA ports, which ran the 4x16TB pool. I wanted to reduce the load on the HBA, so with the new motherboard, I’m able to plug more drives directly into the motherboard.