System:
-
Supermicro H13SSL-NT with AMD EPYC 9124 (16-core 3GHz)
-
256GB DDR5 ECC JEDEC RAM
-
Chelsio T6225-CR 2x100Gb NIC
-
LR-Link LRNV9F24 PCIe4.0x16 Retimer adapter
-
Boot: Seagate 512GB M.2 FireCuda 530
-
2 x Broadcom LSI 9500-16i HBAs (fw ver 37.00.00.00)
-
AIC BBPHHD40013B backplane (LSI 35X28)
-
60 x Seagate Exos X20, 20TB HDDs (6×10raidz2)
-
2 x Micron 7450 U.3 3TB SSDs for metadata (special vdev, mirror) (latest fw)
-
1 x 16TB Solidigm D5-P5336 PCIe SSD (it’s a scratch disk for users)
-
4 x AcBel R1CA2801A PSU
Behavior:
Since upgrading from CORE to SCALE [25.10.1], the T62100 NIC stopped working so the NAS is mostly unusable (onboard NICs notwithstanding).
-
lspci sees it normally
-
ifconfig or ip link don’t see it at all
-
dmesg and lspci -vv reports keep giving these errors:
cxgb4 0000:01:00.0: can’t ioremap BAR 0: [??? 0x00000000 flags 0x0]
cxgb4 0000:01:00.0: cannot map device registers
So the hardware is seen, but the driver fails to initialize.
Tried:
- Moving it to a different slot.
- These BIOS settings:
-
-
Above 4G Decoding → Enabled
-
Re-Size BAR → Disabled
-
SR-IOVBME DMA mitigation → Disabled
-
IOMMU → reverts to Auto (due to dependency on APIC?)
-
PCIe AER → tried disabled & enabled
-
No change.
- Added kernel parameters pci=realloc=on , and pci=nocrs .
After that, I was able to get
pci [..] BAR 0 [mem … 64bit]: assigned
BAR 2, BAR 4, etc. but now I’m stuck with
cxgb4: probe with driver cxgb4 failed with error -5
And I’m out of ideas that don’t involve replacing the NIC. This one should work.. I picked Chelsio because it is supported and recommended by iX.
Should I open a bug at this point? I can provide lspci and dmesg.