I recently moved my six HDD’s in my old CORE system to a new motherboard, CPU, HBA card, PSU, etc. This ‘new’ computer was previously running Win10 for years with (as far as I can tell) no problems.
Now, it appears to crash maybe once every 2 days, with no rhyme or reason.
After the crash, its inaccessible via webgui, of course, but on the machine itself, the internal fans and LEDs are still on, but no output to the monitor, I can’t SSH in, and the LEDs on the attached USB keyboard are no longer lit up. I’ve bypassed the onboard Realtek NIC, removed the graphics card and switched to onboard graphics, ran memtest overnight, but all was fine.
When I force reboot, Truenas recognizes that its been the victim of an unscheduled reboot, but doesn’t give me any more info than that.
I did notice that after the latest restart, my pool was degraded because it was unable to read one of the HDD’s, so I turned it off, then unplugged and re-seated all the SATA data cables and SATA power cables, then rebooted and it was fixed.
Where would I even start to diagnose this? I can’t recreate it at all, it seems random.
TrueNAS CORE 13.0-U6.1
AMD Ryzen 5 1600
Gigabyte B450M DS3H
80GB DDR4 RAM
Dell H200 6Gbps SAS HBA =(LSI 9211-8i) IT Mode
10Gtek 10/100/1000Mbps Gigabit NIC (Intel 82576)
6 HDD’s in RAIDZ1
1 Nvme boot drive
1 Flashdrive as cloned bootpool