TrueNAS SCALE crashing every day

I’m experiencing daily crashes on my TrueNAS SCALE pretty much shortly after I installed it a few weeks ago. It’s running on UGreen DXP4800 Plus (Intel Pentium Gold 8505; 32GB DDR5; 100GB NVME boot disk; 2x 4TB NVME SSD; 4x 16TB Seagate X18 Exos).

I installed 25.04.1, then upgraded to 25.04.2, and then 25.04.2.1. All three versions experienced crashes.

Initially I was using 2x 32GB DDR5. I suspected it could be due to faulty RAM so I ran memtest and it failed quickly with many errors (I returned the RAM and the seller claims he ran memtests on it on a couple of his machines after it was returned and they passed, so it’s possible it was not seated properly in my NAS). Anyway, I swapped them out with 2x 16GB DDR5 and did memtest for a couple of hours with no errors after that but the crashes kept on happening still.

I’ve used a Live Finnix USB to run Long SMART tests on the boot SSD, 2x 4TB SSD. All passed the Long tests. I’ve also ran Long SMART tests on all the 16TB X18 Exos earlier when I first put them into TrueNAS – they also passed the tests.

The crashes happens after a couple of hours after I restart TrueNAS, usually there will be one within 24 hours. When the crashes was due to the faulty RAM, the system would just completely freeze with no additional output on the screen (the last image is the TrueNAS login/options page where it shows the system IP and numeric options to choose) and nothing strange in /var/log/messages.

After the RAM was replaced, when crashes happen, I can see from the KVM screen some output, although only the last bits. Can anybody help me decipher what might be causing the crashes? Each screenshot below is from a different crash.

Additional information:

  • I checked the BIOS. C-states are disabled.

I’m very new to TrueNAS so I’d appreciate pointers on the relevant logs to look at and produce here. Thanks all.

How the system Is connected to the lan, via the 10gb Atlantic nic? Because in your place the first attempt i would try Is to disable It via BIOS and try the 2.5 Intel one. The second screenshot to me seems to show a driver failure.

The general how-to-debug in those case Is tryng to run the system as minimal → check stability → add a new variable each time and check stability again, until the problem not come out. Yep, It require a lot of patience, hope someone more expert can give you better tips

1 Like

Did you reinstall after swapping out the RAM?

It kind of looks like faulty RAM still, or an unstable CPU.

1 Like

Yes, it’s connected on the 10Gb port at the moment. It has a bridge set up.
I’ll try to use the 2.5Gb port to see if it’s more stable. Thanks.

I did not re-install. I’m more confident now that it’s not the RAM as when it was the RAM the system just completely froze up without even a chance to display any additional messages on screen.

I’m certainly hoping it’s not the CPU, that would be a pain. But I might have to wipe TrueNAS and run a different OS to test if the 10Gb port is not the cause.

Do reinstall the OS.

Anything you did while running faulty RAM is suspect, it could be corrupt in hard to figure out places. Unfortunately it could also have corrupted your pools in such a way that they trigger crashes, but first handle the easy things, like the OS.

2 Likes

Update: I reinstalled TrueNAS on the 100GB boot SSD, went to bed and found it has crashed again. I’m gonna try the 2.5Gb port now.

Edit to add screenshot of latest crash:

1 Like

I went down the Aquantia “atlantic” with TrueNAS issues rabbit hole last night and thought that might be it but gutted. Disabled the 10Gb NIC in BIOS and used the 2.5Gb NIC but still crashes.

I had a similar issue with similar hardware, try it with a different OS next and run it for a few days on that, see what happens. In my case it appears to be a damaged motherboard, waiting on a replacement to confirm.

1 Like

you still crashes, but everytime the error seems different (but always about a corrupted pointer).
This is kinda worrying, if a fresh install didn’t help, ram are ok… what @JohnnyM adviced sound pretty usefull to ensure before contiunue

1 Like

There’s something in /var/log/messages at the time when it crashed last night. Software or CPU related?

Aug 13 02:38:38 nassie kernel: show_signal_msg: 11 callbacks suppressed
Aug 13 02:38:38 nassie kernel: asyncio_loop[1031]: segfault at 1c411af ip 000000000046274f sp 00007ffc89653cc0 error 6 in python3.11[6274f,41f000+2b6000] likely on CPU 3 (core 9, socket 0)
Aug 13 02:38:38 nassie kernel: Code: 2e 0f 1f 84 00 00 00 00 00 66 90 41 55 41 54 55 48 89 f5 53 48 89 fb 48 83 ec 08 48 8b 77 08 48 8b 46 60 48 81 c0 0f 84 2f 01 <00> 00 48 8b 90 e0 00 00 00 48 85 d2 74 3a 48 89 ee ff d2 48 3d f0
Aug 13 02:38:38 nassie kernel: truenas_audit_h[2790]: segfault at 1c411af ip 000000000046274f sp 00007fffefcf3110 error 6 in python3.11[6274f,41f000+2b6000] likely on CPU 3 (core 9, socket 0)
Aug 13 02:38:38 nassie kernel: Code: 2e 0f 1f 84 00 00 00 00 00 66 90 41 55 41 54 55 48 89 f5 53 48 89 fb 48 83 ec 08 48 8b 77 08 48 8b 46 60 48 81 c0 0f 84 2f 01 <00> 00 48 8b 90 e0 00 00 00 48 85 d2 74 3a 48 89 ee ff d2 48 3d f0
Aug 13 02:39:12 nassie kernel: Linux version 6.12.15-production+truenas (root@tnsbuilds02) (gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1 SMP PREEMPT_DYNAMIC Wed Aug  6 17:37:09 UTC 2025
Aug 13 02:39:12 nassie kernel: Command line: BOOT_IMAGE=/ROOT/25.04.2.1@/boot/vmlinuz-6.12.15-production+truenas root=ZFS=boot-pool/ROOT/25.04.2.1 ro libata.allow_tpm=1 amd_iommu=on iommu=pt kvm_amd.npt=1 kvm_amd.avic=1 intel_iommu=on zfsforce=1
 nvme_core.multipath=N usb-storage.quirks=2109:0715:u
Aug 13 02:39:12 nassie kernel: x86/split lock detection: #AC: crashing the kernel on kernel split_locks and warning on user-space split_locks

I will wipe the 100GB boot disk and put on a different OS soon to test, just haven’t found the time to do it yet.

Update: Still happening even when using Proxmox with TrueNAS virtualized. Proxmox itself was rebooting or crashing. It’s almost certainly a hardware problem.

Thanks for all the help and pointers given.

Now the fun part is trying to get some support/RMA from UGreen.