5 day reboots?

Since upgrading from Electric Eel to Fangtooth my system becomes unresponsive and I have to hard reboot it roughly every 5 days.

I updated through all versions to the current RC and it is still happening. I can revert back to EE as I haven’t upgraded any pools, but it would be nicer to figure out the cause.

To that end are there any logs I should be looking at? Prior to this update my system ran for months on end without issue.

You should start by giving complete details on your hardware, OS, pools, apps, VMs, etc. Provide as much detail as you can and steps you have tried to take to diagnose. We can only go off what is posted here.

Joes Rules to Asking for Help

1 Like

Yep sorry, was waiting for the system to come back up before I took a screeny.
No apps, no containers, just storage and 2 VM’s. I don’t know what logs to check, hence I’m asking, never had a problem before.

In lieu of actual detailed hardware information I will fall back on the more generic advice:

  • Bad RAM → Run memtest86 overnight and see how the hardware fares.
  • A PSU that’s going bad → Do you have a different PSU to test with? (if modular, do not reuse the cables from one PSU with another!)
  • Boot drive that’s starting to fail → Make sure you save a backup of your TrueNAS config so you can do a reinstall on a different boot device, if need be.

thanks, I’ll just revert then, seems too much of a coincidence that an upgrade killed some hardware. I figured there would be logs worth checking but if not such is life.
I used to have all that info in my sig on the forum will see if I can add it here.

Your pool is 80% full. That is why it is in the red and can easily cause issues. Increase the storage capacity or remove some data from the pool.

it’s a good point, I have 2 8tb drives in transit to swap with 2 4tb ones. this has been happening for a few months though, and I do believe began at around 75%.

Do you have any background process (like smart tests on nvme) that can coincide with the hangs?
Do you have tried to power off the VM to see if the error persists?
What mainboard are you using specifically? Just to understand at least if a Realtek nic is involved

Because of the increased overhead and performance issues when the pool os at or above 80% the system will become stressed and with the pool fragmenting to the point where it becomes very sluggish and depending upon hardware eventually non responsive which in turn makes the system non responsive as a whole. If a reboot is taking care of the responsiveness for a short period then you are, IMO, already in this phase with the hardware you are using and need to do something about it now not later. This as been explained many times in the forums and I believe is even in the docs and I think your issues will subside once you get usage back down to around 60% or less.

motherboard is a asrock h370 pro so intel gigabit nic. hopefully just a space issue perhaps, will know in a few days

aha, this is a network problem!
as I’m away from home this week when the nas dissapeared after only 3 days I decided to reboot my router and voila, NAS back again.
I think a port is failing on my ageing Asus ax88u. this would never happen if we were allowed to have wifi! :laughing:

wifi?

1 Like

yes, you know, the thing that’s actually more reliable than LAN these days. :sweat_smile: