Came home from vacation and wanted to double check the health of my system. UI system unavailable. I can ping the system on the network but cannot SSH (asks for password then just hangs)
No idea what went wrong so I plugged a monitor in.
This is what I can see
If this is not enough to determine the issue, is there a place I can gather more stats?
Seems like this could potentially break my raid if the system suddenly panics or needs to be physically shutdown.
It looks like the system may have run out of RAM. Can you provide hardware details and what version of TrueNAS this is?
The output of less /var/log/messageswhich will be very long would be next steps once you’ve rebooted the system. Feel free to DM the output to me, or a debug file
The system was just set up for testing. No traffic or any applications/shares/etc. It was just sitting there for a burn in. Then I went on vacation. So it wouldn’t have run out of memory from usage, maybe a memory leak?
24.04.0 had issues with agressive swap usage, which could lead to freezing systems. Because of this, swap has been disabled in .01 and .02. Try a restart, update the system to .02 and let the system run for a while and see if the problem reoccurs.
Thanks LarsR. Do you think that was my issue? As I mentioned. I had a very basic setup, no actual usage on the system and it was just sitting there for 60 days when this happened.
Does the log file indicate this type of issue? Why would it be using swap if there was 0 space used and 0 users accessing it?
Can’t say for sure, but i was going off of Nick’s comment about out of memory and given the posts i remember with the .00 version and aggressive swap slowing down systems and even some reports of systems crashing there should be no harm in upgrading to .02.
In either case, updating from 24.04.0 to 24.04.2 should be done considering the known issues with the former.
High swappage on .0 was leading to a massive amount of IOWAIT, which would cause system lockup similar to what you are seeing.
Re-installing won’t be nessessary since the .1 and .2 update were only minor bug fixes to address the swap issue and no major features were added/removed.
Pass 6 and no errors. Should I chalk it up to a sofware issue? Is there no way to see in the debug file and confirm that’s what the issue is? I’d hate to build and start putting my data on this box and have it lead to issues