I tried to access the web GUI of my NAS (ElectricEel-24.10.2.1), but I got the message:
“Connecting to TrueNAS … Make sure the TrueNAS system is powered on and connected to the network.”
Some services running on the device were still working fine.
When I connected a monitor to the device, I got the following screen (see attached image).
Even after plugging in a keyboard, I wasn’t able to type anything.
After rebooting the device, everything started working again.
Any ideas on what could have caused this or what checks I should perform?
To be honest (aka. Transparent), you didn’t provide much data other than an error message. There is not enough context.
Do you have Dockers, VMs, your system configuration, RAM, storage, etc.
My guess, your system had a partial crash. Why? It is a complete guess, system stability, not enough RAM, improper configuration, and the list goes on.
The good part, rebooting recovered the machine.
My advice: monitor how long the system remains up and running. Write down what fails and how long the system was operational before it failed. Is anything going on at the time if the failure, like a SCRUB as an example. Have you run a recent RAM Test and CPU Stability test? Do you have good airflow? (bad fans).
Before upgrading to ElectricEel-24.10.2.1 (from 24.10.2), I had over 40 days of continuous uptime.
The setup hasn’t changed (however It Is a test machine):
*Same services running in Docker
*Regular disk scrubs (1 SSD + 1 NVMe in mirror, 1 SSD in USB enclosure used as boot)
*No disk errors reported
*16 GB RAM
RAM and CPU testing
I’m considering running a RAM test and a CPU stability test—any recommended tools or best practices for that?
Airflow
There might be an issue with airflow, but again, open to suggestions on what to monitor or how to rule that out properly. SSD temp is 30°C average. CPU is more or less 40° C
Logs & Diagnostics
Are there specific logs or tools I should look into to understand what caused the failure?
The strange part is that all services were still responsive before the incident, so I’m not sure what triggered it.
Any help or pointers would be appreciated!