Hello,
I hadn’t touched this server since I ran the last updated and I noticed today that the server was offline.
When I booted the machine, it runs through the startup messages, then immediately triggers the shutdown actions and the machine turns off.
I’ve managed to record the display output of the IPMI KVM interface here: somup . com/cTiIcAN1sR
This start getting stopped around 1:56 in the video, not sure what is triggering the shutdown actions.
About the server, it runs an EPYC CPU, 256Gb of RAM, about 160TB of storage (Storinator 35).
I am not entirely sure when this started or what is causing it and I can’t find anything relevant in the display recording.
Something worth noting, I do have this plugged to a UPS monitoring service, not sure if it is relevant; I’ve seen some posts about that and I did notice a message about the UPS in the startup messages upon reviewing the recording.
Other notes:
- I tried loading
- 24.10.2
- 24.10.1
- 24.04.2.5
- 23.10.2
They all have the same problem.
Any help would be greatly appreciated.
Thanks
Try unplugging from the UPS and directly into your power source. It could be a bad UPS. That is just a guess. I didn’t notice anything in the log when I glanced at it.
You may want to check for IPMI updates or motherboard bios updates, also.
Do you have any custom settings on startup or is this system all defaults. No tunables or custom startup scripts, etc?
Do you know the time stamp of the UPS message? It may help others to see it.
The config is pretty standard, I don’t have any custom startup settings. It should all be default.
The UPS monitoring connection is done through a separate service; I have a Synology device that is connected to the UPS and serves the status over to the TrueNAS server over the network.
Not sure how to get the timestamp of the ups message. Both my UPS units are up and running.
Do you have a current backup of your configuration file?
You may want to try disconnecting the TrueNAS server from the network so it can’t talk to the Synology and plug the TrueNAS server in directly, skip the UPS for now.
You can also try to disconnect all storage except the boot drive and try booting.
Another thought is to try booting a Live Linux USB. If it won’t boot a Live Linux, it may be hardware. We need to try to narrow items down since you don’t have much to go on
Quick Update, I ended up booting in single user mode from grub by adding systemd.unit=rescue.target
to the boot command
Once in single user mode, I went and disabled the ups related services:
journalctl -u nut-monitor
journalctl -u nut-server
I’m not sure why the ups monitoring was stuck in a shutdown mode. I had experienced a power outage some time ago and I suspect this would have been the cause, but I’m not entirely sure why it would remain stuck in a shutdown state.
I was able to get TrueNAS back up after disabling the ups services.
1 Like