Varying Unexpected Shutdowns

I currently use TrueNAS Scale 23.10.2. Regarding why it isn’t 24 is also part of the issue.

The setup is a Gigabyte B550M-DS3H with a Ryzen 5 3500X CPU, 16GB (X.M.P) 3200mhz RAM, and a Geforce GTX 1660 Super. TrueNAS is installed on a NVME SSD and Windows on a separate NVME SSD too.

Reiterating the timeline, I first started with Proxmox and Truenas in it. I wasn’t really satisfied with it because there is too much overhead trying to tune everything together. I never really had issues regarding shutdowns at that even after installing Windows on another SSD, but it got to me that I wanted to use just TrueNAS because it has all the VM things I may need when I setup HASS. So I did. I installed TrueNAS 24.xx.xx as replacement on the Proxmox in the same NVME drive and there it started its shenanigans.
There are times that it’s idle and the whole system just partially turns off, of which I mean TrueNAS is inaccessible, but the fans are haywire fast. I only could ever use it for a few minutes, and it does that. (For emphasizing, I don’t really know how to see the logs because I’m fairly new to these). When I boot into windows though, even if I let it sit there for the whole night idling or even downloading a file, it stays. So, I thought maybe it’s a faulty install, so I tried TrueNAS 23.10.xx and installed Windows again because the EFI is installed on truenas’s partition table too and that destroys the boot manager for Windows. THIS TIME, it stays a fairly long time, an hour or more even and then it shuts off. But as a diagnosis, I think it turns off when I go idle and do not touch the server at all for few minutes.

For remedies I tried, a semi TL:DR for faster rediagnosis is I reset the BIOS, turned off and on the virtualization, turning off XMP, trying the server plugged on an AVR and not, and reinstalling clean TrueNAS OS of versions 24.xx.xx and 23.xx.xx.

this sounds like maybe a kernel panic
do you see anything on the console / display when this happens?

it all happens fast and unexpectedly, I can’t really see anything on the console before it goes out. Does that get saved on a log of some sort or should I manually check on it (prolly with a video)?

I’d expect it to stay on console in that case.

I would get a debug from TrueNAS WebUI and create a bug report in Jira. If you haven’t modified anything in the OS then they will help with better support than I can give.

If you are on Dragonfish you can do this via the WebUI (with a better subject and message of course):
image

If not, you can just log into / sign up for Jira and submit a ticket there.
Your work - iXsystems TrueNAS Jira.

will they reach me out there? thanks for the help essinghigh. I’ll try what I can do for now while waiting for their response.

You’ll need to sign up on their Jira portal so you’ll get emails for when the bug ticket gets an update

will it be safe for now to have my data stored in the datasets I have already?

I don’t want to say explicitly that it will be safe, but it should be okay. As long as you aren’t writing to the disks when a shutdown occurs you should be able to continue reading from them as normal.

While waiting for iXSystems to respond to a bug ticket you could take a look through /var/log. /var/log/messages might have some interesting pointers.

The systemd journal might also catch any possible failures in middleware.

how do I acccess that from the shell? nano? nano /var/log/messages?