TrueNAS stopped booting recently for no reason, blinking cursor only

Hello TrueNAS community! I’ve owned my own custom NAS server since mid 2020 for free from an associate of mine. It’s based on a Lenovo ThinkCentre m91P Tower workstation, with 2x80GB Silicon Power SSDs for boot (RAID 1, 80GB total), and 4x2TB Seagate HDDs for data (RAID 0+1, 4TB total).

It’s been working fine for this long, but recently I woke up this morning and turned on my NAS (have to shut it down every night due to living with roomates with a strict power bill, it’s only on until bedtime) and it gave me two beeps (POST beep and a no-keyboard beep, latter is irrelevant), which is normal, but then 2 minutes later, it played the same two beeps again, which made me concerned.

I plugged in a monitor, and saw a message along the lines of “This is a TrueNAS data disk, please boot from the install drive”, which was odd. So I went into the BIOS and set the boot order on the RAID array to have the two SSDs on top and the data drives on the bottom, with the RAID arrays being above the SATA slots but below the USB slots. (and later on putting the whole RAID array as first). After rebooting, it was stuck on a blinking cursor for 10 minutes, before giving me the error message “Error 1962: No operating system found”. I feel concerned that the boot SSDs are dead in some form, considering it was just working fine yesterday with no updates or anything that affected the system (last update was a couple months ago). People say it’s an issue with Lenovo motherboards involving issues booting with CSM/Legacy instead of UEFI, but I tried setting it to that and it just resulted in the DisplayPort connector not working (until I plugged it in via VGA and changed it back to CSM/Legacy). Plus, like I said, it suddenly stopped working out of the blue. I’m lost and out of options. I am sure the cables for the drives are fully connected and the RAM and RAID array card are fully connected & seated. If anyone knows what’s happening, please let me know. I feel very miserable because I didn’t back up my config for my install, so I need to find a way to make it work again and back up the config before it stops working again.

It sounds like your SSDs are crashed. What version of Truenas are you running, and when you talk about drive configuration does RAID1 mean RAIDZ1 with ZFS and does RAID0+1 mean mirrored vdevs? Are you using anything bios based for raid?

If both your SSD’s are smoked, that shouldnt be a big deal, since you can replace them and reinstall Truenas (I would install the same version) and then import your configuation and pool.

I don’t recall what version I was using since I didn’t screenshot or save any information regarding my system. Like I said, I didn’t save a config of the install, I don’t think the data drives are encrypted but I don’t remember so I feel pretty dreadful right now. In terms of the data pool array, it’s RAID 0+1 since 2 of each are split. Again, I think the data drives are fine. Also the RAID arrays for both the SSDs and the data pool drives are connected to a LSI RAID controller card.

Also big and potentially dreadful update, my motherboard stopped booting from the LSI RAID card. It was working fine when I turned it on to see if it would boot from the SSDs (it still doesn’t) and I was making a Ventoy drive with GParted on it to see if I could access the drives and check the status. However, when I started my NAS, it booted straight into the Lenovo logo instead of the LSI RAID boot sequence. Again, I’m sure everything is connected properly so I feel scared that the controller card might be cooked, but I’m unsure.

So, if you are using hardware RAID for all of this, you MAY be able to recover data, but it wont be through any ZFS magic. It is probably best to determine of the RAID card is shot and if so, a replacement card might be the quickest way to recovery (I have not used hardware raid in years, so I am not sure).

Super quick update, I cleaned my server of any dust and tried to unplug and plug back in some of the cables on the RAID controller card alongside wiggling it gently to see if it fits. I then started it back up and the RAID card booted successfully before the Lenovo logo. Either it was seated improperly, the dust might’ve done something to it, or worse it’s dying as we speak. Again, I’m very unsure.

I’m also in GParted Live and can’t tell if the drives are dead or not. GParted is reading them just fine. The boot SSDs have a 40GB partition in them as “freenas-boot” with the zfs file system.

Also I’m a dumb idiot because the SSDs are actually 64GB each, not 80GB. Whoops.

So if GParted claims it’s fine, and I did systemctl to check the SMART status and based on various info I think it might be fine (Reallocated_Sector_Ct is 100, raw 0 - Power_On_Hours is 1000, raw 23520, all attributes have the type of “pre-fail”), but again, I’m unsure. If they are all fine, I think that means the OS crapped itself file-wise rather than hardware wise and I may need to reinstall it.

I’ve made a TrueNAS installation media with the latest stable version of TrueNAS CORE on it, and it told me if I could upgrade the drives since it detected an install on it, which means the files and everything are fine.

However, after upgrading, it tried to boot and it just ended up doing the same blinking thing for 5 minutes before going to the Intel MAC Address screen thing (which leads to the 1962: Operating system not found error).

I really don’t know what’s happening at this point, because if the SSDs are fine and the data pools are fine, then it has to be a motherboard issue or something else. It can’t be the RAID controller card because if the TrueNAS installer and GParted are able to see it clearly and read/write to it just fine, then that means it has to be a motherboard issue, and I’ve already resetted the CMOS/changed the battery during all of this before using GParted and so forth, so I dunno what’s happening. Especially since it was just working the day before hand. I’m at a loss.

Bumping because based on my previous post and a day of thinking, it’s either the bootloader on TrueNAS has gone kaput, or a setting on my motherboard changed out of nowhere. I’m going to say the former, something related to TrueNAS’ bootloader being broken, and yesterday I did “update” to no results.

When I did the update, I selected “Install in a new boot environment”, and it said it successfully updated (again, there’s no way the SSDs are cooked if GParted Live and SMART tests report they’re “fine”, plus the TrueNAS installer can clearly read the config files and write too it as well), but it went back to the blinking cursor on the upper left corner before going to the last boot method (i.e. PXE then the OS not found error).

The other option I didn’t go with is “Format the boot device”, which I didn’t go with because despite the update method selection saying my config will stay the same, it will wipe the SSDs, which makes me worried it won’t preserve the config files. I don’t know if this is the way to go, but if it actually formats my config I want to find a way to preserve it since I feel a little suspicious that it won’t keep my config. I know I sound like an absolute noob at this, but in actuality I’m panicking and extremely worried since I don’t want to lose stuff like Plex jail metadata among other important settings. Again, I’m aware my data on the drive pools are safe, it’s the main boot drives I’m concerned about.

IT’S FIXED!

I did the setup again using the TrueNAS CORE installer, but this time I selected “Format the boot device” and I changed the boot mode in the BIOS from “Auto” to “Legacy” and everything is working! The config is the same and everything is working, so I’m very happy.

I’m going to guesstimate that maybe TrueNAS has a bug that occasionally messes up the bootloader (like I said, all the drives are fine), and reinstalling said bootloader fixed it. Regardless, I’m happy everything is working and I’m hoping something like that doesn’t happen again.