So I have a failed boot-pool, on a system that has been pretty rock solid for maybe 10+ years. The boot pool was actually running on a USB drive. The system was upgraded from FreeNAS 11 to TrueNAS CORE about 3 years ago, and was running at CORE 13.1 RELEASE 7p at the time of failure. I still can’t believe that even now with the boot-pool suspended and the system pretty much unconfigurable, the remaining pools and NFS shares are still accessible. However, I need to get the system up and running properly, but here is my dilemma.
I have no configuration backup - this is minor really, my pools were not encrypted so i should be able to import no problems.
Because it was running 13.1 RELEASE 7p, I am not sure which release to build the new boot-pool with. Available for download is 13.0-u6.8 or 13.3-u1.2
The Solace console that was presented when I connected a crash cart to the box is now unresponsive after attempting to enter the shell from the prompt, which leaves me with no graceful way to shutdown the system. (IPMI interface is available, but sending a APCI shutdown command probably wont work if there is no OS to interpret the command)
Any suggestions or guidance would be appreciated. I understand running TrueNAS on a USB stick is definitely no longer recommended (this was setup this way before I “inherited” the system). I plan to mirror the pool once I get it back up and running. Just looking for some solid advice.
Mirroring is not necessary. Just use proper SSD, SATA or NVMe, as your new boot drive.
FreeBSD 13.1 is CORE 13.0. 13.3 is… 13.3—and CORE 13.3 is EoL.
I don’t understand whether TrueNAS is still running (“remaining pools and NFS shares are still accessible” would sugest so). If so, save the configuration.
IPMI needs no OS (the BMC firmare is its own OS). And if TrueNAS is not running to receive an ACPI command, there’s nothing to shut down.
First, it is unclear from yourinfo but I’m assuming the shell is not available from the gui or ssh and gui is not available. Which means, even if the system is “up” it’s broken. With this as a precursor, you’re faced with a fresh install without backup. I’d start with 13.0 & install to a new boot device. Once successful, you cah do some web searches on recovvering config or just start over & import pools. Also, I would not expect a forced power cycle (i.e., pulling the power cord) would be a problem as long as you rifet ensure to data writes to the box are in progess. Finally, if you don’t have a current data backup & the system is available via smb, considering try to copy important data. Good luck.
John
Yes you are correct, the web GUI, shell, ssh, and any other form of interacting with TrueNAS/FreeBSD are not available. Initially when I was unable to access the web GUI, I connected a crash cart to the box and was greeted with messages indicating I/O issues and then saw the message that the boot-pool had been suspended. It was sitting at a Solace prompt with 11 options, one of which was 9)Shell. I attempted to enter the shell and the terminal just does nothing. It accepts input but no commands work, no help, nothing. One of the options was reboot, but I can’t even return to the prompt menu. There were 4 other pools running on this device, with several NFS shares configured. Those shares are still accessible somewhat. If client machines had already established connections with a share, they seem to be able to read/write to them. However, attempting to mount the shares on new devices returns RPC errors. It took a while to copy most of the important data to an alternate location, and I am ready to failover. So, at this point I am not too worried about data. Rebuilding data would not be a fun activity, but I am confident the other 4 pools should import easily into a fresh install. I was primarily worried since as mentioned before, the version running at the time of the crash was CORE 13.1 RELEASE p7 which it guess is an interim release. I just wanted to make sure I rebuild with the correct available release of CORE. Once I have things up and operational, I will plan my eventual migration to SCALE.