Issues with booting: zpool set cachefile=/data/zfs/zpool.cache

Hi there, first time posting, so I apologize if I have misunderstood any forum decorum.

I am a graduate student who has inherited a FreeBSD TrueNAS 12.2 release 6 system. Over the summer, we had a brown out that resulted in a failed motherboard, but all the drives were fine. We replaced the board and it was running fine until yesterday. Yesterday, I noticed a corrupted db file (can’t remember which one exactly). After a brief googling, I believed that on the next system scrub (which was supposed to be Sunday) it would resolve. Today, the NAS restarted and then would not complete the boot process. When we plugged in a GPU and monitor, we noticed it had frozen at “zpool set cachefile=/data/zfs/zpool.cache shuttle”. (Shuttle is our pool name).

From an extensive search, it looks like this is a boot corruption issue (or otherwise corrupted file) and the best thing to do is to reinstall TrueNAS but I have both questions and concerns.

  1. Since I inherited this system, I have no idea if we have a configuration file and where it would be stored. If it is on the pool (and the pool is theoretically fine), would I be able to access it following a re-install? If it is not, what is the worst case scenario for our data? Does the configuration file just contain settings in the GUI that we could figure out over time?
  2. We do not have a data backup, so I cannot overemphasize how critical it is that I protect the data on these drives (we are planning to backup data IMMEDIATELY after resolving this).
  3. I am not very familiar with NAS lingo (more of a hardware gal myself). I would immensely appreciate if you could dumb steps down as much as possible.

Here are the specs for the device:

Specs:
Mobo: ASUS TUF Gaming x570-Plus
CPU: AMD Ryzen 9 3950x
RAM: 64 GB G.skill Ripjaws DDR4 3200 MHz
Storage setup: RAID Z1 (4x4TB WD Red+ HDD, 2TB Nvme SSD for cache)
Boot drive: 128GB SATA SSD

Run headless typically but has a GeForce 1030 in right now for diagnostics.

All feedback and advice would be appreciated.

The boot drive is only a boot drive, it is not where you would store data other than configuration date, like user accounts, network setup, etc.

You should be making backups of the configuration file, pretty easy to do via the GUI.

Your solution is to obtain a new (or good used) SSD, leave the original one alone just incase the config file is recoverable if you really need it. Now install FreeNAS 12 onto the new drive. Be careful that you choose the correct drive, if you select a data drive, you can easily wipe it out. You can disconnect each hard drive while you are installing the FreeNAS 12. Once the App is installed, power down, reconnect your data drives. You may need to setup your network configuration as well and some user accounts. But to just grab a copy of the data, root user will do that.

Next you can Import the pool(s), and set the SMB shares up. Copy off the data.

Once you have the data copied, now you can configure the system to suit your needs.

If you can, you might be able to recover the configuration file. If you had your System Dataset on the pool, then you are apt to recover it. Search for the file named freenas-v1.db and pwenc_secret if you had encryption, so the second file may not be present. Place those into a directory and Upload the configuration file via the GUI.

If the system dataset was located on the boot drive, see if you can access that drive and locate those files.

If you are able to recover the configuration files, you can of course apply those once you have installed the FreeNAS application.

Do a Google search for freenas freenas-v1.db and you should find some old threads to read.

Thank you so much! We successfully managed to recreate the boot-pool on a new SSD and now we have backups of everything (data, config files, encryption keys), so I can now be a bit more bold. I was still getting a corrupt system dataset file (df_complex-free.rrd) on the pool so I moved system dataset to the boot-pool so it would rebuild. The corrupt file still appears on the pool, but it might be in the snapshots so I am working to remove that (scrubs take a while).

I am now getting a corrupt file on the boot-pool and I’m not sure how to manage that: boot-pool/ROOT/default@2024-11-04-17:59:21:/usr/local/www/webui/assets/i18n/my.json

I assume I can’t just delete this (not sure where to even find it), and I don’t know if I can run a scrub on the boot-pool. Should I just re-reinstall the OS?

Thanks!