Kernel panic not syncing zfs

I have discovered overnight my system has failed. This is the setup

TrueNAS Fangtooth 25.04.0
Intel Xeon E5 2670 V3
Machinist PR9 X99 Motherboard
128GB DDR4 2133MHz 4x32GB
Samsung 840 120GB SSD (Mirror 2 x 120GB) - boot pool
Seagate IronWolf 4TB NAS Hard Drive 3.5" SATA III 6GB’s 5900RPM 64MB Cache (4 x 4TB) Raid-10 - main pool

I don’t know where my last backup file is so I would have to rebuild from scratch if there is no way to recover this.

Can someone please advise if there is anything I can do on the advanced option on the boot menu?

Thank you

Hi,

So presuming your system won’t boot have you tried switching to the second SSD in your boot mirror? You would most likely need to change this in your bios.

Seems unlikely that both your boot SSDs would fail at exactly the same time.

1 Like

Hi,
So I have unplugged each one in turn and tried to boot and the same things happens with both.

Can you tell if the panic happens when trying to import the data pool?

Is it possible to take a photo of the screen when it panics?

Thank you both for your patience helping me.

This might sound silly, but is there more above or below the portion you captured?

This is a recurring OpenZFS issue I’ve seen a few times, though not 100% certain on the exact cause. Seems to be more common on Non-ECC systems during very large transfers.

If you’re able to boot with enough disks disconnected to cause the pool to fail to import, reconnect them after boot and check if you can import the pool readonly via zpool import tank -o readonly=on

I’d hope you have backups in place, however if not, you should copy off any important data while the pool is imported readonly. Keep in mind GUI won’t be able to see/interact with this so you will need to pull data via something like SFTP.

If that’s successful, your data is most likely fine. You can then zpool export tank, enable the zfs_recover tunable (sudo echo 1 > /sys/module/zfs/parameters/zfs_recover) and try importing via the GUI.

Assuming that import succeeds, leave it for a day or two and then see if it imports as normal after a reboot.

Edit: This post details the same but is better laid out than mine, worth referencing that too :slight_smile:

3 Likes

Not a silly question :slight_smile:
I’ll try and take a few more photos as that is the top of the screen

Hi, so this is my boot pool and I have two disks in a mirror. If I attach either of them independently then I still get the error.

Unfortunately I cannot find a backup of my config.

Of course if I was to rebuild I guess I should be able to import my data pool with some difficulty (and help!).

The system was idle as it happened overnight. I have ECC RAM.

I’m referring to your data pool, have you attempted booting with the data pool disks disconnected?

TrueNAS will attempt to import all pools as part of the init/startup process, so the ZFS panic looks like it could well have been thrown as part of initial pool import

Edit: Not to say it couldn’t be your boot-pool, but best to confirm. If it boots no problem with the disks disconnected, then you know for sure where the problem is.

Yes they were the first thing I disconnected.

1 Like

Wouldn’t it be better to run zpool import -R /mnt tank -o readonly=on to get it mounted in the proper spot?

Edit: Well, maybe it’s not the data pool at all, I missed the latest developments.

I’d rather not probe middleware into freaking out about a pool it can’t interact with, especially when it’s just a test to see if it can import

1 Like

Okay, amazing, if the data disks are intact that saves a lot of hassle as you don’t have any critical data at risk, just a matter of pulling out the configuration now :slight_smile:

Let’s give this a go then seeing as you don’t have config:
On the boot page, press “e” on your primary boot option:

Then arrow down until you see this “linux” line:

Add zfs.zfs_recover=1 to the end of the line then press ctrl+x to boot. With the data disks disconnected, that should hopefully get you into the system and able to download a config backup. I’d recommend a reinstall after the fact, technically you could leave it for a bit as I mentioned earlier, but considering how easy it is to reinstall once you have a configuration backup better safe than sorry!

Hi Henry,
thank you. So if I do this command with the added parameter then it should boot with network and I can logon and backup the config?
It will be tomorrow morning now before I can nick one of the kids monitors :slight_smile:

That’s the goal, yes. As with all things I can’t give you a 100% guarantee, but it should be your best shot at getting into a bootable state to grab a copy of the config.

In future, I’d recommend either taking a config backup after any major configuration changes are made, or deploying a script that can backup config automatically to your storage pool (see: configuration-backup).

Unfortunately that resulted in a kernel panic at the same point.

I guess my only option now is to rebuild. Fortunately I did not encrypt my data pool so it should be relatively easy to import (if slow!).

The only thing I’ve struggled with in the past was the security certificate stuff and cloudflare

That’s unlucky, first time I’ve heard of this happening on a boot pool. Best of luck to you with the rebuild.

I tried with the mirror together and then each disk on it’s own in turn and no luck
:frowning: