Can't Import Pool After Random System Hang

I have a TrueNAS Scale system with 2 storage pools: one 6-wide raidz2 of NAS-grade HDDs, one 1-wide mirror of SSDs. One day I notice my NAS wasn’t running and checked on it to find it was hanging during the boot process. I didn’t get a screenshot of the console but it was stuck on something related to middleware. I was able to boot into the OS by reinstalling TrueNAS and restoring my latest config. The only issue was that one of my pools (the SSD pool) would not import. Anytime I try importing (through the UI or the shell), the system reboots without importing the pool. As far as I can tell, the pool is intact and the disks are fine. I’ve run S.M.A.R.T tests and the outputs of

zpool status -v

and

zpool import

look like this (ignore the permanent errors detected in Bongo, I’m not worried about them, and the pool in question is Gonzales):

root@truenas[~]# zpool status -v         
  pool: Bongo
 state: ONLINE
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
  scan: scrub paused since Thu Jun 25 12:38:10 2026
        scrub started on Tue May  5 10:09:50 2026
        0B / 8.65T scanned, 0B / 8.65T issued
        0B repaired, 0.00% done
expand: expanded raidz2-0 copied 10.2T in 5 days 07:48:08, on Sun Nov 30 22:10:10 2025
config:

        NAME                                      STATE     READ WRITE CKSUM
        Bongo                                     ONLINE       0     0     0
          raidz2-0                                ONLINE       0     0     0
            96f54d75-0f8d-4ea5-babb-af555207aab2  ONLINE       0     0     0
            3b6a6a8d-4807-4b7a-a4a8-951ea9435be7  ONLINE       0     0     0
            5a5e5491-0f2f-4df9-944f-7e45ca6f452b  ONLINE       0     0     0
            1fd91b11-add4-4b2d-889f-4ebec172b37f  ONLINE       0     0     0
            4de28ce3-af69-40d0-b492-417bf527ff5a  ONLINE       0     0     0
            ee6fb309-e2e2-4097-aa16-55cdc9996627  ONLINE       0     0     0

errors: Permanent errors have been detected in the following files:

        /mnt/Bongo/plex_media/TV Shows/Death in Paradise/Season 15/Death in Paradise - S15E08 - Episode 8 WEBDL-2160p.mkv
        Bongo/plex_media:<0x281>

  pool: boot-pool
 state: ONLINE
  scan: scrub repaired 0B in 00:00:12 with 0 errors on Sat Jun 20 03:45:13 2026
config:

        NAME         STATE     READ WRITE CKSUM
        boot-pool    ONLINE       0     0     0
          nvme0n1p3  ONLINE       0     0     0

errors: No known data errors
root@truenas[~]# zpool import
  pool: Gonzales
    id: 16031624355188637350
 state: ONLINE
status: Some supported features are not enabled on the pool.
        (Note that they may be intentionally disabled if the
        'compatibility' property is set.)
action: The pool can be imported using its name or numeric identifier, though
        some features will not be available without an explicit 'zpool upgrade'.
config:

        Gonzales                                  ONLINE
          mirror-0                                ONLINE
            d45c835a-59bb-4a04-a0c9-d46e8aa78dcc  ONLINE
            13ea1cfb-229b-4aac-9d04-914ec984cfeb  ONLINE

I’ve tried disconnecting each of the mirrored disks individually to see if this is caused by hardware problems to no avail. I’ve managed to import the pool as read-only, but am still unable to mount the datasets and can’t really do anything useful with it in this state.

I’m hoping there is a way I can import the pool as it is my app pool and I REALLY don’t want to set up all my apps again from scratch. I would also be ok with just recovering the ix-apps data off of the pool, then rebuilding the pool and restoring the data. Regardless, I’d appreciate any help.

I can also provide more logs, details, etc. if needed; I’m just not sure what would be useful at the moment.

Having a permanent error on a RAID-Z2 vDev / pool is indicative of either disk I/O problems. Or memory errors. These problems can also then affect the Mirror pool.

Please supply details on how the disks, (both Mirror and RAID-Z2), are wired to the system board. And if using a HBA, it’s information too.

If you have not done so already, a memory test is in order.

Thanks for the reply. All disks are attached via an LSI SAS 9300 16i HBA and I’ve run memtest86 for 48 hours without any errors. I’m suspicious the permanent error is cause by one of my services as I have a had a couple in the the past as well as some checksum errors, but since the services have stopped, there have been no new errors.

Make sure this gets enough cooling. Common problem is that these Enterprise HBAs are used in small office / home setups, with inadequate cooling.

Great, one thing that we can check off. However, without ECC memory, it still could have been a random bit flip that caused the problem.

This does not seem possible. Unless those services accessed the raw disks, ZFS would not purposefully allow damage to be written. On the other hand, if these services perform lots of writes, AND there is an underlying problem, (like a RAM bit flip or HBA is too hot), then there is more chance of pool damage.

In essence, unless an App, Service or program writes to the raw disks, any permanent ZFS vDev error won’t be caused by those App(s), Service(s) or program(s). Indirectly, yes… directly no.

To be clear, you have had a serious hardware issue. It is quite hard to get Metadata corruption because by default their are 2 copies. (The Bongo/plex_media:<0x281> error is for Metadata.) This is on top of any redundancy, which for RAID-Z2 means both parity AND both copies failed. Or that a bit flipped in RAM after the Metadata’s checksum was calculated but before it was written.


So, back to the Mirror pool Gonzales. You can try several things. First, see what ZFS write transaction group was written to both Mirror disks.

sudo zdb list /dev/DISK_PART | grep txg | head -1

Replace DISK with the disk name, like “sde”, and PART with the partition number, so it ends up like “sde2”. The 2 disks should have the same ZFS TXG number. But, if they are different by less than 10, we can work around that.

After that, we can explore some options to import the Gonzales pool.

1 Like

I’m not worried about this; I’ve mounted a fan to the heatsink with a 3D-printed bracket.

I definitely want to investigate this further but for now I just want to get my apps running again :sweat_smile:.

Running that command, I get the following error:

zdb: can't open 'list': No such file or directory

ZFS_DBGMSG(zdb) START:
metaslab.c:1789:spa_set_allocator(): spa allocator: dynamic
metaslab.c:1789:spa_set_allocator(): spa allocator: dynamic
ZFS_DBGMSG(zdb) END

Maybe the command syntax is wrong? Anyway, I tried simply zdb instead, with this output:

zdb                                     
Gonzales:
    version: 5000
    name: 'Gonzales'
    state: 0
    txg: 7484288
    pool_guid: 16031624355188637350
    errata: 0
    hostid: 605098918
    hostname: 'truenas'
    com.delphix:has_per_vdev_zaps
    vdev_children: 1
    vdev_tree:
        type: 'root'
        id: 0
        guid: 16031624355188637350
        create_txg: 4
        com.klarasystems:vdev_zap_root: 129
        children[0]:
            type: 'mirror'
            id: 0
            guid: 10286563185723603901
            metaslab_array: 256
            metaslab_shift: 31
            ashift: 12
            asize: 250053394432
            is_log: 0
            create_txg: 4
            com.delphix:vdev_zap_top: 130
            children[0]:
                type: 'disk'
                id: 0
                guid: 15677726995547323178
                path: '/dev/disk/by-partuuid/d45c835a-59bb-4a04-a0c9-d46e8aa78dcc'
                whole_disk: 0
                DTL: 34182
                create_txg: 4
                com.delphix:vdev_zap_leaf: 131
            children[1]:
                type: 'disk'
                id: 1
                guid: 13019253100026596609
                path: '/dev/disk/by-partuuid/13ea1cfb-229b-4aac-9d04-914ec984cfeb'
                whole_disk: 0
                DTL: 34181
                create_txg: 4
                com.delphix:vdev_zap_leaf: 132
    features_for_read:
        com.delphix:hole_birth
        com.delphix:embedded_data
        com.klarasystems:vdev_zaps_v2
boot-pool:
    version: 5000
    name: 'boot-pool'
    state: 0
    txg: 986069
    pool_guid: 1531617368800631957
    errata: 0
    compatibility: 'grub2'
    hostname: '(none)'
    com.delphix:has_per_vdev_zaps
    vdev_children: 1
    vdev_tree:
        type: 'root'
        id: 0
        guid: 1531617368800631957
        create_txg: 4
        children[0]:
            type: 'disk'
            id: 0
            guid: 15850896543870982874
            path: '/dev/nvme0n1p3'
            whole_disk: 0
            metaslab_array: 65
            metaslab_shift: 30
            ashift: 12
            asize: 127490850816
            is_log: 0
            DTL: 89707
            create_txg: 4
            com.delphix:vdev_zap_leaf: 129
            com.delphix:vdev_zap_top: 130
    features_for_read:
        com.delphix:hole_birth
        com.delphix:embedded_data

Sorry. Yes, that should have been as below:
sudo zdb -l /dev/DISK_PART | grep txg | head -1
The output you supplied does not have complete information about the 2 children. I think this will do it, (because the “path:” was listed:

sudo zdb -l /dev/disk/by-partuuid/d45c835a-59bb-4a04-a0c9-d46e8aa78dcc
sudo zdb -l /dev/disk/by-partuuid/13ea1cfb-229b-4aac-9d04-914ec984cfeb

We are looking for the line with “txg:” all by it’s self. Plus, it occurs to me to check the labels at the end of both targeted listings. There should be 4 numbers listed for the labels, on each disk.

Here’s the output:

root@truenas[~]# sudo zdb -l /dev/disk/by-partuuid/d45c835a-59bb-4a04-a0c9-d46e8aa78dcc
------------------------------------
LABEL 0 
------------------------------------
    version: 5000
    name: 'Gonzales'
    state: 0
    txg: 7484288
    pool_guid: 16031624355188637350
    errata: 0
    hostid: 605098918
    hostname: 'truenas'
    top_guid: 10286563185723603901
    guid: 15677726995547323178
    vdev_children: 1
    vdev_tree:
        type: 'mirror'
        id: 0
        guid: 10286563185723603901
        metaslab_array: 256
        metaslab_shift: 31
        ashift: 12
        asize: 250053394432
        is_log: 0
        create_txg: 4
        children[0]:
            type: 'disk'
            id: 0
            guid: 15677726995547323178
            path: '/dev/disk/by-partuuid/d45c835a-59bb-4a04-a0c9-d46e8aa78dcc'
            whole_disk: 0
            DTL: 34182
            create_txg: 4
        children[1]:
            type: 'disk'
            id: 1
            guid: 13019253100026596609
            path: '/dev/disk/by-partuuid/13ea1cfb-229b-4aac-9d04-914ec984cfeb'
            whole_disk: 0
            DTL: 34181
            create_txg: 4
    features_for_read:
        com.delphix:hole_birth
        com.delphix:embedded_data
        com.klarasystems:vdev_zaps_v2
    labels = 0 1 2 3 
root@truenas[~]# zdb -l /dev/disk/by-partuuid/13ea1cfb-229b-4aac-9d04-914ec984cfeb
------------------------------------
LABEL 0 
------------------------------------
    version: 5000
    name: 'Gonzales'
    state: 0
    txg: 7484288
    pool_guid: 16031624355188637350
    errata: 0
    hostid: 605098918
    hostname: 'truenas'
    top_guid: 10286563185723603901
    guid: 13019253100026596609
    vdev_children: 1
    vdev_tree:
        type: 'mirror'
        id: 0
        guid: 10286563185723603901
        metaslab_array: 256
        metaslab_shift: 31
        ashift: 12
        asize: 250053394432
        is_log: 0
        create_txg: 4
        children[0]:
            type: 'disk'
            id: 0
            guid: 15677726995547323178
            path: '/dev/disk/by-partuuid/d45c835a-59bb-4a04-a0c9-d46e8aa78dcc'
            whole_disk: 0
            DTL: 34182
            create_txg: 4
        children[1]:
            type: 'disk'
            id: 1
            guid: 13019253100026596609
            path: '/dev/disk/by-partuuid/13ea1cfb-229b-4aac-9d04-914ec984cfeb'
            whole_disk: 0
            DTL: 34181
            create_txg: 4
    features_for_read:
        com.delphix:hole_birth
        com.delphix:embedded_data
        com.klarasystems:vdev_zaps_v2
    labels = 0 1 2 3

Looks like the “txg” numbers and labels are all the same.

Yes, they are the same, which is good.

What does zpool import Gonzales say?

Since you say that did not work, you can try zpool import -fFn Gonzales. This won’t import the pool, because of the “n” option. This is because the “F” will wind back some of the write transactions, (aka loose most recent written data). We want to see if it will help, because the most recent written data may contain the corruption.

If that does not seem to work, their is more extreme measures that can be taken,

Both of those commands hang and cause the system to reboot.