Help recovering zfs pool after VM passthrough on proxmox

I passed through two devices to a VM running on proxmox and corrupted it.

If anyone else wants to virtualise TrueNAS using proxmox use these commands on the proxmox hypervisor first to prevent concurrent access on the device you are passing through:

systemctl disable --now zfs-import-scan.service
systemctl disable --now zfs-import-cache.service
reboot

Can anyone help recover the pool?

This is the error I get:

# zpool import datatank
cannot import 'datatank': one or more devices is currently unavailable

Here are the devices that were in the pool (in reality I used the by-id path, not sdX, but this is shorter to show):

# lsblk -o name,size,fstype,label,model,serial,mountpoint|grep 12.7
sdc                12.7T                                  WDC WD140EFFX-68VBXN0 Z2KL9X12        
└─sdc2             12.7T zfs_member        datatank                                              
sdd                12.7T                                  WDC WD140EFFX-68VBXN0 Z2KNLX12        
└─sdd2             12.7T zfs_member        datatank                                              

Here are the headers:

# zdb -l /dev/sdc2
------------------------------------
LABEL 0 
------------------------------------
    version: 5000
    name: 'datatank'
    state: 0
    txg: 15493413
    pool_guid: 16911827047402167892
    errata: 0
    hostid: 383395800
    hostname: 'datatank'
    top_guid: 11077806832579153414
    guid: 13180809383598811762
    vdev_children: 3
    vdev_tree:
        type: 'mirror'
        id: 0
        guid: 11077806832579153414
        whole_disk: 0
        metaslab_array: 65
        metaslab_shift: 34
        ashift: 12
        asize: 13998367178752
        is_log: 0
        create_txg: 4
        children[0]:
            type: 'disk'
            id: 0
            guid: 8569546290774946668
            path: '/dev/sdb'
            devid: 'scsi-0QEMU_QEMU_HARDDISK_drive-scsi2'
            phys_path: 'pci-0000:01:03.0-scsi-0:0:0:2'
            whole_disk: 1
            DTL: 21406
            create_txg: 4
        children[1]:
            type: 'disk'
            id: 1
            guid: 13180809383598811762
            path: '/dev/sda'
            devid: 'scsi-0QEMU_QEMU_HARDDISK_drive-scsi1'
            phys_path: 'pci-0000:01:02.0-scsi-0:0:0:1'
            vdev_enc_sysfs_path: '/sys/class/enclosure/6:0:0:0/Slot 05'
            whole_disk: 1
            DTL: 9036
            create_txg: 4
    features_for_read:
        com.delphix:hole_birth
        com.delphix:embedded_data
        com.delphix:device_removal
        com.klarasystems:vdev_zaps_v2
    labels = 0 1 2 3 

# zdb -l /dev/sdd2
------------------------------------
LABEL 0 
------------------------------------
    version: 5000
    name: 'datatank'
    state: 0
    txg: 15493413
    pool_guid: 16911827047402167892
    errata: 0
    hostid: 383395800
    hostname: 'datatank'
    top_guid: 11077806832579153414
    guid: 8569546290774946668
    vdev_children: 3
    vdev_tree:
        type: 'mirror'
        id: 0
        guid: 11077806832579153414
        whole_disk: 0
        metaslab_array: 65
        metaslab_shift: 34
        ashift: 12
        asize: 13998367178752
        is_log: 0
        create_txg: 4
        children[0]:
            type: 'disk'
            id: 0
            guid: 8569546290774946668
            path: '/dev/sdb'
            devid: 'scsi-0QEMU_QEMU_HARDDISK_drive-scsi2'
            phys_path: 'pci-0000:01:03.0-scsi-0:0:0:2'
            whole_disk: 1
            DTL: 21406
            create_txg: 4
        children[1]:
            type: 'disk'
            id: 1
            guid: 13180809383598811762
            path: '/dev/sda'
            devid: 'scsi-0QEMU_QEMU_HARDDISK_drive-scsi1'
            phys_path: 'pci-0000:01:02.0-scsi-0:0:0:1'
            vdev_enc_sysfs_path: '/sys/class/enclosure/6:0:0:0/Slot 05'
            whole_disk: 1
            DTL: 9036
            create_txg: 4
    features_for_read:
        com.delphix:hole_birth
        com.delphix:embedded_data
        com.delphix:device_removal
        com.klarasystems:vdev_zaps_v2
    labels = 0 1 2 3 

What does sudo zpool status -v and sudo zpool import return?

(I started and switched to using the TrueNAS VM now, so different devices will show)

# zpool status -v
  pool: boot-pool
 state: ONLINE
config:
        NAME        STATE     READ WRITE CKSUM
        boot-pool   ONLINE       0     0     0
          sdc3      ONLINE       0     0     0

errors: No known data errors

# zpool import
  pool: datatank
    id: 16911827047402167892
 state: ONLINE
action: The pool can be imported using its name or numeric identifier.
config:

        datatank      ONLINE
          mirror-0    ONLINE
            sdb2      ONLINE
            sda2      ONLINE
          indirect-1  ONLINE
          indirect-2  ONLINE

# zpool import datatank
cannot import 'datatank': pool was previously in use from another system.
Last accessed by proxmox (hostid=fc6c0867) at Tue Oct  7 22:31:24 2025
The pool can be imported, use 'zpool import -f' to import the pool.

# zpool import datatank -f
cannot import 'datatank': one or more devices is currently unavailable

That, and passthrough the drive controller and blacklist it.

Time to call Batman @HoneyBadger

2 Likes

Oof. Yeah, that’s an unpleasant situation.

Normally when I see a double-mounted pool it throws back the “insufficient replicas/corrupted data” rather than “one or more devices is currently unavailable” - I do see some indirect devices there as well, so did you ever do a vdev removal (ie: adding more disks including special vdevs, switching between a virtual and a passthrough disk, etc)? IIRC there was a brief point in time where “block clone exists + vdev removal happens” could cause problems.

Can I get the output of the last hundred or so lines /proc/spl/kstat/zfs/dbgmsg immediately after a failed import?

1 Like

indirect devices: I did have another mirror of zdevs in the pool once, could it be that?

dbgmsg:

1760716386   ffff8f3db6d2c8c0 vdev.c:183:vdev_dbgmsg(): disk vdev '/dev/sda2': probe done, cant_read=0 cant_write
1760716386   ffff8f3db6d2c8c0 vdev.c:183:vdev_dbgmsg(): disk vdev '/dev/sdb2': probe done, cant_read=0 cant_write
1760716386   ffff8f3eaa22e100 spa_misc.c:2376:spa_import_progress_set_notes_impl(): 'datatank' Loading checkpoint t
1760716386   ffff8f3eaa22e100 spa_misc.c:2376:spa_import_progress_set_notes_impl(): 'datatank' Loading indirect vdev metada
1760716387   ffff8f3eaa22e100 spa_misc.c:2376:spa_import_progress_set_notes_impl(): 'datatank' Checking feature fla
1760716387   ffff8f3eaa22e100 spa_misc.c:2376:spa_import_progress_set_notes_impl(): 'datatank' Loading special MOS directori
1760716387   ffff8f3eaa22e100 spa_misc.c:2376:spa_import_progress_set_notes_impl(): 'datatank' Loading propertie
1760716387   ffff8f3eaa22e100 spa_misc.c:2376:spa_import_progress_set_notes_impl(): 'datatank' Loading AUX vdevs
1760716387   ffff8f3eaa22e100 spa_misc.c:2376:spa_import_progress_set_notes_impl(): 'datatank' Loading vdev metada
1760716389   ffff8f3eaa22e100 spa_misc.c:429:spa_load_note(): spa_load(datatank, config trusted): Read 663 log space maps (663 total blocks - blksz = 131072 bytes) in 2244 
1760716389   ffff8f3eaa22e100 spa_misc.c:2376:spa_import_progress_set_notes_impl(): 'datatank' Loading dedup tabl
1760716389   ffff8f3eaa22e100 spa_misc.c:2376:spa_import_progress_set_notes_impl(): 'datatank' Loading BRT
1760716389   ffff8f3eaa22e100 spa_misc.c:2376:spa_import_progress_set_notes_impl(): 'datatank' Verifying Log Devic
1760716390   ffff8f3eaa22e100 spa_misc.c:415:spa_load_failed(): spa_load(datatank, config trusted): FAILED: spa_check_logs fail
1760716390   ffff8f3eaa22e100 spa_misc.c:429:spa_load_note(): spa_load(datatank, config trusted): UNLOADING

Yep, that would cause indirect devices. Hopefully this isn’t it.

spa_check_logs might be a better place to fail though - that means usually that the last uncommitted transaction group was corrupted (probably via the double-mount) and we might be able to rewind a little bit.

Try the import with -fF - both letters included, with case mattering here. f is “force import pool mounted by other host” and F is “force rewind”.

1 Like
# zpool import datatank -fF
cannot import 'datatank': insufficient replicas
        Destroy and re-create the pool from
        a backup source.

I remember getting this error while testing zfs and pool settings inside a VM. If I recall, I had created several virtual test disks in VMWare and put them into varying configurations within the virtualized TrueNAS. When I got the error it was because I had messed with one of the disk files outside of the VM on the host machine. Nothing I did would fix it. I even had backups of those test disks (should some issue occur) so I didn’t have to recreate them, but shutting down and copying over the backup disk did not fix it. I eventually had to start over and create new disks, new pools, etc.

@HoneyBadger should I try importing half the mirror?

Edit: in case it’s a problem with the metadata being out of sync

I’m tempted to say that you might need more aggressive rewinding here with the risk of potentially losing some recent data here.

Importing with -fFX may do it but if it doesn’t you may even need to do

Warning - Dangerous Tunables Inside
sudo echo 0 > /sys/module/zfs/parameters/spa_load_verify_data
sudo echo 0 > /sys/module/zfs/parameters/spa_load_verify_metadata

in order to disable the default behavior of “immediately bail out on corrupted metadata” and tell it to keep forging ahead regardless.

2 Likes

It’s causing the VM to reboot. dmesg shows nothing. I will try at the hypervisor level.