Can’t access my pool because all disks have been exported?

truenasrooks · November 19, 2024, 6:41pm

root@truenas[/home/admin]# zpool list -v
NAME                                       SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
boot-pool                                   95G  27.4G  67.6G        -         -    16%    28%  1.00x    ONLINE  -
  sda3                                    95.3G  27.4G  67.6G        -         -    16%  28.9%      -    ONLINE
ssd                                        222G   132G  89.9G        -         -    12%    59%  1.00x    ONLINE  /mnt
  mirror-0                                 222G   132G  89.9G        -         -    12%  59.5%      -    ONLINE
    334cbbbb-e0d5-40c6-8462-5f82b0033ad7   224G      -      -        -         -      -      -      -    ONLINE
    d4c7f156-eff2-410c-b410-46c884c268f8   224G      -      -        -         -      -      -      -    ONLINE
root@truenas[/home/admin]#

NickF1227 · November 19, 2024, 6:41pm

Okay cool. Nevermind

NickF1227 · November 19, 2024, 6:43pm

If what @HoneyBadger posted above doesn’t get you any further…

I would be interested in seeing if you can boot the NAS with one disk at a time in that pool to see if it imports. Since it’s a 3 way mirror, you should be able to physically unplug the power of the other 2 each time. This will help rule out a hardware issue.

truenasrooks · November 19, 2024, 6:45pm

individual thing didnt work and gave the same error. One of the device didnt give output for zdb command

output.txt (26.8 KB)

HoneyBadger · November 19, 2024, 6:48pm

Whoops; that’s on me. I forgot to specify that you should target the partition and not the device, so:

zdb -l /dev/sdb1
zdb -l /dev/sdc1
zdb -l /dev/sdf1

truenasrooks · November 19, 2024, 6:50pm

Thanks. this is output .

zpool import 5087050144587203797 -R /mnt -N -f -o cachefile=/data/zfs/zpool.cache -d /dev/disk/by-partuuid/a9e51b2f-2df8-48ae-9c3e-6303e9256c28 -d /dev/disk/by-partuuid/d08bcfdf-d4fc-4094-8978-7f29368f737f -d /dev/disk/by-partuuid/a3573b00-d380-4587-806a-8a1a50690002


admin@truenas[~]$ sudo su
[sudo] password for admin: 
root@truenas[/home/admin]# zpool status
  pool: boot-pool
 state: ONLINE
status: Some supported and requested features are not enabled on the pool.
        The pool can still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
        the pool may no longer be accessible by software that does not support
        the features. See zpool-features(7) for details.
  scan: scrub repaired 0B in 00:01:55 with 0 errors on Fri Nov 15 03:46:56 2024
config:

        NAME        STATE     READ WRITE CKSUM
        boot-pool   ONLINE       0     0     0
          sda3      ONLINE       0     0     0

errors: No known data errors

  pool: ssd
 state: ONLINE
status: Some supported and requested features are not enabled on the pool.
        The pool can still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
        the pool may no longer be accessible by software that does not support
        the features. See zpool-features(7) for details.
  scan: scrub repaired 0B in 00:09:35 with 0 errors on Sun Oct 13 00:09:37 2024
config:

        NAME                                      STATE     READ WRITE CKSUM
        ssd                                       ONLINE       0     0     0
          mirror-0                                ONLINE       0     0     0
            334cbbbb-e0d5-40c6-8462-5f82b0033ad7  ONLINE       0     0     0
            d4c7f156-eff2-410c-b410-46c884c268f8  ONLINE       0     0     0

errors: No known data errors
root@truenas[/home/admin]# sudo blkid 
/dev/sdd1: LABEL="ssd" UUID="9570678906628884325" UUID_SUB="8374691078963971794" BLOCK_SIZE="4096" TYPE="zfs_member" PARTUUID="334cbbbb-e0d5-40c6-8462-5f82b0033ad7"
/dev/sdb1: LABEL="tank" UUID="5087050144587203797" UUID_SUB="2644609643477440627" BLOCK_SIZE="4096" TYPE="zfs_member" PARTUUID="a3573b00-d380-4587-806a-8a1a50690002"
/dev/sde1: LABEL="ssd" UUID="9570678906628884325" UUID_SUB="336954752255915850" BLOCK_SIZE="4096" TYPE="zfs_member" PARTUUID="d4c7f156-eff2-410c-b410-46c884c268f8"
/dev/sdc1: LABEL="tank" UUID="5087050144587203797" UUID_SUB="9388162094058127461" BLOCK_SIZE="4096" TYPE="zfs_member" PARTUUID="d08bcfdf-d4fc-4094-8978-7f29368f737f"
/dev/sda2: LABEL_FATBOOT="EFI" LABEL="EFI" UUID="44F1-C597" BLOCK_SIZE="512" TYPE="vfat" PARTUUID="ee924cc1-87b4-4de0-a8bd-0fec0097dc63"
/dev/sda3: LABEL="boot-pool" UUID="1696277370963779199" UUID_SUB="14857367843732674036" BLOCK_SIZE="4096" TYPE="zfs_member" PARTUUID="408a753f-3d74-4d2a-be6a-77cf0c711051"
/dev/sdf1: LABEL="tank" UUID="5087050144587203797" UUID_SUB="17972730002245592786" BLOCK_SIZE="4096" TYPE="zfs_member" PARTUUID="a9e51b2f-2df8-48ae-9c3e-6303e9256c28"
/dev/sda4: PARTUUID="cea312e0-c0f5-4a8d-bdf9-8d84ef58ca88"
/dev/sda1: PARTUUID="f831f3ef-03d6-48a2-bb72-a2fb6791da77"

Edit: Command what honeybadger gave for individual disk and blkid are same. i thought its different.

truenasrooks · November 19, 2024, 6:53pm

here is the output
output.txt (4.8 KB)

bacon · November 19, 2024, 7:51pm

/dev/sdb1 is lagging behind the other two drives.

/dev/sdb1: txg: 3582921
/dev/sdc1: txg: 3583088
/dev/sdf1: txg: 3583088

NickF1227 · November 19, 2024, 8:05pm

Will the pool import with sdb removed?

HoneyBadger · November 19, 2024, 8:37pm

That was going to be my first pointer and suggestion.

Try removing the /dev/sdb disk (check and confirm by serial number if you have to) and then see if the pool will import.

It’s also not just a little bit behind, it’s 167 transactions out of date according to the txg stamps.

winnielinnie · November 19, 2024, 8:39pm

Not to derail this thread, but how does that even happen?

Wouldn’t a member device fail and be automatically offlined from the mirror vdev before it comes to that point (of differing TXGs)? I just don’t understand how ZFS could ever allow a member drive to drift from the others, without any degredation of the pool’s status.

NickF1227 · November 19, 2024, 8:48pm

OP can you also please share the output of lspci -v ?

bacon · November 19, 2024, 8:51pm

Accoding to the event log /dev/sdb1 also had a checksum error during read. That was yesterday.

HoneyBadger · November 19, 2024, 8:59pm

Corruption in an uberblock and reverting back to a previous valid one when queried by zdb perhaps.

@truenasrooks now that we have the short versions, let’s go for the long wall of text with the zdb -ul /dev/sdb1 and other devices - this will give a lot more data as it prints the uberblock timestamps.

truenasrooks · November 19, 2024, 10:20pm

lspci-v-result.txt (25.5 KB)
output.txt (40.3 KB)

i have attached output for both zdb -ul and lspci commands.

So far i have not removed the disk which is lagging behind. i will do that now

truenasrooks · November 19, 2024, 10:25pm

SDB is HDD with serial number AAH0NPLH. I will remove that and try importing ?

so far i have not reverted to previous version of truenas scale from current electric eel. let me know if i should i try that ?

EDIT: All three disks still doing SMART test long and about 90% done. im thinking of waiting for it to complete before attempting to remove one and importing them

neofusion · November 19, 2024, 10:32pm

Maybe hold off until people here had time to review the last post first.

To me it looks like sdf (and only sdf) has the most recent txg.
But I don’t know actually how to read that output.

truenasrooks · November 19, 2024, 11:52pm

those long SMART tests are complete and no errors on those hard disks. im happy that HDD are good but really sad not able to see my data yet

Hope we can fix it. Let me know if i try any of these or something

Revert to previous version
Install HBA card and connect these disks
Remove anyone or 2 and import with the id

EDIT 1: I removed SDB alone and tried import from GUI and with commandline. it failed. I removed SDB and SDC and did the same and it failed as well.

This is the command i used, i removed SDB and SDC accordingly

zpool import 5087050144587203797 -R /mnt -N -f -o cachefile=/data/zfs/zpool.cache -d /dev/disk/by-partuuid/a9e51b2f-2df8-48ae-9c3e-6303e9256c28 -d /dev/disk/by-partuuid/d08bcfdf-d4fc-4094-8978-7f29368f737f -d /dev/disk/by-partuuid/a3573b00-d380-4587-806a-8a1a50690002

EDIT 2: i will wait for confirmation but if im going back to previous version, all i have to say is activate in gui next to that version in boot environment page right ?

truenasrooks · November 20, 2024, 4:18pm

can someone advice on what i should do from last output and give any help ?

SmallBarky · November 20, 2024, 4:36pm

There are a lot of these threads going on right now. You just have to be patient and wait.
I am guessing the experienced users are going thread by thread, so as not to mix up users and issues