Zpool is trying to resilver wrong drive after attach!

PyCoder · June 3, 2024, 2:54pm

Hi

I tried to move my pool to another pool, because I had not enough drives I decided to make a RAID 0 and then to attach two drives to the strips.

I did it in the UI, should be similar to this:

zpool create batcave /dev/sda1 /dev/sdg1

Which worked perfectly fine!

After that I attached drives to /dev/sda and /dev/sdg

zpool attach batcave /dev/sda1 /dev/sdf1
zpool attach batcave /dev/sdg1 /dev/sdd1

Which creates mirror-0 and mirror-1 so a RAID 10.

NOW THE ISSUE!
After the resilver kicked in, mirror-0 started to resilver BOTH drives!!!
Even after I put /dev/sdf1 offline, it’s still resilvering /dev/sda which was the original drive from the RAID 0!!!

admin@truenas[~]$ sudo zpool status -LP batcave 
  pool: batcave
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Mon Jun  3 05:37:00 2024
        7.81T / 7.83T scanned at 350M/s, 3.48T / 3.92T issued at 156M/s
        3.49T resilvered, 88.87% done, 00:48:50 to go
config:

        NAME           STATE     READ WRITE CKSUM
        batcave        DEGRADED     0     0     0
          mirror-0     DEGRADED   174     0     0
            /dev/sda1  ONLINE       0     0    18  (resilvering)
            /dev/sdf1  FAULTED      0     0     0  external device fault
          mirror-1     ONLINE       0     0     0
            /dev/sdg1  ONLINE       0     0     0
            /dev/sdd1  ONLINE       0     0     0  (resilvering)

errors: 178874 data errors, use '-v' for a list
admin@truenas[~]$

I can’t stop the resilver, even though I remove the other drive!

admin@truenas[~]$ sudo zpool iostat -vLP batcave
                 capacity     operations     bandwidth 
pool           alloc   free   read  write   read  write
-------------  -----  -----  -----  -----  -----  -----
batcave        7.83T  10.4T    206    241   156M   156M
  mirror-0     3.92T  5.17T     11      5   340K  84.8K
    /dev/sda1      -      -     11      5   340K  84.8K
    /dev/sdf1      -      -      0      0      0      0
  mirror-1     3.91T  5.19T    194    235   156M   156M
    /dev/sdg1      -      -    194      5   156M    84K
    /dev/sdd1      -      -      0    229    518   156M
-------------  -----  -----  -----  -----  -----  -----
admin@truenas[~]$

As you can see mirror-1 behaves correct and reads from /dev/sdg1 and writes to /dev/sdd1, while mirror-0 is doing weird stuff.

Can someone explain WTH is going on and why is a “stripe” trying to resilver itself or if I can fix it somehow?

Every drive has a different UUID, and this “method” worked in the past without any issue!

PS. Why does TrueNAS use partitions not the whole disk?

etorix · June 3, 2024, 3:32pm

I’m not following the logic (or terminology) here…

Did you use the CLI for an operation which could, and therefore should, have been performed from the GUI? And with device letters instead of GPTID?
How were the partitions created? (Normally the middleware handles all of this, creating disposable extras for swap and/or padding, which could accomodate replacement drives of a slighlty different size…)

At which point came the data?

PyCoder · June 3, 2024, 4:29pm

I had a RAID 5 with 3 drives and wanted to move to a RAID 10 after I bought an extra drive.

I removed 1 drive from the RAID 5, created a RAID 0 in the UI.
I send all datasets to the RAID 0 (batcave)

As far as I know it’s not possible to make a RAID 0 to RAID 10 in the UI, so I did this:

GPTID was created on the other drives with:

sudo sgdisk --backup=foobar /dev/sda
sudo sgdisk --load-backup=foobar /dev/sdx
sudo sgdisk -G /dev/sdx

The -G options creates a random GPTID

After that, I attached the other two disks as shown before, but mirror-0 freaked out for some reason.

admin@truenas[~]$ sudo zpool status -LP batcave
  pool: batcave
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Mon Jun  3 17:51:50 2024
        6.34T / 7.83T scanned at 3.11G/s, 731G / 7.83T issued at 359M/s
        350G resilvered, 9.12% done, 05:46:20 to go
config:

        NAME           STATE     READ WRITE CKSUM
        batcave        DEGRADED     0     0     0
          mirror-0     DEGRADED   348     0     0
            /dev/sda1  ONLINE       0     0    42
            /dev/sdf1  REMOVED      0     0     0
          mirror-1     ONLINE       0     0     0
            /dev/sdg1  ONLINE       0     0     0
            /dev/sdd1  ONLINE       0     0     0  (resilvering)

errors: 178875 data errors, use '-v' for a list

admin@truenas[~]$ sudo fdisk -l /dev/sdg
Disk /dev/sdg: 9.1 TiB, 10000831348736 bytes, 19532873728 sectors
Disk model: WDC WD101EFBX-68
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: E58F28A0-B1B7-4CA7-9B62-0591FB129C63

Device     Start         End     Sectors  Size Type
/dev/sdg1   4096 19532871680 19532867585  9.1T Solaris /usr & Apple ZFS

admin@truenas[~]$ sudo fdisk -l /dev/sdd
Disk /dev/sdd: 10.91 TiB, 12000138625024 bytes, 23437770752 sectors
Disk model: TOSHIBA HDWG21C 
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 723FB754-3385-468A-9404-9D4EAFA38FFD

Device     Start         End     Sectors  Size Type
/dev/sdd1   4096 23437768704 23437764609 10.9T Solaris /usr & Apple ZFS

admin@truenas[~]$ sudo fdisk -l /dev/sda
Disk /dev/sda: 9.1 TiB, 10000831348736 bytes, 19532873728 sectors
Disk model: WDC WD101EFBX-68
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 2A33937E-E028-4C0B-B247-C3AA73693342

Device     Start         End     Sectors  Size Type
/dev/sda1   4096 19532871680 19532867585  9.1T Solaris /usr & Apple ZFS

admin@truenas[~]$ sudo fdisk -l /dev/sdf
Disk /dev/sdf: 9.1 TiB, 10000831348736 bytes, 19532873728 sectors
Disk model: WDC WD101EFBX-68
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 8E77E5F5-E459-487C-A1E9-FEF63192704A

Device     Start         End     Sectors  Size Type
/dev/sdf1   4096 19532871680 19532867585  9.1T Solaris /usr & Apple ZFS
admin@truenas[~]$

One drive that I attached is bigger, but that’s in mirror-1 which didn’t go crazy

etorix · June 3, 2024, 6:04pm

In proper ZFS-speak, “raidz1” to “stripe of mirrors”.

Select a drive and click on “Extend” from the 3-dot menu to make it a mirror. Repeat for the other vdev. With mirrors, one can add or remove drives as well (single drive vdev = 1-way mirror).

Failing that, it would have been best to add the new drives by gptid in the CLI. But that is likely not the issue here.

It seems that sda, or its cable, or an overheating controller, issued errors while there was no redundancy. Check its SMART report.

Unfortunately, the procedure implies that you no longer have a backup of the data… And “178875 data error” does not look good.
My suggestion would be to let ZFS resilver, but experts might have a better idea.

PyCoder · June 3, 2024, 6:43pm

Didn’t know it’s possible in the ui… I’ll try it someday

According to zpool status -v batcave it’s only some files and videos that are broken like around 1 TB… I have copies of the files but I would like to at least recover the 3 TB of movies that aren’t broken.

Maybe I copy everything to another drive and destroy the pool, at the end.

Thanks tho

sfatula · June 3, 2024, 7:01pm

Yeah before you create another pool, I’d really check the hardware, something is wrong here. You are getting errors on both drives so you need to check your HBA firmware, perhaps do in a burn in test of all 4 drives as a minimum.

PyCoder · June 6, 2024, 10:16pm

The hardware was fine, smart didn’t report any error, controller has no issues with the drives, etc.

After resilver is replaced the broken files and I got no error anymore but I can’t migrate my apps from the NVME back to the RAID 10.

It just gets stuck -.-’

sfatula · June 6, 2024, 10:22pm

You are possibly misunderstanding what smart is, it is not a definitive drive/hardware is fine kind of thing. ZFS is much better at detecting errors, and it did. Read and especially checksum errors not good no matter what smart says.