Electric Eel and pool VDEV expansion and reported dataset disk space

r00tsh3ll · September 2, 2024, 5:34pm

Hello TrueNAS folks,

I created a pool with a RAIDZ1 vdev with 3x20to HDDs on latest stable TrueNAS version.

Recently I bought 4 more 20to disks and wanted to grow that RAIDZ1 vdev.
For this I upgraded the rig with the latest Electric Eal nightie, with succes, except it wasn’t detecting my old HBA anymore. I’ll get back to this later.

So I extended the vdev one disk at a time. It took a week to proceed for the four disks and everything seems correct.
I even ran the zfs-inplace-rebalancing also with success. So far so good

But, I noticed something strange.

zpool list returns the correct size (used and free) for the pool:

archives    127T  55.6T  71.7T        -         -     0%    43%  1.00x    ONLINE  /mnt

I have only one dataset named “data” in this pool, but it seems it’s reporting the wrong size for both used and free:

# zfs list archives   
NAME       USED  AVAIL  REFER  MOUNTPOINT
archives  37.1T  47.6T   128K  /mnt/archives
archives/data  37.1T  47.6T  37.1T  /mnt/archives/data

It seems it’s reporting the size it had before the 4 disks were added.

Any ideas what could be wrong ?

Thanks a lot!

Side note about the HBA issue:
For my old HBA to be detected again, I had to add “pci=realloc=off” to grub kernel parameters.
Without this, the mpt3sas kernel module is complaining:

kernel: mpt3sas 0000:03:00.0: can't disable ASPM; OS doesn't have ASPM control
kernel: mpt3sas 0000:03:00.0: BAR 1: can't reserve [mem 0x805c0000-0x805c3fff 64bit]
kernel: mpt2sas_cm0: pci_request_selected_regions: failed
kernel: mpt2sas_cm0: failure at drivers/scsi/mpt3sas/mpt3sas_scsih.c:12348/_scsih_probe()!

Device is:
07:00.0 Serial Attached SCSI controller: Broadcom / LSI SAS2004 PCI-Express Fusion-MPT SAS-2 [Spitfire] (rev 03)

Davvo · September 2, 2024, 5:59pm

IIRC you need to rebalance the VDEV for the expansion to actually take place, but take this statement with caution.

I suggest giving RaidZ Expansion on ElectricEel Nightlies a read.

r00tsh3ll · September 2, 2024, 6:21pm

Hello @Davvo,

Thank you for your reply.

Actually I already did the rebalance of the the files (that were present in the dataset before the expand).

It lowered the ALLOC / CAP numbers of the pool.

NAME                                       SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
archives                                   127T  56.2T  71.1T        -         -     0%    44%  1.00x    ONLINE  /mnt
  raidz1-0                                 127T  56.2T  71.1T        -         -     0%  44.1%      -    ONLINE
    1bf0de16-446f-4ba6-b8f2-9b14b1b9c199  18.2T      -      -        -         -      -      -      -    ONLINE
    458e15f2-8c3b-400f-ad2d-5ed27d15998f  18.2T      -      -        -         -      -      -      -    ONLINE
    179217e6-3efc-4251-823c-34e67682c015  18.2T      -      -        -         -      -      -      -    ONLINE
    12b6d633-da81-47e8-90c6-26e2396e1770  18.2T      -      -        -         -      -      -      -    ONLINE
    c90735c0-7dee-4695-b133-fa9d5a4a9c5c  18.2T      -      -        -         -      -      -      -    ONLINE
    bf4a3f48-1f6c-4417-9853-70ed152f499b  18.2T      -      -        -         -      -      -      -    ONLINE
    2c49f5f6-8001-47e6-8897-84b96cd81fb2  18.2T      -      -        -         -      -      -      -    ONLINE

but the avail/used of the dataset is not reflecting the free/alloc of the pool

NAME       USED  AVAIL  REFER  MOUNTPOINT
archives  37.5T  47.2T   128K  /mnt/archives

Davvo · September 2, 2024, 6:34pm

Output of zfs get raidz_expansion /mnt/archives?

r00tsh3ll · September 2, 2024, 7:02pm

Hmm, zfs get raidz_expansion /mnt/archives returns the help for zfs get but:

]# zpool get all | grep raidz
archives   feature@raidz_expansion        active                                         local
boot-pool  feature@raidz_expansion        disabled                                       local

however:

zfs get all | grep raidz
zfs get all | grep exp

Returns nothing.

Should it also return something on the datasets or only for the pool ?

I know this features is really new but I thought it was worth a try

r00tsh3ll · September 2, 2024, 7:04pm

Also, I wanted to add that I moved a lot of files on /mnt/archives/data and checked the disk activity with iostat. The writes seems to occur on all 7 disks of the ZDEV. The 3 initially present and the 4 I added during the expand journey.

Device             tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd
sda             398.00         0.00     91784.00         0.00          0      91784          0
sde             526.00         0.00     91416.00         0.00          0      91416          0
sdk             434.00         0.00     91340.00         0.00          0      91340          0
sdl             556.00         0.00     91840.00         0.00          0      91840          0
sdm             450.00         0.00     90612.00         0.00          0      90612          0
sdr             451.00         0.00     91136.00         0.00          0      91136          0
sds             419.00         0.00     90920.00         0.00          0      90920          0
zd0               0.00         0.00         0.00         0.00          0          0          0

Protopia · September 2, 2024, 8:31pm

I wish you had asked here first before attempting to extend your pool.

If you had asked first here would have been my recommendation.

Firstly for a pool of 7x 20TB disks i.e. 140TB of raw disk, you should really be contemplating using RAIDZ2 and not just RAIDZ1.

Secondly, ElectricEel is not yet ready for production use especially new features and even more so for new ZFS features (rather than just new TN SCALE features based on old ZFS features), and because the ZFS RAIDZ extend code is so new and relatively untested in real-life, you were risking (and perhaps still are riskisking) your ZFS pool(s) becoming corrupted if you use it.

If you had asked I would have recommended waiting before using new functionality and either:

(Recommended) Creating a new RAIDZ2 pool using the 4x 20TB disks and replicating the data across - giving you a same sized pool as existing; or
(Recommended more) Creating a new RAIDZ2 pool using the 4x 20TB disks plus one fake file-based pseudo-disk (and then making the fake drive offline) giving you temporarily effectively a RAIDZ1 pool, then replicating the data across, after which you could destroy the original pool and replace the missing pseudo-disk with a real physical disk, taking you to a RAIDZ2 pool which is 50% bigger than existing; or
(Not recommended) Take the redundant drive offline from the existing pool, leaving it temporarily non-redundant, and add it to one of the above two options.

And then wait until EE is released properly and has had some time to bed in (i.e. had a first support version) and then add the new drives one by one.

But since you have already done the extending, it would seem that your options now are:

Leave it and hope that the pool is stable. I would run a scrub to check that the pool is indeed stable.
Back up all data (if not already backed up), destroy the pool, and create a new 7x RAIDZ2 pool, and then copy the data back again.

Davvo · September 2, 2024, 8:41pm

IIRC zpool returns only the pools, but I am away from PC and my memory is a bit rusty (as you have seen).