24.10 RC2 Raidz expansion caused miscalculated available storage

I have a server with 6x18tb (16.37TiB) drives in RAIDz2, and due to hardware limitations, needed to start with 4 drives and expand later.

Using 24.10 RC2, I extended my vdev twice, and during the second vdev expansion, I rebooted the system (as zfs documentation said it was safe to reboot during expansion, and I had a need to reboot). After all expansion was completed, it seems like the usable capacity value wasn’t updated to properly reflect the new capacity.


I did try the zfs rebalancing script, which did reduce the used cacity value from 10TiB to around 8TiB which matches my estimations, but the usable capacity value was unchanged.

I have so far tried rebooting the system and exporting/importing the pool, to little effect. Is there any shell command I can run to force Truenas to reassess how much space there actually is in the pool?

Not to “reassess”, but using the command-line will bypass the GUI, which can rule out the GUI giving you inaccurate information.

Pool status and composition:

zpool status gen-pop

Pool space information:

zpool list -o name,size,cap,alloc,free,frag,expandsize gen-op

Dataset(s) and volume(s) space information:

zfs list -r -t filesystem,volume -o space,volsize gen-pop


ok, it seems like the UI is reflecting accurately what ZFS thinks - only issue is ZFS is wrong.

With a size of 98.2T, the pool shows the total capacity of all 6 disks, but is reporting 39.2T of available space, which is the space of 3 disks - except I should have the space of 4 disks. Is there any way to fix this?

Is it possible to paste the output as text within triple backticks?

Screenshots don’t work well (and strain the eyes) when it comes to text output. They also tend to get cropped.

  pool: gen-pop
 state: ONLINE
  scan: scrub repaired 0B in 04:06:20 with 0 errors on Thu Oct 10 15:09:33 2024
expand: expanded raidz2-0 copied 23.0T in 1 days 06:17:31, on Tue Oct  8 20:23:56 2024
config:

        NAME                                      STATE     READ WRITE CKSUM
        gen-pop                                   ONLINE       0     0     0
          raidz2-0                                ONLINE       0     0     0
            fe1aa18b-6209-462f-b1c1-91e3b10baaf7  ONLINE       0     0     0
            1c31eefd-a3dd-4746-ba90-b606a4bd688a  ONLINE       0     0     0
            75bfa7d8-e547-41cc-8f31-ddeb6d569e4b  ONLINE       0     0     0
            350c1891-9021-4d56-950a-ec739a87c470  ONLINE       0     0     0
            117d28cc-5620-4bc7-b26d-b1dc51336dad  ONLINE       0     0     0
            4526a44d-4ed3-47a9-a237-7539b8ad6626  ONLINE       0     0     0

errors: No known data errors
admin@truenas[~]$ sudo zpool list -o name,size,cap,alloc,free,frag,expandsize gen-pop
NAME      SIZE    CAP  ALLOC   FREE   FRAG  EXPANDSZ
gen-pop  98.2T    17%  17.1T  81.1T     0%         -
admin@truenas[~]$ sudo zfs list -r -t filesystem,volume -o space,volsize gen-pop
NAME                                                      AVAIL   USED  USEDSNAP  USEDDS  USEDREFRESERV  USEDCHILD  VOLSIZE
gen-pop                                                   39.2T  8.30T        0B    198K             0B      8.30T        -
gen-pop/.system                                           39.2T  1.29G        0B    834M             0B       484M        -
gen-pop/.system/configs-ae32c386e13840b2bf9c0083275e7941  39.2T  7.42M        0B   7.42M             0B         0B        -
gen-pop/.system/cores                                     1024M   140K        0B    140K             0B         0B        -
gen-pop/.system/netdata-ae32c386e13840b2bf9c0083275e7941  39.2T   476M        0B    476M             0B         0B        -
gen-pop/.system/nfs                                       39.2T   163K        0B    163K             0B         0B        -
gen-pop/.system/samba4                                    39.2T   395K        0B    395K             0B         0B        -
gen-pop/disk-backup                                       39.2T  3.58T        0B   3.58T             0B         0B        -
gen-pop/ix-apps                                           39.2T   853M        0B    198K             0B       852M        -
gen-pop/ix-apps/app_configs                               39.2T  1.09M        0B   1.09M             0B         0B        -
gen-pop/ix-apps/app_mounts                                39.2T   140K        0B    140K             0B         0B        -
gen-pop/ix-apps/docker                                    39.2T   800M        0B    800M             0B         0B        -
gen-pop/ix-apps/truenas_catalog                           39.2T  51.4M        0B   51.4M             0B         0B        -
gen-pop/plex-bucket                                       39.2T  4.38T        0B   4.38T             0B         0B        -
gen-pop/plex-config                                       39.2T  9.39G        0B   9.39G             0B         0B        -
gen-pop/plex-transcode                                    39.2T   198K        0B    198K             0B         0B        -
gen-pop/userspace                                         39.2T   214G        0B    214G             0B         0B        -
gen-pop/vm-disks                                          39.2T   112G        0B    140K             0B       112G        -
gen-pop/vm-disks/Sneedbox-9ip2rx                          39.2T   112G        0B    112G             0B         0B     500G
gen-pop/youtube-dl                                        39.2T   304M        0B    304M             0B         0B        -

I hope this helps!

Something “happened”.

It’s as if one (of the two) newly added drives expanded the pool’s capacity. (Hence the ~48 TiB instead of the expected ~64 TiB.)

By all means, a RAIDZ2, comprised of five 18-TiB drives, yields about 48 TiB usable capacity.)

Then there’s this:

expand: expanded raidz2-0 copied 23.0T in 1 days 06:17:31, on Tue Oct  8 20:23:56 2024

So did it start to auto-expand your pool (“expand RAIDZ2”), but then get interrupted (by the reboot), and then after the reboot it “finished” the expansion?

Considering that RAIDZ expansion is fairly new, I wonder if perhaps you’re not meant to reboot in the middle of this process?

according to raidz expansion feature by don-brady · Pull Request #15022 · openzfs/zfs · GitHub

“The pool remains accessible during expansion. Following a reboot or export/import, the expansion resumes where it left off.”

Of course, that’s kind of irrelevant as evidently, I’m in a weird state. Perhaps there’s a command that normally should be ran at the end of the process, which my reboot interrupted?

Okay, so it seems there’s an expected loss of capacity with RAIDZ expansion, based on how much data already exists (and is thus rebalanced). But you only had stored about 1 drive’s worth of data on your entire pool.

This feature (for OpenZFS 2.3.x) is too new for me.

I’d invite @HoneyBadger @kris @yorick or @Captain_Morgan to confirm if the loss of an entire 18-TiB drive’s worth of capacity is really expected from this…

If that’s really the expectation, then RAIDZ expansion is kind of disappointing, especially to those who want to expand their pools… because their pool is getting full.

I have doubts that the expected loss of capacity is what’s going on here - I only have 8TiB of data. Even in my starting config of 4x18TB RAIDz2, where I was running with effectively a 1:1 parity ratio, the lost capacity shouldn’t get anywhere near a 18TB drive in size.

I also ran the zfs-inplace-rebalance script, which reduced the used space by a few TiB, as seen in the difference between my first and 2nd picture (in the OP), so I don’t think parity ratio mismatch excess is even counted in the “usable capacity”.

1 Like

That’s why I find it odd that it’s acting “as if” you only expanded from 4-wide to 5-wide (of 18-TiB drives).

To lose out on a whole drive’s worth of capacity seems too much.

That was my initial assumption.

But the GUI (and the zfs command) is showing you the usable capacity of a 5-wide RAIDZ2 vdev (not 6-wide) of 18-TiB drives. (Parity and rebalancing taken into account, you’d lose more than just the parity itself, but I agree… surely not nearly an entire drive’s worth of space.)

EDIT: There might indeed be a zpool command to force it to properly expand the RAIDZ vdev, but like I said, the feature is too new, and so I don’t feel comfortable to “wing it”.

Hopefully the other users I pinged can chime in and unravel this mystery.

Not sure what is going on.
@Hittsy
Did the 1st expansion go as expected? What usable capacity was displayed?

The second expansion seems to be the problem.
Was there evidence that it completed after reboot?

Have you rebooted again to see if ZFS recalculates the capacity?

@Captain_Morgan

Yes the first expansion went as expected… I don’t remember the usable capacity displayed however, I just fired off the 2nd expansion then cancelled the automatic scrub (intending on doing it after the 2nd one).

After the reboot, the expansion listing in the “Jobs” drop-down list disappeared, but disk usage continued for a few hours more. I considered the 2nd expansion ‘complete’ when disk usage returned to normal (which it did, after a similar amount of time to the first.)

I have rebooted and tried exporting/importing the pool to resolve this, neither seems to affect the overall storage space.

This is possibly the issue: The first expansion should have completed before doing another one.

To clarify, I waited for the entry in the job log for the first expansion to disappear, before I started the 2nd expansion. Unbeknownst to me, it started a scrub. I then started the 2nd expansion, and stopped the scrub when I noticed it was causing a severe performance hit to the expansion.

Since the scrub should only be done after the expansion (as it would otherwise be useless), I have doubts it affected anything other than risking the integrity of my data.

I’d be more than happy to look over logs and find out otherwise, or provide them if needed.

I guess that filing a bug report in Jira cannot hurt.

https://ixsystems.atlassian.net/browse/NAS-131728

Was hoping to avoid this, but done… Here’s to hoping this helps.

2 Likes

So the first “expand” triggered a scrub upon finishing? But during the scrub, you started the second one?

Out of curiosity, does the GUI allow expanding RADIZ with two extra drives simultaneously?

In a basic test, it would allow me to “enqueue” the second expand (select a second disk, and hit the “Expand/Confirm” button, but it sat waiting for both the expand and scrub processes to complete before proceeding.

2 Likes

So it looks like this is a space reporting display challenge with zfs list.

I did this in microcosm with a 3wZ1 → 5wZ1 and much smaller drives and data - 16G disks and about 2.8G of ISOs (a couple 24.10 builds)

With the files written to the 3wZ1 first, after the expand, it’s showing the AVAIL/USED as below:

NAME                AVAIL   USED  USEDSNAP  USEDDS  USEDREFRESERV  USEDCHILD  VOLSIZE
expanding           48.5G  2.78G        0B    128K             0B      2.78G        -
expanding/expandme  48.5G  2.78G        0B   2.78G             0B         0B        -

Delete the files, and I get back up to 51.3G AVAIL:

NAME                AVAIL   USED  USEDSNAP  USEDDS  USEDREFRESERV  USEDCHILD  VOLSIZE
expanding           51.3G  1.25M        0B    128K             0B      1.12M        -
expanding/expandme  51.3G   128K        0B    128K             0B         0B        -

But if I rewrite the exact same files again, the better space efficiency of 5wZ1 kicks in:

NAME                AVAIL   USED  USEDSNAP  USEDDS  USEDREFRESERV  USEDCHILD  VOLSIZE
expanding           49.1G  2.19G        0B    128K             0B      2.19G        -
expanding/expandme  49.1G  2.19G        0B   2.19G             0B         0B        -

So you may need to rewrite your files in place in order to get proper “accounting” from zfs list commands that query the filesystem vs. the pool structure.

1 Like

But a whopping ~18-TiB difference?

It just seems like way too much, even in light of less efficient parity before in-place rewrites.

Looking at his numbers, you’d assume you’re looking at a 5-wide RAIDZ2 of 18-TiB drives.