RaidZ Expansion on ElectricEel Nightlies

Hello!

Support for RaidZ Vdev Expansion is now available in ElectricEel Nightlies.
For those who do not know its a new feature that allows to extend an existing raidz vdev with another disk, e.g. from a 5 wide raidz to a 6 wide raidz.


Extend button is available when clicking in the top level raidz vdev on the pool status page.

image
As you can see, the caveat of this operation is that the parity ratio is maintained.
That essentially means the usable capacity will be lower when compared to creating a fresh vdev with that same number of disks.
To fix that one would have to essentially rewrite all the data to use the new parity ratio to get to full capacity.

As always, we do not recommend any production data to be used in Nightlies but we welcome the community of early testers to help us verify this functionality is working appropriately and squash any remaining bugs, either in the UI/API or at the ZFS level.

5 Likes

I was under the impression that only “pre-existing data on the expanded vdev” would use the “pre-expanded parity ratio”, is that not correct?

That’s correct.

I’d suggest inserting the words “pre-existing data on” after “The” and before “expanded vdev” in the caveat :slight_smile:

I read the original caveat as saying the vdev would use the pre-expanded ratio for future data too.

Is an expanded pool compatible with DragonFish at this stage?

You should not be able to import an ElectricEel pool into Dragonfish.

2 Likes

Two questions that I think will be helpful to have the answers to once the initial release goes live:

  1. Do we know how to calculate the expected new capacity when 1 disk is added, akin to the ZFS capacity calculator and
  2. Is there likely to be a script/process to force the rewrite of all the information to benefit from the enable usage of all of the space?

Personally, this is what I’m waiting for to figure out how I’m going to expand from a 6 wide vdev up to a 12 wide vdev for archival purposes (do it in steps versus doing it all at once)

1 Like
  1. should be the existing rebalancing script—assuming there is >50% free space.

If you understand how raidz expansion is implemented (preserving stripe width for existing data), it should be clear that the best way is the traditional way: Backup, destroy, restore.
Multiple single drive expansions on a filled pool can result in significant loss of capacity:
https://louwrentius.com/zfs-raidz-expansion-is-awesome-but-has-a-small-caveat.html

1 Like

…but rewriting the data should address this, right?

Yes.

Also, replicating a dataset should resolve too… I think.

Yeah, but at that point, if that’s going to be done for every disk added… Is this process really that useful? Sounds like a ton of micromanagement that’s going to eat up time and open the door to all sorts of fun issues (“oops, I hadn’t copied that over yet” comes to mind).

But why would it? Adding disks, AFAIK, needs to be done one at a time, but then the rebalancing operation would only be needed once after the last disk was added. Right?

At the end of the day, you still get a disks worth of space added for each disk you add.

1 Like

Thanks for a bit more to the discussion. I think it highlights the benefit of ensuring there is an in depth guide (Edit: to be candid, it should include a link to the standard rebalance script)

One of my concerns is that expanding from an eg 80% full 6 wide RaidZ2 to an 8 wide stripe isn’t going to give enough space to allow the expected backup/destroy/restore for archival data at the reduced capacity hence the capacity calculation query. But a stepped approach is likely to get stuck in needing lots of additional (and for a homelabber likely unnecessary) capacity if not planned in advance.

I’ll be trying to keep usage below 50% of pre-expansion space for now.

Yes, but that assumes that the pool has enough free space to rebalance.
Filling the pool up to 70-80%, and then adding a disk (rinsing and repeating), which is what a cash-strained user would like to do, is not going to work nicely.

If you can front-load the cost of multiple drives in one go, with a pool which is not yet filled to the limit, it is just easier to do it “the old way” rather than go through multiple rounds of single drive expansion and rebalancing.

1 Like

…but if you’ve added six drives to the vdev, that’s likely to be the case, no?