Theoretical RAID-Zx add parity - The Good, Bad & Ugly

This is not a TrueNAS issue, just a general discussion of the problems of adding a parity level to RAID-Zx. This does assume that RAID-Zx stripes are contiguous. But, I think they are, (just covering my bets in case they don’t have to be…).

Some of this was taken from my post here:


The Good:

Someone who started out with a few disks in RAID-Z1, but now has too many for easy backup, re-create and restore, could then add parity.

And perhaps even allowing taking a single disk and adding parity, to end up with a 2 disk RAID-Z1. User can later then add data disks or another parity level. Thus, eventually allowing Mirror vDev to RAID-Zx vDev pool migration, live, with no down time.

Eventually the “block pointer update” code could possibly allow RAID-Zx vDev removal, to equal or wider RAID-Zx vDevs.

This may even “fix” the Device Removal for Mirrors and Stripe disks, where a virtual vDev is created for the removed vDev. That is both clumsy and requires in-memory pointers. Having a “fixed” version of Device Removal for Mirrors and Stripe disks, does require “block pointer update”…


The Bad:

This is really, really, need I say it again? Really hard.

Adding parity will involve block pointer updates. Not the de-fragment or fragment that some people want. Just a plain block pointer update for a new data block location. So, VERY HARD because of:

  • hard links
  • block clones
  • snapshots
  • dataset clones
  • De-duplication
  • bookmarks?

Now why does that RAID-Zx stripe need a new location?

Simple, a full width RAID-Zx stripe without a spare block immediately after, can’t support an additional parity. Remember, adding a column to RAID-Zx does not add free space at the column level, but at the end of the vDev. Totally different from regular RAID-5/6.

Further, during the disk add routine, exactly the same as RAID-Zx expansion, any new writes would need to be restricted to the old maximum width. Or perhaps written with the new parity level. This is needed because we must not let any old level of parity be written to the new maximum width. That would prevent additional parity to be added to that RAID-Zx stripe, (without adding yet another disk!).

Next, tunables are needed:

  • Number of read ahead blocks / RAID-Zx stripes
  • Maximum amount of memory to use
  • Ability to enable over-coming the maximum amount of memory to use, when a single RAID-Zx stripe needs it
  • Pause & resume, to allow the heavy work to occur at desired times
  • Auto-pause on scrub or not, (but ALWAYS pause on Re-Silvers!)

Basically this is done is 3 phases:

  1. Add additional disk, but limit new writes to be new parity level. Or to prior maximum width.
  2. Perform the post disk add scrub
  3. Move the data to a location which has space for the additional parity

The last requires scanning through the RAID-Zx stripes to look for old parity level stripes. But, because the wider stripes need more space, it should probably be done starting with the widest and working down.


The Ugly:

The biggest stumbling block for this feature is the “block pointer update”. As soon as word gets out that you are working on such, people will jump all over you and assume / ask / demand / threaten for the Holy Grail of de-fragmenting or fragmenting.

That is in a whole another universe of HARD. Right up there with changing checksum, compression, encryption & de-duplication in place.

There are going to be people that just will not accept that general purpose de-fragmenting or fragmenting is much harder. I can foresee that anyone working on “block pointer update”, either for RAID-Zx vDev removal or Add Parity, would need someone to filter communications.

Otherwise the programmer could go a bit crazy, trying to get people to READ the scope of the project. (Which does not include de-fragment or fragment…)

I did say this was ugly, not bad or hard.