RaidZ Expansion on ElectricEel Nightlies

You would need to do a fresh install of the BETA and restore config. The update validation is alphabetical, so it treats a move from MASTER to BETA as a downgrade and prevents it.

2 Likes

So if one were to rename it to NASTER? Would that be considered an upgrade then?

1 Like

I’m not proposing this because I don’t (yet?) run SCALE, but integrating the in-place rebalancing script would be a really useful feature.

Expanding the pool and flagging a “do you want to rebalance the pool?” would make this much smoother.

It would even be useful when expanding the pool “the old-fashioned way”, i.e. adding another vdev. I’m only starting out my TrueNAS journey and this script seems like something essential to do post-pool expansion, yet if I wasn’t the type of person to go to the official forum to learn stuff (which is most people), I would’ve never realized it existed.

I wouldn’t say so. Helpful, I guess, but not essential–it just isn’t that important in most cases to “rebalance” data on your pool.

The reason it’s being mentioned in connection with RAIDZ expansion is that, when a pool is expanded, any data that was already on the pool is at the old data/parity ratio, which means you’re missing potential capacity. Data written to the pool after expansion is at the new ratio. Thus, rebalancing, which means rewriting all the data, recaptures that missing potential capacity.

2 Likes

Rebalancing also reduces free space fragnentation, and can be used to enforce a dataset’s blocksize change. It has plenty of nice uses, but there are a few caveats (ie deduplication, snapshots, etc).

I will also point out that integrating it in the WebUI will likely need more work (ie checking there is enough free space to rebalance). I do believe that TN requires such a feature (I would argue that the expansion should do this work, but whatever), and integrating the script is a good way not to start from 0.

If iX deems this feature not worthy of implementation, so be it: it’s still incredibily easy to run the script. Plus, I do not yet use SCALE.

The script that does a manual re-blance could have a flaw, in that if the file is in use, your copy could end up missing data.

Even if snapshots are used. This is because the snapshot is a point in time. If the file is active, no data after that is copied.

Disclaimer, I glanced at the “reblance script” forum thread years ago. So it may handle files in use correctly. (Like detect files being changed after copy started, and throw out the new, in-complete copy.)

That’s why it requires work.

Also, you need to delete all your snapshots before rebalancing… assuming you don’t want to double your disk usage.

Yeah. I’m not even sure if block cloning would void it.

One other thing about the rebalance script.

A way to reduce the window of exposure to user access, is with hard links;

ln SRC_FILE TEMP
ln NEW_FILE SRC_FILE

If TEMP was not modified, (access does not mater, as the files are identical), then you can simple delete TEMP. If on the other hand TEMP was modified, then a user opened SRC_FILE after the first hard link, but before the second hard link.

Don’t quote me, but I’ve used something like this to change out a critical OS file, live, without impact.

This step could even be put in to a binary to reduce the exposure window even further.

I believe the script does not touches hard links.
I assume that an easy way would be to stop all shares and Apps/VMs while working on the pool, which I assume is done anyway when expanding it.

ZFS never stops anything. Any and all zfs changes are always done live without downtime :wink:

This is why you can’t have nice things… like raidz expansion… without a lot of careful design/implementation work :wink:

Or thread-safe rebalancing.

2 Likes

I ser. Since it appears that RAIDZ expansion is getting integrated into TN, I guessed it was done so in a safe way. I also assumed that in order to expand the VDEV no files should be accessed during the process.

Saw this in the EE blog post

RAIDZ Expansion allows RAIDZ (Parity) vdevs to be expanded by one drive at a time, ideal for small footprint systems looking to expand incrementally. This feature also permits 2-drive RAIDZ1 systems, ideal for home users who want to start small on a budget.

Did not know about 2-way Z1? Is that a thing now? Good if it is.

What about 3-way z2? (Would be nice to have a degraded 3-way ;))

It is, but it looks like the UI is still enforcing the 3-drive minimum at the moment.

truenas_admin@ElectricEel[~]$ sudo zpool status TwoWideZ1
  pool: TwoWideZ1
 state: ONLINE
config:

        NAME        STATE     READ WRITE CKSUM
        TwoWideZ1   ONLINE       0     0     0
          raidz1-0  ONLINE       0     0     0
            sdb     ONLINE       0     0     0
            sdc     ONLINE       0     0     0

errors: No known data errors

Not sure if a 3wZ2 will be allowed, but it’s in the same vein of logic as a 2wZ1 - create it now with more overhead (way more, in the Z2 arrangement) and then grow drive-by-drive with the new expansion.

I don’t think creating intentionally degraded arrays is going to be permitted; you’ll probably still be in the HERE BE DRAGONS camp on that.

2 Likes

…from the UI, anyway. I’m sure you aren’t locking it out at the shell.

It depends on which shell you’re talking about. Our backend APIs / CLI prevents creating a 2-wide RAIDZ1 vdev. Once you have a conventional (non-CLI) shell, then the gloves are off and you can basically do whatever you want to the system (regardless of how ill-advised) as forum regulars are probably more than aware.

Generally speaking, we have additional validation in UI / APIs to enforce some semblance of best-practices, or at a minimum ward off some worst-practices. :slight_smile:

Wow, (and not the Wow! from 1977), 2 drive RAID-Z1… Gives the potential of taking what would necessitate a 2 disk Mirror vDev to a whole new level. (Necessitate because the user has just 2 disks now…)

Gee, does that new feature support a RAID-Z3 with 1 data and 3 parity?
If so, can you create a RAID-Z3 with a single disk?
Using 3 degraded disks?

Or similarly, a 3 disk RAID-Z2, with 2 degraded disks?


Now all we need is to add parity. Taking a 2 disk RAID-Z1, to 3 disks and then later adding another disk for more parity to make a 4 disk RAID-Z2. (It would be probable that the new parity would be striped, just like the RAID-Zx expansion feature.)

Gee, if we are going to be greedy with adding parity, we need column remove and parity remove from RAID-Zx.

And how about Block Pointer Re-write while we are at it?

4 Likes

if you are going to those lengths already… You may as well just rewrite the entire thing from scratch with a license that is compatible with Linux…

Oh wait, they already did that with BTRFS… silly me. I guess we need yet another rewrite… Let’s call it SRSLYBTRFS.

1 Like