I recently expanded my 5 disk raidz2 pool (14.55TiB disks) to 6 and wanted to try the new zfs rewrite command to rebalance and use the new parity-data scheme. That had a few surprises in store…
I originally had around 59% used space on the pool (out of around 42TiB), but wanted to expand it to make room for a large upcoming import. After the expansion the gui showed 51.54TiB of usable space. I know that the gui has some issues with this and zpool list confirmed that all was well.
I started the zfs rewrite and it took around 1.5 days to complete - as expected, but what I didn’t expect was that the used space almost doubled during the rewrite!
I reached 93.2% used space when the rewrite finished and it was not fun following the increase and getting all the alerts (zfs rewrite doesn’t have a progress meter). I knew the gui was wrong, but still my head spins when I see the red numbers
I had to delete all the old snapshots to recover the space, but the gui still holds on to the wrong number (known issue)
I know you are working on adding rebalancing into the gui and that would be a welcomed addition. I was not aware that I needed (in my case) 2x space available for the rewrite. Maybe deleting snapshots before the rewrite (and pausing them during?) is a good strategy. I was only at 59% used space when I started my expansion, I think many start with much more used data - will they be able to rewrite at all?
Me too. But on second thought, it makes sense (if there are snapshots involved). AIUI, snapshots point to the set of blocks. And with a rewrite, you’re rebalancing blocks to vdevs (or better said create new blocks) so… there is a limit to zfs being smart.
As I understand now, you would lose all your snapshots as well. This makes this solution far from perfect (at least for me).
ZFS Rewrite does not do anything to snapshots, clone datasets and possibly not to block clones. All those would need to be removed if you used more than 50% of the space.
While ZFS Rewrite seems like an acceptable feature, it is not magic. It is much faster than script or external program based re-writing. But, it has the exact same limitations about multiple pointers to the same data blocks, (like snapshots, etc…).
The main benefit ZFS rewrite vs. the rebalance script doing it is file integrity. With the script, it is possible for a user and the script to be interacting with a file at the same time, potentially causing data loss.
That’s because the script basically copies a file while giving it a new name, verifies the copy happened, deletes the old file, then renames the copy to the old name. It’s a simple way to force a rebalance or to start moving content into sVDEVs.
Before using the rebalance script, I suggest turning off snapshots and deleting all existing snapshots. It should be no different with ZFS rewrite except that ZFS rewrite is allegedly much more sophisticated to prevent data loss or corruption.
I have zero experience with ZFS rewrite, but as with any new feature in TrueNAS, there will be bugs. I am surprised one as big as this made it into production, however. If you can control access to the NAS then for now it might be better to use the rewrite script as it does not suffer from the fill bug.