Hi TrueNAS Community, long time lurker (since FreeNAS 9,) first time poster.
I have a SCALE array of 7xRAIDZ2 VDEVs | 5 wide each, made of 10TB drives with a Special VDEV for metadata only on 3-way mirrored NVMe. In the past for capacity expansion on simple pools I’ve always build a 2nd pool with larger disks and done a zfs send | zfs recv to migrate then repoint all my shares and services. But due to trying to keep my metadata on NVMe, and being maxed out on NVMe slots, that wasn’t a fit this time. I read on the forum that TrueNAS Scale should be able to do simultaneous replace operations through the GUI so having a few empty drive slots I planned out doing this in 5 round of 7 drives each (one in each VDEV.)
Imagine my surprise when at the end of the resilvering it showed only 1 unused drive and started the second resilvering. Further research shows other have experienced the same …seems to operate in series not parallel…
Was this a capability misunderstanding on my part? Or is there a different process I need to follow to keep from having to do all 34 additional resilvers?
I do recall it working in parallel in the past, but its certainly serial these days.
I suspect the behaviour may have changed when “Sequential Re-silvering” was added to ZFS.
And yes…
And good news, is apparently there’s a solution
You’re hitting the resilver_defer feature, which attempts to prevent restarting if another disk faults in the middle of a resilver, but mostly surprises people who try to resilver multiple things at once. See #14505 for a discussion.
So, the quick summary is this is a feature of resilver_defer feature which defers future pended resilvers until the current finishes, as you would have to restart the resilver that has already been running… and may have been running for days… thus future resilvers are deferred until current ones run…
But this means that in the situation where you start a bunch and you want them all to run simultaneusly, they won’t.
You can force it to do them all immediately with zpool resilver.
So, if you instead force a resilver, it will start resilvering again… with ALL the pending resilvers being done in parallel.
A neat trick. And yes, it would probably be better if OpenZFS had implicit magic to make this decision.
BTW, this probably means you can replace 2 drives at a time per vdev… or perhaps even more?
The last drive-replacement-for-expansion I did on my main NAS did 6 replacements simultaneously; this was in January of this year under SCALE 23.10. If memory serves, I started one replacement a while after the rest, and it waited until the rest finished, but five disks were resilvering simultaneously.
Awesome, Thanks for the helpful replies, indeed it looks like the 1st replace operation triggered the 1st resilver but looking at zpool iostat -vy 1 1 I’m seeing writes hitting all the remaining replacements at once.
This is good to know, like you’ve pointed out there’s no magic solution that will work for everybody but at least we have a mechanism to trigger a consolidated resilver.
Ooh good idea, I need to shift my mental model to accommodate this. Coming from the legacy world of RAID with having to swap out disks to force online capacity expansion - ZFS is so great to be able to do this without impacting redundancy.
Thank you all for your help on this hopefully others benefit from this knowledge as well.