However that page only addresses the procedure to replace a failing disk (admittedly the more common case) and not the procedure for replacing a working disk.
The page mentions that it might be okay in certain circumstances to not take a disk that is to be replaced offline (and removing it) before using the “Replace” command, but doesn’t clarify the circumstances.
It would be great to have that clarification and - ideally - the proper steps - to replace a working disk (perhaps taking it offline and removing it from the system before running the “Replace” command isn’t needed in this case?). If that is documented elsewhere already, a link to that place would be very welcome.
Impact
Updating the documentation in that aspect would ensure that users would have clarity on the procedure. If it is indeed the case that a working disk that is to be replaced does not need to be taken offline and removed first, this would ensure (additional) redundancy during the replacement/resilvering process and could potentially help avoiding a failing pool.
User Story
After reading and parsing this page very carefully, I am still confused as to how it applies to my circumstances (replacing a working drive to upgrade storage space). Having that clarification would be reassuring.
So, if you want to replace a working disk, and can have the replacement online while the original disk is online, just select the disk to replace, and then select the replacement.
When the replacement is finished, the original disk is detached from the pool. And can then be physically removed.
If you can’t have both online, then offline the original disk, physically remove, then replace, as per the failed disk instructions.
Do not offline then replace without physically removing, or you risk having TrueNAS erase your original disk, depending on if a “bug” had been fixed or not.
Thank you so much for your quick response and the explanation. This clarifies the procedure for me. The only reason I’m not marking this as the solution at the moment is that it would be good, if that info found its way into the TrueNAS online documentation, so that others may also benefit from it.
A follow-up question, if I may, out of curiosity: When a disk gets replaced with the original disk still online in the pool, zfs could in theory just copy the content of the old disk to the new disk, which would mean less wear on the system as a whole. Does zfs do that or does it still resilver the new disk as if the original disk weren’t there anymore?
If you choose to replace a working, (or disk with bad blocks), on-line with another available disk, we sort of call this “replace in place”. And yes, ZFS supports this.
To sum up, ZFS will “mirror” the old disk to the new disk, and when the re-silver is complete, remove the old disk, (aka break that temporary mirror). And yes, it actually shows up similar to a mirror in the output of zpool status.
Doing this “replace in place” has 2 benefits:
Uses the source disk for all good blocks, resorting to any available redundancy for bad blocks.
Maintains as much redundancy as it can during the replacement. This is especially important for 2 way Mirror vDevs or RAID-Z1 vDevs.
Of course, “replace in place” does not work on a completely dead / faulted disk. Only good disks or disks with a few bad blocks, (and no more spares available).