Best strategy to upgrade a big JBOD?

blanchet · August 27, 2024, 3:30pm

Hi,

I am replacing

a 44-disks Supermicro JBOD with 18TB SATA spinning disks
by a 78-disks AIC JBOD with 22TB SAS spinning disks.

To avoid downtime:

I have connected the new JBOD to the TrueNAS server
Now I plan to replace all disks in the first JBOD by newer disks in the 2nd JBOD
When all the disks in the first JBOD have been replaced, I can disconnect the first JBOD.

I would like to know that is the best strategy to replace the disks

Replace only one disk at the same time for the pool ?
Replace one disk per vdev at the same time ?
Replace several disks per vdev at the same time ?

My zpool layout is: 7 x 6-wide RaidZ2 vdevs

I know that option #1 is the safest, but it will take months to upgrade the JBOD.
So I wonder it the other options are viable.

The server is

Supermicro 2U server
CPU: 2 x Intel Xeon Gold 6226R 16 cores@2.90 GHz
RAM: 512 GB of RAM DDR4 ECC
HBA: Broadcom 9300-8e
OS: TrueNAS Scale Dragonfish-24.04.2

Best Regards,

HoneyBadger · August 27, 2024, 3:50pm

If you have both the old 44x 18T JBOD and the 78x 22T JBOD, I would suggest you simply make a second, new pool on the new JBOD, and use a ZFS replication job to send your files, data, and existing snapshots over to the new pool:

dan · August 27, 2024, 4:53pm

The fastest thing to do is likely what our resident “don’t give a sh*t” suggests, with the additional (potential) advantage that you can change your pool layout as (or if) desired. But you can’t do that without at least some disruption of services. OTOH, that method leaves all your data intact on the old JBOD until you decide to decommission it.

Not quite as fast, but I’d probably vote for:

It’s what I did the last time I did an in-place upgrade on my pool: replaced 6x 4 TB disks with 6x 16 TB units, simultaneously. I expect it would have worked just as well if I were upgrading two or more vdevs at a time, but that wasn’t my situation.

blanchet · August 27, 2024, 6:27pm

Thank you for the advice. I will try the solution #2 replace one disk per vdev because I wish to avoid any downtime.

Arwen · August 27, 2024, 10:40pm

@blanchet - One neat feature of OpenZFS is that you can replace in place. This is what you are doing with the extra JBOD & disks.

How this works with ZFS, is that ZFS will create a temporary Mirror between the source and destination disk. If the source suffers a block problem, redundancy is used to populate the destination disk. If the destination disk dies during the process, no problem, your source disk is still present. When the destination disk is fully synced, the source disk is automatically detached, leaving the RAID-Z2 vDev as normal.

What this means, is that it is reasonably safe to perform 100% disk replacements at the same time. Of course, performance will suck and managing that many disk replacements is a concern too.

So if performance is not too bad, you can safely start another disk per vDev. For example, if an existing set of replacement disks is almost complete, but you want to go home, start the next. By morning the earlier set should be done and you did not waste any time overnight.

But, changing 1 disk per RAID-Z2 vDev is both safe and lower performance hit.

blanchet · August 30, 2024, 3:36pm

After doing some tests, I confirm that I can replace simultaneously one disk per vdev without real extra overhead.

I have found this article from Chris Siebenmann that explains the technical reason.
https://utcc.utoronto.ca/~cks/space/blog/solaris/ZFSMultidiskResilversFree

So I will upgrade my JBOD with this way.