Z2 with less or Z3 with more

R_P · November 11, 2024, 12:36pm

Lets say we got 24 disks, we can do a 6 disk Z3 * 4 VS a 4 disk Z2 * 6. Both should give rather close total volume. Is there any reason beside eaiser expanding to lean on one option?

WiteWulf · November 11, 2024, 1:15pm

As I understand it, a pool will fail if any one member vdev fails. Redundancy is only within the vdev, not across the pool.

So, more vdevs increases the statistical likelihood that your pool will fail.

Z2 allows up to two disks in the vdev to fail, but sacrifices two disks to parity, whereas Z3 allows up to three to fail, but three are used for parity.

So, your 4xZ3 vdev will lose 12 disks (half your storage) to parity, and a 6xZ2 vdev will be exactly the same, 12 disks or half your capacity lost.

Personally, I’d go with 3x Z2, with 8 disks in each vdev. That way you lower the number of vdevs, and the number of parity disks (6 in total, vs 12 in your designs) while still retaining plenty of redundancy (2 disks per vdev).

edit

Of course the question you really need to answer is what’s more important: capacity/redundancy (the two are linked), or performance. The above could be faster, but you also may want to think about how long it takes to resilver if you replace a disk (which is why you shouldn’t put all the disks in one big Z3/RAID6 vdev

HoneyBadger · November 11, 2024, 4:36pm

That’s a lot of redundancy that you’re designing here. Can I ask what the use case and environment is? 6wZ3 is the same capacity overhead of mirrors.

Is this going to be placed in an area where responding to a drive failure will take a significant amount of time (eg: greater than 48h?)

RetroG · November 11, 2024, 4:55pm

a wider Z3 is safer but it also comes with the caveat that all raidz flavors do… potential allocation inefficiency.

if a small record gets striped up and doesn’t fully cover all disks in a vdev? yea that space is effectively lost. this can start to add up when you have small recordsizes on a vdev that is potentially a dozen disks or more. sticking to big files and 1M recordsize minimizes this effect. replicated ZVOLs are notorious for this, you can imagine what a <16k volblocksize does.

for your proposed VDEV sizes though, I can’t say you’d see much allocation inefficiency. and I’d honestly say go for the RaidZ3, as it offers more self healing potential and as a result should be more resilient. (and considering the ratios it seems resiliency is your goal)

probain · November 11, 2024, 5:34pm

What many people don’t lift as a thing to consider. Is the age/usage of the drives themselves.
My pool is Z3 because I’m using heavily second hand enterprise SAS-drives. These have seen roughly 7+ years of constant spinning. So I deem the failure rate to be elevated. I wouldn’t want a heavy operation (resilvering) to kill more and more drives.

Two cents deposited

R_P · November 11, 2024, 10:55pm

Well I am planning to use real cheap used drives in this setup, which fail is quite likely. So the drives should be at most 12 TB for refurbrished deals, resilver should be around 10h?(havent tested out yet)

Thinking again, I got 35 functional bay so instead of 57 actually considering 3(8+3)+2 cold. Well I plan to use this machine to backup critical data from my main truenas

Constantin · November 11, 2024, 11:30pm

Another reason for 4xZ2 vs. 3xZ3 could be IOPS. More VDEVS usually result in a more performant pool, all things being equal. Four VDEVs should put you into 1GB/s read/write territory for large file transfers on 10GbE networks.

I run used drives here with 45k+ hours on them in a Z3 VDVEV and have plenty of qualified spares awaiting more failures. With that many drives, I suggest getting refurbished helium-filled drives, they consume less power, produce less heat, and the HGST He10’s I have been getting from goharddrive.com are still working great.

Unless the rig is remote, I would not bother with hot spares. Instead, qualify spares with bad blocks and SMART, then set them aside for eventual use.

CodeGnome · November 12, 2024, 12:24am

Don’t think about this from just a capacity POV. You need to also think about it from a backup and recovery perspective. If you follow the 3-2-1 rule, you shouldn’t be relying on just RAIDZ in any configuration for data assurance.

Even if you have the ability to keep a second copy on a different device, or the equipment and bandwidth to make backups to alternative media, then you still want to ensure that failures are as localized as possible. A RAIDZ3 or dRAID configuration might make sense if you have a ton of large disks and need to keep them all part of the same vdevs, but having smaller vdevs that can be backed up separately or fail indepenently is an inherently more resilient strategy.

You also need to consider your data transfer bandwidth. 24 disks in a single non-Thunderbolt enclosure is likely to exceed the read/write speeds you can achieve over a single 6Gbps backplane (for SATA-III) so smaller vdevs are likely to be more performant if you aren’t using them all concurrently. This is definitely a “your mileage may vary” factor, but one worth considering.

I’m using high-capacity remanufactured drives, and consider a 8-wide RAIDZ2 plus a two-disk mirrored read cache with regular LTO-9 backups more than adequate short of a catastrophic enclosure or backplane failure. The odds of more than two spinning drives failing at the same time is not zero, but seems low enough to be an acceptable risk. Even backing up the whole 72 TiB takes only 4-5 LTO-9 tapes, but segmenting the backups by dataset can yield faster backups and offers more options for determining what you consider “valuable.”

I don’t know enough about the trade-offs with dRAID to offer an opinion, but even with low-quality drives I would think that in the general case you’d be better off with more four-to-six wide RAIDZ2 vdevs unless you really need to keep all that data within a single dataset and can’t spread it across logical volumes.

More redundancy can help avoid losses due to occasional drive failures, but no form of RAID is foolproof. My advice is to build your arrays around your data partitioning needs and backup plans, rather than overcompensating for drive quality. If your drives are fast eough then RAIDZ3 is certainly an option if the cost of more parity won’t slow you down too much, but it’s not how I myself would do it even with a 36-bay enclosure.