This is a rather odd question about zfs levels and vdevs

PhilD13 · June 18, 2025, 5:30pm

This is a random thought while eating lunch…

It is standard someone to say with a raidz2 setup for a pool two drives can be lost without data loss. That is true.

Pools themselves are made of one or more vdevs which contain the drives and the drive configuration such as mirror, z1, z2, z3.

So when someone talks about having a pool with a raidz2 setup and it can lose 2 drives without data loss that is true to a point. I ask because if a pool consists of two vdevs and each vdev is configured as a raidz2 then the pool itself could lose up to 4 drives without data loss provided the loss was distributed so each vdev lost no more than 2 drives. Correct?

winnielinnie · June 18, 2025, 5:48pm

Yes.

etorix · June 18, 2025, 7:07pm

Correct, but redundancy really is at vdev level: If you lose a whole vdev, you lose the pool.
So a pool of multiple raidz2 vdevs can be entirely lost by the failure of no more than three drives, no matter how many vdevs there are. More vdevs provides more space and more performance, but not more resiliency.

winnielinnie · June 18, 2025, 7:33pm

True.

somethingweird · June 18, 2025, 8:30pm

is this where hot spares come in play? Temporary mitigate failure of pool

etorix · June 18, 2025, 8:49pm

That’s not specifically related to the question, but hot spares are a way to limit the time when a pool is degraded.
As you can see from some recent threads, raidz2 is supposed to be reasonably resilent… but if one doesn’t react quickly to a first failure, it only takes overheating drives or a port multiplier going in the way to end up in a bad place.

winnielinnie · June 18, 2025, 9:24pm

Indeed.

Fleshmauler · June 18, 2025, 9:46pm

All the recent posts got me debating if I should turn my cold spare into a hot one…

winnielinnie · June 18, 2025, 9:53pm

Maybe.

dan · June 18, 2025, 10:03pm

Unless you don’t have easy access to the server, I wouldn’t. A hot spare is wearing out at the same rate as a disk that’s part of the pool. If you think a “hot spare” is appropriate, consider RAIDZ3 instead of RAIDZ2.

winnielinnie · June 18, 2025, 10:20pm

Exactly.

PhilD13 · June 18, 2025, 10:53pm

The recent posts on lost pools is what made me think of the question.

Gyula_Masa · June 18, 2025, 11:32pm

…and that is what BACKUPS (in plural) are for…

winnielinnie · June 18, 2025, 11:43pm

This.

Fleshmauler · June 18, 2025, 11:50pm

@winnielinnie confirmed paying by the word to post

dwolsten · June 18, 2025, 11:52pm

I’m really surprised they don’t have systems with “hot spares” that are actually kept powered down unless they’re needed due to a disk failure.

Arwen · June 19, 2025, 2:48am

In some ways, that would be an Enterprise like feature. I am not saying consumer hardware can’t have such a thing. Or should not have such a thing. I can imagine such a feature would be less wanted in the consumer hardware space if it costs more.

Some HDDs have the ability to stay in reset mode, based on the old 3.3v SATA / SAS power line. That 3.3v line has been repurposed for a bit of high availability. A disk enclosure can then power up drives in sequence to avoid high current load on power up. Or even reset a drive that appears to be hung. Then in theory, SAS enclosure services could allow program access to such a feature.

Another way to look at it, is to have several cold spare HDDs and cycle them through your server like rotating the tires on your car. (At least back in the bad old days when your vehicle’s spare tire was of the same size & type as the 4 running tires…) If you did such a HDD rotation, this might help people to first verify backups before disk changes. And next, practice disk changes before the excrement hits to rotating impellers.

dwolsten · June 19, 2025, 5:36am

I guess my question is: isn’t this feature common in enterprise hardware? I can understand why it would not be in consumer hardware, but it seems like a no-brainer in enterprise hardware. Surely enterprises aren’t running racks of expensive hard drives for thousands of hours, doing nothing and just waiting for other drives to fail.

craigdt · June 19, 2025, 7:02am

I’m unsure about Scale/CE because I’ve not upgraded yet, but Core has the ability to set a individual disk to spin down, could this be used to spin down a hot spare?

Imagine a hot spare won’t be accessed unless its being used to replace a failed hdd at which point it would spin up, or would TrueNAS access it occasionally for other tasks and cause it to spin up again ?

Gyula_Masa · June 19, 2025, 10:35am

As @dan has mentioned correctly, a cold- or hot- spare does not really make much sense.
If you already have spent your money on that HDD, you should use a RaidZ3 setup, because then in case of any failure, your system ALREADY has 3 parity.
In case of a spare, the system have to resilver the pool and that is a long lasting (it can easily take longer than a day for a 16+TB drive), high load on the individual HDDs.
That is a really sensitive period, because, if you used a Z2 in the beginning, and the first drive is dead, you are one (and a half) HDD failure away from the data loss. If you set it up Z3 in the first place, you are 2 (and a half) failure away.