LTT DOES NOT fork TrueNAS

If I read the failure curve right it says a 20-wide raidz3 is better than 2x10 raidz2, and far better than 3x7 raidz2.

Just for failure curve. Not performance.

That is certainly an interesting analysis of probability and perhaps a good basis for considering this subject, but mathematically incorrect for several reasons.

  1. I am not sure what a p of 0.03% actually means anyway. Is that a probability of 0.03% of a drive failing on any specific day (?) because if so that implies that after 3334 days there is a 100% certainty that the drive has failed - and that is NOT how drive failures work. Assuming that a drive was not e,g,. damaged in transit or by excessive temperatures, then the probability of a 2 month old drive failing on a specific day is probably much lower (but still non-zero) than a drive that is 9 years old.

  2. We don’t really want to consider the probability of x disks failing simultaneously with a probability of a single drive failing being p. We are not interested in what happens when a drive hasn’t failed - only when at least one drive has already failed. And we know that if one drive has failed, a non redundant pool is toast. So we should really only be looking at situations where at least one drive of a redundant pool has failed and then look at the probability of additional drives failing at the same time.

  3. If you purchase several identical drives at the same time, they are perhaps more likely than not to come from the same manufacturing line (with potentially the same tiny manufacturing flaws) using the components from thee same batches. IMO drives from the same batch, installed at the same time in the same environment are more likely to fail at around the same lifetime than a bunch of disks from (say) different manufacturers or different batches or with different power-up times. Similarly drives shipped together are likely to have suffered similar bumps and bruises, and drives in the same enclosure are likely to have had similar exposure to extreme temperatures.

    In other words, in a typical NAS you are potentially likely to have failures grouped together in time.

  4. It is IMO likely that the probability of a drive failing in any particular hour bears some relationship to the stress it is under - for example drives doing a lot of work get hotter than those that are spinning but idle. So the probability p almost certainly goes up when resilvering is occurring - and this is another reason for only considering situations where at least one drive has failed and look at the probability of further drive failures at the same time - possibly precipitated by the heavy workload of resilvering.

If we are able to determine p as the probability of a drive in a redundant vDev failing in (say) a 1 hour period during resilvering of the vDev, that would then enable us to make a much better estimate (using the calculations referred to) of losing the pool during resilvering based on: the vDev width, the redundancy factor and the length of time to resilver (i.e. likely proportional to the size of the drives).

P.S. If point 4 is true, then we should perhaps also be looking at mitigating actions - for example, turning all the cooling fans on full during resilvering might reduce the risks, or perhaps somehow turning down the resilvering intensity if temperatures rise.

1 Like

It’s not how probabilities work. Given a probability of 0.03% failing on a given day, the probability of the drive failing before 3334 days (about 9 years) is about 63%.

2 Likes

@alexey Yes - you are right. Too many decades have passed since I last did probability theory.

There are 2 things with dRAID that people need to keep in mind, (you probably know them…)

  1. Integrated hot spares is a key feature. Not having any reduces the usefulness of dRAID.
  2. Unlike RAID-Zx, dRAID does a full stripe allocation for all writes, even ones smaller than the stripe. Thus, small files are not as suitable for dRAID as RAID-Zx.

In some practical situations, like user friendliness, dRAID is worse than RAID-Zx, in my opinion. Perhaps not suitable for a home and easy to manage NAS, (like HexOS). I mean a user that could add or remove Hot Spares with other vDev types finds they are “stuck” in integrated Hot Spares in dRAID.

And has been pointed out before, more disks for a dRAID vDev is the norm, when compared to RAID-Zx vDevs. (And their is no project at present to add columns to a dRAID vDev.)

4 Likes

Also, the probability of a drive failing changes with time: from a statistical point, it grows as the drive gets older.
Anything short of a research paper is just dabbling for fun and educational puroposes… my resource surely is.

A 20 wide Raidz array is going to be very bad for resilvering if a disk does fail. At least it’s only 2TB drives, but still, that’s a lot of work to resilver. Probably still safe in my view, but, not very good performance wise. How can anyone claim that would be more fault tolerant than say a 2 10 wide array? The only thing you do gain is more disk space with a single vdev. If you think about it though, 2x10 disks means you could if lucky lose 6 drives and still not lose data. With a single vdev, you lose 4, you are guaranteed to lose the pool.

It’s called statistic. I suggest you read again said post.
But we all agree that a 20-wide VDEV is bad on so many ways.

Baka!

We all know what statistics are, they can say virtually anything. Statistics were used to make the terrible faulty claim that raid 5 was dead a decade or more ago. Those statistics were faulty then and still are. Real life has clearly shown that they are more resilient than the “math” showed. But yes, it’s a bad idea to run a 20 wide. And a lot of those analyses also forgot about scrubs and the like and assumed so much more. And as we know, a URE does not kill a resilver, unlike old style raid. At least with enterprise drives.

Edit: I will add that the specific feature that helps with this is TLER and the various other names it goes by.

2 Likes

That’s unfair to statistic, but agreedable. Maybe that’s why I love it so much.

For the reasons you describe, I am pretty sure but can not proof it scientifically that:

A 6 wide RAIDZ2 from a single HDD batch offers a greater risk than a pool of 3 vdevs, each with a 2 way mirror, where each mirror consists of two different drives.

I theory you might think that RAIDZ2 is safer here, because you can loose any two drives. In reality I would argue that especially in a homelab or even a business that does not swap HDD before they fail, after 5y of running these drives, the bathtub curve kicks in and mirror is safer.

We recently had this discussion here: RAID Reliability Calculators blindspot - #12 by Sara

Statistic is still science though, and it gives us a clear answer.

I also can not proof it statistically :wink:

Attempting to bring this back on Topic…

HexOS is going to need to deal with this when end users invariably try to build servers with mismatched disks and inefficient layouts. That setup wizard is going to have to offer guidance on an “optimal” layout - and optimal will be very subjective when it comes to total size, performance and reliability. There’s no one-size-fits-all solution, but I’d wager there’s some reasonable “defaults” you can go with for a home use-case.

That’s precisely why a wizard is such a challenge. There is a lot of wetware with a lot of experience on this forum and even the august Demi-gods disagree at times re: what the best solution is to a given problem.

Now try distilling that into a few lines of code, especially if said code is supposed to address a wide variety of use cases - docks, apps, VMs, databases, and even static files, to name a few.

I’d suggest an approach where the wizard starts by asking the user what they want to accomplish, following by looking through the hardware and suggesting a couple of approaches. A little education goes a long way, but I imagine one of the design principles is to keep it simple.

A non-performant system isn’t acceptable either, however. So I’d suggest a wizard that simply says “no” when it comes to using SMR drives, HDDs as SLOGs, etc. Dealing with a wide variety of boards, drives, and so on might be quite an interesting coding challenge.

I’m glad it’s not my problem.

2 Likes

Exactly - thousands of input variants for a single workflow to create an optimum pool from a random set of hardware available. And that’s just a single workflow for a known situation (a set of empty disks) - imagine how much more complex it will be for an unknown issue and a random set of hardware.

1 Like

I genuinely don’t think it’s as complex as some people think.

Anything to do with the underlying workings of ZFS - like SLOGs, L2ARC, etc. can be abstracted away. Most home users don’t need to know about these, it’s an implementation detail and in the context of a home user it almost doesn’t matter - a home user’s use-case is much more definable - it’s typically storage, storage, storage, very read-heavy and not super write-heavy (It’s one of the reasons why unRAID works quite well in the home space, its non-RAID design is very well suited to home use, even though you don’t really get write performance better than a single drive).

SMR is probably the main thing to watch out for, as it’ll work fine in a non-RAID setup but fail horribly on ZFS - but that’s a single check you can do against any drive connected to the system. Give the user a stern warning about it, let them decide if they want to take the risk (or just block it).

Really the only things that matter are the numbers and sizes of the drives and how you arrange them. HexOS could simply demand that disks must be the same size to be used together in a pool and default to a single-VDEV raidz1 arrangement - it would be “fine” for “most” users and that’s all it needs to be for a first release.

You could even stick to the single-VDEV-per-pool approach and let the user pick the number of parity drives, all the user has to care about is “How many drive failures before I lose all my data”. That would be enough. The user doesn’t need to know what a VDEV is, what parity is or anything like that, all an end user needs to know is that they will get x space from y drives and can suffer z drive failures.

4 Likes

I’m signing up. …Maybe they’ll pivot and start using FreeBSD.

4 Likes

Yes - you could limit HexOS to only the most simple single use case - which will meet a lot of people’s initial needs, until that is they need more storage and want to add a 2nd vDev and find that HexOS was a bad choice. Or until they have some sort of storage issue and find that ZFS really isn’t as simple as HexOS tried to make it (by hiding the complexity for a single use case but providing no help for literally any other user case at all).

What does this have to do with anything in this thread? The choice of underlying O/S probably makes very little difference to the functionality HexOS will deliver or whether they will be successful or not.