Assessing the Potential for Data Loss

2 Likes

Hi Davvo,

Thanks for this resource. I wrote some simple functions to calculate these values to help inform me when setting up a new NAS, however when validating against your results I came across an issue. In your table for failure probabilities for a 2-way mirror, my results agree with yours (rounded) for P(0) and P(1), but I believe P(2) should be 0.009 - WolphramAlpha agrees - I think you lost a factor of 2 somewhere :wink:.

I also have a question about the results after table 2. You state:

… and a 0.05% of simultaneously losing three disks. Because we are using RAIDZ2 (2 parity drives) we encounter data loss only when three disks organize to go on strike together: the data loss probability of our VDEV is 0.05%

i.e. the possibility of losing a 6-drive RAIDZ2 vdev = P(3), but should this not be P(3) + P(4) + ... + P(n) or perhaps alternatively 1 - (P(0) + P(1) + P(2))? (My kingdom for the Discourse maths plugin.)

If so, the final calculation for VDEV data loss would be:
image, where n = total number of drives, p is their individual failure probability, and Z is the VDEV’s drive failure tolerance (i.e. 1 for RAIDZ1, 2 for RAIDZ2).

I appreciate these probabilities are small, such that it is ~P(3), I just found it jarring that this was not mentioned.

Yep, I remember being a few errors here, but the numbers being so small meant it impacted very little on the overall results (the sense of this resource was to help generate awareness, hence things are generally rounded to and understandable value).

I was planning to correct this, but me being busy first and the new forum moving then… with me losing my password for the old one did not help :grimacing:.

Thank you for pointing out the errors, do note that I am assuming your calculations are correct… It has been a while since I dabbed in math/statistics.

Also, you can use R2-C2 as reference.