This statement makes me think that you’re assuming activity = writes only, which is obviously not the case.
How exactly are you replacing the disks simultaneously? In a RAIDZ2, I guess you could replace 2 at once, but then you’d be left with no parity, I’m pretty sure majority of people replace just 1 disk at a time.
Oh yeah sure you can. 90% of people don’t do that. Majority of people barely have enough money (and complain a lot) to even buy ECC gear, let alone buy a full disk shelf just to upgrade.
I actually hold the same opinion here, and I have never had multiple disks fail in quick successions. But I do know a lot of people that fret over disk failures though. I mean you can see that fear because I see barely anyone ever recommends RAIDZ1 around here.
I’ve also definitely noticed pool performance drop quite significantly while resilverin and my use case is bit different from most people as I actually use my pool for block storage and I need every drop of IOPS Finally, I also rather have the flexibility to upgrade my pool by simply buying 2 disks at a time rather than x-wide disks at a time.
Yes, but I need details of “misconceptions” for items. Not discussions of various tunables that various people use, (unless it changed noticeably or was a misconception).
Plus, my goal was a short paragraph per item, not 20 lines…
@Arwen I would maybe a few other common misconceptions I’ve personally seen pretty frequently.
People tend to equate regular RAID levels with ZFS:
RAIDZ1 → RAID 5
Mirror → RAID 10
Another one I see maybe less frequently is that some people mistake RAIDZ expansion to being able to change the type of your vdev. Ie. Upgrade RAIDZ1 to RAIDZ2.
Another one that’s also semi-common and I think this isn’t explained well even int he documents; Redundance exists at the zpool level rather than the VDEV level.
to be fair… it was not said what type of activity but I’ve never seen regular use tank a scrub/resilver into taking days/weeks.
just above this quote I made the assumption that since you were talking about each disk being read 7 times you were talking about upgrading the entire VDEV with bigger disks via replacement, which is one of the only times that would normally happen. which the rest of what I said in that is based on.
which is why I called it an online replacement, online replacements offer additional safety by not degrading the pool during replacement.
RAIDZ1 really only makes sense for smaller pools… let’s say 5-wide at a max and as long as you are OK with the compromise… if you are building bigger pools, it make more sense to have a VDEV twice as wide with twice the redundancy, since it increases the self-healing potential with the same storage efficiency. (bad cables and controllers are a significant concern)
consider adjusting this memory value if your srubs/resilvers are slow… the default for the sequential scanning is 5% of RAM which (probably) isn’t enough. this literally halves the scrub time on my 72-wide pool.
also, these are absolutely valid reasons to use mirrors.
Depends. Two drives failing the same vdev could bring down my pool. But also 3 drives failing, each from one vdev would not bring down the pool.
Not even three but two drives would bring down the 2way mirror example I made.
But yeah, that is exactly my point
I say that the WD and Seagate from same vdev failing at the same time is less likely to happen than 3 WDs failing in a 6 wide RAIDZ2.
Agreed
Agreed again.
But for the more realistic example of 6 SATA ports and 2 way mirrors, RAIDZ2 has
worse reliability
way worse performance
only 33% more storage (absolute best case without any padding overhead, so you better not use zvols)
5y down the line you need to replace 6 drives all at once to get more storage while with mirror you can replace two drives with larger ones and already get more storage
I think a reliability section can be added “mirrors are not always more reliable” but I’m not sure how pervasive that belief is. @jro has an excellent resource to calculate this https://jro.io/r2c2/
I take issue with this as a general recommendation for a whole host of reasons. Primarily because by setting this tunable basically means you no longer have an Adaptive Replacement Cache, and instead you have a Most Frequently Used Cache.