Drive layout suggestion

Hi all!,
Comming from my other post (Designing new system), but decided to spin up another one just for a “shorter” answer.

I have to layout 8-10, 10 TB drives for my server. This will be - at the moment - a purely backup repository with no direct access from the rest of the network, so no concurrent users and such.

I have 10 bays to play with (I have 12 LFFs, but need to leave some for future use), so here’s the question:

I have ordered 9, 10TB drives (Ultrastar HC330) and my plan is to use it in a RAIDZ2. As I need at least 60 TBs, now after a ton of reading, I’m exploring other options for increased performance, so as far as I can see, I can:

  • Set a stripped RAIDZ1, aka RAID50, with 8 drives. That’s 60 TBs raw and double? the speed. However, RAIDZ1 with 10 TB drives makes me a little unconfortable.
  • Buy another drive, set a stripped RAIDZ2, aka RAID60, with 10 drives, again, 60 TBs raw. I feel this is a bit overkill in terms of redundancy.
  • Keep as it is, and make the 9 drive-wide RAIDZ2 and get 70 TBs raw.

The rest of the variables are “set”, 7 million office-like files, 10 Gbit connection.

Again this is for a backup repository, so I want to make it safe and redundant. The system will be replicated to another offsite system, but that’s for the next topic.

So any comments on this are welcome.

Thank you!

Why do you think you need “improved” performance (over and above RAIDZ2)?

What do you think the performance problems will be if you do use a single RAIDZ2 vDev?

(I ask because there is a whole lot of misinformation about the need for multiple vDevs or mirroring related to the difference between IOPS and throughput that people often read and believe applies to their sequential file I/O but which need debunking.)

For your use case - effectively a single backup user and almost entirely streaming sequential writes (and no 4KB random access and no synchronous writes) - IMO you need write throughput and not IOPS.

2 Likes

I’m at the moment where I can decide. My question is “can I choose better?”

As the system is new, I don’t have “previous” data to compare, so I don’t have “issues” with the speed.

And I thank you very much for debunking my understanding, this is my first project with TrueNAS at business-level, so any input is well received.

Allright, I get it, however (and that comes from my other post) I believe I’m more in the random r/w pattern, as I will be syncing (robocopy/rsync) a large file server with this one. But again, I might have misunderstood some of the dozens of posts I’ve read lately.

This is very important, how much “useable” space do you need?
Here is how I think about it when asked this type of question… it is not very scientific, it is just my experience here on the forums.

How long do you think the 60TB will last before you outgrow it? If the answer is 5+ years, then you are okay. In a corporate environment you never know and you always want to have the ability to add more storage. For the sake of argument, 60TB is what you never plan to exceed.

If you want 60TB of “Usable” storage and you want it using 10TB drives, think about RESILVER times, it is a bit of time, and because when things go wrong, they go very wrong, now you have two drive failures. How long is your data at risk while you RESILVER a replacement drive into the system? I would not do less than RAIDZ2. If you were not going to have an offsite backup, I might suggest a RAIDZ3, depending on how important it is to recover that data. But let’s assume RAIDZ2 is good for the time.

A RAIDZ2 using nine (9) 10TB drives will give you a raw amount of 90TB, simple math here. But usable space is not the same as raw space. To keep it simple (dumbed down), RAIDZ2 uses the equivalent of 2 drives for parity, now you have 70TB of storage, but you have to take away 20% (14TB) to maintain a healthy pool. This leaves you with a maximum usable capacity of 56TB. This does not take into account any compression.

Again, this was the dumbed down version of the math, it is more complicated than that if you are looking for exact figures (use a RAIDZ calculator), and while you can go above the 80% capacity, if you hit that point then you are looking to add more capacity into the system.

I’m in agreement with @Protopia about the data access speeds. IOPS for a backup is not very relevant. If this were an active server that many people were using actively, that would be a different story and the requirements would change.

If you have not already done so, write down your list of requirements, this is the minimum you would need to do the job you want. But as a backup server, IOPS is negligible.

If you need high IOPS, then a different layout would be required, and the entire system should be designed around high IOPS. It is not just the drive layout.

Someone else may be able to speak to using rsync but from what I have seen written, it sounds like it is slow. If this is true, then IOPS is not a real factor here.

I hope this helps.

1 Like

Mirrors are not necessarily better. For sequential writes, RAIDZ is at least as good for throughput and much better for storage efficiency.

No - you won’t be because rsync writes are sequential writes of entire files, so absolutely NOT random update writes like you would get with database files or virtual disks.

BUT…

If you are doing a sync from another server which is using a ZFS file system, then you should definitely do ZFS replication (send/receive) because that is much much much more efficient than rsync / robocopy / syncback. Indeed, IMO this is sufficiently better to justify rebuilding the source servers to use ZFS file systems just so you can do replication for incremental backups.

1 Like

So raidz2 or raidz3: That’s your only decision now.
You may even use raidz expansion to widen later, though 10 is about as wide as many would go with raidz2. 11- or 12-wide raidz3 looks OK-ish.

1 Like

Wow, nice so much -more- information to digest :sweat_smile:

Well yeah, I’m designing this system for the next 3-5 actually. Based on my math I could go easily to 5+ years of storage, but at that time, I would be looking to update the server itself so not very worried about that. I have a few bays left (6 to be precise) so I could mount another volume with larger drives in the future IF needed, but as you said, corporate is a bit unpredictable in that aspect.

Going back to 2010, current storage sits about 28 TBs, with an average of 1,6 TBs / year for the last 9 years.

Yup, resilver is what gives me chills. That’s why I said that I wasn’t too confortable with a stripped RAIDZ1 (aka raid50) Sorry I’m much more used to the “raid” terminology so that’s how I make sure we are on the same page.

Sounds about right then. 60 TBs is a jump from what’s already available, but this system is a first shot, if it is outgrown too quick, then I’ll pass it to the upper levels, delete or pay more.

Got it, HDDs were never meant of high IOPS, that’s why I’m speccing for storage, but as I said in my OP, maybe I can do “better”, no shame in asking.

Well the current storage design dictates half of the equation, as posted on the other thread, having a Windows Server in the mix having to robocopy is far from ideal, but that’s something I cannot redesign atm.

And yes, really helpful.

Understood, but as stated before I’m not at the point of rebuilding the source servers as TN (also noted by @Johnny_Fartpants), while I do have plenty Windows Server experience, I do lack the expertise needed to fully jump into the ZFS/TN pool with production storage, even with the potential benefits.

As noted in my other post, call it a “soft transition”.

All right, that’s a nice summary of the last posts!

Thank you all!

Before you dive in completely, you should fully understand the upgrade path to increase your storage capacity and the pitfalls. With those 6 extra bays you could add more drives to your RAIDZ2 and “expand” it. That would give you roughly 10TB per added drive more in capacity. And please understand, the capacity values I am stating are very rough and will of course be a little bit lower but not significantly lower. Since this will be a backup system, I personally do not see an issue with making 12 wide RAIDZ2, but if it were to be used as a normal server, there is too much risk there. But we will stay on topic, backup server only. You could also add another complete vdev using 6 drives of the capacity of your choice. It is good to have some growing options.

You could use RAIDZ3, one extra disk of safety from drive failure. You would then need to add one more 10TB drive to make up the capacity lost making it a RAIDZ3.

That is very good, many people are short sighted and do not grasp the big picture.

It is a good thing you already work with servers so you know the requirements you should have. I suspect that you would not “need” to replace the server for a while but many companies have a computer replacement plan. When I was working it was 3 to 5 years a new laptop for everyone. Over 110,000 employees as of today, and I retired less than a year ago. That was several divisions and each had its own IT Department. I do not know how often servers were replaced but I would not be surprised if it was on a similar schedule.

Very right, it is important to ask questions.

Also, as a backup system, you do not need high end CPU or massive amounts of RAM. You do not need any special vdevs, cache, logs, whatever. K.I.S.S is the rule to follow here. Of course if it is a leftover old server, whatever it comes with will likely be very fine.

Some light reading material for you:

And look for the White Papers for ZFS Pool Layout and Pool Capacity.

I hope I have given you some things to think about and it all helps you out.

1 Like

Ha, K.I.S.S. love it. This is me getting my feet wet with TN, in a new job, I want to make it right.

Thank you for your help.