Pool suggestions

natebur · May 15, 2024, 1:42am

Hello,

I’m looking for suggestions on a drive layout.

I have 56 - 9.1 tb drives, 1 - 2 tb nvme, and 2 - 500 gb ssds (os). I’m trying to find the sweet spot of performance and capacity (like everyone). 80% of my data is media and 10% is VMs. Then some miscellaneous documents and stuff. Sitting about 150 tb consumed as of now. I would like to have a spare in the pool also.

winnielinnie · May 15, 2024, 1:51am

Not to derail, but did you (safely) succeed in removing enough drives from the 54-wide stripe vdev to at least construct a mirror vdev at minimum?

I don’t believe member disks in a stripe vdev can be “upgraded” to a RAIDZx vdev.

natebur · May 15, 2024, 7:17am

I’m still in process of removing enough disks. I’m getting close to the minimum I need to start migrating, but am not sure what layout to do. I’m hesitant for a bunch of mirrors as I’m hoping not to end up with that “low” of capacity.

essinghigh · May 15, 2024, 8:19am

Capacity wise, 4x14-Wide RAIDZ3 VDEVs should give you 70% or so usable capacity with a decent amount of redundancy. RAIDZ2 would be closer to 80% (but with a larger array like that I’d be less comfortable using it ).

Protopia · May 15, 2024, 8:22am

Network data c. 135TB of media files, VMs c. 15TB.

A few questions about your requirements to help decide what to do with the SSDs/NVME?

How do you access the data over the network? SMB or NFS? (Are your network writes Async or Sync? Will an SLOG help? If media files are rarely written, probably not.)
How much memory will your server have after you have used some for VMs? (Will a L2ARC help? For rarely accessed large media files, I am guessing not, but I may be wrong.)
15TB of VM data is too much for SSD. If these VMs were native, how would you use SSDs?

If it were me I would buy a cheap small NVMe for the boot drive and swap out the large one to use somewhere else in the future.

My gut reaction is that it is not worth using the SSDs for either L2ARC or for SLOG for network data or for metadata storage. But I am not an expert.

I might consider using them as an SLOG for VMs if I knew for certain that it would really improve performance and if performance was important.

Alternatively I might use the 500GB SSDs as a mirror pair to hold apps and e.g. Plex/Jellyfin metadata for streaming.

For the network data, you will probably want RAIDZ2/3 perhaps in vDevs which are 7 or 8 wide (because these are divisors of 56). Given the number of drives and likelihood of a drive failing you probably want to keep back a few drives to use as global spares.

I don’t have any experience with VMs on SCALE, so not sure whether you should use a small RAIDZ or mirrors for the VM data (my gut reaction would be to use mirror pairs), and also not sure whether an SSD SLOG mirror would help with VM writes.

essinghigh · May 15, 2024, 9:05am

With spares in mind, perhaps 6x9-Wide RAIDZ2 arrays would be most ideal, keeping two as hot spares to replace any failures as and when (RAIDZ3 is of course ideal but if capacity is important, it becomes a balancing act)

I typically keep VMs on mirrored NVMes for better read IOPS (which from what I’ve read previously is generally recommended for block storage)

Stux · May 15, 2024, 9:08am

Will the system be used for VM storage or to actually host VMs?

If storage, iSCSI or NFS?

Doesn’t really matter. If you’re hosting VM
Storage on rust your own real choice is mirrors.

And you’ll need a good SLOG too.

etorix · May 15, 2024, 10:24am

Hosting VMs = mirrors, under 50% use
For 15 TB, that would be eight (ten) drives in four (five) 2-way mirrors. Make that twelve/fifteen drives for safer 3-way mirrors.

That leaves 41-48 drives for media, in raidz2 or raidz3, under 80 % use. 9 TB drives
5 * 8-wide raidz2 = 270 GB raw, and some spares
4 * 11-wide raidz3 or 10-wide raidz2 = 288 GB raw

It all fits.

natebur · May 15, 2024, 10:31am

Just adding a bit of info.

The system has 512 gb of memory.

My data is access via iscsi with a couple smb/nfs shares.

At the moment I’m limited on ssds and the nvme. The 2 in the system are already used for the os. The nvme is currently used as the cache and it’s 3200 gb. All the hdds are sas. The chassis is a ucs 3260. So I’m limited on regular drives and pci adapters

Protopia · May 15, 2024, 10:49am

That is some SERIOUS hardware!!

essinghigh · May 15, 2024, 10:55am

Jeez, just looked up the chassis. Makes my little lab look like a glorified USB in comparison!
I’ll keep spirits high by remembering it’s not about the size of your storage… it’s about what you do with it

natebur · May 15, 2024, 11:10am

It was free to me and has all the storage I should need for a while. It did require me to run dedicated power and invest in 10 gb networking.

etorix · May 15, 2024, 11:10am

Your uses call for two pools with two different geometries. The mirror pool can evolve at will, but the raidz# bulk storage pool requires an initial critical decision about vdev geometry.
Then you can create at least the first vdev, move data in it, remove drives from the old pool, add another raidz vdev, rince and repeat. (And possibly run a rebalance script at the end.)

natebur · May 15, 2024, 11:12am

So after reading most responses, a small mirror pool for the vm data? Then a raidz2 for the rest?

Protopia · May 15, 2024, 1:05pm

For the network data / media files, several RAIDZ2 vDevs combined into a single pool because you don’t want too wide a vDev otherwise resilvering will take forever.

natebur · May 15, 2024, 4:06pm

So would 8 wide be good? Or less than that?

natebur · May 15, 2024, 4:13pm

I was looking at a raidz calculator. With using 4 drives for my vm pool, 52 drives in the media pool. 2 reserved for spares. Would 5-10 wide work? Or is that too wide? That would give me 318 TiB of capacity.

essinghigh · May 15, 2024, 4:29pm

2x2-Wide Mirror for VM Pool
5x10-Wide RAIDZ2 for Storage + 2 Spare

Sounds like a decent solution to me.

Stux · May 15, 2024, 5:09pm

The more vdevs you have in the vm pool, the better the performance will be.

When you have multiple VMs they are sharing the total IOPS of their host pool.

Sharing multiple VMs on 2 mirrors is a way to get single hard drive like performance from a VM.

Recommendation is to keep a vm pool below 50% (ie block storage)

Thus for 15TB of VM you should have a 30TB pool which means at least 4 mirrors of 9.1TB.

Unless perf doesn’t matter

(50% is to ameliorate fragmentation, because zfs is a cow fs)

etorix · May 15, 2024, 8:29pm

Mandatory reading for iSCSI: