Criticize My iSCSI Setup!

Hello! I wanted to post my iSCSI setup and my results to see if anyone can point out anything I can improve on and if these results are expected. For a general overview, I am trying to implement TrueNAS Scale in my environment as a shared storage device for my virtual machines. I have 3 hosts each with ~15 VM’s doing various things… we can delve more into their actual functions if anyone is curious. I am not a data engineer or expert in this field. Data engineering is maybe 10% of my scope of work so please don’t crucify me too bad…

A couple of notes…

  • TrueNAS Scale 25.10.1

  • Not going to harp on hardware too much because I don’t really have the option to acquire new hardware unless we are talking a really cheap upgrade…

  • x12 8Tb HDD’s in a raidz2 and one 1Tb SSD for cache vdev

  • x2 25gbps Mellanox NIC’s for TrueNAS & ESXi

  • 256GB of ECC RAM

  • x2 Intel(R) Xeon(R) Gold 6136 CPU @ 3.00GHz

  • Using VMFS 6 on vCenter / ESXi / vSphere

  • Using MPiO Round Robin

I am going to attach pictures of my configuration as well as a CrystalDiskMark result on a virtual machine on one of my hosts using iSCSI as the datastore source :slight_smile:

Check your arc_summary to see if you are even gettting any benifit from L2ARC with your 256 GB of RAM. You might be better off looking at SLOG for sync writes but it’s there are device requirements. Block type storage is recommended at 50% usage or under.
this should be the L2ARC and SLOG section for 25.10

BLOCK STORAGE

2 Likes

Awesome, I will investigate this more. Thank you for the resources.

What about my multipathing setup? Did I set that up correctly? As you can see I bonded two 25gbps interfaces together for each bond and then used those bonds as paths for round robin. I’ve done some pretty extensive back and forth with different configurations and I seem to get the best performance with this networking setup. I also have Jumbo Frames enabled across the board.

I am not familiar with the networking aspect. I don’t know if that is called ‘multipath’ or something else, like ‘multichannel’ Others should comment. I just went with a quick reply on what I knew the resources for.

:scream:

“Multipath” is a SAS feature—no longer supported by SCALE/CE.
“Multichannel” is for SMB.
“Link aggregation” is the term you’re looking for. But 2*25G is not a 50G link.

I suspect that Crystal DiskMark results are massively helped by RAM caching.
Actual results from 15 clients competing for IOPS on a single vdev are unlikely to match these expectations.

1 Like

Could you elaborate further instead of just an emoji…? Is that bad or good? Is 12 disks not enough for iSCSI?

I understand that I am using link aggregation by bonding two interfaces but is multipathing not just…using multiple paths? I am aware bonding two interfaces together does not produce double throughput / speed. I am just trying to maximize my performance with what I have.

If I move the entirety of one host’s VM’s to my iSCSI datastore, during regular business hours, I see about a third of my performance lost if I run CrystalDiskMark again. I thought these were decent numbers but given your reaction to my post I am starting to think I should just drop iSCSI / TrueNAS all together if I don’t have the appropriate hardware.

Edit: Here is what I changed to switch to round robin in vCenter. Part of the reason I was under the impression I was using MPiO.

image

The path to success for block storage article talks about performance. You have about 76% space used on your current pool. We can’t recommend multiple mirror VDEVs because you would have to change a lot of drives and the %50 used or under for block storage recommendations.

If you read those articles, I think you will look just like that emoji. I don’t think there are any real recommendations we could make for your system except read the articles and plan for a new system. There is also a pool layout whitepaper that talks about IOPS, read, write, etc of the layouts.
iX Systems pool layout whitepaper

That’s 76% reserved for a thick provisioned zvol… that is currently almost empty.
So the recommendation is still be to switch to 6 striped mirrors but use no more than half of the 48 TB space.

Or use SMB/NFS file sharing rather than block storage. But there would still be the question of IOPS.

1 Like

I’ve set up my pool with 6 mirrors like you suggested, and I am pleased with the results. A few things though…

There has to be a better benchmark software that get more real world results / data right? Any recommendations? Sure I could just load up the iSCSI targets with VM’s and monitor performance but that requires waiting…

Lastly, I read that whitepaper that was linked previously in this thread and to be totally honest I don’t understand a lot of it. It shows benchmarks of different setups and explains the function of those setups but I can’t seem to get a “why” from anyone or any source. Like… in human readable terms. Like I said I am not a data engineer and this is maybe 10% of my scope of work so I am not, and don’t plan on becoming a TrueNAS expert. Specifically why is 6 mirrors better than a large 12 disk Raidz2? Like sure I get more redundancy but is that really it? I have other plans for redundancy in place. Like I said, I read the whitepaper, and I still don’t understand the emoji response. A million connections could be made to that reaction in regards to the whitepaper and my post, none of which are useful.

See the examples by workload section, aprox page 9, in that Pool Layout whitepaper. The iSCSI example they use shows needing IOPS. They give three different workloads and sum up the pool choices.

1 Like

Ah I am starting to pick up what you’re putting down. My CrystalDiskMark results aren’t really steering your responses right? Since you’re thinking my RAM may be carrying my IOPs in my results? Thus @etorix recommendation to switch to a more IOPs efficient pool configuration. Halving my storage space kinda sucks but to be honest I probably won’t put a dent in 20Tb for a longgggg time. Most of my VM’s have 128Gb allocate anyways. Thank you for the straight forward guidance.

Yep. You have, for once, a very confortable amount of RAM to handle your iSCSI share. For a write test, ZFS could cache up to two transaction groups before it has to flush to disks—requiring a pause to digest if need be. 2 * 5 s * 25 Gb/s = 250 Gb, i.e. about 25 GB (at 10 bits per byte with headers, so this 1 GiB Crystal test looks like it is not stressful at all.
Having 15 clients competing for the 300 IOPS of a single raidz2 vdev, when each operation actually requires multiple drive access (for the data, for all the metadata of the filsystem in the zvol, and for all ZFS metadata on top of it) would be a lot more stressful, but to simulate that you’d have to play with appropriate options for fio.

Testing ZFS performance easily goes to 8 and above on this scale

2 Likes