TrueNAS Scale - LOVE IT, still getting same performance as ZFS on Ubuntu

Good day to everyone! I am running into a bottleneck mainly on writes and can’t seem to isolate it down and looking for help and guidance. I have read many and I mean many forums and spent the last few weeks and 10’s of hours investigating this and losing my hair fast. Here’s my setup:

- ZFS + Ubuntu builds, 128K record size:

  • Build #1: Cisco UCS C220-M4, 2x Intel Xeon 2.5Ghz 12 core, 32GB RAM 2133Mhz, 2x 120GB SATA in RAID-1 for Ubuntu (SLOG and L2ARC but different partitions), 1 vdev 6x 960GB Samsung SSDs
  • Build #2 and #3: Cisco UCS C240-M4, 2x Intel Xeon 2.4Ghz 14 core, 256GB RAM 2400Mhz, 2x 480GB SATA in RAID-1 for Ubuntu (SLOG and L2ARC but different partitions), 2x vdev 11x 1.8TB 10K Seagate Enterprise drives

- TrueNAS build, 128K record size:

  • Cisco UCS C240-M4, 2x Intel Xeon 2.4Ghz 6 core, 128GB RAM 2400Mhz, 2x 120GB SATA in RAID-1 for TrueNAS (managed by TrueNAS not RAID controller), 2x vdev 11x 1.8TB 10K Seagate Enterprise drives, 2x 960GB Samsung SSDs for SLOG (mirror) and L2ARC

- Network setup for several hundred VMs:

Cisco B-200 servers running ESXi 7.0 U3n → Cisco 6248 FI’s → Nexus 5548 → 2x 10G to ZFS/TrueNAS storage

  • vCenter to manage all ESXi hosts
  • iSCSI to storage with multi-pathing enabled including round robin to 1 IO per path change instead of the default of 1,000, each iSCSI path has its own dedicated VLAN and subnet

- Reads are super fast with all the ARC and L2ARC, so my issue… of the 4 builds listed above I am having almost the same “real-world” write throughput limit in my virtualized environment where I max out at ~55MB/s on write bandwidth. This value is confirmed by taking 55MB/s and subtracting it by the average write bandwidth from command “zpool iostat -v” for the entire pool; any new file transfers to the storage in question or file unzipping/uncompressing will get the remaining throughput. For example 55MB/s - current average of 40MB/s == 15MB/s on a file transfer/unzip, lower current average will allow faster file activity while higher average will result in slower file activity. I did a check with command “zpool iostat -pr” and can see my sync and async writes are ONLY from 4k - 1M blocks and no smaller or larger. I have tested this against Windows and Linux virtual machines and no difference. What really is confusing is with an ALL flash storage as mentioned in build #1 with 1x vdev and 6x 960GB SSDs is getting the same performance as the other 3 builds.

  • I have removed SLOG from SSDs and allowed ZIL to be on the spinning disks, no change
  • I have moved storage to 6248 FI’s bypassing the Nexus 5548’s, no change
  • I have removed an ESXi host from vCenter and seen a 30% increase in writes to 71MB/s but I think its related to vCenter causing additional delays on cluster sharing of resources or even DRS, so in vCenter I did try to crank the storage IO limit from 100k to 500k and no change there either
  • ATTO testing with 256 queue depth confirms multipathing reads and writes can easily saturate 2x 10Gbps links with 64k and higher block size, any lower gets a mix of a few hundred MB/s but still tons of IO/s
  • I even built a single vdev with 4x 1.8TB Seagate drives, I would have imagined this would have really hurt write performance - but no change
  • I have tried different records sizes of 32k, 64k, extent block size should be 4k if I’m not mistaken with ESXi 6.7 and later and its 4k minimums but did try 512 - 4k and again… no change
  • I did try different metadata sizes 32k - 1M but no change

Would love to know what troubleshooting steps I could take to find the source of this issue or if there’s a best practice for this sort of setup where I may have overlooked a simple setting in ESXi or ZFS/TrueNAS. If you made it this far - THANK YOU and look forward to chatting!

Follow up to this post. I did some more reading and saw posts that shared record size of 1M having better throughput results with a virtualized environment so I gave it a whirl and really didn’t see any difference - but I kept the 1M record size as it just felt like the right thing to do long term. After that I got the energy to spin up 3 different virtual machines running Windows 10. And with all 3 virtual machines powered on and running I was able to on 2 of the virtual machines uncompress the same file as well as conduct an ATTO benchmark test at the same time. Network traffic did take up a majority of the 2x 10Gbps links for a period of time; however, what is interesting is I was in fact able to uncompress the same file at around 55MB/s for each of the 2 virtual machines while the ATTO test was still going on in the background testing against a 4GB file with block sizes 4k - 2M. So it appears that this ceiling of 55MB/s I have been chasing for quite some time really was derived from limitations on physics per virtual machine and looks like I have ruled out anything on the TrueNAS side and wanted to share that with the community if anyone else runs into the same issue and questions what the root cause is. Just to share, I did disable DRS on the storage cluster and made little to no impact to the throughput, I did confirm the storage disk policy was the default an no other rouge one. I took my testing rampage a step further and ran the ATTO benchmark test on 2 of the virtual machines at the same time and copied over approximately 15+GB of random files to TrueNAS storage the storage itself should in theory (cache/write througput/ect) be saturated with those 2 tests being ran at the same time and the storage shouldn’t be able to handle anymore - well to my surprise I was able to get about 100-250MB/s transfer speeds while all these bits of data was bring processed. I am thoroughly impressed by how TrueNAS Scale was able to handle and process the workload I was throwing at it. CPU was up to 50% for a while and temperature went from a cool 40 degrees Celsius to a slightly warm 50 degrees but still hasn’t dropped back to idle temperatures even after a few hours so I knew I put the load on there.

  • I am happy to report that my issue was pretty much a non-issue but my investigation into that 55MB/s uncompress ceiling will continue a little longer as I’m still fleshing out some VMWare features. Have a great day to all those that read this and I hope this helps anyone else trying to get some answers - its not TrueNAS :wink:

Its important to first test storage bandwidth with something like fio… it gets rid of any single threaded issues with file copy or perhaps compression/decompress.

If fio has a problem, then there really might be a storage problem. If not, its likely to be something else.