TrueNAS SCALE Performance Concern – SSD Mirrors + NFS + Proxmox (20Gbps Bonded)

Hello everyone,

I’m new to TrueNAS and would appreciate some insight from the community. I’ve recently set up a TrueNAS SCALE (version 24.10.0) installation and integrated it with my Proxmox cluster over NFS 4.1. Performance seems lower than expected, and I’d love to hear your thoughts.

Setup Details:

Pools:
Pool 1: 7 × MIRROR (2-wide) — Usable: 24.2TB Pool 2: 4 × MIRROR (2-wide) — Usable: 27.8TB
Drives: SAMSUNG_MZ7LH3T8HMLT-00003 enterprise SSDs

Networking:

TrueNAS box: LACP bond using 2 × 10Gbps SFP+, MTU 9000
Proxmox nodes: Each has a similar bond (2 × 10Gbps SFP+), MTU 9000
Protocol: NFS 4.1 used to share both pools to Proxmox

Performance Testing:
Test Run on the TrueNAS box (local FIO):fio --rw=randrw --rwmixread=50 --bs=4k --iodepth=64 --numjobs=4
–time_based --runtime=60 --direct=1 --ioengine=libaio
–group_reporting --size=1G

Result:
Read: 71.2k IOPS, ~278 MiB/s
Write: 71.2k IOPS, ~278 MiB/s
Latency: ~1.7–1.8 ms

Test Run inside a Proxmox VM (disk on TrueNAS NFS share
Result:
Read: 15k IOPS, ~88 MiB/s
Write: 18k IOPS, ~72 MiB/s
Latency: ~2.1–4.8 ms

Considering I’m using enterprise SSDs in multiple mirror vdevs and bonded 20Gbps links, I was expecting much higher IOPS or throughput.

Is this the expected performance?
Is my NFS config possibly limiting throughput?
Would switching to iSCSI improve performance for VM workloads?
Any sysctl tweaks or ZFS tunables I should look at?

Any suggestions or ideas to help identify or improve the bottleneck would be greatly appreciated!

Thanks in advance

Some things to check/investigate:

  • Have you checked what hashing algorithm you’re using for your LACP bonds? There are a variety available for different use cases, prioritising redundancy, single host throughput or multiple host throughput.
    Not all hashing algorithms for a given switch vendor will be available on TrueNAS, and vice versa, so consult your switch documentation.

  • Is your Proxmox-hosted VM capable of pulling more than the stated figures from any other host on your network?

  • Can you pull data directly from the TrueNAS to the Proxmox host any faster, rather than a guest?

  • Try running network throughput tests between Proxmox host/guest and TrueNAS using perf. This will eliminate the storage and focus on network throughput

Hi WiteWulf thanks for the reply.
I have check my hash polivy is L2+L3 and is it supported.
i migrate the guest vm to all nodes and the results are the same. Also when i go for a fio test from the node also the results is a little better but far away from the expeced.
i run iperf from the nodes to the truenas and i get the full 10G speed (in/out)
my normal traffic to the truenas is about 1Gbps - 1.5Gbps (aggregate from all nodes)

Update ,from the node i can get almost 200k IOPS to the truenas , so the problem is only inside the vms.