Where might be the bottleneck?

I’m seeing what looks like a performance bottleneck. I’m curious if others with more experience have any suggestions on where to look.

Background:
Hardware 12-core Xeon with 196GB of RAM running Proxmox VE
TrueNAS Scale 24.10.1 running as a VM under Proxmox with 96GB of RAM
Array 1: 6x18TB Exos RAIDZ1
Array 2: 6x16TB Exos RAIDZ1

When copying large files from one array to another I get about 600MB/s. However, when copying:

  1. From TrueNAS ARRAY1 or ARRAY2 to Windows desktop (10Gbps) via SMB - a solid 2.8-2.9Gbps (low utilization on local SSD)
  2. From TrueNAS ARRAY1 or ARRAY2 to another Linux VM on same box as TrueNAS via NFS - a solid 2.8-2.9Gps
  3. From TrueNAS ARRAY1 AND ARRAY2 to desktop (10Gbps connected) via SMB - still 2.8-2.9Gbps total throughput (low utilization on local SSD)

The 2.8-2.9Gps is consistent. It is adequate for my needs, but I’m curious where the bottleneck is.

BTW iperf between desktop and a VM on the server is 9.8Gbps. iperf between another VMs on the server is ~20Gbps.

You need to identify your hardware to be more than what cpu you are using. The devil is in the details. What NIC are you using?

While I’m not a networking expert, providing more details will likely attract others to answer your conserns.

I’d be interested to see if those numbers changed when moving data from the clients directly to the servers as opposed to server to server via client.

Ignore me just realised this is virtualised.

Server CPU = Xeon E5-2650 v4 @ 2.2Ghz (12-core)
Desktop CPU = AMD Ryzen 9 3900X (12-core)

Server NIC = Intel X520-2. Desktop NIC = Intel X520-1.

Thanks. The really odd thing is that it is so consistently pegged at that speed, and that kicking off a second robocopy from the second array doesn’t change the throughput at all. That implies a network bottleneck, yet iperf doesn’t show that…

  1. For 6-wide 18TB RAIDZ the general recommendation seems to be that you should really be doing RAIDZ2.

  2. Run iperf to confirm what your network speed end-to-end is and separate it out from disk performance issues.

  3. You don’t give much detail as to why you virtualised TrueNAS under Proxmox rather than running TrueNAS bare metal. But if TrueNAS containers & VMs will give you what you need you might be better off switching.

  4. I am assuming that you are passing through the 12x Exos drives AND passing through your HBA(s) and blacklisting the HBAs. But if not, please describe what you are doing?

  5. I would agree that 5x Exos HDDs should allow you to read an aggregate of more than 350MB/s and that there is likely to be a bottleneck somewhere.

1 Like

It could just be Windows tuning, at least in part.

#1 - I agree RAIDZ2 - but that shouldn’t be a performance issue - but rather a concern of a second drive failure during rebuild from a single drive failure - the data is backed up so I’m not concerned.

#2 - iperf consistently shows 9.8Gbps over 10Gbps links between all concerned.

#3 - I prefer Proxmox for virtualization.

#4 - yes. LSI HBA is passed-through to the TrueNAS VM.

#5 - agreed, hence the post. It is odd that I hit a wall at 2.8-2.9Gbps - consistently.

@NickF1227 I read the article and tried the settings within. Thanks, but not change. Also important to note that I hit the same bottleneck doing a transfer over NFS to a Linux VM on the same proxmox host - exact same speed. It’s so odd… It led me to believe it was bottleneck of the array. But when I copy array to array I easily get 600MBs+

How are you copying the files? How are you measuring? What are the files, are they the same files used in your other tests, or are they differant?

How are you measuring? Like the little status box thing when you copy a file in Explorer?

I’m not sure what you mean by AND here vs OR above. Can you clarify?

What are your client mount options?
Are these on the same host as the TrueNAS, or are they going out to the switch and into another Proxmox system? In either case, it would be helpful to see a physical and logical diagram of the connected systems, switches, etc.

You don’t mention iPerf between your desktop and the TrueNAS. Iperf is available on the TrueNAS if you didnt already know. Can you test this?

If you have the time… This will give us a more accurate idea of what LOCAL pool performance looks like, which will prove beyond a reasonable doubt that the bottleneck is in the network/environment/protocol or not.

When copying from one array to the other I was using rsync directly on truenas. When copying from truenas to the desktop I was using robocopy. To clarify, when I said copying from ARRAY1 AND ARRAY2 the throughput didn’t change, I mean that I kicked off a robocopy from ARRAY1 peaked at 2.8Gpbs (according to task manager) then kicked off a robocopy from ARRAY2 while the first robocopy was still going and the throughput did not change at all - still pegged at 2.8Gbps.

When I tested from a VM on the same host as TrueNAS I did an rsync over NFS and still only got 2.8Gbps.

When I do an iperf from my desktop to the truenas VM I get consistently 9.8Gbps. When I do an iperf from a linux VM on the same host I get ~20Gbps…

Thanks.

I think you should start by testing local performance on the pools as a next step.

Just ran fio:

ARRAY1:
WRITE: bw=561MiB/s (589MB/s), 561MiB/s-561MiB/s (589MB/s-589MB/s), io=33.1GiB (35.5GB), run=60344-60344msec
READ: bw=3528MiB/s (3699MB/s), 3528MiB/s-3528MiB/s (3699MB/s-3699MB/s), io=88.0GiB (94.5GB), run=25558-25558msec

ARRAY2:
WRITE: bw=669MiB/s (702MB/s), 669MiB/s-669MiB/s (702MB/s-702MB/s), io=39.3GiB (42.1GB), run=60072-60072msec
READ: bw=3352MiB/s (3515MB/s), 3352MiB/s-3352MiB/s (3515MB/s-3515MB/s), io=87.9GiB (94.4GB), run=26846-26846msec

FIO is infinitely configurable and you dont show what you ran exactly. It would also be helpful to get a zfs get all for the dataset you ran it in.

I suspect the sample size of data run by FIO was small enough to be cached in RAM (ARC Cache) and that your pools are not quite as fast as that benchmark would make it seem. This is why I posted my benchmark specifically, because it’s known to mitigate that effect some.

Excellent point. My dataset size was too small so caching was definitely playing a role. Re-ran:

fio --ramp_time=5 --gtod_reduce=1 --numjobs=1 --bs=1M --size=500G --runtime=300s --readwrite=read --name=testfile
READ: bw=875MiB/s (918MB/s), 875MiB/s-875MiB/s (918MB/s-918MB/s), io=256GiB (275GB), run=300053-300053msec

These numbers make more sense for the number of drives I have. Still doesn’t explain the robocopy bottleneck…

If you only want to measure disk, you can turn off ARC caching for data and / or metadata temporarily when you run the test and turn it on again immediately afterwards.

1 Like

output.txt (11.6 KB)
zfs get all output…