Dissapointing iSCSI and NVMEOf Performance :( How to optimise disk / pool performance?

For private I have a powerful NAS 96GB memory 16 core AM4 processor ConnectX4-Nic used at 10G.

However for some not understood reason, I do not manage 10G speed up and down to the NAS when using iSCSI or NVME-oF via TCP (RDMA not supported in the community edition). Using my ‘superfast’ NVME-pool.

With both protocols I can write about 5G towards the NAS and 9G towards my PC. Not extremely bad of course but why o why not 10G up and down!!
(testing using SMB, which is less complicated needs less writes does provide nearly 10G up and down)

I did find two interesting blog posts in this forum:

So I decided to perform the above performance test.

Run a 60-second test of 1MB writes. “how fast could I copy a large, multi-gigabyte video to the system?” If you plan to use it for smaller files, adjust the bs value.
CD to a dataset. /mnt//
sudo fio --ramp_time=5 --gtod_reduce=1 --numjobs=1 --bs=1M --size=100G --runtime=60s --readwrite=write --name=testfile

Test results

Here the result of the test on three different dataset types

• Sata SSD
WRITE: bw=448MiB/s (470MB/s), 448MiB/s-448MiB/s (470MB/s-470MB/s), io=26.3GiB (28.3GB), run=60130-60130msec

• PCIE4 NVME SSD
WRITE: bw=773MiB/s (811MB/s), 773MiB/s-773MiB/s (811MB/s-811MB/s), io=45.4GiB (48.8GB), run=60162-60162msec

• Raid Z1 Pool (4 drives + 2x NVME Special VDEVS
WRITE: bw=413MiB/s (433MB/s), 413MiB/s-413MiB/s (433MB/s-433MB/s), io=24.2GiB (26.0GB), run=60013-60013msec

What really surprised me is the very limited performance differences!! The NVME should outperform the SATA SSD and the RAIDZ1 by very big margin, what is not the case.

I can imagine that the ram-cache is spooling the test :frowning: If so is there a better one?

Bottom line

The bottom line is of course that I expect better performance when using the NVME-based pool in combination with the powerfull cpu.

So I wonder what is the bottle neck and how to get better performance.

Second question is how to specifying the --nr-io-queues and is that a good idea?

When you do FIO tests there is a flag you can use to not use cache, otherwise your test dataset has to be larger than 96GB of ram to assure your not hitting ARC.