Iperf3 performance slow in one direction

When testing the network performance using iperf3 running it as the server on my desktop and connecting to it from TrueNAS I get decent speed but going the other way I get abysmal results. Why would this be? Something with MSS saying 0?

  TCP MSS: 1460 (default)

[ 5] local 192.168.1.45 port 50801 connected to 192.168.1.30 port 5201
Starting Test: protocol: TCP, 1 streams, 131072 byte blocks, omitting 0 seconds, 10 second test, tos 0
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 899 MBytes 7.54 Gbits/sec 45 145 KBytes
[ 5] 1.00-2.00 sec 923 MBytes 7.74 Gbits/sec 55 135 KBytes
[ 5] 2.00-3.00 sec 928 MBytes 7.78 Gbits/sec 93 138 KBytes
[ 5] 3.00-4.00 sec 908 MBytes 7.62 Gbits/sec 29 136 KBytes
[ 5] 4.00-5.00 sec 951 MBytes 7.97 Gbits/sec 53 194 KBytes
[ 5] 5.00-6.00 sec 910 MBytes 7.64 Gbits/sec 19 118 KBytes
[ 5] 6.00-7.00 sec 902 MBytes 7.56 Gbits/sec 31 140 KBytes
[ 5] 7.00-8.00 sec 967 MBytes 8.11 Gbits/sec 15 208 KBytes
[ 5] 8.00-9.00 sec 1017 MBytes 8.53 Gbits/sec 20 147 KBytes
[ 5] 9.00-10.00 sec 932 MBytes 7.81 Gbits/sec 30 208 KBytes


Test Complete. Summary Results:
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 9.12 GBytes 7.83 Gbits/sec 390 sender
[ 5] 0.00-10.00 sec 9.12 GBytes 7.83 Gbits/sec receiver
CPU Utilization: local/sender 82.4% (30.0%u/52.4%s), remote/receiver 6.3% (1.9%u/4.4%s)
snd_tcp_congestion newreno

  TCP MSS: 0 (default)

[ 4] local 192.168.1.30 port 63524 connected to 192.168.1.45 port 5201
Starting Test: protocol: TCP, 1 streams, 131072 byte blocks, omitting 0 seconds, 10 second test
[ ID] Interval Transfer Bandwidth
[ 4] 0.00-1.00 sec 23.8 MBytes 199 Mbits/sec
[ 4] 1.00-2.00 sec 27.8 MBytes 232 Mbits/sec
[ 4] 2.00-3.01 sec 34.6 MBytes 289 Mbits/sec
[ 4] 3.01-4.01 sec 36.1 MBytes 302 Mbits/sec
[ 4] 4.01-5.01 sec 37.6 MBytes 316 Mbits/sec
[ 4] 5.01-6.01 sec 41.9 MBytes 353 Mbits/sec
[ 4] 6.01-7.01 sec 41.2 MBytes 345 Mbits/sec
[ 4] 7.01-8.00 sec 46.8 MBytes 396 Mbits/sec
[ 4] 8.00-9.00 sec 37.9 MBytes 317 Mbits/sec
[ 4] 9.00-10.00 sec 29.1 MBytes 245 Mbits/sec


Test Complete. Summary Results:
[ ID] Interval Transfer Bandwidth
[ 4] 0.00-10.00 sec 357 MBytes 299 Mbits/sec sender
[ 4] 0.00-10.00 sec 357 MBytes 299 Mbits/sec receiver
CPU Utilization: local/sender 4.1% (1.7%u/2.4%s), remote/receiver 4.5% (1.1%u/3.3%s)

The desktop runs Windows?

Yes

Do you have networking problems between the two or is this just the test issue?

In the Windows machine; network adapter properties; turn off Receive Segment Coalescing. See if that helps.

Who makes the network interfaces?

It was already disabled. They are both Mellanox ConnectX-3 in each machine.

It it something with that one Windows machine with the Mellanox ConnectX-3 in it. From another Windows machine with a 1g onboard nic I get this:
TCP MSS: 1460 (default)
[ 5] local 192.168.1.232 port 64074 connected to 192.168.1.45 port 5201
Starting Test: protocol: TCP, 1 streams, 131072 byte blocks, omitting 0 seconds, 10 second test, tos 0
[ ID] Interval Transfer Bitrate
[ 5] 0.00-1.00 sec 103 MBytes 861 Mbits/sec
[ 5] 1.00-2.01 sec 115 MBytes 956 Mbits/sec
[ 5] 2.01-3.02 sec 113 MBytes 944 Mbits/sec
[ 5] 3.02-4.01 sec 112 MBytes 949 Mbits/sec
[ 5] 4.01-5.01 sec 113 MBytes 949 Mbits/sec
[ 5] 5.01-6.01 sec 113 MBytes 950 Mbits/sec
[ 5] 6.01-7.01 sec 113 MBytes 949 Mbits/sec
[ 5] 7.01-8.00 sec 113 MBytes 950 Mbits/sec
[ 5] 8.00-9.01 sec 114 MBytes 949 Mbits/sec
[ 5] 9.01-10.00 sec 112 MBytes 949 Mbits/sec


Test Complete. Summary Results:
[ ID] Interval Transfer Bitrate
[ 5] 0.00-10.00 sec 1.10 GBytes 941 Mbits/sec sender
[ 5] 0.00-10.02 sec 1.10 GBytes 939 Mbits/sec receiver
CPU Utilization: local/sender 0.0% (0.0%u/0.0%s), remote/receiver 1.0% (0.5%u/0.6%s)

No experience with the CX-3. I used a CX-4 to replace a QLogic Fastlinq and had miserable performance in just one direction and only with small block QD1 I/O. I found a tuning knob (the one I mentioned above) which worked around the problem but the real fix was netsh int ip reset and netcfg -d to nuke and pave the entire Windows networking stack.

There must have been residue from the previous NIC or perhaps a knob I’d twisted and forgotten about.