Slow speed on 10GB link

Hi, I have configured an Intel 10G 2P X520 on my TrueNAS server.
On each port, I have 2 different Hypervisor, one is VMWare and the other is Proxmox. All my MTU are set to 9000. Everythings is linked via fiber and a 10GB switch.
When I test the speed with an iperf3, all my nodes give my speed of 3.50 Gbits/sec with the TrueNas server. The test between hosts all give me 9.90 Gbits/sec.
At this point, I have no dount the issue is only the the TrueNAS server and not the hosts.
Any ideas of a settings I may have missed ? Or anything else
Thanks

Which results are you expecting? 9.9 Gbits/s?
Please elaborate on your setup, as well your testing.

P.S.: it’s SLOG, not ZLOG as per your signature.

Yes, I expect to have closer to 9.
For my setup, as I explained. I have 4 clients servers, 2 VMWare and 2 Proxmox. They are all connected via fiber and my SX3008F switch. All machines/port are set with an MTU of 9000.
I did iperf3 test from all of them VMWare to VMWare, Proxmox to Proxmox and I have result of around 9.xx Gbits/sec everywhere except when I test the iperf3 with the TrueNAS Scale EEL server. (client or server)
My TrueNAS server have an Intel X520 card. Others are a mix of Intel, Emulex Qlogic, but as they are all reaching around 9 Gbits/sec with my test, I don’t think it’s an hardware “compatibility”. And the adapter known to be the more compatible, the Intel, is in my Scale server.
Do you need more information ? details ?
Thanks

Are those istances virtualized? If so, are they on the same machine?
Where does TN fit inside this: ie, a completely separated machine or is it the host machine?
Which is the iperf3 command you are using?

The 5 machines are physical. And as I already mentioned, all the VMWare/Proxmox host are connected to the TN via a fiber 10G switch.
As for the iperf command, it was standard without special parameters:
iperf3 -s 192.168.200.253 5201 (for server)
iperf3 -c 192.168.200.33 (for client)
And I adapt the IP for the different subnet

The X520 is a very old card that only supports PCIe Gen2. It absolutely needs all 8 lanes for decent performance. Your board has two Gen 3 x16 slots, but they only have 4 or 1 lane. The lack of PCIe lanes is the main problem when not using proper server gear that has 40 or even 128 lanes .

Ether get a better mainboard (no repurposed gaming rig), or get an X540/X550 NIC. These are Gen 3 cards and can have decent performance with only 4 lanes (at least on one port).

PCIe 2.0 is 500 MB/s, x4 is 2000 MB/s, which converted should be around 15 Gbps: unless my math is very wrong, he experiencing around 3.5 Gbps is not a simple lane issue.

@Janus006 did you troubleshoot any further than the iperf3 test? Swapping the cables, etc…

What if the X520 is in the x1 slot?

@Janus006 Can you verify the state of your X520 with lspci?

1 Like

I sure hope he’s using the x4 slot, but you might be on point. :sweat_smile:

@fchk
Hi, as my motherboard is an PRIME B660M-A AC D4, and it have 4 x16 slots, the slowest ones avre PCIe 3.0, the cars is curently inserted in an 3.0 slot.
I’m not sure exactly the information you want when you talk about the status, but here is the export of an lspci -vvv for the X520 ports:

I’m really not an expert in this command, but this part may be interesting ?

LnkCap:Port #0, Speed 5GT/s, Width x8, ASPM L0s, Exit Latency L0s unlimited
ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
LnkCtl:ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
**LnkSta:Speed 5GT/s, Width x1 (downgraded)**     


-------------------------------------------------------------
0000:01:00.1 Ethernet controller: Intel Corporation Ethernet 10G 2P X520 Adapter (rev 01)
Subsystem: Intel Corporation 10GbE 2P X520 Adapter
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin B routed to IRQ 17
IOMMU group: 17
Region 0: Memory at 84580000 (64-bit, non-prefetchable) [size=512K]
Region 2: I/O ports at 6000 [size=32]
Region 4: Memory at 84700000 (64-bit, non-prefetchable) [size=16K]
Expansion ROM at 84500000 [disabled] [size=512K]
Capabilities: [40] Power Management version 3
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold-)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
Address: 0000000000000000  Data: 0000
Masking: 00000000  Pending: 00000000
Capabilities: [70] MSI-X: Enable+ Count=64 Masked-
Vector table: BAR=4 offset=00000000
PBA: BAR=4 offset=00002000
Capabilities: [a0] Express (v2) Endpoint, MSI 00
DevCap:MaxPayload 512 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us
ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0W
DevCtl:CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+
RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ FLReset-
MaxPayload 256 bytes, MaxReadReq 512 bytes
DevSta:CorrErr+ NonFatalErr- FatalErr- UnsupReq+ AuxPwr- TransPend-
LnkCap:Port #0, Speed 5GT/s, Width x8, ASPM L0s, Exit Latency L0s unlimited
ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
LnkCtl:ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta:Speed 5GT/s, Width x1 (downgraded)
TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Range ABCD, TimeoutDis+ NROPrPrP- LTR-
 10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt- EETLPPrefix-
 EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
 FRS- TPHComp- ExtTPHComp-
 AtomicOpsCap: 32bit- 64bit- 128bitCAS-
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- 10BitTagReq- OBFF Disabled,
 AtomicOpsCtl: ReqEn-
LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete- EqualizationPhase1-
 EqualizationPhase2- EqualizationPhase3- LinkEqualizationRequest-
 Retimer- 2Retimers- CrosslinkRes: unsupported
Capabilities: [e0] Vital Product Data
Product Name: X520 10GbE Controller
Read-only fields:
[PN] Part number: G73131
[MN] Manufacture ID: 1028
[V0] Vendor specific: FFV15.0.27
[V1] Vendor specific: DSV1028VPDR.VER1.0
[V3] Vendor specific: DTINIC
[V4] Vendor specific: DCM10010081D521010081D5
[V5] Vendor specific: NPY2
[V6] Vendor specific: PMT12345678
[V7] Vendor specific: NMVIntel Corp
[RV] Reserved: checksum good, 0 byte(s) reserved
End
Capabilities: [100 v1] Advanced Error Reporting
UESta:DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk:DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt:DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta:RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
CEMsk:RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
AERCap:First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
HeaderLog: 00000000 00000000 00000000 00000000
Capabilities: [140 v1] Device Serial Number a0-36-9f-ff-ff-4b-01-a8
Capabilities: [150 v1] Alternative Routing-ID Interpretation (ARI)
ARICap:MFVC- ACS-, Next Function: 0
ARICtl:MFVC- ACS-, Function Group: 0
Capabilities: [160 v1] Single Root I/O Virtualization (SR-IOV)
IOVCap:Migration- 10BitTagReq- Interrupt Message Number: 000
IOVCtl:Enable- Migration- Interrupt- MSE- ARIHierarchy- 10BitTagReq-
IOVSta:Migration-
Initial VFs: 64, Total VFs: 64, Number of VFs: 0, Function Dependency Link: 01
VF offset: 128, stride: 2, Device ID: 10ed
Supported Page Size: 00000553, System Page Size: 00000001
Region 0: Memory at 00000060e0100000 (64-bit, prefetchable)
Region 3: Memory at 00000060e0000000 (64-bit, prefetchable)
VF Migration: offset: 00000000, BIR: 0
Kernel driver in use: ixgbe
Kernel modules: ixgbe
---------------------------------------------------

Physically, they are x16 slots, that means all type of cards fit.

Internally, they do not have the full number of PCI lanes, you have x16 for the graphic cards slot, one x4 and two x1.

Many boards have x1 and x4 sized slots, so you can see that immediately, with server boards often having open slots and leave the area behind unused.

2 Likes

Moving the X520 to slot 2 should solve your issue.

2 Likes

Thank you very much for pointing me this specification. I didn’t remark this when I bought the card. (a bit disapointed)
As you pointed, it seems the theorical speed of an PCIe 3.0 x4 max spped is 3.9 MB/sec, exactly what I have.

Curently, my X520 is in the slot 2
Slot 1 is empty
Slot 2 is used by my x520
Slot 3 is empty
Slot4 is used by my HBA

I think I will have to make choices, speed for disks or network. Anyway, I will have a bottleneck somewhere, and it will all affect the data access. As my disks are almost HDD, I think the speed will be for the x520. (Am I right ? Any better idea ???)

As i’m curently on another project, this will give me time to think and plan.

It’s been a long time since I’ve had such support from the community in general. VERY appreciated

Any reason you’re not using the full 16 lane slot for something? And which HBA? Using a x1 slot for a PCIe gen 2 HBA isn’t going to do you many favors for speed.

And welcome to modern consumer motherboards. Almost all the cpu/chipset lanes are being designated for NVME storage.

Because previouly, this slot was used by my P600 video card. When I removed it, I didn’t read the spec corectly and I didn’t know that all the slot were not all x16.
I now switched my 10G card to slot 1 and my hba to slot 2. I have a nice 10gb speed on iperf

Thanks all for your help. Telling me I was to “stupid” to read the spec corectly. I will take care next time before buying.

That’s great! And most people only ever add a GPU to their home computers, so they never know much about actual lanes on different slots, let alone how old server parts are probably wired x8 and older PCIe gen.