Transfer speeds caps to 20mb/s on NVME

Hey everyone, My problem seems to be common, but none of the solutions presented in the other topics were related (or I couldn’t find anything to solve my problem).

I have 2 NVMe SSDs:

  • One with 512 GB
  • And another with 250 GB

My problem is that both of these drives are having the same issue. They are getting an initial transfer spike of 1.0 Gbps, but after a while, they throttle down to ~20 Mbps and lower. The SSD sometimes tries to go higher to ~300 Mbps, but it drops down again.

I’m transferring a single big file of 50 GB as a test (But I also tried to transfer multiple 50 MB files and the result was the same).

I tried using them as STRIPE, individual STRIPE Pools, and in MIRROR, just for debugging, and the result was always the same for both of them.

I have another pool of 4x480 GB SATA SSDs in RAIDZ1, and I’m able to get speeds of a constant ~450 MBps with this one, with no problem (And around ~700 MBps using all disks in STRIPE).

I also tried running iperf3 between TrueNAS and my computer, and my network is not the problem, since I’m also able to get a 1.0 Gbps spike in the beginning of the transfer, as shown above.

Also, is it normal to iperf3 only get ~7.8 Gbps on a 10 Gbps network?

In addition, I also tried the following:

  • Increasing Jumbo Frames on every machine (including my network switch).
  • I tried transferring using Wi-Fi instead of a cable.
  • Changing the Sync type and the Record Size

If anyone knows how to solve this problem, you will have my eternal gratitude.

What are exact models of those two NVMe SSDs?

  • KINGSTON SNVS250G: This is the Kingston NV1 250 GB model.
  • NE-512 2280: This is the PNY CS2241 512 GB model.

iPerf should be hitting higher on a 10Gbps nework, as long as you have nothing else going on at the time. Network needs to be silent.

You left out all the detail of your system and network. TrueNAS Core or TrueNAS Scale, what version?

You need to be very detailed as we can’t guess your setup or how you are testing exactly. MB and Mb make a huge difference.

I apologize for the oversight; I’m quite new to TrueNAS. When I said MB, I actually meant Mb.

My system is as follows:

  • OS Version: 25.04.1
  • Product: Default string
  • Model: Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz
  • Memory: 31 GiB
  • System Serial: Default string
  • NIC: X540-t2

Am I missing any info?

I don’t know these models offhand but I’d check online to see if they are the infamous fast cache / slow cache style: a small fast flash on the front end that is periodically flushed into a much slower flash drive in the rear. Sort of like a SMR drive, except solid state.

Such cached drives can do really well in benchmarks but stink under sustained workflows. Some OEMs like WD were also caught modifying / cheapening their flash lineup during COVID and hoping no one would notice.

The fact that they start fast and slow down is what points to this sort of problem. If the drives are in a mirror, it’ll be the slower drive holding back the faster one.

If it’s not the drives, I’d check your network setup, the cables in particular. I’ve had some pretty bad luck with MikroTik switches featuring SFP ports communicating with SFP+ equipped TrueNAS’ that featured a 1GbE transceiver. Auto negotiation simply wouldn’t happen.

So connect your laptop ideally straight to the NAS to rule out all other parts of the chain and then test your way from there, ideally with random data. Straight zeros or ones are too easily compressed, resulting in bad benchmarks. Good luck!

I’m starting to think that my network could have some bottleneck. Only after writing this topic did I notice the relation between the 7.8 Gbps in iperf and the ~700 Mbps transfer speed in the other pool.

Regarding one drive slowing down the other, I also tested them both separately, and they both had the same results.

What’s bugging me is the 20 Mbps (why so slow??) Even with a 1 Gbps network or a subpar NVMe, wouldn’t it be at least a bit faster than this?

How are you attaching all the drives? What motherboard? Expand @Constantin ‘My system’ to get an idea of the details we are looking for. The network setup is important as is how you are performing the tests. SMB or NFS? Have you checked the drives on a Windows computer using the Manufacturer software.

The board I’m using is a MACHINIST X99 MR9A V1.0 ATX. I’m using the SMB protocol.

I remember trying these NVMe drives on my computer a long time ago, and I think I had no problems with them, but I could be mistaken.

I’ve been delaying having to open the server to remove them, as it’s on a hard-to-access shelf.

I think it’s the drives if another pool in the same server doesn’t have these issues.

If it was the server or network, it would manifest on every transfer attempt. But it only happens with those drives. Seriously, check into their construction. See if they are the infamous dual cache type.

1 Like

That is the worst scenario for me.

I will try to reverse things, removing this drives from the server and installing on my PC, and trying to make the same file transfer, to check if they behave similarly

Those drives are quite low-end/low-spec. Both are DRAM-less and use system RAM, both use slow QLC chips.

  • Kingston NV1 250GB is rated only for 60TBW (terabytes written), which is pretty bad, more like a bad joke. The 1TB version has 243GB SLC cache, not sure about the 250GB version (62GB? Given the QLC-SLC bit-ratio). Write speed after exhausting SLC cache <100MB/s.
  • PNY CS2241 512GB is rated for 160TBW (better but still bad). Write speed after exhausting SLC cache 40-46MB/s.

Now, these SSDs are slow, but shouldn’t be 20 Mbps (2,5 MB/s) slow. Or even 20 MB/s. But perhaps there are many factors there such as usage of the SLC cache, current state of the SSDs and TRIM support, communication with host RAM, PCI-E lanes allocation, etc. The problem is, you are troubleshooting a performance issue with already subpar hardware. One 1TB NVMe TLC SSD with good performance and proven reliability would give you more space, much better performance, and much more durability (such as Kingston KC3000 and many others).

The next thing here is the motherboard. It being an unknown stuff, there is no real and useful documentation of how PCI-Express lanes are allocated. Also, when I check their site here, one of the M.2 slots only supports SATA…? If that is accurate, how is the second one connected?

2 Likes

I have updates on this problem. Sorry it took so long; I had a busy week in the studio, and my main computer just died, so I’m having a rough week.

But I was able to take both of these SSDs out of the server to test in an M.2 to USB-C adapter, and was able to reach 300 to 400Mb/s continuously on a Windows machine (with the exact same file I was testing). I’m now completely lost and with no clue of what it could be.

The last thing that comes to mind is a lack of PCI-E lanes. I’m using a PCI to 6x SATA adapter on the top x1 slot and a x540-t2 NIC on the bottom x16 slot, and using all of the 6 SATA ports on my server motherboard (Machinist X99 MR9A).

Any ideas?

I would review the motherboard PCIe bus block diagram to see if there is something funky going on with the interfaces you’re having trouble with.

I would also look into the bios settings and see if there is something there that is causing your interfaces to slow down - sleep modes and the like. Max it all out and see if that makes a difference.

The problem with that board is that is is an unknown Chinese copy or repurposed board without any support. You will not find any diagrams. They can’t even provide relevant information on their web page. They list the chipset as B85, when it is clearly an X99 board. They list one of the M.2 slots as SATA, but then two NVMe drives would not work - if the original poster doesn’t use a PCI-E card adapter. It is a nightmare to debug problems on hardware like this because you cannot trust anything.

I don’t think it has anything to do with PCI-E lanes. X99-era chipset had PCI-E 3.0 and that means even x1 slot is capable of 1GB/s, much faster than the speeds he is getting. One step lower means zero PCI-E lanes assigned = not working disk.

I re-read the topic from the beginning and I don’t think you tested speed of those SSD directly on TrueNAS…? These would be my next troubleshooting steps:

  1. In TrueNAS GUI, open System > Shell and execute the lspci command. It will give you a short list of devices. Confirm that your slow SSDs are in there. Then execute lspci -vv. It will produce a lot more information. Find your SSDs and confirm their PCI-E lanes bandwidth on the LnkSta line. For example, in my case it is Speed 8GT/s, Width x4 (8GT/s speed is PCI-E 3.0). If possible, copy-paste the output for those two drives here.

  2. Test what are the SSDs capable when being written on from localhost, i.e. we need to separate system and network troubleshooting. I would create an empty volume on the pool residing on one or two SSDs and then proceed with some fio write tests:

# Create an empty volume with compression disabled
zfs create -o compression=off SlowSSDPool/SpeedTest

# Note the mountpoint (e.g., /mnt/SlowSSDPool/SpeedTest)
zfs get mountpoint SlowSSDPool/SpeedTest

# Proceed with sequential write test (you can also test with --iodepth=16)
fio --name=zfs_seq_write_test --directory=/mnt/SlowSSDPool/SpeedTest --size=4G --bs=1M --rw=write --ioengine=libaio --direct=1 --numjobs=1 --iodepth=1 --time_based --runtime=60 --end_fsync=1 --filename=fio_testfile --zero_buffers=0

# Cleanup
zfs unmount SlowSSDPool/SpeedTest
zfs destroy SlowSSDPool/SpeedTest
3 Likes

Here are the results I got:

02:00.0 Non-Volatile memory controller: Kingston Technology Company, Inc. NV1 NVMe SSD [SM2263XT] (DRAM-less) (rev 03) (prog-if 02 [NVM Express])
Subsystem: Kingston Technology Company, Inc. NV1 NVMe SSD [SM2263XT] (DRAM-less)
Physical Slot: 0
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 32 bytes
Interrupt: pin A routed to IRQ 32
NUMA node: 0
IOMMU group: 43
Region 0: Memory at fbe00000 (64-bit, non-prefetchable) [size=16K]
Capabilities:
Kernel driver in use: nvme
Kernel modules: nvme


admin@truenas[~]$ sudo fio --name=zfs_seq_write_test --directory=/mnt/SlowSSDPool/SpeedTest --size=4G --bs=1M --rw=write --ioengine=libaio --direct=1 --numjobs=1 --iodepth=1 --time_based --runtime=60 --end_fsync=1 --filename=fio_testfile --zero_buffers=0
zfs_seq_write_test: (g=0): rw=write, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=libaio, iodepth=1
fio-3.33
Starting 1 process
zfs_seq_write_test: Laying out IO file (1 file / 4096MiB)
Jobs: 1 (f=1): [F(1)][100.0%][eta 00m:00s]
zfs_seq_write_test: (groupid=0, jobs=1): err= 0: pid=32636: Fri Aug 8 17:05:14 2025
write: IOPS=102, BW=103MiB/s (108MB/s)(10.8GiB/107366msec); 0 zone resets
slat (usec): min=181, max=10447, avg=5427.25, stdev=3480.15
clat (nsec): min=1895, max=40670, avg=6620.93, stdev=2772.64
lat (usec): min=183, max=10478, avg=5433.87, stdev=3482.01
clat percentiles (nsec):
| 1.00th=[ 1992], 5.00th=[ 2096], 10.00th=[ 2224], 20.00th=[ 3504],
| 30.00th=[ 7264], 40.00th=[ 7392], 50.00th=[ 7520], 60.00th=[ 7648],
| 70.00th=[ 7776], 80.00th=[ 7904], 90.00th=[ 8096], 95.00th=[ 8256],
| 99.00th=[10304], 99.50th=[20864], 99.90th=[31360], 99.95th=[33536],
| 99.99th=[39680]
bw ( KiB/s): min=106496, max=3837952, per=100.00%, avg=188674.15, stdev=369595.99, samples=119
iops : min= 104, max= 3748, avg=184.25, stdev=360.93, samples=119
lat (usec) : 2=1.22%, 4=19.93%, 10=77.79%, 20=0.51%, 50=0.55%
cpu : usr=0.44%, sys=4.92%, ctx=14727, majf=0, minf=11
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=0,11034,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
WRITE: bw=103MiB/s (108MB/s), 103MiB/s-103MiB/s (108MB/s-108MB/s), io=10.8GiB (11.6GB), run=107366-107366msec

When copy-pasting this stuff, select the whole block of text and add it to the Preformatted text (the “</>” icon. It will keep text formatting intact = better readability.

Is the output from that SSD done using the lspci -vv command (i.e., double v)? If so, the disk possibly provides much less information. For example, this is my output:

lspci output
04:00.0 Non-Volatile memory controller: ADATA Technology Co., Ltd. XPG SX8200 Pro PCIe Gen3x4 M.2 2280 Solid State Drive (rev 03) (prog-if 02 [NVM Express])
        Subsystem: ADATA Technology Co., Ltd. XPG SX8200 Pro PCIe Gen3x4 M.2 2280 Solid State Drive
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 30
        Region 0: Memory at fcf00000 (64-bit, non-prefetchable) [size=16K]
        Capabilities: [40] Power Management version 3
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [50] MSI: Enable- Count=1/8 Maskable+ 64bit+
                Address: 0000000000000000  Data: 0000
                Masking: 00000000  Pending: 00000000
        Capabilities: [70] Express (v2) Endpoint, MSI 00
                DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
                        ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 75W
                DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+
                        RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop- FLReset-
                        MaxPayload 256 bytes, MaxReadReq 512 bytes
                DevSta: CorrErr+ NonFatalErr- FatalErr- UnsupReq+ AuxPwr+ TransPend-
                LnkCap: Port #0, Speed 8GT/s, Width x4, ASPM L1, Exit Latency L1 <8us
                        ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
                LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
                        ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 8GT/s, Width x4
                        TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Range ABCD, TimeoutDis+ NROPrPrP- LTR+
                         10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt- EETLPPrefix-
                         EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
                         FRS- TPHComp- ExtTPHComp-
                         AtomicOpsCap: 32bit- 64bit- 128bitCAS-
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- 10BitTagReq- OBFF Disabled,
                         AtomicOpsCtl: ReqEn-
                LnkCap2: Supported Link Speeds: 2.5-8GT/s, Crosslink- Retimer- 2Retimers- DRS-
                LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                         Compliance Preset/De-emphasis: -6dB de-emphasis, 0dB preshoot
                LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete+ EqualizationPhase1+
                         EqualizationPhase2+ EqualizationPhase3+ LinkEqualizationRequest-
                         Retimer- 2Retimers- CrosslinkRes: unsupported
        Capabilities: [b0] MSI-X: Enable+ Count=16 Masked-
                Vector table: BAR=0 offset=00002000
                PBA: BAR=0 offset=00002100
        Capabilities: [100 v2] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout+ AdvNonFatalErr-
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
                AERCap: First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
                        MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
                HeaderLog: 00000000 00000000 00000000 00000000
        Capabilities: [158 v1] Secondary PCI Express
                LnkCtl3: LnkEquIntrruptEn- PerformEqu-
                LaneErrStat: 0
        Capabilities: [178 v1] Latency Tolerance Reporting
                Max snoop latency: 0ns
                Max no snoop latency: 0ns
        Capabilities: [180 v1] L1 PM Substates
                L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
                          PortCommonModeRestoreTime=10us PortTPowerOnTime=10us
                L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
                           T_CommonMode=0us LTR1.2_Threshold=0ns
                L1SubCtl2: T_PwrOn=10us
        Kernel driver in use: nvme
        Kernel modules: nvme

I was hoping for information like:

  • LnkCap and LnkCap2 (PCI-E link speed capability)
  • LnkSta and LnkSta2 (actual negotiated PCI-E link speed and stability)
  • UESta (Uncorrectable Errors status; should all be with a minus sign)
  • CESta (Correctable Errors status; also should be negative, I have one Timeout error present)

Aside from this, we can analyze fio results. The speed 103MiB/s is expected of this disk after it has exhausted the SLC cache (as measured in this review). The strange thing is that you see from the beginning, as if the SLC cache was not being used at all…? I am not sure that is possible as it should probably be fully transparent for TrueNAS and only the SSD controller was aware of it. Also, because this SSD is DRAM-less, it uses system memory (RAM), and we can check if it really uses it. Run command nvme id-ctrl /dev/nvme0 (change the device path, if necessary) and look for parameters hmpre and hmmin (preferred and minimum required host memory buffer size). On my SSD with its own DRAM, both are 0 and that is fine. You should not have 0 at all.

Now we got all that out of the way, here is the real deal: Your slow SSD speed might be normal after all :thinking:.
I wanted to compare your speeds and latencies to my SSD and guess what. Unexpectedly, I got 162MiB/s result and latencies all over the place same as you. I mean, my “slow” RAID-Z2 pool consisting of 6 SATA HDDs can give me 650MB/s sustained speed. I started digging and found out this can be caused by “sync writes”. These are slow, acknowledged writes to the disk (good for data integrity, bad for speed). This is a very good topic with many benchmarks - with sync writes on/off for comparison. Check also Similar threads at the bottom of that topic. So, these speeds look normal on consumer-grade NVMe drives in sync-write mode.

Edit: All this still doesn’t explain slow SMB transfer. I just tested copying a large file over SMB to my “slow” SSD (the one with 162MiB/s fio result), and the file was being copied at 283MB/s (fully saturating my 2.5Gbit ethernet connection between desktop PC and NAS). So, yeah… we are still back at the beginning :neutral_face:.