SSD pool extremely slow

I have 2 Nvme SSD pools on my NAS, and both are getting well below expected performance levels. How do I optimize them?

System Specs:
Ryzen 5950X, 128GB RAM

Pool 1: 2x Crucial P5 Plus, Mirror, Expected to get 14GB/s read and 5GB/s write, Actual performance is about 900MB/s both read and write. VMs on this pool also take extremely long to boot, sometimes >1min for my Windows 10 VM.

Pool 2: 3x Kioxia Exceria G2, Raid Z1, Expected to get about 4GB/s read and 4GB/s write, Actual is about 2.5GB/s for both read and write.

Compression is LZ4 for both pools.

What should I change to get better performance out of these pools?

Edit: Performance was tested using hdparm running directly on the TrueNAS host without any virtualization. The numbers are sequential access speeds.

SSDs are attached in the following manner:
One P5 Plus is Directly attached to CPU via a 4.0x4 m.2 slot
The other 4 SSDs are attached through the same PLX88024 PCIe 4.0 Switch. Each SSD has 4x to the switch, and the Switch has 4.0x8 direct connection to the CPU.

Motherboard is a ASUS B550 ProArt board
RAM is running at the rated speed of all the modules, 3200MT/s

You will need to specify exactly how you are testing your drives. You need to specify your hardware and how the NVMe drives are physically connected to the computer.

There is a lot that goes into performance testing.

Have you read this:

1 Like

Updated original post with how the numbers were obtained and how the SSDs are connected.

What are the exact commands you are using?

You should probably be using fio as your benchmark tool. This does work well for ZFS pools.

From the CLI run fio --ramp_time=5 --gtod_reduce=1 --numjobs=1 --bs=1M --size=100G --runtime=60s --readwrite=write --name=testfile

This will run for 60 seconds, generate a large file called testfile, and generate a nice little report.

These are my results from my NVMe system.

root@truenas:/mnt/farm/scripts/DS_Logs2# fio --ramp_time=5 --gtod_reduce=1 --numjobs=1 --bs=1M --size=100G --runtime=60s --readwrite=write --name=testfile
testfile: (g=0): rw=write, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=psync, iodepth=1
fio-3.33
Starting 1 process
testfile: Laying out IO file (1 file / 102400MiB)
Jobs: 1 (f=1): [W(1)][100.0%][w=1085MiB/s][w=1085 IOPS][eta 00m:00s]
testfile: (groupid=0, jobs=1): err= 0: pid=420167: Mon Sep 22 13:46:37 2025
  write: IOPS=1058, BW=1059MiB/s (1110MB/s)(62.1GiB/60032msec); 0 zone resets
   bw (  MiB/s): min=  973, max= 1122, per=100.00%, avg=1059.11, stdev=30.62, samples=120
   iops        : min=  973, max= 1122, avg=1059.07, stdev=30.64, samples=120
  cpu          : usr=0.38%, sys=11.89%, ctx=150339, majf=0, minf=38
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,63563,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=1059MiB/s (1110MB/s), 1059MiB/s-1059MiB/s (1110MB/s-1110MB/s), io=62.1GiB (66.7GB), run=60032-60032msec

Results for the Kioxia 4TB pool:
Write:

admin@truenas[/mnt/Apps2]$ sudo fio --ramp_time=5 --gtod_reduce=1 --numjobs=1 --bs=1M --size=100G --runtime=60s --readwrite=write --name=testfile 
testfile: (g=0): rw=write, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=psync, iodepth=1
fio-3.33
Starting 1 process
Jobs: 1 (f=1): [W(1)][87.8%][w=2353MiB/s][w=2353 IOPS][eta 00m:06s]
testfile: (groupid=0, jobs=1): err= 0: pid=2043838: Tue Sep 23 06:18:46 2025
  write: IOPS=2350, BW=2350MiB/s (2464MB/s)(85.7GiB/37351msec); 0 zone resets
   bw (  MiB/s): min= 1912, max= 3688, per=99.96%, avg=2349.09, stdev=289.97, samples=74
   iops        : min= 1912, max= 3688, avg=2348.99, stdev=289.97, samples=74
  cpu          : usr=1.77%, sys=39.95%, ctx=95720, majf=0, minf=38
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,87776,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=2350MiB/s (2464MB/s), 2350MiB/s-2350MiB/s (2464MB/s-2464MB/s), io=85.7GiB (92.0GB), run=37351-37351msec

Read:

admin@truenas[/mnt/Apps2]$ sudo fio --ramp_time=5 --gtod_reduce=1 --numjobs=1 --bs=1M --size=100G --runtime=60s --readwrite=read --name=testfile
testfile: (g=0): rw=read, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=psync, iodepth=1
fio-3.33
Starting 1 process
Jobs: 1 (f=1): [R(1)][96.7%][r=1886MiB/s][r=1886 IOPS][eta 00m:02s]
testfile: (groupid=0, jobs=1): err= 0: pid=2044764: Tue Sep 23 06:19:52 2025
  read: IOPS=1859, BW=1860MiB/s (1950MB/s)(95.0GiB/52280msec)
   bw (  MiB/s): min= 1496, max= 2096, per=100.00%, avg=1859.86, stdev=133.07, samples=104
   iops        : min= 1496, max= 2096, avg=1859.81, stdev=133.05, samples=104
  cpu          : usr=0.18%, sys=89.80%, ctx=32491, majf=0, minf=37
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=97235,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=1860MiB/s (1950MB/s), 1860MiB/s-1860MiB/s (1950MB/s-1950MB/s), io=95.0GiB (102GB), run=52280-52280msec

2TB Crucial Pool

Write:

admin@truenas[/mnt/App]$ sudo fio --ramp_time=5 --gtod_reduce=1 --numjobs=1 --bs=1M --size=100G --runtime=60s --readwrite=write --name=testfile
testfile: (g=0): rw=write, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=psync, iodepth=1
fio-3.33
Starting 1 process
testfile: Laying out IO file (1 file / 102400MiB)
Jobs: 1 (f=1): [W(1)][100.0%][w=172MiB/s][w=172 IOPS][eta 00m:00s] 
testfile: (groupid=0, jobs=1): err= 0: pid=2048662: Tue Sep 23 06:23:31 2025
  write: IOPS=695, BW=695MiB/s (729MB/s)(40.8GiB/60129msec); 0 zone resets
   bw (  KiB/s): min=124928, max=1708032, per=100.00%, avg=713325.61, stdev=582429.86, samples=120
   iops        : min=  122, max= 1668, avg=696.57, stdev=568.78, samples=120
  cpu          : usr=0.52%, sys=9.70%, ctx=43043, majf=0, minf=41
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,41817,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=695MiB/s (729MB/s), 695MiB/s-695MiB/s (729MB/s-729MB/s), io=40.8GiB (43.8GB), run=60129-60129msec

Read:

admin@truenas[/mnt/App]$ sudo fio --ramp_time=5 --gtod_reduce=1 --numjobs=1 --bs=1M --size=100G --runtime=60s --readwrite=read --name=testfile
testfile: (g=0): rw=read, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=psync, iodepth=1
fio-3.33
Starting 1 process
Jobs: 1 (f=1): [R(1)][93.5%][r=3386MiB/s][r=3386 IOPS][eta 00m:03s]
testfile: (groupid=0, jobs=1): err= 0: pid=2050283: Tue Sep 23 06:25:13 2025
  read: IOPS=2539, BW=2539MiB/s (2663MB/s)(93.7GiB/37803msec)
   bw (  MiB/s): min= 1342, max= 3414, per=99.77%, avg=2533.47, stdev=633.20, samples=75
   iops        : min= 1342, max= 3414, avg=2533.44, stdev=633.19, samples=75
  cpu          : usr=0.20%, sys=99.15%, ctx=625, majf=0, minf=37
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=95994,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=2539MiB/s (2663MB/s), 2539MiB/s-2539MiB/s (2663MB/s-2663MB/s), io=93.7GiB (101GB), run=37803-37803msec

Where are you getting this estimated speed from? This looks more like the manufacturer specs for a single device not the spec for a ZFS pool. 14GB for a mirror to read looks like a theoretical maximum of both drives (7GB + 7GB), which is not how it actually works. We all wish it did but it doesn’t.

As for the Crucial P5 drives, the maximum speed for write is 1800MB/sec once the onboard cache is full. That is a huge distinction. Cache are great for small pieces of data but 200GB (making an assumption here) fills up quick and the NVMe chips still write at 1800MB/sec.

And while I may be wrong on some of the specifications for your drives, this is still the way things work in the real world.

Have you read this white paper from the company, it helps, also read the thread as there is some good information there as well?

And one more thing to note with respect to write speeds, TrueNAS has TRIM turned off by default as far as I’m aware, you may manually run trim and the run the write test again and the values may improve.

This was posted in another thread recently by one of the regular forum users (forgot which user). You can use a cron job to run
zpool trim yourPool as root on a weekly basis.

Those performance numbers look fairly reasonable for actual real-world throughput when considering the consumer NAND on the Crucials, the topology, and the PLX chip adding a bit of latency.

Question - have you checked thermals on the drives?

1 Like