TrueNAS Performance Benchmarking/Troubleshooting

I’ve seen numerous posts that discuss specific troubleshooting tips for various performance issues, but I’m curious if there is an all-encompassing process for benchmarking performance or troubleshooting performance issues.

I’m currently having some pretty significant performance issues on my TrueNAS box and while I have dabbled around in storage for a long time, it’s not where my expertise is. Would love to see a “guide” to troubleshooting or benchmarking TrueNAS if one exists.

There’s some high level guidance in this blog.

I came across the TN-Bench tool that was created and thought I would give it a shot for an initial benchmark/overview and to hopefully highlight any issues that may be underlying. I haven’t dug into the results in detail, but from a quick peek I don’t see any glaring issues. I’m curious if anyone sees anything that I’m overlooking. Note: da2 and da10 are bad disks and not part of any pools, however it seems that there is no indication in this testing that da10 had any issues from what I see.

root@sannas[~]# git clone https://github.com/nickf1227/TN-Bench.git && cd TN-Bench && python3 truenas-bench.py
Cloning into 'TN-Bench'...
remote: Enumerating objects: 143, done.
remote: Counting objects: 100% (7/7), done.
remote: Compressing objects: 100% (4/4), done.
remote: Total 143 (delta 3), reused 3 (delta 3), pack-reused 136 (from 1)
Receiving objects: 100% (143/143), 75.04 KiB | 5.77 MiB/s, done.
Resolving deltas: 100% (61/61), done.

###################################
#                                 #
#          TN-Bench v1.05         #
#          MONOLITHIC.            #
#                                 #
###################################
TN-Bench is an OpenSource Software Script that uses standard tools to Benchmark your System and collect various statistical information via the TrueNAS API.

TN-Bench will make a Dataset in each of your pools for the purposes of this testing that will consume 20 GiB of space for every thread in your system during its run.

After which time we will prompt you to delete the dataset which was created.
###################################

WARNING: This test will make your system EXTREMELY slow during its run. It is recommended to run this test when no other workloads are running.

NOTE: ZFS ARC will also be used and will impact your results. This may be undesirable in some circumstances, and the zfs_arc_max can be set to 1 (which means 1 byte) to prevent ARC from caching.

NOTE: Setting it back to 0 will restore the default behavior, but the system will need to be restarted!
###################################

Would you like to continue? (yes/no): yes

### System Information ###
Field                 | Value                                   
----------------------+-----------------------------------------
Version               | TrueNAS-13.0-U6.7                       
Load Average (1m)     | 0.7080078125                            
Load Average (5m)     | 0.8408203125                            
Load Average (15m)    | 0.7978515625                            
Model                 | Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz
Cores                 | 24                                      
Physical Cores        | N/A                                     
System Product        | Icebreaker 4824                         
Physical Memory (GiB) | 31.92                                   

### Pool Information ###
Field      | Value         
-----------+---------------
Name       | Lightning     
Path       | /mnt/Lightning
Status     | ONLINE        
VDEV Count | 1             
Disk Count | 5             

VDEV Name  | Type           | Disk Count
-----------+----------------+---------------
N/A         | RAIDZ2         | 5

### Disk Information ###
###################################

NOTE: The TrueNAS API will return N/A for the Pool for the boot device(s) as well as any disk is not a member of a pool.
###################################
Field      | Value                   
-----------+-------------------------
Name       | da6                     
Model      | ATA WDC WD800JD-98LS    
Serial     | WD-WMAM9E011508         
ZFS GUID   | None                    
Pool       | N/A                     
Size (GiB) | 74.53                   
-----------+-------------------------
-----------+-------------------------
Name       | da5                     
Model      | ATA WDC WD10EALS-002    
Serial     | WD-WCATR7544152         
ZFS GUID   | None                    
Pool       | N/A                     
Size (GiB) | 931.51                  
-----------+-------------------------
-----------+-------------------------
Name       | da4                     
Model      | ATA WDC WD40EFRX-68W    
Serial     | WD-WCC4E0VLX67A         
ZFS GUID   | 13760272320454652383    
Pool       | N/A                     
Size (GiB) | 3726.02                 
-----------+-------------------------
-----------+-------------------------
Name       | da3                     
Model      | ATA WDC WD1001FALS-0    
Serial     | WD-WMATV1979921         
ZFS GUID   | None                    
Pool       | N/A                     
Size (GiB) | 931.51                  
-----------+-------------------------
-----------+-------------------------
Name       | da2                     
Model      | ATA WDC WD40EFRX-68W    
Serial     | WD-WCC4E0VLXYA5         
ZFS GUID   | 8569413254585567322     
Pool       | N/A                     
Size (GiB) | 3726.02                 
-----------+-------------------------
-----------+-------------------------
Name       | da1                     
Model      | ATA WDC WD40EFRX-68W    
Serial     | WD-WCC4E3ED97SN         
ZFS GUID   | 11304261845268617089    
Pool       | N/A                     
Size (GiB) | 3726.02                 
-----------+-------------------------
-----------+-------------------------
Name       | da0                     
Model      | ATA WDC WD40EFRX-68W    
Serial     | WD-WCC4E0VLXFRA         
ZFS GUID   | 5884347931348330863     
Pool       | N/A                     
Size (GiB) | 3726.02                 
-----------+-------------------------
-----------+-------------------------
Name       | ada1                    
Model      | KINGSTON SA400S37120G   
Serial     | 50026B7380108550        
ZFS GUID   | None                    
Pool       | N/A                     
Size (GiB) | 111.79                  
-----------+-------------------------
-----------+-------------------------
Name       | ada0                    
Model      | KINGSTON SA400S37120G   
Serial     | 50026B7380108629        
ZFS GUID   | None                    
Pool       | N/A                     
Size (GiB) | 111.79                  
-----------+-------------------------
-----------+-------------------------
Name       | da7                     
Model      | HITACHI HUH72808CLAR8000
Serial     | VJGRSJ0X                
ZFS GUID   | 9582034011963173512     
Pool       | Lightning               
Size (GiB) | 7452.04                 
-----------+-------------------------
-----------+-------------------------
Name       | da8                     
Model      | HITACHI HUH72808CLAR8000
Serial     | VJGGEDDX                
ZFS GUID   | 13750401593833288325    
Pool       | Lightning               
Size (GiB) | 7452.04                 
-----------+-------------------------
-----------+-------------------------
Name       | da9                     
Model      | HITACHI HUH72808CLAR8000
Serial     | VJGREP7X                
ZFS GUID   | 1977240638271540285     
Pool       | Lightning               
Size (GiB) | 7452.04                 
-----------+-------------------------
-----------+-------------------------
Name       | da10                    
Model      | HITACHI HUH72808CLAR8000
Serial     | VJGRBAXX                
ZFS GUID   | 4942859381719713454     
Pool       | N/A                     
Size (GiB) | 7452.04                 
-----------+-------------------------
-----------+-------------------------
Name       | da11                    
Model      | HITACHI HUH72808CLAR8000
Serial     | VJGRJ2TX                
ZFS GUID   | 4053293304217049582     
Pool       | Lightning               
Size (GiB) | 7452.04                 
-----------+-------------------------
-----------+-------------------------
Name       | da13                    
Model      | SanDisk Cruzer Fit      
Serial     | 4C530001190827103470    
ZFS GUID   | None                    
Pool       | N/A                     
Size (GiB) | 14.91                   
-----------+-------------------------
-----------+-------------------------
Name       | da12                    
Model      | ATA ST8000VN004-2M21    
Serial     | WSD8TLQK                
ZFS GUID   | 3998958855074354122     
Pool       | Lightning               
Size (GiB) | 7452.04                 
-----------+-------------------------
-----------+-------------------------
Name       | da14                    
Model      | Generic                 
Serial     |                         
ZFS GUID   | None                    
Pool       | N/A                     
Size (GiB) | 27.48                   
-----------+-------------------------
-----------+-------------------------

###################################
#                                 #
#       DD Benchmark Starting     #
#                                 #
###################################
Using 24 threads for the benchmark.


Creating test dataset for pool: Lightning

Running benchmarks for pool: Lightning
Running DD write benchmark with 1 threads...
Run 1 write speed: 173.18 MB/s
Run 2 write speed: 173.75 MB/s
Average write speed: 173.47 MB/s
Running DD read benchmark with 1 threads...
Run 1 read speed: 371.84 MB/s
Run 2 read speed: 376.76 MB/s
Average read speed: 374.30 MB/s
Running DD write benchmark with 6 threads...
Run 1 write speed: 328.95 MB/s
Run 2 write speed: 340.78 MB/s
Average write speed: 334.87 MB/s
Running DD read benchmark with 6 threads...
Run 1 read speed: 302.52 MB/s
Run 2 read speed: 318.70 MB/s
Average read speed: 310.61 MB/s
Running DD write benchmark with 12 threads...
Run 1 write speed: 342.20 MB/s
Run 2 write speed: 341.00 MB/s
Average write speed: 341.60 MB/s
Running DD read benchmark with 12 threads...
Run 1 read speed: 309.86 MB/s
Run 2 read speed: 314.98 MB/s
Average read speed: 312.42 MB/s
Running DD write benchmark with 24 threads...
Run 1 write speed: 331.89 MB/s
Run 2 write speed: 310.89 MB/s
Average write speed: 321.39 MB/s
Running DD read benchmark with 24 threads...
Run 1 read speed: 299.52 MB/s
Run 2 read speed: 305.09 MB/s
Average read speed: 302.30 MB/s

###################################
#         DD Benchmark Results for Pool: Lightning    #
###################################
#    Threads: 1    #
#    1M Seq Write Run 1: 173.18 MB/s     #
#    1M Seq Write Run 2: 173.75 MB/s     #
#    1M Seq Write Avg: 173.47 MB/s #
#    1M Seq Read Run 1: 371.84 MB/s      #
#    1M Seq Read Run 2: 376.76 MB/s      #
#    1M Seq Read Avg: 374.30 MB/s  #
###################################
#    Threads: 6    #
#    1M Seq Write Run 1: 328.95 MB/s     #
#    1M Seq Write Run 2: 340.78 MB/s     #
#    1M Seq Write Avg: 334.87 MB/s #
#    1M Seq Read Run 1: 302.52 MB/s      #
#    1M Seq Read Run 2: 318.70 MB/s      #
#    1M Seq Read Avg: 310.61 MB/s  #
###################################
#    Threads: 12    #
#    1M Seq Write Run 1: 342.20 MB/s     #
#    1M Seq Write Run 2: 341.00 MB/s     #
#    1M Seq Write Avg: 341.60 MB/s #
#    1M Seq Read Run 1: 309.86 MB/s      #
#    1M Seq Read Run 2: 314.98 MB/s      #
#    1M Seq Read Avg: 312.42 MB/s  #
###################################
#    Threads: 24    #
#    1M Seq Write Run 1: 331.89 MB/s     #
#    1M Seq Write Run 2: 310.89 MB/s     #
#    1M Seq Write Avg: 321.39 MB/s #
#    1M Seq Read Run 1: 299.52 MB/s      #
#    1M Seq Read Run 2: 305.09 MB/s      #
#    1M Seq Read Avg: 302.30 MB/s  #
###################################
Cleaning up test files...
Running disk read benchmark...
###################################
This benchmark tests the 4K sequential read performance of each disk in the system using dd. It is run 2 times for each disk and averaged.
In order to work around ARC caching in systems with it still enabled, This benchmark reads data in the amount of total system RAM or the total size of the disk, whichever is smaller.
###################################
Testing disk: da6
Testing disk: da6
Testing disk: da5
Testing disk: da5

Testing disk: da4
Testing disk: da4
Testing disk: da3
Testing disk: da3
Testing disk: da2
dd: /dev/da2: Input/output error
Testing disk: da2
dd: /dev/da2: Input/output error
Testing disk: da1
Testing disk: da1
Testing disk: da0
Testing disk: da0
Testing disk: ada1
Testing disk: ada1
Testing disk: ada0
Testing disk: ada0
Testing disk: da7
Testing disk: da7
Testing disk: da8
Testing disk: da8
Testing disk: da9
Testing disk: da9
Testing disk: da10
Testing disk: da10
Testing disk: da11
Testing disk: da11
Testing disk: da13
Testing disk: da13
Testing disk: da12
Testing disk: da12
Testing disk: da14
Testing disk: da14

###################################
#         Disk Read Benchmark Results   #
###################################
#    Disk: da6    #
#    Run 1: 31.08 MB/s     #
#    Run 2: 31.18 MB/s     #
#    Average: 31.13 MB/s     #
#    Disk: da5    #
#    Run 1: 48.34 MB/s     #
#    Run 2: 48.29 MB/s     #
#    Average: 48.31 MB/s     #
#    Disk: da4    #
#    Run 1: 62.57 MB/s     #
#    Run 2: 64.98 MB/s     #
#    Average: 63.78 MB/s     #
#    Disk: da3    #
#    Run 1: 44.87 MB/s     #
#    Run 2: 46.22 MB/s     #
#    Average: 45.55 MB/s     #
#    Disk: da2    #
#    Run 1: 174.64 MB/s     #
#    Run 2: 183.72 MB/s     #
#    Average: 179.18 MB/s     #
#    Disk: da1    #
#    Run 1: 62.89 MB/s     #
#    Run 2: 64.85 MB/s     #
#    Average: 63.87 MB/s     #
#    Disk: da0    #
#    Run 1: 64.22 MB/s     #
#    Run 2: 64.93 MB/s     #
#    Average: 64.58 MB/s     #
#    Disk: ada1    #
#    Run 1: 8.91 MB/s     #
#    Run 2: 8.87 MB/s     #
#    Average: 8.89 MB/s     #
#    Disk: ada0    #
#    Run 1: 13.89 MB/s     #
#    Run 2: 14.04 MB/s     #
#    Average: 13.97 MB/s     #
#    Disk: da7    #
#    Run 1: 21.09 MB/s     #
#    Run 2: 36.18 MB/s     #
#    Average: 28.63 MB/s     #
#    Disk: da8    #
#    Run 1: 45.48 MB/s     #
#    Run 2: 25.58 MB/s     #
#    Average: 35.53 MB/s     #
#    Disk: da9    #
#    Run 1: 29.93 MB/s     #
#    Run 2: 31.74 MB/s     #
#    Average: 30.84 MB/s     #
#    Disk: da10    #
#    Run 1: 84.17 MB/s     #
#    Run 2: 82.82 MB/s     #
#    Average: 83.49 MB/s     #
#    Disk: da11    #
#    Run 1: 20.82 MB/s     #
#    Run 2: 34.08 MB/s     #
#    Average: 27.45 MB/s     #
#    Disk: da13    #
#    Run 1: 9.36 MB/s     #
#    Run 2: 9.40 MB/s     #
#    Average: 9.38 MB/s     #
#    Disk: da12    #
#    Run 1: 64.64 MB/s     #
#    Run 2: 65.05 MB/s     #
#    Average: 64.84 MB/s     #
#    Disk: da14    #
#    Run 1: 7.81 MB/s     #
#    Run 2: 7.81 MB/s     #
#    Average: 7.81 MB/s     #
###################################

Total benchmark time: 881.78 minutes```

I believe my issues are likely related to SMB sharing, but I'll get to that shortly. Thanks for all the inputs so far, I'll dig into the info provided.

Can you be more specific in terms of

I also note (from the later posts) you appear to be running TrueNAS CORE and not SCALE. Just noting this as some things are different between the underlying g FreeBSD and Linux.

I have servers running CORE and others running SCALE. Eventually the CORE servers will be upgraded to SCALE.

I can try. Very unscientific at this point, but my main interaction with TrueNAS is via MacOS through SMB shares. I’m experiencing unusually slow performance in reading or writing files. Writing files is mostly scanning PDFs (typically a couple MB each) to the SMB share, but sometimes larger files. Also, opening folders on the shares often takes quite a while to show the folder contents.

Most of my clients are accessing via wifi, but I do have a few wired clients. The problem seems to span them all. While the issues could be network related, I haven’t made any significant changes to the network configuration in quite a while, so I’m not confident that it could be that, but I wouldn’t rule it out 100% either.

There was another topic on this, SMB and MacOS Finder speed, again

My experience is that the macOS Finder will walk the tree gathering metadata when it first opens (I suspect it actually does this in the background once the share is mounted, but I have yet to test that). If the number of items is large, the time to gather the metadata is noticeable.

How many items (files + directories) are there at the root level of your shares?

Do you see slow copy times for large items copied via CLI (terminal)? via Finder?

When I was running CORE I was able to speed up the initial opening of the share by doing 2 things. I used AFP instead of SMB and I configured the AFP share to put the CNID database on a very, very fast SSD (Intel Enterprise with over 50,000 I/Ops performance for small, random reads and writes).

I do not (yet) have a tuning for SCALE and SMB to minimize the wait while the Finder gathers metadata.

But, I had run performance tests of SMB vs. AFP on macOS to TrueNAS CORE and found they had comparable read and write performance for large (greater than 1MiB) files.

1 Like

Thanks for the reply. My file structure is quite different depending on the share, so that could likely have something to do with the performance. Some of my shares do have a very “flat” structure, and those are considerably slower. The directory that most of my pdf scans are saved to currently has 430 files in it. I wouldn’t consider that excessive, but maybe it leans that way?

I’ll consider the rest of your thoughts when I have a bit more time. I meant to add to my previous post that yes, I am running CORE, but have considered many times migrating to SCALE. I wasn’t 100% sure of what impacts that would have to my deployment, so I ended up just keeping what I had for the moment.

If there are any blatant “gotchas” to migrating to SCALE (or some pros of doing so), I’d love to hear about that as well.

For the following I am running SCALE 24.10.2, SMB and an M1 macOS 15.3.2 vi WiFi. TrueNAS server is connected to core switch via dual (bonded) 10Gig connections and has dual Xeon(R) CPU E5-2623 v4 @ 2.60GHz and 192GiB RAM. Disks are all connected via LSI SAS HBAs. Primary zpool is 6 x 2-way mirror vdev of 2TB HDD.

My largest (at the top) share has 379 directories in it, it opens in about 1.5 seconds. The largest directory in there has 475 PDF scans and it opens in about 1 second (but the OS has already had a second to scan that directory before I opened it). I have a total of 12 SMB shares.

Ok, did some quick tests. Transferred a 5 MB pdf file both ways. Roughly 2 seconds both in Finder and CLI. Then copied a 397MB folder full of PDFs over both ways (read from TrueNAS & writing to TrueNAS). Read was 2:01 and write was 2:26. My math tells me that is roughly a 25Mb/s transfer rate. This was done on my Mac Mini with an SSD, so I don’t think there should be any bottlenecks in regards to that.

I also notice that deleting the copied folder from TrueNAS takes 15-20 seconds, but deleting the folder on my MacMini is almost instant.

use iperf3 to measure network performance.
use fio to measure dataset/disk performance.

Your 2:01 and 2:26 timings are completely useless.

1 Like

I assume that was in the Finder. How long does it take via the CLI/Terminal (rm -r foo)?
I think the Finder has to enumerate the tree it is deleting before it actually deletes and it may be deleting one thing at a time. The rm -r should delete everything at once.
The Finder sacrifices speed for ease of use and safety in many cases.

I wouldn’t call them useless, they are measuring user experience, not a specific, technical value. I agree there are better tools to measure strictly disk I/O and network performance (but even fio will measure both when used on a client of a network share).

he can print and post them in the toilet at home.

they mean nothing on this forum in this thread, as no one can relate to them.

I actually felt stupid for even doing this test, because stopwatching anything IT related is mostly pointless. What I wasn’t sure of is how to isolate the network aspect of things from the protocol (SMB) aspect of things.
I’m not sure if either iperf3 or fio will delineate the two, but I will check into them. Maybe it’s just a process of elimination to isolate SMB issues?

Correct, this was done in Finder. I would have thought this, but deleting locally stored files either doesn’t do the same enumeration, or does it much, much faster to where it is not even noticeable.

The Finder deals with metadata, which is handled very differently by local macOS filesystems than by SMB/TrueNAS/ZFS. The macOS filesystems are optimized for the metadata access. Under TrueNAS CORE and AFP I was able to get similar performance to local by putting the AFP CNID (metadata) database on a separate, very fast SSD based zpool. But SCALE has dropped AFP support as the netatalk project it was based on is largely dead as Apple embraces SMB.

Takes 2 minutes to run those to understand the fundamentals where you stand instead of going into esotericism of elimination something that you dont even know

Thanks for all your help… I’m not super familiar with either of the tools, especially fio, so it took more than 2 minutes, but whatever.

iperf3 on WiFi:

[  7] local 172.16.110.33 port 57955 connected to 172.16.110.15 port 5201
[ ID] Interval           Transfer     Bitrate
[  7]   0.00-1.00   sec   512 KBytes  4.19 Mbits/sec                  
[  7]   1.00-2.00   sec   640 KBytes  5.23 Mbits/sec                  
[  7]   2.00-3.00   sec   512 KBytes  4.18 Mbits/sec                  
[  7]   3.00-4.00   sec   256 KBytes  2.10 Mbits/sec                  
[  7]   4.00-5.00   sec   512 KBytes  4.20 Mbits/sec                  
[  7]   5.00-6.00   sec   128 KBytes  1.05 Mbits/sec                  
[  7]   6.00-7.00   sec   384 KBytes  3.15 Mbits/sec                  
[  7]   7.00-8.00   sec   384 KBytes  3.14 Mbits/sec                  
[  7]   8.00-9.00   sec   384 KBytes  3.15 Mbits/sec                  
[  7]   9.00-10.00  sec   640 KBytes  5.24 Mbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate
[  7]   0.00-10.00  sec  4.25 MBytes  3.56 Mbits/sec                  sender
[  7]   0.00-10.04  sec  4.20 MBytes  3.51 Mbits/sec                  receiver

iperf3 on wired:

[ ID] Interval           Transfer     Bitrate
[  7]   0.00-1.00   sec  11.4 MBytes  95.0 Mbits/sec                  
[  7]   1.00-2.00   sec  11.1 MBytes  93.6 Mbits/sec                  
[  7]   2.00-3.00   sec  11.2 MBytes  94.4 Mbits/sec                  
[  7]   3.00-4.00   sec  11.2 MBytes  94.4 Mbits/sec                  
[  7]   4.00-5.00   sec  11.2 MBytes  94.1 Mbits/sec                  
[  7]   5.00-6.00   sec  11.2 MBytes  94.4 Mbits/sec                  
[  7]   6.00-7.00   sec  11.1 MBytes  93.6 Mbits/sec                  
[  7]   7.00-8.00   sec  11.2 MBytes  94.1 Mbits/sec                  
[  7]   8.00-9.00   sec  11.2 MBytes  94.5 Mbits/sec                  
[  7]   9.00-10.00  sec  11.2 MBytes  94.3 Mbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate
[  7]   0.00-10.00  sec   112 MBytes  94.2 Mbits/sec                  sender
[  7]   0.00-10.01  sec   112 MBytes  94.1 Mbits/sec                  receiver

fio output:

fio --filename=test --direct=1 --rw=randrw --randrepeat=0 --rwmixread=100 --iodepth=128 --numjobs=12 --runtime=60 --group_reporting --name=4ktest --ioengine=psync --size=4G --bs=128k
4ktest: (g=0): rw=randrw, bs=(R) 128KiB-128KiB, (W) 128KiB-128KiB, (T) 128KiB-128KiB, ioengine=psync, iodepth=128
...
fio-3.28
Starting 12 processes
4ktest: Laying out IO file (1 file / 4096MiB)
Jobs: 10 (f=10): [r(4),_(1),r(1),_(1),r(5)][80.0%][r=14.3GiB/s][r=117k IOPS][eta 00m:02s]
4ktest: (groupid=0, jobs=12): err= 0: pid=22103: Fri Mar 21 13:02:47 2025
  read: IOPS=52.4k, BW=6555MiB/s (6874MB/s)(48.0GiB/7498msec)
    clat (usec): min=15, max=743152, avg=207.79, stdev=5393.02
     lat (usec): min=16, max=743152, avg=207.99, stdev=5393.02
    clat percentiles (usec):
     |  1.00th=[    24],  5.00th=[    29], 10.00th=[    35], 20.00th=[    55],
     | 30.00th=[    62], 40.00th=[    70], 50.00th=[    80], 60.00th=[    91],
     | 70.00th=[   102], 80.00th=[   115], 90.00th=[   137], 95.00th=[   163],
     | 99.00th=[  2802], 99.50th=[  3195], 99.90th=[  3785], 99.95th=[  4047],
     | 99.99th=[175113]
   bw (  MiB/s): min=   18, max=16164, per=99.26%, avg=6507.02, stdev=524.92, samples=156
   iops        : min=  138, max=129314, avg=52051.86, stdev=4199.47, samples=156
  lat (usec)   : 20=0.01%, 50=17.36%, 100=50.76%, 250=29.06%, 500=0.04%
  lat (usec)   : 750=0.03%, 1000=0.09%
  lat (msec)   : 2=0.61%, 4=1.99%, 10=0.03%, 20=0.01%, 50=0.01%
  lat (msec)   : 100=0.01%, 250=0.01%, 500=0.01%, 750=0.01%
  cpu          : usr=1.49%, sys=36.51%, ctx=11540, majf=0, minf=0
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=393216,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=128

Run status group 0 (all jobs):
   READ: bw=6555MiB/s (6874MB/s), 6555MiB/s-6555MiB/s (6874MB/s-6874MB/s), io=48.0GiB (51.5GB), run=7498-7498msec

I got a few things out of this. First, my WiFi speeds seem much lower than I would have expected. I’ll have to dig into this more.
My wired results were also much lower than expected. The connection is gigabit capable, but I found that it was auto-negotiating to FastEthernet speeds and I’m getting collisions. I’m going to re-terminate my cabling on both ends to see if that resolves the issue there.
I’m not sure if the fio output is favorable or not, or if I even used the right parameters to get a good test.

Self answering questions are the best

Can you pull per-port errors from the switch ports? If I see any errors on a port I circle back and take a very close look at cables.
Collisions are used by modern switches for flow control, if the destination is slow you might even see collisions on gig-e connections.