Performance issues on MC12-LE0 5700X vs i5 8400T

Hi,
so I’ve been hunting weird discrepancies in delivered performance between my current TrueNAS Core box:
OEM Fujitsu Esprimo P758
i5 8400T
64 GB RAM
8x PM863a 1.92TB (4 2-way mirrors) - 5 on onboard SATA ports, 3 on ASM1164
Dualport BCM57810 10G NIC
1x P1600X SLOG

and my new build (TrueNAS Scale):
Gigabyte MC12-LE0 AM4 B550 board
Ryzen 7 5700X
128GB RAM
12x PM863a 1.92TB (6 2-way mirrors) - 6 on onboard SATA, 6 on 2x ASM1166
Dualport 82599ES 10G NIC
2x P1600X SLOG

For some reason, the new Ryzen build has always been slower than the 8400T build (all over iSCSI). Locally run VMs directly on the new Ryzen build deliver good performance though.

From a VM over iSCSI over 2x 10G Multipath IO:

Within a VM on the TrueNAS Scale (Ryzen) build:
image

Any idea where I could start looking? What I’ve checked already:

  • lspci shows the correct links speed and width for all the SATA controllers, Optanes and NIC.
  • changed cables and optical transceivers for both, the TrueNAS box, as well as the iSCSI consuming box (which is an ESXi server)
  • no ethernet interface errors

In my mind, the Ryzen box should be considerably faster than the 8400T box, at least for reading. Writing is being limited by the Optanes, since I have “sync=always” enabled.

Thanks!

for some reason i can’t edit my post nor embed images.
bottom line is, ryzen build achieves 9GB/s read on a local VM run on TrueNAS Scale directly, but only 1,5GB/s over iSCSI, which is the same speed, the 8400T build achieves over iSCSI. also, setting sync=disabled doesn’t do anything on the ryzen build (always only around 800MB/s write speeds), but on the i5 8400T build disabling sync writes increases write speed from ~800MB/s to 1,6GB/s.

Different systems, with different pool layouts. I’m not sure comparisons are meaningful.

But I’d like to know more about these ASM controllers and how everything gets its PCIe lanes.

1 Like

(Overzealous?) Spam protection. Just read a few threads, there is also a tutorial you can do, I think.

different configs yes, but i’d consider the ryzen one beefier in every single way, no? thus my assumption, that it should be at least as fast, if not faster.

pcie lanes are provided by 4x4x4x4x bifurcation and a x16 to 4x M.2 x4 PCIe card (asrock rack).
according to lspci everything is getting its lanes.

oh and the asm1166 chips are sitting on two of those M.2 to SATA module thingos

edit: i was finally able to edit in the screenshots of my benchmark results.

ideas, anyone? I don’t expect you fine people to solve this for me, I consider myself quite tech-savvy. just throw some hints at me, where to look, maybe cli tools or methods which could help hunting down which component(s) is/are the culprit here, cause I’m kinda out of ideas.
next step I planned was to benchmark every single drive on its own and then all together and compare whether they can all deliver the same results if run at own and together, plus then I also want to try to saturate the network links at the same time. just to see if there is a bandwidth problem anywhere.

Network issue or difference in how well the differently branded NICs work, at least with regards to iSCSI?
I say that because your in-VM tests show the performance is there locally.

currently testing a different 82599ES from a different OEM, so far the same.
might move the BCM57810 to the ryzen box to rule out the NIC. cables and SFP+ modules have been changed already, switches are the same for all boxes (2x CRS309 8 Port SFP+ for fabric A/fabric B topology).

regarding the network part, it is really only the NICs which haven’t been changed yet. although the initiator server (so the one running the VM which connects to both truenas boxes) also has a 82599ES card installed. it might be the ESXi driver though, which might fix some oddities with that chip.

also another weird fact is that with truenas CORE on the ryzen box, I had the phenomenon that reads and writes were even (way) slower than with truenas scale. EXCEPT if I’d simultaneously run a benchmark on the test-vm which was run directly on the truenas core box. that made the iscsi benchmark (which had to run at the same time) skyrocket. just like if the box needs to be pushed up to speed. very weird. I’ve not configured the bios for ultra low power or anything.

Not really. Chasing bottlenecks is never easy.
Potential suspects would be the ASMedia SATA controllers on M.2 (not much love for these around here…), or bifurcation overhead (@NickF1227 documented such effects on all-NVMe pools, but that seems less likely with SATA).

1 Like

aren’t there any tools to look deeper into stuff? no matter where i looked, nothing was limiting.
the asm1164 in the other box seem to work fine, but that of course doesn’t necessarily need to apply to the asm1166. personally i haven’t had a single issue with these in the past year of two of “production” homelab use.

i have yet to do the benchmarking of all the disks and the NIC as described earlier but at least accoring to reporting, they asm1166 connected disks seem to perform exactly as the ones connected to onboaerd chipset SATA.

especially weird is the write performance when sync=disabled. the i5 i8400T box performs TWICE as fast, there MUST be some weird issue…

the mistery continues…

You can use solnet-array-test to investigate read issues. It would reveal whether there’s a performance gap between different controllers.
Sync writes on the MC12 is where I vaguely suspect that bifurcation overhead might be a factor.

2 Likes

thanks, script has been running for two hours now, still going. what I have found out so far, is that 5 of 12 drives are giving me slow and inconsistent read speeds. Example of a disk which behaves OK and one of the 5 which behave weirdly:


Oddly enough, 3 of them are on the ASM1166 M.2 cards, but the other two are on the onboard B550-backed SATA ports.
Looking at the block diagram of the motherboard, I can see that there is a 3.0 x4 link, which imho should be enough for 6 SATA3 drives, assuming that I’m not using the onboard LAN for anything and the onboard M.2 just for booting. I’m assuming that the chipset is acting as a PCIe switch, thus being able to assign all the bandwidth to the SATA drives, if needed.

Disk    Bytes Transferred Seconds %ofAvg
------- ----------------- ------- ------
sda         1920383410176    3411     98
sdb         1920383410176    3410     98
sdc         1920383410176    3409     98
sdd         1920383410176    5122    147 --SLOW--
sde         1920383410176    3409     98
sdf         1920383410176    4880    140 --SLOW--
sdg         1920383410176    3406     98
sdh         1920383410176    4855    139 --SLOW--
sdi         1920383410176    3411     98
sdj         1920383410176    3406     98
sdk         1920383410176    5246    150 --SLOW--
sdl         1920383410176    4860    139 --SLOW--
nvme2n1       58977157120      22      1 ++FAST++
nvme0n1       58977157120       6      0 ++FAST++

NVMe disks are only used if i’m doing sync writes. since this is about reads and async writes (sync=disabled), the NVMe disks imo don’t matter for this.

the smart details for sdd, sdf, sdh, sdk and sdl look clean to me, they all look like this (sdf) and they’re all about the same age and wear level:

martctl 7.4 2023-08-01 r5530 [x86_64-linux-6.6.44-production+truenas] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Samsung based SSDs
Device Model:     SAMSUNG MZ7LM1T9HCJM-00003
Serial Number:    deleted
LU WWN Device Id: deleted
Firmware Version: GXT3003Q
User Capacity:    1,920,383,410,176 bytes [1.92 TB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
TRIM Command:     Available, deterministic, zeroed
Device is:        In smartctl database 7.3/5528
ATA Version is:   ACS-2, ATA8-ACS T13/1699-D revision 4c
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Fri Dec 20 00:23:52 2024 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is:   Unavailable
APM feature is:   Unavailable
Rd look-ahead is: Enabled
Write cache is:   Enabled
DSN feature is:   Unavailable
ATA Security is:  Disabled, frozen [SEC2]
Wt Cache Reorder: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x02)	Offline data collection activity
					was completed without error.
					Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		( 6000) seconds.
Offline data collection
capabilities: 			 (0x53) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					No Offline surface scan supported.
					Self-test supported.
					No Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 ( 100) minutes.
SCT capabilities: 	       (0x003d)	SCT Status supported.
					SCT Error Recovery Control supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  5 Reallocated_Sector_Ct   PO--CK   100   100   010    -    0
  9 Power_On_Hours          -O--CK   091   091   000    -    45111
 12 Power_Cycle_Count       -O--CK   099   099   000    -    33
177 Wear_Leveling_Count     PO--C-   097   097   005    -    220
179 Used_Rsvd_Blk_Cnt_Tot   PO--C-   100   100   010    -    0
180 Unused_Rsvd_Blk_Cnt_Tot PO--C-   100   100   010    -    9682
181 Program_Fail_Cnt_Total  -O--CK   100   100   010    -    0
182 Erase_Fail_Count_Total  -O--CK   100   100   010    -    0
183 Runtime_Bad_Block       PO--C-   100   100   010    -    0
184 End-to-End_Error        PO--CK   100   100   097    -    0
187 Uncorrectable_Error_Cnt -O--CK   100   100   000    -    0
190 Airflow_Temperature_Cel -O--CK   045   044   000    -    55
195 ECC_Error_Rate          -O-RC-   200   200   000    -    0
197 Current_Pending_Sector  -O--CK   100   100   000    -    0
199 CRC_Error_Count         -OSRCK   100   100   000    -    0
202 Exception_Mode_Status   PO--CK   100   100   010    -    0
235 POR_Recovery_Count      -O--C-   099   099   000    -    16
241 Total_LBAs_Written      -O--CK   099   099   000    -    352246306774
242 Total_LBAs_Read         -O--CK   099   099   000    -    2145768260254
243 SATA_Downshift_Ct       -O--CK   100   100   000    -    0
244 Thermal_Throttle_St     -O--CK   100   100   000    -    0
245 Timed_Workld_Media_Wear -O--CK   100   100   000    -    65535
246 Timed_Workld_RdWr_Ratio -O--CK   100   100   000    -    65535
247 Timed_Workld_Timer      -O--CK   100   100   000    -    65535
251 NAND_Writes             -O--CK   100   100   000    -    924262626312
                            ||||||_ K auto-keep
                            |||||__ C event count
                            ||||___ R error rate
                            |||____ S speed/performance
                            ||_____ O updated online
                            |______ P prefailure warning

General Purpose Log Directory Version 1
SMART           Log Directory Version 1 [multi-sector log support]
Address    Access  R/W   Size  Description
0x00       GPL,SL  R/O      1  Log Directory
0x01           SL  R/O      1  Summary SMART error log
0x02           SL  R/O      1  Comprehensive SMART error log
0x03       GPL     R/O      1  Ext. Comprehensive SMART error log
0x06           SL  R/O      1  SMART self-test log
0x07       GPL     R/O      1  Extended self-test log
0x09           SL  R/W      1  Selective self-test log
0x10       GPL     R/O      1  NCQ Command Error log
0x11       GPL     R/O      1  SATA Phy Event Counters log
0x13       GPL     R/O      1  SATA NCQ Send and Receive log
0x30       GPL,SL  R/O      9  IDENTIFY DEVICE data log
0x80-0x9f  GPL,SL  R/W     16  Host vendor specific log
0xe0       GPL,SL  R/W      1  SCT Command/Status
0xe1       GPL,SL  R/W      1  SCT Data Transfer

SMART Extended Comprehensive Error Log Version: 1 (1 sectors)
No Errors Logged

SMART Error Log Version: 1
No Errors Logged

SMART Extended Self-test Log Version: 1 (1 sectors)
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
  255        0    65535  Read_scanning was completed without error
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

SCT Status Version:                  3
SCT Version (vendor specific):       256 (0x0100)
Device State:                        SCT command executing in background (5)
Current Temperature:                    55 Celsius
Power Cycle Min/Max Temperature:     26/56 Celsius
Lifetime    Min/Max Temperature:      0/70 Celsius
Under/Over Temperature Limit Count:  4294967295/4294967295

SCT Temperature History Version:     3 (Unknown, should be 2)
Temperature Sampling Period:         1 minute
Temperature Logging Interval:        10 minutes
Min/Max recommended Temperature:      0/70 Celsius
Min/Max Temperature Limit:            0/70 Celsius
Temperature History Size (Index):    128 (17)

Index    Estimated Time   Temperature Celsius
  18    2024-12-19 03:10     ?  -
 ...    ..(108 skipped).    ..  -
 127    2024-12-19 21:20     ?  -
   0    2024-12-19 21:30    26  *******
   1    2024-12-19 21:40    35  ****************
   2    2024-12-19 21:50    43  ************************
   3    2024-12-19 22:00    38  *******************
   4    2024-12-19 22:10    43  ************************
   5    2024-12-19 22:20    50  *******************************
   6    2024-12-19 22:30    53  **********************************
   7    2024-12-19 22:40    53  **********************************
   8    2024-12-19 22:50    52  *********************************
   9    2024-12-19 23:00    50  *******************************
  10    2024-12-19 23:10    48  *****************************
  11    2024-12-19 23:20    46  ***************************
  12    2024-12-19 23:30    45  **************************
  13    2024-12-19 23:40    50  *******************************
  14    2024-12-19 23:50    54  ***********************************
  15    2024-12-20 00:00    56  *************************************
  16    2024-12-20 00:10    56  *************************************
  17    2024-12-20 00:20    55  ************************************

SCT Error Recovery Control:
           Read: Disabled
          Write: Disabled

Device Statistics (GP/SMART Log 0x04) not supported

Pending Defects log (GP Log 0x0c) not supported

SATA Phy Event Counters (GP Log 0x11)
ID      Size     Value  Description
0x0001  2            0  Command failed due to ICRC error
0x0002  2            0  R_ERR response for data FIS
0x0003  2            0  R_ERR response for device-to-host data FIS
0x0004  2            0  R_ERR response for host-to-device data FIS
0x0005  2            0  R_ERR response for non-data FIS
0x0006  2            0  R_ERR response for device-to-host non-data FIS
0x0007  2            0  R_ERR response for host-to-device non-data FIS
0x0008  2            0  Device-to-host non-data FIS retries
0x0009  2            2  Transition from drive PhyRdy to drive PhyNRdy
0x000a  2            2  Device-to-host register FISes sent due to a COMRESET
0x000b  2            0  CRC errors within host-to-device FIS
0x000d  2            0  Non-CRC errors within host-to-device FIS
0x000f  2            0  R_ERR response for host-to-device data FIS, CRC
0x0010  2            0  R_ERR response for host-to-device data FIS, non-CRC
0x0012  2            0  R_ERR response for host-to-device non-data FIS, CRC
0x0013  2            0  R_ERR response for host-to-device non-data FIS, non-CRC

now remains the question, why these 5 disks are slower.
i’ll probably move them around between ports and see whether the slowness moves with them or not.

CPU is mostly iowait-ing, btw:

1 Like

Same firmware version across drives?

yes, i had the same suspicion, but they’re the same.

grafik

You said two of the six drives connected to the onboard SATA are slow?

Maybe I am overly suspicous, but any Idea why the onboad SATA controllers are grouped that way?

Edit: I found this in an article about the B550

“and four SATA ports are among the other interfaces of the B550. Two of the PCIe lanes can also be configured for two additional SATA ports.”

In a diagramm it also says “up to 8 Sata ports are possible via lane sharing”.

That got me suspicious as well. 2 PCIe 3.0 lanes should be plenty for 2x SATAIII though. Assuming that the 4x SATAIII can make use of the 4x uplink of the B550 chipset, there shouldn’t be an issue. Even if the 4x SATAIII only get two of the lanes, that should still fit (a little tight, but still).

I’ve moved around the drives between SATA ports (swapped a drive with inconsistent performance with a drive with consistent performance and the issue moved with the drive. That made me think that that might be some weird (pre-)failure mode of the drives. Odd though, that SMART doesn’t hint anything.

I threw in 12x Intel S3610 480GB drives which have kinda the similar performance as the PM863 ones and this is how all the drives looked like (right part of the graph is S3610, left (more wonky) part is PM863:

Only one of the S3610 drives looked like this (most right part of the graph):

Write performance over iSCSI with “sync=disabled” (so NVMe Optanes are out of the way) is still not impressive with th S3610 drives:
image

I’m not sure what conclusion I now should draw.

  • S3610 drives perform WAY more consistently generally
  • still, two S3610 do almost a straight line which leaves me wondering why only these two
  • Writes aren’t faster with the S3610 drives → so is this just a SCALE thing (architecture or whatever) and I’m just not going to get faster writes with my hardware on SCALE or is there a different (hardware) issue still present which is yet to be identified

My head goes :exploding_head:

edit: moving two of the drives off of the 4-port B550 group to the ASM1166 modules (now there’s 4 drives connected to each of the two ASM1166 modules) actually increases write performance to a to this point unseen degree:
image

1 Like

Wow, the crappy little SATA controllers are winning out over the chipset? That’s a new one. Funny thing is, IIRC, ASMedia makes the chipset for AMD these days.

1 Like

Well, that is a bit of a blackbox, isn’t it? At least for me :smile:

There are USB + 4x SATA “natively” connected to the chipset.
Then, 2x SATA using 2 PCIE-lanes.
Finally 8 different devices using 1 PCIE lane each.

All of them using a Gen 3. x4 (8GBs) uplink to the CPU. There has to be some overhead “arbritating” this. Also, not sure how the “native” SATA drives fare against the PCIE-connected ones.

But maybe it is the drives, as you suspect?

Apart from that, it could be many things, or a combination thereof.

And while Core ist not the future, I guess it is well optimized NAS OS, with great performance, especially on older hardware.

i’ll probaby order two more of the wonky little ASM1166 modules, throw them into the remaining two M.2 slots of my x4x4x4x4 bifurcation card (instead of the optanes, temporarily) and then test again. just to rule out any chipset sata shenanigans…

I’m not sure that doubling down on ASMedia is the best solution. The original post shows a small difference in performance which could just as easily be chalked up to margin of error and/or client side load.

As @ericloewe mentioned the B550 chipset is also using an ASMedia controller, so native performance from on-board SATA vs the M.2 adapter can probably be chalked up to overhead associated with the chipset itself having other peripherals connected (even if not in use).

It’s been demonstrated by the test of time that even old PCI-E Gen2 LSI SAS controllers will generally outperform far more modern SATA controllers while being generally more reliable. You may be better served by an M.2 to PCI-E breakout card and an 9207-8i. You would be limited to PCI-E 3.0x4 bandwidth of course, but that is not different from your current proposed solution.

Also as @etorix mentioned, Bifurcation is not free and some loss is expected. You are using a desktop platform to do server workloads, so scaling will never be perfect no matter what methodology you choose if you don’t have a full phat x8 or x16 slot available for your HBA.

e.g. (I have not tested this particular device)

I’ve dealt with this specific card and vendor many, many times with good results.

1 Like