High 'iowait' when manipulating data (cp/rm) causing low useability of TrueNAS Scale

Having recently migrated my TrueNAS from Core to Scale due to a failing boot disk and outdated jails/plugins, the transition itself went smoothly without any problems.
Taking advantage of the Scale environment, I expanded my storage pool with new disks. Furthermore, since my services were now running through the Apps functionality, I decided to reorganize the folders within my dataset.

This involved copying a significant amount of data (around 5TB) and then deleting some, mostly duplicates. Initially, I intended to do this in one go using as few commands as possible, but I first tested the process with smaller subsets of approximately 150-200GB.
It was at this point that I began experiencing severe server slowdowns, eventually leading to system freezes.

In between some reboots, I attempted the copy/delete operations both via a Windows machine accessing my SMB share or through an SSH terminal using Linux commands.

Both methods resulted in the same issue: cp/rm operations on my dataset caused high iowait on the system (as indicated by iostat or htop), making my TrueNAS server almost unusable until the iowait subsided.

Considering that I previously performed similar manual data management tasks on my TrueNAS Core dataset without any issues, I suspect a potential incompatibility or configuration problem between TrueNAS Scale and my server setup.
However, I recognize that my expertise with TrueNAS is limited to that of an enthusiastic user, and my recent system modifications might not have always followed best practices.

Therefore, I would greatly appreciate assistance in identifying potential areas of investigation and resolution to restore my system’s stability.

My TrueNAS server
Hardware:

  • MB: Supermicro X9SCM-F
  • CPU: Intel(R) Xeon(R) CPU E3-1230 V2 @ 3.30GHz
  • RAM: 4x8GB DDR3 PC3-12800E ECC
  • NIC: Intel 82574L Gigabit​
  • SATA controller: Intel (onboard), LSI 9240-8i (PCIe)​

HDD’s

  • 5x WD WD40EFRX (CMR) 4TB = RAIDZ1 (PCIe controller)​
    ** sda, sdb, sdc, sdd​, sdf
  • 1x WD WDS120G2G0A SSD 120GB = boot pool (onboard controller)​
    ** sde​

Software:
System = TrueNAS SCALE ElectricEel 24.10.2.2​
Apps = Radarr, Sonarr, Sabnzbd, Plex, Jellyfish (App storage connected with host paths​)
Services = SMB; No encryption

Timeline of changes to the system
#1 Boot device replacement from a WD HDD Raptor to a WD SSD Green

  • Disk swap + Clean install of TrueNAS Core 13.3 + Import of latest saved configuration

#2 Migration from TrueNAS Core 13.3 to TrueNAS Scale 24.04 Dragonfish

  • Migrated using the CORE UI Manual Update

#3 Creation & Configuration of the Apps on Scale 24.04 to replace my Jails/Plugins

  • Using the built in Apps catalog and interacting with my existing dataset

#4 Update from TrueNAS Scale 24.04 Dragonfish to TrueNAS SCALE ElectricEel 24.10.2.2​

  • Updated using the Scale UI Manual Update

#5 VDEV RAIDZ1 extension

  • Addition of 2x WD40EFRX (CMR) 4TB, one at a time (pool went from 3 to 5)

I have (well had, 2 are remaining) 4 of the very same drives in another nas, they are not great, I regularly have failing sectors on them, up to now I could repair 2 of them a couple of times, and 2 went to the trash…And performance drops a lot in the inner of the plate, exterior is about 150 MB/s, interior drops to less than 80 MBytes/s in sequential read.
Plus if you added drives to your raid, you also have the issue where free space reported is wrong etc… Better break and rebuild it, that’s what I’m doing atm.

On the expansion of the pool, while indeed the overall size was initially reported lower than what it should be, over the course of the following days it came up to the size it should be (if I calculated correctly).
NB: My pool of 5x 4TB drives in RAIDZ1 is currently reporting 13.6TiB capacity.

On the disks themselves, they have and are consistently reporting no errors on their periodic SMART tests.
When I was still on TrueNAS CORE last month, I didn’t keep track of their metrics, but they never left me the impression of performance drops while I was “manipulating” my data.
Would you have any recommandation on how to verify if your issues would be applicable here?

For the speed, you can use most disk utilities. I did a read surface test last week with HD sentinel on one of these WD40EFRX, it started at 150 Mbytes/s then decreased linear to 75 MB/s at the end.