Hey @Noise
Reviewing some of this, including your system specs. I’m going to jump in with one important thing and then we’ll look at performance questions.
Header for Attention!
Important bit here right off the bat - sync=standard is effectively equivalent to async/disabled on iSCSI for ESXi/VMware. Without manually setting sync=always for this ZVOL, you potentially have data that could be at risk in a sudden-shutdown scenario (eg: hardware component failure) - so you’ll want to set that as enabled.
Will this result in the requirement for a SLOG device? Possibly. You have CD8-R drives which should have PLP, but they are the -R (read intensive) models … but what you’re seeing should be within reason for sync=always.
With that out of the way, let’s get to the questions and thoughts.
Number one - can I ask why compression is disabled on the ZVOL? You’ve got more than enough CPU horsepower on the TrueNAS side to benefit from it, and the default LZ4 will early-abort on compressible data. You’d have to be able to hit a 1.33x ratio to make this effective for space-saving on disk given your volblocksize and ashift, but it could still save memory in active ARC (since that’s compressed as well) meaning more cache hits.
Generally when 4K write speed is low, what you’re fighting with is end-to-end latency. It might require a check in the BIOS unless your IPMI menu allows you to look, but investigate any “power savings” settings for PCIe link speeds, PCIe link ASPM (active state power management) and disable them. If your board is set to defaults it may be putting your Kioxia drives or the link to them into a lower-power, lower-bandwidth state - and then it’s got to “wake” them for the writes. This costs a few fractions of a second, but it’s nontrivial when you’re at bonded 25Gbps speeds. That said, you’re running sync=standard which should mean the tiny writes are hitting RAM, but it’s worth looking into this.
Network wise, you mentioned this is iSCSI MPIO - which is the correct way to do it - but did you set up the VMW_PSP_RR round-robin rule on the ESXi servers?
Since you’re on CORE, it should be:
esxcli storage nmp satp rule add -s "VMW_SATP_ALUA" -V "TrueNAS" -M "iSCSI Disk" -P "VMW_PSP_RR" -O "iops=1" -e "TrueNAS iSCSI Claim Rule"
This will tell your servers to flip links every I/O.
On the subject of networking - can you briefly describe the IP layout? The correct IP SAN topology for TrueNAS is two non-overlapping subnets, and Gandalf standing in the middle shouting YOU SHALL NOT PASS (traffic between them) - so for example:
VMware TrueNAS
192.168.1.101 --- Switch A --- 192.168.1.100
192.168.2.101 --- Switch B --- 192.168.2.100
---
You don’t want 192.168.1.101 to be able to see 192.168.2.100 in this layout. A vmkping -I from the first interface to the second target should fail with a no route to host message.