iSCSI Setup: More Options for iSCSI Extent Logical Block Size Larger than 4k (4096 bytes)

Problem/Justification
(What is the problem you are trying to solve with this feature/improvement or why should it be considered?)
I would like to be able to specify additional logical block sizes when setting up an iSCSI Extent. In particular, I’d like the option to use values larger than 4096.

I initially discussed this here: [SCALE 24.10.2.2] Is it Possible to Manually Set Custom iSCSI Extent Block Size?
@Stux suggested I add a feature request, as this does not seem to be possible now.

Impact
(How is this feature going to impact all TrueNAS users? What are the benefits and advantages? Are there disadvantages?)

I am not yet certain if there are disadvantages; I’m still learning about iSCSI and don’t have a great deal of experience with it yet.

However, I would like to be able to set extent sizes larger than 4k to test whether this represents a performance gain for iSCSI targets backed by zVols being used by virtual machines, per the discussion here: Understanding Relationship Between zVol Block Size and iSCSI Logical Block - TrueNAS - Practical ZFS

I originally started investigating how this worked when I wanted to set up an iSCSI share to use in a Windows 11 (NTFS) VM.

Quoting from Jim Salter’s reply to my bumbling there (emphasis added), where I was asking about extent logical block size for a hypthetical zVol with a volblocksize of 64k:

Ntfs default cluster size of 4K is for volumes smaller than 16TB, which may or may not apply.

But more importantly, 4KiB clusters eat a LOT of IOPS. You rarely want this in a VM, for the same reason you rarely (read: damn near never) want 4KiB clusters / blocks / volblocks for a VM using 4KiB native sectors: very few workloads actually fragment data that heavily, not even databases.

Also–and again, much like ext4–data under ntfs is stored primarily in extents, not just clusters. Extents are a range of contiguous clusters which are read or written in a single IOP. These tend to average closer to 64KiB.

Even Microsoft SQL server typically defaults to 64KiB extents: Pages and Extents Architecture Guide - SQL Server | Microsoft Learn

This means that you generally want to match your block size, or volblocksize, to roughly match the typical extent size, not cluster size. So, 64KiB.

This does mean that you’ll get a bit of read and write amplification on the occasional very small extent–or on EXTREMELY fragmented ntfs filesystems–which will in turn decrease performance in those cases, and which also tells you that you still shouldn’t run virtualized filesystems extremely full, even if the host storage has plenty of room–because if you do, the guest will be forced to allocste storage that normally would be in large extents in fragmented individual clusters!

You don’t always get the absolute best performance out of an exact match between guest level extent (or other IOP) size and host level blocksize. But that’s usually a very good starting point, and I typically wouldn’t even recommend bothering trying anything smaller than half the typical extent or IOP size, or larger than double.

Half the typical IOP size will prioritize latency at the expense of IOPS and throughput. Double the typical IOP size will prioritize throughput and IOPS efficiency at the expense of small operation latency. Pick your poison, and, hopefully… Employ a royal taster, before committing to a great big gulp in production.

I have a mirror vdev-backed pool where my virutal disks live. zVols default to a 16k volblocksize on the mirror pool, so even with the default volblocksize I’d still be interested in experimenting with values above 4k for VM disks.

User Story
(Please give a short description on how you envision some user taking advantage of this feature, what are the steps a user will follow to accomplish it)

Looking at this image again:

I envision either one of two UI tweaks to enable a user to engage in more customized performance tuning and testing:

  1. Adding additional options to the dropdown list for 8k, 16k, 32k, and 64k; or
  2. Adding a “custom” option to the dropdown list that enables a field where a user can specify a specific valid size: 8k, 16k, 32k, and 64k . For simplicity’s sake, the field would also take as valid input the same values as are in the dropdown menu now.

We think it doesn’t matter. It’s really the tuning of the file system that sits on iSCSI that has the impact.

All file systems are used to 4k block HDDs, but they write in much larger blocks…

In general, we don’t do anything in the UI to test something… so, we’d need more evidence that it has an impact first.

This dropdown drives the scst blocksize value - and many (most?) guest operating systems won’t handle a device with a logical block size >4096

Heck, I think VMware still won’t handle remote media (iSCSI/FC) presented as 4Kn.

I’m betting you’re going to get the performance tuning results you’re after from a combination of guest OS filesystem cluster size, ZFS recordsize, and vdev topologies - not things like logical block size or ashift :slight_smile:

1 Like

Thanks for this info.

With a mirror pool, the zVols I create as backing storage for iSCSI default to 16k volblocksize. Having a volblocksize 4x the OS’s logical block size seems like a bad idea.

What would you set the zVol volblocksize to when using a 4K logical block on the extent?

I’d use 16K on mirrors, possibly 32K on RAIDZ.

The default was actually raised from 8K back in 2021 because workloads were getting zero benefit from compression, costing too much memory for metadata/L2ARC, and bottlenecking things that had expensive per-record costs.

1 Like

Thanks! I’ll set something up in Windows and see how it goes in benchmarks. :slight_smile:

1 Like