Sustained Write Benchmarking on Core

Good afternoon,
I wonder if the general speeds of CORE have changed significantly due to ongoing development at iXsystems.

Specifically, my recollection of ZFS write speeds was that one would expect a 1-VDEV system to write approximately as fast as a single drive, i.e. about 250MB/s for my He10’s per the OEM spec sheet. Achieving 250MB/s on a sustained write was about right for my transfers over 10GbE in the past.

I recently decided to take better advantage of my sVDEV by adjusting recordsizes to 1M for image/video/archive datasets, rebalancing the pool, etc. The result is somewhat flummoxing as sustained writes now go into the pool at 400MB/s on a pretty sustained basis (20GB file). All over 10GbE fiber using a QNAP to Thunderbolt 3 adapter.

I have yet to replicate the same transfers using my older Sonnet 10GbE adapter but I suspect this has less to do with the adapter and more with the record sizes, sVDEV, and changes iXsystems has made under the hood. The pool is quite empty (20% full) and I’m using SMB. Snapshots are currently off.

Is it the combination of recordsizes and the sVDEV that caused this speed increase or can pools now write significantly faster than I used to remember them?

Caching in general got a quite substantial rework in ZFS itself recently; about iX’s side, iirc there is some serious optimization in the upcoming releases of both CORE and SCALE.


If the point of this thread is to celebrate iXsystem’s contributions to ZFS, you can start by checking out the long list of contributions from the saint of a man:
amotin (Alexander Motin) · GitHub



The 1M recordsize probably has the most significant effect for large, sequential writes. (And reads too.)

Think about it. Compared to the default 128K recordize, there are 8 times fewer ZFS metadata operations for every file.

Every block needs a pointer, a generated hash, and attempted (or aborted) compression. (And, optionally, encrypted.)

A file that requires 1,000 of these operations at 1M recordsize would require 8,000 of these operations at 128K recordsize. (Same file, same size.)


Those are very good insights. Thank you!

Along similar lines, if all the metadata with lots of little bitty blocks goes straight to a SSD pool then the latency on all that overhead is significantly reduced vs. writing to a HDD pool. SSDs don’t take a long time to traverse heads to new sectors and all that. But I really did not expect a 50% boost over my standard large-file write performance.

Thanks again, I appreciate it.

1 Like