16 MiB recordsize?!?!?!?! <--- clickbait punctuation marks for the algorithm

winnielinnie · September 16, 2024, 3:02pm

Not sure, myself.

We already have an example with @dxun that there’s a notable decrease in scrub performance with 16M recordsize (assuming full 16M blocks) vs 2M recordsize.

This hints that for sequential reads, 16M recordize can indeed backfire, doing more harm than good.

Consider that the following are processed as single-thread for every block that is read:

Generating a hash (to check its integrity)
Decryption (if the block is encrypted)
Decompression (if the block contains any form of compression)

Doing this in parallel with 4, 8, or 16 threads (against 16 MiB of data) is theoretically superior to doing this in a single thread against the same 16 MiB of data.

I think this is the best summary of why one might consider 16M recordsize.

You would also have to pair the above with a higher level compression setting, such as ZSTD-9+.