Problem/Justification
The ZStd library has received several updates since the release 1.4.5, that ZFS currently uses within its codebase. Practically every subsequent release brings performance improvements in regards to decompression speed, some of them also to compression speed.
Unless there’s some specific blocking issues, it’d be generally nice if the library used gets an update.
Impact / User Story
More performance for everyone.
I’m just nudging around about it here, because TrueNAS has a general commercial interest and does active development on ZFS.
I wonder what he means by this?
Yeah, you’d think that a compression file format would/should not change. Would certainly be hilarious if all and every RAR archive decompresses into garbled crap, just because WinRAR fancied to mess around a bit on a blue monday.
I happen to be an expert in compression algorithms. Granted, my experience is primarily in lossy algorithms.
Anyway, the issue is that ZFS assumes the same block of data will compress the same way with the same algorithm.
The assumption is made by the L2ARC code at least, and possibly the dedup code.
Which on the face of it seems a valid assumption.
Most lossless compression algorithms (yes, codecs, yes @HoneyBadger was correct) assume that the decoder will produce 100% the same results as any other version of the decoder (this is not always the case with lossy algorithms, but that’s problematic, and all current standards (at least in the video space) ban that behavior)
But not that the coder/compressor will. And in fact, SIMD optimizations etc can affect the results depending on how they are implemented.
So, basically, newer versions of ztsd produce different results given the same input. This could be worked around.
Part of it is because the zlib guys changed the defaults to “spend” some perf. Pity. And secondly some of the perf gains come from non deterministic output, for example by racing across threads.
So, it’s hard and the gains are smaller than you’d think.
Their was an OpenZFS project to do early bailout of compression if possible. It involved using another compression algorithm, like LZ4, to see if the block could be compressed at all. If not, skip ZSTD altogether. Thus, early bailout since LZ4 is reasonably fast.
I tried looking up the GitHub Issue and any project associated with this compression early bailout, but did not find it easily.
ZSTD uses an “early abort” by using a first pass of LZ4 (and then ZSTD-1, only if LZ4 supposedly claims the block is not compressible.)
Long story short: rincebrain found this to be the best conditions to automatically enable ZSTD “early abort”:
Recordsize must be 128K or larger.
ZSTD compression level must be set to ZSTD-3 or higher.
If your dataset’s recordsize is 128K or higher and you set the compression to ZSTD-3 or higher, then you are automatically using ZSTD early abort.
LZ4 and ZSTD-1 are so fast as heuristics, that you gain the benefits of compression with higher levels of ZSTD, without much of a performance penalty for incompressible data.
Glad the early bailout is both integrated and working.
However, my comment also applies to updating the ZSTD code. It may need to have the early bailout re-implemented / re-integrated. Since OpenZFS does not use ZSTD as plain code, (like someone might use the gzip command), their could be complications.
Of course, most of what I’ve heard about OpenZFS source code is that it is clean, well documented and ever improving. So it is entirely possible that to bump ZSTD version up would not take much effort, at least for the early bailout side. (Can’t speak to the compatibility issues…)
It may need to have the early bailout re-implemented / re-integrated
It does not, we just call the zstd library to compress blocks and early bailout just tests compression first using LZ4 and never starts/opens the zstd lib if its not needed.
So it is entirely possible that to bump ZSTD version up would not take much effort
Having worked on the testing stack for ZSTD-on-ZFS I can say:
Its very easy to update the library and extremely hard to do so fully backwards (ARC/L2ARC/dedupe) compatible.
at least for the early bailout side
There is no ZSTD-version-specific code/requirements for early bailout. Bailout is done using LZ4.
If the improvements keep continue stacking with each further release, I sure hope that a solution will be found eventually. At some point, there’ll be plenty left on the table, while more and more shifts away from rust to solid state drives.