Best settings for data archives?

Hello dear community,
I have a lot of old data (photos, videos, documents). I touch this stuff maybe two or three times a year. That’s why I want to store it more efficiently. What would be the best record size for archives? Would you prefer Gzip-9 or Zstd-19?

Thank you very much :slight_smile:

Both will grind your server to a halt. Give them a try on a small subset of your data before committing to them.

1 MiB

Neither. LZ4.

Media files (photo, video, audio) are not very compressible, and usually are already compressed.

You do, however, want to enable inline compression, regardless. LZ4 is the best candidate.

  1. It will eliminate the extraneous padding on the last records that comprise a file.
  2. It will eliminate the sometimes hidden padding (lengths of zeros) within some files and metadata, such as EXIF, video headers, etc.
  3. LZ4’s early abort makes the write-once, read-many performance a non-issue. (You could argue the same for ZSTD with OpenZFS 2.2.x. Although the benefits of trying to engage ZSTD for an archival dataset of mostly pictures and videos is pointless.)
1 Like

I enabled zstd as default for my 1-VDEV pool which is also mostly WORM archival data. My pokey little D1537 at 1.7GHz slurps it up at 400MB/s. So the impact of zstd vs. lz4 doesn’t seem to be material on more modern CPUs. Thank you for your guidance re record sizes and compression, @winnielinnie!

1 Like