Recordsize and special_small_blocks sizing for predominately photo/video editing?

Hi All,

Just a home user who has outgrown their existing NAS. I’m in the process of building out my new NAS based on TrueNas 25.10.1 for which my main use cases will be;

  • Photo & Video Editing (Resolve Studio & GIMP / Affinity Photo)
  • Plex media storage
  • General user home drive file storage

I’ll be the only regular user requiring consistently good read & write performance. I do a lot of video editing as well as extended time lapse sequencing that could be thousands of camera RAW files. I’ll be using NVME caching on my local Win11 PC (with 10GbE networking) for the Video/Photo editing software with all files in my projects sourced from this new NAS for which the basic config I’m planning as follows;

  • Intel i5-11500H
  • 64GB DDR4
  • Mellanox ConnectX-3 10GbE SFP+
  • 2 x 512GB SSD (Mirror Boot)
  • 1TB NVME (Apps VDEV for Plex)
  • 8 x 16TB X16 Seagate SATA (VDEV with 2 x 4 disk RAIDZ1)
  • 2 x 1.46TB Enterprise SSD (mirror sVDEV)

From the reading I’ve done it would seem the best performance and capacity while maintaining some level of redundancy for my use case and hardware would be stripped RAIDZ1 VDEV with a mirrored special VDEV for metadata and small files. For file backups I have 2 x local NAS units and one remote, all with dual parity.

I’m hoping the above is on track and that my main decision now is to determine the best recordsize and special_small_blocks size. Based from the spread of size and count of the files in my dataset below, what would be the best way to go about determining the bestrecordsize? Does the general advice that special_small_blocksbe no larger than the recordsize still hold true? I’ve read that 128K ~ 1M is still the sweet spot for most applications. Given the files I’ll be editing are sort of 12MB RAW files through to 100GB video files, should I just be going for the upper recommended limit of 1MB for recordsize straight up?

1.0k:  36262 files,      27.2M total
2.0k:  16938 files,      50.0M total
4.0k:  10926 files,      62.2M total
8.0k:  80234 files,     938.9M total
16.0k:   4842 files,     109.3M total
32.0k:   9299 files,     393.3M total
64.0k:   3332 files,     309.6M total
128.0k:   6451 files,       1.1G total
256.0k:   6744 files,       2.5G total
512.0k:  40681 files,      27.1G total
1.0M:  26697 files,      38.4G total
2.0M:  64521 files,     189.4G total
4.0M:  75614 files,     437.4G total
8.0M:  89276 files,     956.1G total
16.0M: 105688 files,       2.2T total
32.0M:  38822 files,       1.4T total
64.0M:  32299 files,       3.1T total
128.0M:  25389 files,       3.9T total
256.0M:   6065 files,       2.0T total
512.0M:   2984 files,       2.0T total
1024.0M:   1954 files,       2.7T total
2.0G:   1838 files,       5.2T total
4.0G:   1080 files,       6.1T total
8.0G:    700 files,       7.5T total
16.0G:    146 files,       2.9T total
32.0G:      8 files,     326.8G total
64.0G:      7 files,     683.6G total
— Summary —
Total files: 688797 files,   41.8T total

It’s hard for me to give you an exact answer. I’m a beginner.

If you set special_small_blocks to a value larger than your recordsize, then all blocks of that recordsize will go into special_small_blocks.

Blocks of size special_small_blocks and smaller are written to the special vdev.

I assume a vdev with a 1 MB recordsize and 512 KB for the svdev. If a file is 2.2 MB, then 2 blocks will be written to the vdev and the rest will be written to the svdev.

Maybe my example will give you a clue. I’m still testing the special vdev. I set the vdev record size to 1 MB. Blocks smaller than or equal to 512 KB are written to the special vdev. Currently, my special vdev is using about 100 GB. It’s hard to compare this to your set of files.

1k: 101738
2k: 27103
4k: 25154
8k: 31199
16k: 32066
32k: 33395
64k: 44806
128k: 42996
256k: 37226
512k: 32635
1M: 31680
2M: 48602
4M: 46979
8M: 29415
16M: 67780
32M: 40237
64M: 11618
128M: 11149
256M: 8246
512M: 5637
1G: 2141
2G: 1061
4G: 1108
8G: 531
16G: 391
32G: 222
64G: 66

Has the recommendation about having a power of 2 disks in a RAIDZ pool gone away? I’m using a 4+2 x 6TB array (so RAIDZ2) (for similar uses; my photos, photo archive I manage, some video mostly historical restoration). Your setup with 4x16TB pools RAIDZ1 is using a 3-drive base set, midway between 2 powers of 2.

With disks even this big (never mind your 16TB) I’m afraid to go with less than RAIDZ2. Resilvering disks this big takes a LONG time (and is a heavy load), and if one disk dies and I replace it and start the resilvering process, I’m unprotected for the duration of the resilver.

(I’ve been using ZFS for my home NAS since 2006, back when I was working for Sun, running Open Solaris. Migrated to FreeNAS and TrueNAS and SCALE, the last just months ago. So it’s likely I have a lot of old-fashioned ideas, I can easily miss things changing despite making some effort.)

One of the things I’ve found is that trying to optimize configurations runs into inconvenient distributions of file sizes. For example, just this month I’ve added many hundreds of 20MB raw files of my photos – but each of them has a relatively tiny XMP file next to it, and when I render a batch of jpegs those go in a directory just under it. It’s all mixed together.

I didn’t mention the Raid-Z1 dangers because the OP mentioned having backups. Planned usage seems to be for the fastest read / write on conventional hard drives. I think the OP is trying to balance space with performance in making the pool layout decision. 4x mirror pairs may offer better performance but it also eats up half of the raw capacity.

I was thinking this was more of a temporary workspace than final storage of projects.

https://www.truenas.com/white-papers/#TrueNAS-PDF-zfs-storage-pool-layout/1/

1 Like

A laptop CPU? Why?

Overkill in home use. Keep a copy of the configuration file. In the unlikely event of a boot drive failure, replace, reinstall and the sole user will be back and kicking in no time.
Rather mirror the app pool.
And TRIPLE mirror the special vdev.

8-wide raidz2 would be safer. I doubt that you need the extra IOPS.

About 15 years ago…

The hardware looks decent, with enough RAM (I presume this is one of those ITX boards with a soldered mobile CPU). Boot SSDs are unnecessarily big and might be better used as a mirror Apps pool.

As for the pool layout: Z1 is fine if you have good backups, but it’s is pretty unlikely to make a huge difference in performance especially with an sVDEV and extra Apps pool. 300 IOPS isn’t much different than 150 and with metadata and thumbnails offloaded and most data over 1MB a Z2 will do just fine.

2x mirror as sVDEV is in theory a Risk, but I presume these are some high quality 1.6TB write endurance optimized drives and STH & others have already shown that class of SSD to be around an order of magnitude more reliable than any HDD and they resilver two orders of magnitude faster, so it should be fine if you keep an eye on pool health and don’t just ignore it for months. Maybe add in a third SSD to the mirror once the prices have come back down a bit, that also offsets the amount of data written to that drive from the other ones.

I’d say go for an 8-wide Z2 with 1MB records (using anything smaller will loose more capacity to stripe padding). It’s not really worth it to go beyond 1MB records, except maybe for a tiny bit of extra compression ratio, so I’d say don’t. And I’ve never seen a more perfectly distributed filesize set for recordsize=1M with special_small_blocks=512K.

…which should then be carefully examined as to how it provides enough SATA port for NAS use.
On second thought, I wonder about PCIe lanes as well.

Write optimised may not be required BUT enterprise SSDs and “NAS-motherboard-with-a-consumer-laptop-CPU” are a strange imbalanced mix.