Understanding zfs: What is the use case for nested datasets and why stripe mirrors?

thoresson · July 11, 2025, 1:50pm

Being new to zfs, there are at least two things that still confuses me:

What are the use cases for nested datasets? Is it only about organization, or are there technical aspects as well? One specific thing I’m considering currently is if datasets for desktop backups should be created as “root” datasets or nested as children under a backups dataset parent.
When reading the forum about vdev configurations, one frequent warning is about striped mirrors: “If one of the mirrors lose both drives, your whole vdev will crash.” So why create a pool with striped vdevs in the first place? Why not just create multiple single-mirror pools? One reason I can think of is that you don’t end up with unused space spread out over multiple pools, but are there other benefits with striped mirrors as well?

dan · July 11, 2025, 1:54pm

One of ZFS’ main benefits is pooled storage–put all your storage^[1] into one pool, and then you can divide it up (using datasets or otherwise) as desired. Your suggestion eliminates that benefit. In terms of other benefits, the IOPS capacity of a pool is proportional to the number of vdevs in the pool.

If the risk of pool failure is too high for your needs, you can always reduce that risk by using three-way mirrors. Or, if your use case doesn’t need mirrors at all, use parity RAID–RAIDZ2 or RAIDZ3 vdevs consisting of more disks.

At least, all storage of a similar kind–e.g., all the large spinners ↩︎

swc-phil · July 11, 2025, 2:17pm

I personally use it to apply the same settings for the child dataset. For example, the parent Media dataset can have a blocksize of 1M (or even 4M) instead of the default 128K and light compression (lz4) instead of heavier zstd-3 (my default for one).

Also, you can apply the same snapshot schedule for the parent dataset. Thus, you can effectively exclude data that doesn’t need frequent (or any) snapshots.

Aiui, you can also apply different replication settings to the parent datasets.

Dan already answered this. To better understand the differences between pool geometries, you should read this guide.

thoresson · July 11, 2025, 2:24pm

Thanks! Not only did you answer my explicit question, you also hinted at one I’m yet to dive into: Understanding blocksizes. Your examples and my intention to create a specfic dataset for my photo library confirm that’s something I need to better understand as well!

Already did, but that was while I was still preparing for my first TrueNAS installation. Will revisit, thanks for the reminder that the paper exists!

awalkerix · July 11, 2025, 2:49pm

I prefer to avoid nesting datasets unless absolutely required. If files are within the same dataset they can be atomically renamed (moved). It also works better with file sharing clients in general.

winnielinnie · July 11, 2025, 2:52pm

This only makes sense in the context of the dataset’s usage, covering all files and folders within.

I don’t need to atomically rename or move anything from my streaming media dataset into my desktop backups dataset. There’s a reason why “streaming media” and “desktop backups” are different datasets, not folders.

awalkerix · July 11, 2025, 2:55pm

That would be a reasonable requirement for a dataset. I stick typically with one dataset per share and not nesting within the shares.