Snapshots defy math and logic. "THEY DON'T MAKE SENSE!"

Why ZFS “snapshots” don’t make sense
A children’s book for dummies, by a dummy :nerd_face: :+1:


“I don’t really understand how snapshots work.”

“Why doesn’t destroying my snapshot free up more space?”

“How come these numbers don’t add up?”

Does this sound like you? Stop thinking about ZFS, pools, datasets, blocks, and snapshots.

You want to demystify snapshots? Think about trucks, boxes, and colored stickers.


What this illustrative walkthrough is not:

:x: It’s not a technical explanation of ZFS, snapshots, or any related technology
:x: It’s not meant to cover all uses and nuances of snapshots
:x: It’s not meant for experienced users or admins of ZFS or TrueNAS

What this illustrative walkthrough is:

:white_check_mark: A way to understand ZFS snapsots from a layperson’s view
:white_check_mark: To help demystify snapshots for users new to ZFS or TrueNAS


You own a truck. It only has so much room for boxes. Twelve boxes is its maximum capacity. You like to use your truck to store things for yourself and for others.

Right from the start, your truck is empty. Say “hello” to your new truck.

As Confucius said in ancient China, “A truck is most useful when it is empty.”

There are some rules about your truck, and how to add, remove, and tag boxes.

Here are the rules:

  • When you add a new box, it has a white sticker on it
  • If you want to remove a box, you must first rip off its white sticker
  • To tag colored stickers on your boxes, they must have existing white stickers
  • When you tag with a color, all boxes with white stickers must be color-tagged
  • You cannot use the same color to create another “color tag set”
  • When you rip off a colored tag, all stickers of this color must be ripped off
  • A box cannot be removed if it has at least one sticker on it
  • Unlike colored stickers, you can rip off the desired number of individual white stickers

Please read these rules again. They are very important. As a new truck owner, you are bound to these rules like the laws of nature.


You decide to add some boxes inside your truck.

Notice how they all come pre-tagged with white stickers?


Let’s get rid of a couple of these boxes! Remember the rules? First rip off their white stickers!

If we slow things down, you can see that there is nothing to “protect” these boxes from removal!

Which means…


…they are removed!

Good for you! You just made space for two more boxes by ripping off those two particular white stickers. (Not that you really needed the extra room.)

This concludes the basics of: “If you want to remove boxes, you must first rip off their white stickers. If a box has no more remaining stickers on it? It gets removed!”


Let’s go back again to try something else instead. :clock1030:

Here we are again. You have four boxes with white stickers. Nothing was ever removed.


You add several more boxes inside your truck. All come pre-tagged with white stickers, as expected.

The truck isn’t full, but space is getting limited…


You like the way things are with your white sticker boxes. Just like this. Nothing more, nothing less.

You’re so satisfied with the current inventory of white sticker boxes that you decide to commemorate this occasion by applying a red sticker to all boxes that currently have a white sticker.

It’s at this point you realize that no matter what happens to any of your white stickers, you know exactly what this “beautiful set” of boxes was at this exact point in time, since they all have a red sticker on them.


To test this theory out, you rip off the white stickers from two boxes that aren’t so important.

Lo’ and behold. They are not removed from the truck! Why? Remember the rules? As long as a box has even just one sticker, it cannot be removed.

If you wanted to free up the space of two boxes, that’s too bad. They’re not going anywhere!

This concludes the basics of: “If a box has even a single sticker on it, whether white or colored, it cannot be removed from the truck.”


Let’s go back again to try something else instead. :clock1030:

Here we are again. You have eight boxes. Nothing was ever removed.


You add a few more boxes in your truck.

Notice how these newly added boxes have no red stickers on them?

Unfortunately, you cannot use the same color again if you want to preserve this lovely “set of boxes” that currently have white stickers.

But wait! You know of a brilliant way to commemorate this “lovely set”. You can tag all white sticker boxes with a new color!

How about…


…blue!

Yes, that’ll do.

Every existing white sticker box gets a nice blue sticker! Good thinking!

You don’t have to worry about how the “red sticker set” does not include those three new boxes. You have them protected with your “blue sticker set”.


Let’s go back again to try something else instead. :clock1030:

Remember how you got to this point? Eleven boxes in total, eight of which have the protection of the “red sticker set”.

Oh dear. Your truck is getting full. You better remove some boxes…


Simple! You’ll just rip off the white stickers from two boxes that really aren’t so important upon reviewing your inventory. That should free up some room to…

Hey, wait a minute! That didn’t remove the boxes? That didn’t free up some extra room?

It’s those stupid red stickers! You could remove the entire “red sticker set”, but you don’t want to lose the protection on all your white sticker boxes.

What to do? What to do? You want to preserve the current state of your white sticker boxes.

Oh, of course, there is a solution…


…blue sticker set to the rescue!

By tagging all existing white sticker boxes, you get to preserve this current set of white sticker boxes.

This means that if you want to free up some space later, you can simply remove the entire red sticker set, while you get to keep a nice blue sticker set for future protection, since you totally like this current white sticker set.


In all your excitement, you forgot that the box in the bottom left corner is actually not important.

You rip off the white sticker!

Of course, this box stays in the truck because it still has at least one sticker on it. A blue one. If only you hadn’t rushed things.


You don’t have time to think. A friend needs you to add a new box to your truck.

Now it’s at full capacity!

But your friend doesn’t stop there…


…he wants to protect his box! So he tags all white sticker boxes with a green sticker set!

What is his deal? How is he even your friend?


You’ve had it. You feel your blood boiling. You go into a mad rage!

You want to remove all boxes from the truck, move to another city, and start over with an empty truck.

You rip off all the white stickers. It’s time to say “bye bye” to all remaining boxes!

What’s this? The truck remains full? Not a single box was removed?

Oh, that’s right. Every box has at least one sticker on it.


What if you were to rip off the red sticker set? How much room would free up?

Only two?


Let’s go back again to try something else instead. :clock1030:


What if you were to rip off the blue sticker set? How much room would free up?

Only one?


Let’s go back again to try something else instead. :clock1030:


What if you were to rip off the green sticker set? How much room would free up?

Only one?


Let’s go back again to try something else instead. :clock1030:


What if you were to rip off the red and green sticker sets? How much room would free up?

Only three?

Well, I guess that makes sense?


Let’s go back again to try something else instead. :clock1030:


What if you were to rip off the red, blue, and green sticker sets? How much room would free up?

Let’s do the math.

Ripping off the red sticker set would supposedly free up two boxes.

Ripping off the blue sticker set would supposedly free up one box.

Ripping off the green sticker set would supposedly free up one box.

2 + 1 + 1 = 4

Therefor, ripping off the red, blue, and green sticker sets will obviously free up four boxes.

Let’s try it now!

What’s this?! IT FREED UP TWELVE BOXES?


The end.

If you found this helpful, please “like” this post.

If you ended up even more confused after this, please “like” this post.

If you think this post was pointless, please “like” this post.

If you want to offend me and make me cry, please “like” this post.

26 Likes

Definitely my reason to do it. Seriously, though, this breaks it down about as simply as it can be explained–I think it’ll help a lot of people.

Edit: would you mind if I put this into Fester’s Guide? With attribution, of course.

2 Likes

I am offended. (While crying, too!) :+1:

I had considered providing the relevant zfs commands with each illustration, as well as adding footnotes with “exceptions to the rule”, or overlaying it with the nuance of replications and how “the order of snapshots matters”[1], and explaining the differences between a pool, dataset, and vdev.

But I think that would have worked against new users, since it’s better for them to have their own “aha!” moment.

I wanted this to remain as “dumbed down” and simple as possible.


  1. I intentionally placed the white stickers in the top corner of each box, and kept a consistent left-to-right placement for the colored stickers, which sort of reiterates the order of “old to new”. ↩︎

@dan, You may!

No attribution needed. I’ll leave that up to you. It doesn’t matter to me either way. :+1:

1 Like

Lunacy…Sheer Lunacy…

1 Like

Instea how about a littel addendum that lists the most important commands with an example. (not graphical but fully working command…)

1 Like

This ‘dumbed down’ version could certainly be referenced with other information such as you suggest, without directly tying it to it which might have the potential for confusion simply due to some persons choosing to read every single thing (even when it could go against them and their newness), best to let the new information settle.

1 Like

@tigersharke and @Haldi, agreed and agreed, with a caveat:

Perhaps I can include a hidden “spoiler” at the very end, which injects some examples and explains “nuance”, without getting too cluttered.

This is what I found to be the case with other guides (not just software, but any topic).

In my humble opinion, sometimes it’s better to give only the minimum amount of “understanding”, and allow the new user to explore further on their own.

To use this “dumbed down” picture guide as an example: The new user will immediately realize that snapshots don’t “duplicate” or “copy” data. So right from the start, they’ve already demystified an important fundamental about ZFS snapshots.


EDIT: I’m thinking of adding another picture with a pair of “red tinted” eye glasses. So that when you view the truck wearing these special glasses, you will only see the red sticker boxes. (To demonstrate how a “snapshot” is in fact a complete filesystem.) Then maybe another example with “green tinted” glasses.

Clearly this truck is a flatbed — we didn’t have to move any boxes to remove the deep ones.

(Rumors that a helicopter was involved have not been independently confirmed.)

1 Like

I submitted a PR to the OpenZFS GitHub to switch to trailer trucks, but they insisted that flatbeds are cross-platform compatible.

1 Like

To be honest, I’m still a little confused: What is the analogy exactly. I suppose the truck=Dataset, the boxes = your data, and the snapshots the stickers, right?

I’m trying to understand ZFS replication better: Apparently, if I’m taking a snapshot every 6 hours on NAS A (1am, 7am, 1pm, 7pm), then use ZFS replication on NAS B and pull everyday just the 1am snapshot, I should still have a complete daily back-up of all my data. However, I fail to understand the following. If every snapshot just saves the incremental change… how am I still getting all the data eventhough I’m only pulling the 1am snapshot and not the other snapshots from this day? Is it because once you take another snapshot (let’s say add a green label), you basically add that green label to all boxes? But how does the NAS B then know what the changed data is so it doesn’t have to transfer all boxes with a green label?

Only the colored stickers. The white stickers, which come with every new box, represents the “live filesystem”.


Understanding ZFS replications goes beyond just snapshots on a dataset.

You’re not “still getting all the data”.

What happens is that your destination’s “live filesystem” will be at the state of the latest snapshot received. If you were to delete everything on the source’s live filesystem, then the 1AM snapshot happens, then you replicate that 1AM snapshot to the destination, it means the destination’s live filesystem will appear empty. (Any snapshots that exist on the destination before this latest 1AM snapshot will still contain the “deleted” data.)

In order to access older copies of these filesystems, you can either rollback (one-way destructive operation) or browse to the hidden .zfs/snapshot directory that presents the filesystems in browsable form, which lets you view and safely recover files.


Snapshots don’t save incremental changes. All data that had existed in the live filesystem is forever preserved in this “moment-in-time” filesystem (a “snapshot”). If the snapshot is destroyed, any data blocks that are no longer referenced in the live filesystem (white sticker) or at least one other snapshot (color sticker) are freed and no longer take up space.

“Differences” or “incremental” are only relevant when comparing two snapshots.


As for replications, the TrueNAS UI does not make obvious what is happening.

In ZFS, when you issue an incremental replication, you specify exactly two snapshots.

zfs send -i mypool/mydata@A mypool/mydata@D

If the -i parameter is used, it means only the differences between these exact two snapshots are transferred to the destination, no matter what other intermediary snapshots exist between them. This will only send the difference between A → D. Blocks unique to snapshots B and C are ignored.

If you instead use the -I parameter, it uses a “passing the baton” method, where it will send the differences between A → B and then B → C and then C → D.


You don’t add a green sticker to all boxes. You only add a green sticker to boxes that still have their white sticker (the “live filesystem”). Remember, the “live filesystem” is always the latest transaction group (TXG), no matter how recently you took a snapshot. If the green stickers were applied to all white boxes, it has no effect on the incremental send that only involves the difference between redblue stickers.

The white stickers (“live filesystem”) are never involved in a ZFS replication. Replications only involve snapshots.


You might think: “Doesn’t the destination dataset have a live filesystem? How is that possible if it only receives snapshots?”

This is because upon a successful receive, the latest snapshot (as specified in the zfs send command from the source dataset) becomes the destination’s “live filesystem”. There’s a caveat to that. You can only use it as a read-only filesystem. If you write anything to it, even something as simple as updating a file’s access timestamp, you break future incremental replications. It can no longer be used to receive incremental replications.

In order to continue receiving incremental replications, you’ll need to rollback the destination dataset to the latest snapshot it had received from the source, which will delete or undo anything that was created or modified after that point.

This is why there are options to enforce “read-only” on the destination dataset in the Replication Task settings.


NAS B doesn’t know what the differences are. It is NAS A that is responsible for comparing two of its snapshots, knowing which blocks are unique between the two snapshots, and then sending only those blocks to NAS B. NAS B happily receives this stream of data, as long as it has the same “base” snapshot that exists on NAS A.

To demonstrate this, you can issue a “dry run” on NAS A with the -n and -v flags of the incremental zfs send command. This is done locally without involving another dataset, pool, or server. It will print the estimated size of this “incremental stream”. You can even output this stream into a file. (The file won’t be of much use to you, but it will contain only the differences of blocks between the two specified snapshots in the command.)

1 Like

Sorry for the late reply. Thank you so much for that write up! This really helps.

One thing that came to my mind as needing more explanation:

There are no “incremental snapshots”. All snapshots contain all the data at the time they were taken.

So an incremental replication does not send an “incremental snapshot”, either.

The source explicitly computes the difference between e.g. the new midnight snapshot and the last one. Then sends that difference to the destination including the information that this increment is based on the last midnight snapshot.

So the destination can build the new midnight snapshot based on the last one and the increment it receives. If it does not have the last midnight snapshot, replication fails.

All differences/incrementals are computed and sent explicitly on the fly - the are no differences/incrementals stored, anywhere.

2 Likes

That’s why I responded to @cordvision when he said