ZFS Pool "Checkpoints": They work just like seatbelts! (Not really)

What is a ZFS Pool “Checkpoint”?


If you do destructive or experimental actions against files and folders within a filesystem, you can always resort to rolling back to a dataset’s snapshot.

Even if you don’t wish to do a full rollback (which will ultimately lose any new data after the point-in-time of the snapshot), you can still retrieve old or deleted files from the read-only filesystem (i.e, "snapshot) on an individual basis.

But what safeguard exists if you do something outside of a filesystem?

What happens if you…

  • …destroy a snapshot?
  • …destroy an entire dataset?
  • …add a new vdev to the pool that you instantly regret?
  • …create chaos with a poorly vetted batch script that contains zfs commands?
  • …enable pool features that you immediately regret, thus forfeiting backwards compatibility?
  • …rename / shuffle your dataset structure, only to immediately realize it was a bad idea?

This is where a pool checkpoint can come in handy.

You never want to find yourself in a situation where you need to resort to “rewinding” your pool back to a checkpoint, just as you never want to be in a situation where a seatbelt saves your life from a vehicle collision.

Ideally, you never mess up your pool.

Ideally, you never get into a car accident.

But just as seatbelts exist, so do pool checkpoints.

So then what is a ZFS Pool “Checkpoint”?
It is an immutable point-in-time state of the entire ZFS pool.


Managing Checkpoints with the command-line

To check the existence of a pool checkpoint, use the zpool get command, and look for a “size” under the VALUE column.

In this example, the pool “mypool” has no checkpoint:

zpool get checkpoint mypool

NAME       PROPERTY    VALUE    SOURCE
mypool     checkpoint  -        -

To create a checkpoint, use the zpool checkpoint command:

zpool checkpoint mypool

Now we can see the VALUE column has a “size”:

zpool get checkpoint mypool

NAME       PROPERTY    VALUE    SOURCE
mypool     checkpoint  540K     -

If you want to remember when you created a checkpoint:

zpool status mypool | grep checkpoint

checkpoint: created Tue June 4 14:40:30 2024, consumes 540K

:information_source: An empty output means that no checkpoint exists.

To discard a checkpoint, use the -d flag in the command:

zpool checkpoint -d mypool

Now we see that there is no “size” under the VALUE column once again:

zpool get checkpoint mypool

NAME       PROPERTY    VALUE    SOURCE
mypool     checkpoint  -        -

To actually “view” or “rewind” to a checkpoint requires that the pool is first exported, and then re-imported.

To access a pool’s checkpoint in a read-only state (such as retrieving particular data that exists on a dataset you outright destroyed):

zpool import --read-only=on --rewind-to-checkpoint mypool

To rewind to a checkpoint (which will discard everything you did after the checkpoint’s creation), remove the --read-only flag:

zpool import --rewind-to-checkpoint mypool

:warning: Remember, you will lose everything after the checkpoint’s creation (including any newly added vdevs), and you will no longer have an existing checkpoint in the pool post-importation.

:warning: For TrueNAS systems, you must also include -R /mnt to the import parameters. This is not needed for vanilla ZFS systems, but it is required for TrueNAS.


Important caveats about Checkpoints

:warning: Do not treat pool checkpoints as you would dataset snapshots.

There are some important caveats and distinctions:

  • A pool can only have a single checkpoint
  • A checkpoint’s contents cannot be accessed from a (normal) imported pool; you must export and re-import with the --rewind-to-checkpoint option to access checkpoint-exclusive content
  • A scrub on a (normal) imported pool will not check the data that only exists in the checkpoint
  • A checkpoint is pool-wide, thus “rewinding” back to a checkpoint will undo everything in the pool that you’ve done after its creation
  • You are not supposed to “sit” on a checkpoint: After you create one and then do some “stuff”, you should very soon make a decision on whether you want to discard the checkpoint or rewind to it
  • You cannot remove or modify vdevs if a checkpoint exists
  • You can add a new vdev after creating a checkpoint, in which rewinding the checkpoint will act as if the new vdev (including any files saved after its addition) never existed

TL;DR: What should I do?

  1. You want to try something that affects the entire pool or dataset(s). This includes “upgrading” pool features, destroying datasets or snapshots, adding a new vdev, trying out a batch script that uses zfs commands, receiving a replication stream to a dedicated backup pool that you might reconsider, and so on
  2. Before doing so, you create a checkpoint with zpool checkpoint mypool
  3. You go ahead and continue with whatever you decided on
  4. You assess the results. You need to make a decision, since it’s unwise to let a checkpoint “sit” in a pool for too long.
    4a. Are you happy with the results? :partying_face: Discard the checkpoint with zpool checkpoint -d mypool
    4b. Are you unhappy with the results? :scream: Export the pool and then rewind to the checkpoint with zpool import --rewind-to-checkpoint mypool

Always remember, kids!™

Use pool checkpoints as a safety net, with the mindset that you’ll never have to actually rewind your pool.

Wear seatbelts as a safety measure, with the mindset that you’ll never depend on them to save your life from a car accident.

15 Likes

Here is a forum post feature request for the TrueNAS GUI to incorporate the checkpoint feature.

Here is a Jira ticket you can vote on.

Excellent post! - Question, will checkpoint also work if you add an addition VDEV to the pool by mistake?

1 Like

I believe it’s only possible if the added vdev is a mirror (and all existing vdevs are mirrors).

I remember reading about this very specific condition. I’ll have to retrieve the article. (I think it was posted in the FreeBSD Journal.)

EDIT: Found it. While not from the FreeBSD Journal, this blog post was written by the same author who published the article about checkpoints in the 2018 FreeBSD Journal.

Apparently, there is no distinction on the vdev type. So this is not limited to only mirrors. :slightly_smiling_face:

4 Likes

I updated the guide to reflect what was discovered about adding new vdev(s).

Glad you asked, @somethingweird! :+1:

1 Like

well, i finally had a need to try it out. You have to disable the jailmkr startup script and reboot before you can export.

But on reimport, I got this:

root@truenas[~]# zpool import --rewind-to-checkpoint 10077816991006409476
cannot mount '/main': failed to create mountpoint: Read-only file system
Import was successful, but unable to mount some datasets
root@truenas[~]#       

so it was re-imported ok, but NOT mounted.

Trying a reboot now.

That makes sense for vanilla ZFS, but not for TrueNAS.

You’ll need to add -R /mnt to the import parameters. The reason I left this out is because an “altroot” is not necessary for non-TrueNAS systems.

However, I’ll go ahead and edit my post to include -R /mnt as a needed parameter for TrueNAS users.

1 Like

Added a note to the post. :point_down:

Click to view revised section
1 Like

I figured this out after spending 5 hours in agony as to why it was trying to mount to / instead of /mnt.

Thank you, this will save people incredible pain.

I just followed your instructions and my system was toast. I had to manually set the mountpoint property for main and for ix-apps. Learned a lot though!

can you edit the original post to correct it? That’s what people will see.

thanks!

It’s already edited. :slightly_smiling_face:

It looks the same as the original even after page reload:

There is no -R option added to the import. Am I missing something?

1 Like

@winnielinnie I’d suggest emboldening “For TrueNAS systems” in your [i] block.

And maybe dropping a link to your “revised” comment.

1 Like

Bolded.

The thing about TrueNAS vs “ZFS” is you jump away from upstream defaults.

-R /mnt is one of them, but so is -d /dev/disk/by-partuuid, and the irregular location of the “cachefile” (/data/zfs/zpool.cache) for TrueNAS, compared to upstream OpenZFS (/etc/zfs/zpool.cache).

That’s why I leave the examples as “vanilla” as possible, since they can be applied and tweaked for any system.


It also highlights why it’s important for TrueNAS to incorporate these useful features into their GUI, since a lot happens “under the hood”.

My ticket on Jira was “closed” because they want us to use the forums for feature requests instead.

2 Likes