ZFS Snapshots Not Being Deleted

WiteWulf · September 19, 2024, 1:35pm

I have a single periodic task configured on my Scale box to take a daily snapshot of the Apps dataset, and retain snapshots for 2 days. However, when I looked in the Datasets/Snapshots section of the UI just now I noticed that I have more than 2000 snapshots dating back to May.

Any idea why they’re not getting deleted, what I can do to fix that, and how I can clear out the old ones now?

Here’s a screenshoot of the snapshot task configuration:

WiteWulf · September 19, 2024, 1:47pm

Closer inspection of the snapshots shows that it is correctly only keeping 2 snapshots, as I only have one for today and yesterday in September. But then I have loads more from back in May/June that I need to get rid of. I guess I should learn how to use ‘zfs destroy’

winnielinnie · September 19, 2024, 1:57pm

What is their naming convention?

Are they recursive, or only for the Apps dataset directly?

How were they original created?

WiteWulf · September 19, 2024, 2:12pm

They’re recursive, so one for each folder under /Apps, with a naming convention as follows:

Apps/ubooquity@auto-2024-06-12_06-00 260K - 11.2M -
Apps/ubooquity@auto-2024-06-13_06-00 276K - 11.3M -
Apps/ubooquity@auto-2024-06-14_06-00 268K - 11.3M -
Apps/ubooquity@auto-2024-06-15_06-00 272K - 11.3M -

They were originally created by this same replication task when I was keeping snapshots for longer. I think when I shortened the retention period it didn’t automatically clear out the old snapshots I no longer wanted

The ‘@auto’ part of the name is what’s pertinent to these automated task snapshots

winnielinnie · September 19, 2024, 2:18pm

That’s 6am.

Your screenshot shows 5am.

This is the reason why zettarepl^[1] is not pruning them, since they do not match the naming schema.

zettarepl is name-based. It does not use a companion database or metadata, or any other means of understanding a snapshot other than its actual “parseable” name. ↩︎

WiteWulf · September 19, 2024, 2:20pm

Ah, so I must have changed the snapshot time at the same time. Makes sense when explained, not at all obvious when configuring it.

Thank you

winnielinnie · September 19, 2024, 2:26pm

You can use the % symbol to tell zfs destroy to delete an entire sequence of snapshots. Just be aware, it will destroy ALL snapshots in this range, based on their creation date, with a single command.

An example looks like the following.

To do a non-destructive “dry run”:

zfs destroy -nvr Apps@auto-2024-01-01_06-00%auto-2024-07-31_06-00

The above should cover all snapshots between January 1 and July 31, on the Apps dataset and all of its children.

It will tell you which snapshots would be destroyed, and how much space will be freed.

The -r will issue it recursively, to also apply to child datasets.

If you’re feeling brave, you can remove the -n flag, and run the command again with admin/sudo privileges.

I’d advise that you create a checkpoint first, just in case.

zpool checkpoint <nameofpool>

And then afterwards, if everything appears fine, you can remove the checkpoint:

zpool checkpoint -d <nameofpool>

WiteWulf · September 19, 2024, 2:35pm

Perfect, thanks again

WiteWulf · September 19, 2024, 2:40pm

Quick question, though, @winnielinnie: other than taking up space, does leaving snapshots and checkpoints in place have any performance impact?

In vmware it really slows down access to a storage volume if you’re making changes on top of a snapshot, so it’s considered bad practice to leave them around for longer than you need to. Is the same, or similar, a thing with ZFS?

winnielinnie · September 19, 2024, 2:44pm

ZFS snapshots (and checkpoints) don’t have that issue.

Leaving them can accumulate space over time, as well as create clutter when it comes to management.

But other than filling your pool’s capacity beyond 80%, you shouldn’t see any decrease in performance.

Protopia · September 19, 2024, 2:59pm

VMware and ZFS snapshots are very different beasts that work in completely different ways. ZFS snapshots and new data are in the same dataset. I have a feeling that VMware snapshots hold the new data on a completely different virtual disk.