Hi, I am using the recursive options on my main pool to keep quite frequent snapshots with certain retention settings meaning that there will be a max of 210 snapshots at the same time per dataset. Is there a way to get that number down, like merging snapshots? It’s come to a point where I can’t even use the GUI to see the snapshots anymore, which will become a problem if I ever want to restore. Now I got a warning that I have more than 10000 snapshots even though I only have 38 datasets which would mean roughly 8 thousand snapshots at maximum, so how could this result in over 10 thousand? How do I get the number of total snapshots down and how do I fix this “misdetection”?
You need a small script to get rid of all snapshots older than X days.
Look online, I modified one that takes as arguments the pool name, a grep pattern and number of seconds, hors or days. It deletes all snapshots that match and are older than the given secs, hors, days.
But, this also means that snapshots are not being deleted automatically.
This can happen when renaming the snapshots name, or some other weird snapshot or replication setup?
I have properly set up the keep for x days on the snapshot tasks. I am not using anything special for replication, it’s simply the regular snapshot task creation tool. That’s why it’s weird that it doesn’t seem to be deleting any snapshots even though that is what was configured.
Maybe you should use CLI to handle snapshots. This is what I do 99% of the time.
Personally, having a huge amount of snapshots isn’t a problem on its own.
What you have to ask yourself the question is what the snapshots intended for.
Beside locking the state of the zfs blocks over long period, are you making use of replication?
If you have used replication in the past, it is possible a flag has been set to prevent deletion of stale snapshots which hasn’t been replicated yet.
I haven’t used replication yet, I wanted to first set up snapshots and test them for some time before starting replication. I don’t have anything else configured except these snapshots… Why is it not deleting them properly, I don’t understand. Should I delete the snapshot tasks and delete all snapshots and then use the CLI to create them? I’m honestly lost since I was of the perception that everything is working fine as the snapshot count of each dataset never went over the max specified amount.
Post how you have your snapshots set up. It might give a clue as to what is going on
Before you start deleting snapshots, you should read on the subject and understand what a snapshot is.
For now, there is nothing wrong having 10k snapshots. Maybe you could update the task dealing with snapshot creation to reduce the interval when snapshots are created.
I am aware of what a snapshot is. The frequency I have selected can be reduced, sure, but my main problem is that I’m getting alerts of having more snapshots than should be possible with my current configuration and retention settings. Will send the configured UI once I get back to my PC. Thank you
Are you using apps? In the pool that you set for handle apps, some hidden datasets are created, and the recursive flag on the snapshot task will have effect on those too… Maybe Is just that?
Ah yes that’s probably it. I thought it only affected visible snapshots which I shouldn’t have assumed. How do I exclude all iX apps datasets from backups since all applications with important data have a dedicated dataset created by me?
Seems simple enough, thank you! How do I get rid of those snapshots specifically?
Do they show up in the Snapshots page of the UI? Can they be filtered out?
If not, you’ll need to use the command-line, which is risky if you accidentally type or paste the wrong thing. Even missing a single character in the command can destroy your data.
There is a way to do this “batch” and to destroy all snapshots on a dataset with a single command. It requires grabbing the name of the earliest dataset first. It’s also less complicated to do this one by one for each child, since the earliest snapshot might not match with what the parent has.
Honestly, I could get rid of all snapshots right now and it wouldn’t be a problem, so that’s an option. The UI sadly doesn’t load anymore so that’s not a possibility to filter that way. I think I can just wipe snapshots, keep current data and then fix the snapshot tasks. Otherwise I’m unsure how to delete singular snapshots, the docs haven’t been incredibly enlightening (probably because of my own incompetence lol).
Proceed at your own risk. ![]()
For these steps, replace <poolname> with the name of your pool.
-
Make a checkpoint of the pool.
zpool checkpoint <poolname> -
Confirm the checkpoint.
zpool status mypool | grep checkpoint -
Find the name of the earliest snapshot in the parent dataset.
zfs list -H -t snap -o name -s creation <poolname>/.ix-apps | head -n 1 -
Do a dry run of what the batch destruction will look like. (Do not omit the
%symbol. It is required for the batch operation.)
zfs destroy -nv <poolname>/.ix-apps@<nameofearliestsnapshot>% -
Destroy all snapshots on the dataset, using the name of the snapshot you found in the previous step. (Do not omit the
%symbol. It is required for the batch operation.)
zfs destroy -v <poolname>/.ix-apps@<nameofearliestsnapshot>% -
Repeat steps 3 to 5 with each child dataset under
<poolname>/.ix-apps.[1] -
If you believe everything went well, you can now discard the pool checkpoint.
zpool checkpoint -d <poolname>
Read over every step carefully. Some lines wrap because of how Discourse limits the reading pane.
If you need to list all children under
.ix-apps, you can use this command:
zfs list -H -t fs -r -o name <poolname>/.ix-apps↩︎
Thank you so much!
I deleted all unnecessary snapshots and excluded those datasets from future snapshots. I think this solved my problem. I also reduced the retention of very frequent snapshots as I don’t think I’ll need them to be available for long.
Again thank you all for your help!
