Snapshot replication issue: No incremental base on dataset

jrma · December 26, 2024, 7:50am

Thought I would post here to see if anyone knows of a workaround to my issue.

I have Truenas1 and Truenas2 machines. The Truenas2 machine is offline most the time and I only power it on to replicate snapshots.

I have a few snapshot tasks which I set for 1 day on a large dataset because the data changes frequently and I don’t have enough storage on the source system to keep longer snapshots for.

Everything works well long as I replicate snapshots within 1 day. If I try and replicate snapshots after a few days I get the error:

No incremental base on dataset

I think this is happening because the source snapshot has ‘aged’ so when the replication task runs it is trying to delete the destination snapshot also before starting replication so there would be no base snapshot for replication to start.

I could enable the option ‘replicate from scratch’ but that would be TBs of data copying again over network which is what I don’t want.

Surely the best solution here would be for the destination system to delete the ‘aged’ snapshots after the replication. Is this possible? I didn’t see any options to do this.

Am I doing something wrong here or is this just how Truenas snapshot replication works?

The issue only comes up when my destination system has been offline a few days and I haven’t had chance to replicate the snapshots.

Any ideas or comments on this if you have had similar issue would be appreciated. Thanks!

etorix · December 26, 2024, 8:43am

This.
The solution is to retain at least some snapshots long enough for them to be present on both the source and destination: That’s how incremental replication works.
Or keep the destination powered on.

~~There’s no option to retain a snapshot “as long as it is not replicated somewhere else”, if that’s what you mean.~~

jrma · December 26, 2024, 9:00am

Thank you for the reply.

I have been reading through the documentation and I am going to try using the option ‘Save pending snapshots’

I am hoping that if the destination system is offline and the replication task has not run or failed then it won’t destroy the source snapshots until they have been replicated.

I am running a test now and will report back later today.

jrma · December 26, 2024, 1:06pm

Been doing some testing today.
I created a snapshot task to retain snapshots only for 1 hour and then I set hourly snapshots and created a replication task. On the replication task I enabled the option “Save Pending Snapshots”

I then shutdown the destination truenas and left it for over 5 hours.

When I checked the source I could see 5 snapshots! Which was perfect and what I wanted. I then did a replication job and it completed no issues! I then saw the 4 snapshots get purged from the source and both source and destination have the same snapshot left.

It seem to be working now exactly how I need it too. The option I was missing was ‘Save Pending Snapshots’ on the replication task.

I will test this now over a few days and see how it goes. But I suspect this was the fix to this issue.

neofusion · December 26, 2024, 1:38pm

As a comparison, I have two systems.
One servers as a backup system of the main system.
The backup system is only turned on once a month to pull relevant snapshots.
The main system is set to retain those snapshots for 3 months, giving me 2 months leeway in case I, for whatever reason, am unable to follow my normal backup schedule.

In your case, as I take it you are pushing the backups (?), the option to “Save Pending Snapshots” might indeed be enough, although I have not tried that method myself.

jrma · December 26, 2024, 1:55pm

Yes that is correct. I am using the PUSH method.

The “Save pending snapshots” seems to be working for me. Looks like it is retaining the snapshots not copied but also seems to be keeping the chain in tact and not deleting the last successful snapshot until replication completes.

This seems to be keeping the base snapshot as well which is exactly what I wanted.

I have just setup now my daily snapshot schedules so tonight 11pm it will create all the base snapshots.

Tomorrow will replicate all those to the destination truenas so the base is replicated.

I will then need to wait a few days with the destination server offline to see if this works and build up a few daily snapshots. (Fingers crossed)

I will report back later this week on this.

RetroG · December 26, 2024, 5:02pm

I find that when I move some datasets around or deliberately clear out snapshots I can run into this issue at times.

my usual fix is temporarily enabling “replication from scratch” on the replication task, which will cause it to delete the destination dataset if there is no incremental and re-replicate it.

This is a dangerous option that can be misused and can cause data-loss if you aren’t aware of how and where it’s replicating to, so be warned.

if you want to be more manual about this, you can also delete the destination dataset yourself, which will again cause a full replication of that dataset.

jrma · December 26, 2024, 5:26pm

Thanks. But you should not have to replicate from scratch. Not unless the snapshots are out of sync.

My issue here was that the retention of the snapshots was being met before I had a chance to replicate them. So when the replication job ran it saw no previous base snapshot and therefore could not replicate the next incremental.

I am hoping now that using the “Save pending snapshots” option will fix this. It certainly seemed to fix it in my testing of 1 hour snaps, but we will see in a few days time if it works.

Stux · December 26, 2024, 6:44pm

This should be the solution to your problem.

save pending snapshots
destination snapshot lifetime same as source

This should result in source snapshots only being removed after they have expired AND have been replicated.

And then after the successful replication, snapshots no longer on source will be removed from destination too.

jrma · December 29, 2024, 10:11pm

Yep. My solution to the issue was enabling “Save pending snapshots”.

I tested today and all the snapshots of the last few days replicated successfully.