Hello!
I couple of months ago I finally decided to backup my main TrueNAS machine to a remote machine (still TrueNAS Scale). To do so, I use the replication task, sending the most recent snapshots of my three pools (pool0, pool1, pool2), to the dataset backuptank/backup/ on the remote machine.
In particular I snapshot my main pools daily, retaining the snaps for two weeks, and send them daily (both tasks at midnight).
Well, it has been a pain in the neck.
I guess the replication is straightforward to configure, and everything seems to work… until it doesn’t.
Every now and then, the replication fails with a message of this kind:
[EFAULT] cannot receive incremental stream: most recent snapshot of backuptank/backup/pool0 does not match incremental source.
The full log message is the following:
Error: [2024/12/16 00:00:05] INFO [Thread-267] [zettarepl.paramiko.replication_task__task_18] Connected (version 2.0, client OpenSSH_9.2p1)
[2024/12/16 00:00:05] INFO [Thread-267] [zettarepl.paramiko.replication_task__task_18] Authentication (publickey) successful!
[2024/12/16 00:01:36] INFO [replication_task__task_18] [zettarepl.retention.calculate] Not destroying 'auto-2024-12-15_00-00' as it is the only snapshot left for naming schema 'auto-%Y-%m-%d_%H-%M'
[2024/12/16 00:01:36] INFO [replication_task__task_18] [zettarepl.retention.calculate] Not destroying 'auto-2024-12-15_00-00' as it is the only snapshot left for naming schema 'auto-%Y-%m-%d_%H-%M'
[2024/12/16 00:01:36] INFO [replication_task__task_18] [zettarepl.retention.calculate] Not destroying 'auto-2024-12-15_00-00' as it is the only snapshot left for naming schema 'auto-%Y-%m-%d_%H-%M'
[2024/12/16 00:01:37] INFO [replication_task__task_18] [zettarepl.replication.pre_retention] Pre-retention destroying snapshots: []
[2024/12/16 00:01:37] INFO [replication_task__task_18] [zettarepl.replication.run] For replication task 'task_18': doing push from 'pool0' to 'backuptank/backup/pool0' of snapshot='auto-2024-12-16_00-00' incremental_base='auto-2024-12-15_00-00' include_intermediate=False receive_resume_token=None encryption=False
[2024/12/16 00:01:38] ERROR [replication_task__task_18] [zettarepl.replication.run] For task 'task_18' unhandled replication error ExecException(1, 'cannot receive incremental stream: most recent snapshot of backuptank/backup/pool0 does not\nmatch incremental source\n')
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/zettarepl/replication/run.py", line 181, in run_replication_tasks
... 16 more lines ...
raise self.process_exception
File "/usr/lib/python3/dist-packages/zettarepl/replication/process_runner.py", line 37, in _wait_process
self.replication_process.wait()
File "/usr/lib/python3/dist-packages/zettarepl/transport/ssh.py", line 167, in wait
stdout = self.async_exec.wait()
^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/zettarepl/transport/async_exec_tee.py", line 104, in wait
raise ExecException(exit_event.returncode, self.output)
zettarepl.transport.interface.ExecException: cannot receive incremental stream: most recent snapshot of backuptank/backup/pool0 does not
match incremental source
A few notes:
- checking the list of snapshots on both machine, they coincide (minus the most recent one, obviously), but they differ in the columns USED and REFER;
- if I delete the last two or three snapshots (recursively), I can manually restart the replication and it works for a few days;
- I set the readonly flag on, recursively on backuptank/backup;
- as far as I know, the datasets are not mounted on the backup machine (they are not at the moment, and I guess they won’t be during the replication);
- as far as I know, the datasets on the backup machine are not in use by apps or anything else (I don’t have apps, sharing services are disabled).
Does anyone have any advice or hint on how to solve the issue? Or at least understand where the problem is.
I really don’t what is causing this and it’s unnerving.
Thanks a lot!