I’m running TrueNAS Scale ElectricEel-24.10.2.1 and am trying to understand some unexpected data replication behaviors.
Background
I have multiple disks that I rotate in and out of my system as part of my backup system. Every day, a snapshot is taken on all of my datasets and this data is backed up to these disks via a Replication Task.
Yesterday, I rotated one of the disks and ran the Replication Task. TrueNAS threw an error when replicating one of the datasets in the Task:
[EFAULT] No incremental base dataset ‘[REDACTED]’ and replication from scratch is not allowed
Unless I’m misunderstanding, this isn’t surprising to me because it had been a long time since I’ve run the Replication Task on this particular backup disk and a lot of data/snapshots have come and gone since its last backup. I enabled “Replication from scratch” and executed the Replication Task. This completed successfully.
Issue 1
When I enabled “Replication from scratch”, one of the undesirable things I noticed was that when the task was replicating the data for my largest dataset, it seemed TrueNAS was copying all of the data/snapshots first, then deleting the old data/snapshots afterward. I can see that being helpful in the event the Task fails or is interrupted, but one side effect is this almost maxed out my disk (~92% full). The Task did delete the old data afterward, which brought the storage utilization back down to 63%. Is this expected behavior?
Issue 2
Since I allowed a full Replication Task to run with “Replication from scratch”, everything should have been clean, so I disabled “Replication from scratch”. The Replication Task is running again now, but I’m seeing something that doesn’t make sense to me.
The Task is replicating a dataset that is 5.81TiB in size*. These are images and videos from my NVR. Every day, the NVR adds today’s recordings and deletes the oldest day’s recordings. A day of recording is ~700GB of data. I would expect the Replication Task to be writing ~700GB of new data since it runs every day and ~700GB of new data was written to the dataset since the last Replication Task was completed.
However, TrueNAS is indicating this Task is replicating 4.71TiB of data for this NVR dataset. I don’t understand how this is possible. What would cause TrueNAS to think the backup drive needs 4.71TiB of data to be written to it, with the “Replication from scratch” setting disabled, if there’s no way this much data has changed or been added since the last Replication Task was executed?
I also note that the overall amount of data being written by the Replication Task is 8.54TiB, which doesn’t make sense because this would be 3.83TiB for the other datasets. Including the NVR, less than 1TiB of data was changed between these two Replication Task runs, so TrueNAS is writing > 8x the actual amount of data it needs to.
It seems to me that TrueNAS is replicating a lot of this data from scratch even though I’ve disabled “Replication from scratch”.
*This is the figure reported in the Datasets tab in the TrueNAS web UI. Am I correct in assuming this is the size of the data + the size of all snapshots for this dataset?