Ran replication task on wrong target - is a recovery even possible

Gumble · October 14, 2025, 5:35am

Hello everyone,

I made a mistake when I accidentally ran a replication task on the wrong target.

I have the following configuration:

opendataset0/subset1
opendataset0/subset2
encryptedset0/subset3

Since I wanted to decommission and expand encryptedset0, I created opendataset0/subset3, but during the replication task, I accidentally selected the wrong level.

The task then ran from encryptedset0/subset3 to opendataset0, and it ended up deleting the subsets including the snapshots.

Is there any chance to recover the subsets somehow? I’ve deactivated all scrubbing tasks, etc.

Thanks in advance for any replies.

winnielinnie · October 14, 2025, 1:09pm

How? There’s a safeguard to prevent this.

As long as this option is disabled, the replication task should have aborted.

Not unless you have a pool checkpoint.

Titmando · October 14, 2025, 1:35pm

My guess is that he bypassed the GUI and ran destructive CLI commands with out realizing what they could do. The GUI is there for this specific reason.

Unfortunately that data is gone, unless you had pool checkpoint’s, backups or mirrors of that data, I’m assuming you didn’t because you’re here asking if it’s recoverable.

Edit: Sentences are hard.

-T

Gumble · October 14, 2025, 1:55pm

Hi, thanks for the answer. I haven’t executed any shell commands yet; I accidentally clicked something in the target’s tree menu via the GUI. I came across this article serverastra . com/docs/Tutorials/How-to-Recover-files-from-a-stable-and-working-ZFS-pool-when-snapshots-are-unavailable and I’m inclined to try mounting the oldest txg. It was a manual replication task through the GUI.

winnielinnie · October 14, 2025, 2:20pm

I don’t think that will help since it’s been 9 hours already.

Did you enable the circled option?

Gumble · October 14, 2025, 2:52pm

Actually I don’t remember clicking some checkboxes.
But it allready happened and I know I’am stupid.
The last txg from 2025-10-14 01:13 UTC.
I tried to mount it with
zpool import -N -o readonly=on -f -R /mnt/ro -F -T 12287190 int-hdd-z1
but the system only freezes and dmesg is spammed with middleware timeouts.
I think I have to face the fact, that my data is lost.

Johnny_Fartpants · October 14, 2025, 2:52pm

I think regardless of the replicate from scratch check box if you point it at a parent dataset on the receive end it will blow away the children.

It’s a very unfortunate situation and one that should ideally be much harder to do.

Johnny_Fartpants · October 14, 2025, 3:07pm

Out of interest what version of TrueNAS are you running?

winnielinnie · October 14, 2025, 3:07pm

It absolutely should not. If this is how the TrueNAS GUI/middleware behaves, then I would consider it a dangerous bug.

This is what happens in the command-line if you try to do it without the destructive -F flag:

cannot receive new filesystem stream: destination 'testpool/destinationfs' exists
must specify -F to overwrite it

I cannot proceed unless I override it with -F. The destination dataset and its 3 children are untouched when I tried to do this:

zfs send -R oldpool/sourcefs@migrate | zfs recv testpool/destinationfs

This is a safeguard to prevent people from accidentally destroying data on the destination.

Does TrueNAS always invoke -F for every replication process?

Johnny_Fartpants · October 14, 2025, 3:09pm

I know I’ve done this before luckily in testing just not sure if the behaviour has changed during versions. I would most likely have done this on CORE. Just testing on CE now.

Johnny_Fartpants · October 14, 2025, 3:20pm

Yep, as you say unless I select ‘Replication from scratch’ it won’t let me.

Be interesting to hear what version @Gumble is running.

Even more interesting is that even if I select ‘Replication from scratch’ it still preserves my child dataset on the receive system assuming you are sending recursively a layer above and not specifically going from one dataset to another.

Worth noting Im doing this with a specific replication user and not root/admin using PULL rep. Other variations may vary.

winnielinnie · October 14, 2025, 3:35pm

Why -R /mnt/ro? Why not just -R /mnt?

Gumble · October 14, 2025, 5:14pm

I’am using Scale 25.04.2.4.
The ro was just a reminder for mysel as readonly - the folder exists.
I did the replication as admin.