Send / Receive replication challenges when source has unrecoverable files

From time to time I get zfs telling me I have uncorrectable files in some snapshots. I have an offsite zfs send / receive at a friends house which is a complete replica including snapshots, sent nightly. with same as source.

Fortunately in this case the file is not important, and I just deleted the file, however this means it’s still in the snapshot. Perhaps the file was still OK too, but I’m unsure.

Here’s what ZFS says:

zpool status -xv
  pool: hdd1pool
state: ONLINE
status: One or more devices has experienced an error resulting in data
	corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
	entire pool from backup.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
  scan: scrub repaired 0B in 1 days 07:50:16 with 1 errors on Mon Mar  2 07:50:29 2026
config:

	NAME                                      STATE     READ WRITE CKSUM
	hdd1pool                                  ONLINE       0     0     0
	  raidz2-0                                ONLINE       0     0     0
	    074df82f-bf92-4bd7-8bea-f8981dd3c9c4  ONLINE       0     0     2
	    4a42c51f-7a27-4624-81c9-db643fe46b74  ONLINE       0     0     2
	    0840a677-c089-4baa-af1c-3aa14652219b  ONLINE       0     0     2
	    602e8af4-7bf3-428a-b783-8ae3e88acc7a  ONLINE       0     0     2
	    7f2b8c03-a897-45b9-855d-7dff5467ac38  ONLINE       0     0     2
	    bdbdc737-0eec-45cf-80ec-6f053af4b99a  ONLINE       0     0     2
	special
	  mirror-1                                ONLINE       0     0     0
	    5b80329c-fc5a-47c7-9fbe-b306d786c8cc  ONLINE       0     0     0
	    7dbe232d-f23d-4257-b178-0bc22a47fe0f  ONLINE       0     0     0
	    d453f430-1080-4f63-8e07-e360f623e5ed  ONLINE       0     0     0


errors: Permanent errors have been detected in the following files:
        hdd1pool/vroot_media/home_videos@auto_daily-2026-02-28_00-00:/Masters/8mm/HD_Scans-Progressive/8mm Film 3/Proxy/pict.mov
        hdd1pool/vroot_media/home_videos@auto-monthly-2026-03-01_00-00:/Masters/8mm/HD_Scans-Progressive/8mm Film 3/Proxy/pict.mov
        hdd1pool/vroot_media/home_videos@auto_daily-2026-02-27_00-00:/Masters/8mm/HD_Scans-Progressive/8mm Film 3/Proxy/pict.mov
        hdd1pool/vroot_media/home_videos@auto_daily-2026-02-24_00-00:/Masters/8mm/HD_Scans-Progressive/8mm Film 3/Proxy/pict.mov
        hdd1pool/vroot_media/home_videos@auto_daily-2026-02-26_00-00:/Masters/8mm/HD_Scans-Progressive/8mm Film 3/Proxy/pict.mov
        hdd1pool/vroot_media/home_videos@auto_daily-2026-03-02_00-00:/Masters/8mm/HD_Scans-Progressive/8mm Film 3/Proxy/pict.mov
        hdd1pool/vroot_media/home_videos@auto_weekly-2026-03-01_00-00:/Masters/8mm/HD_Scans-Progressive/8mm Film 3/Proxy/pict.mov
        hdd1pool/vroot_media/home_videos@auto_weekly-2026-02-08_00-00:/Masters/8mm/HD_Scans-Progressive/8mm Film 3/Proxy/pict.mov
        hdd1pool/vroot_media/home_videos@auto_daily-2026-02-25_00-00:/Masters/8mm/HD_Scans-Progressive/8mm Film 3/Proxy/pict.mov
        hdd1pool/vroot_media/home_videos@auto_daily-2026-03-01_00-00:/Masters/8mm/HD_Scans-Progressive/8mm Film 3/Proxy/pict.mov
        hdd1pool/vroot_media/home_videos@auto_weekly-2026-02-15_00-00:/Masters/8mm/HD_Scans-Progressive/8mm Film 3/Proxy/pict.mov

If I delete these snapshots it breaks replication and doing a from scratch replication seems to work only while from scratch is enabled, after which if I turn it off it fails again. This requires me to ring my friend and ask him to delete the replicated data (after I’ve restored it if necessary) and then replicate it again, truly from scratch and then it’s good. Anyone else have that problem? I’m not going to test it on this folder as it’s truly massive, but this has been something that’s been very annoying as it’s happened quite a few times over the years.

it’s interesting that the checksum failed across all physical disks in this case. It picked this up as part of a scrub, but it must have been weeks since I wrote this file. These are new disks I got last year is this some kind of controller glitch do you think?

Keen on any thoughts and also any others that replicate to a friends house but don’t have access to their server otherwise (I can’t ssh to it). Thanks.

Is the file corrupt on the destination?

Is it feasible to run a scrub on the destination and then manually compute the SHA256 hash of that file on the source and destination to see if they differ?

There is the new feature of “corrective receives”, but it’s not really well thought out and might not help in this particular case.

I assume the destination file is not corrupt, but given I can only access that server via a send or receive I can’t tell, since it’s not important I’ll just recreate it and let the snapshots age out. My friend is on bereavement and I can’t ask. Probably there is a way to give limited ssh access for this kind of scenario, but I haven’t figured it out and I’d need to convince my friend whom is fairly security conscious as am I - it probably needs some fairly escalated permissions depending on what exactly I do. This is a bit of a gap with the product at the moment I think. A lot of us home labbers probably do a reciprocal backup with a friend. Perhaps I’m missing something obvious though.

Yes, I would say I could restore the file and do a hash of it. I’ve deleted it on the source now though to start the age out process. Until then I have to live with these errors in my logs.

Just thought I’d ask and see what everyone else does.