Hello.
I am confused about how a zfs send/recieve replication could have the size of the dataset differ.
Could you please explain, or point me to resources?
Below a description of what I did and what has me confused:
I have two Truenas Scale machines, both run the latest TrueNAS SCALE 25.04-RC.1.
Each has a RaidZ2 pool - one is my old machine, one the newer one.
Because I originally did not create the ZFS pool with Truenas it is not exactly set up right, so I wanted to copy all the data onto my old system, re-create the whole pool (the right way this time, without paritions) and re-migrate all the datasets back.
Here I noticed something strange:
While the data inside the datasets are identical (included total sizes of all files), the size of the ZFS dataset as displayed by TrueNAS/zfs is different (larger after replication):
Source:
Dataset Name Used / Available
RAIDZ2_Pool 11.42 TiB / 802.62 GiB
Backups 1.13 TiB / 802.62 GiB
Images 322.14 GiB / 802.62 GiB
VM_Backups 3.59 TiB / 802.62 GiB
XFS_RAID6_Content 6.38 TiB / 802.62 GiB
Destination:
Dataset Name Used / Available
RaidZ2 11.59 TiB / 2.81 TiB
Backups 1.17 TiB / 2.81 TiB
Images 332.22 GiB / 2.81 TiB
VM_Backups 3.7 TiB / 2.81 TiB
XFS_RAID6_Content 6.4 TiB / 2.81 TiB
All datasets are unencrypted, the Destination pool was freshly created with truenas for the purpose of this data-migration.
I did all the replication with the following commands:
zfs snapshot -r RAIDZ2_Pool/<dataset-name>@migration
zfs send -R RAIDZ2_Pool/<dataset-name>@migration | ssh root@TargetTruenasMachine "zfs receive -Fu RaidZ2/<dataset-name>"
I have done some checks and it seems the data was copied properly and all files are identical, also all snapshots where copied.
As you can see, especially the VM_Backups dataset is noticably larger (at least when reported by TrueNAS). How can it just use roughly 100 GigiByte more, or at least report it so?
When trying du -s on a dataset folder the files are identifal with --apparent-size, but the bytes taken up on the disk differ (I checked the Images dataset and there the actual bytes taken up are smaller then the apparent size on both machines/pools, but slightly more on the target.)
Thank you