Block Cloning - Dataset Space Used

So I’ve just started messing around with block cloning today and thinking about practically how it would work in my environment. In testing it appears to be working well however I have noticed that ZFS dataset ‘USED’ doesn’t seem to understand whats going on from a reporting point of view. If I take a copy of a file (10GB) on the dataset and duplicate it block cloning is defo working as A. the copy is stupid fast and B. I can see it using zpool get all tank | grep bclone however as far as the dataset is concerned I have used another 10GB of space.

My dilemma here is that I share out multiple datasets to various groups in my org and they have quotas attached to them (for obvious reasons) and as it stands ‘I’ (the system) will benefit from block cloning but they won’t directly.

I wonder if other people have noticed this and if there are any plans down the road for ZFS to address this?

PS: I’ve not really used dedupe before so starting to think does this work the same way?

Yes, with block-cloning or anything that references the same data blocks, you can even end up with a dataset that has a larger “USED” space than the pool itself. This is because the dataset’s space calculation does not take pool features and properties into account.


That’s a good question. I wonder if it’s a caveat of using dataset quotas with deduplication and block-cloning? :open_mouth:

1 Like