Recommendations for ZFS Deduplication SSD

FPSDE · December 12, 2024, 5:54pm

Hello everyone,

I would like to use the ZFS Deduplication feature for my new TrueNAS NAS / SAN, as it would be worthwhile for me.
The point that makes me a little unsure is the choice of SSD for the de-duplication table.

I currently have a new Intel Optane 900P available, so would this SSD standalone be sufficient without a mirror or is it more advisable to use a SUN Accelerator Card like Craft Computing and operate several SSD modules in a mirror?

Do you already have experience with this topic?
Can you recommend a way that I have mentioned, or would you do it completely differently?

I look forward to your answers!

Thank you

Protopia · December 12, 2024, 6:13pm

The general recommendation about Deduplication (like most of the special types of vDev) is that you should only do it if you have a specific issue and you know for certain that this will fix it for you - and these tend to occur only on very very large-scale ZFS pools and even then only for specific types of files or workload.

Deduplication is possibly the worst of these. Not only can you NOT remove it once you have implemented it - if you find that you want to remove it, you will need to copy your data to a different pool and destroy the de-dup pool completely to free up disks to make a new non de-dup pool - but it comes with some very significant performance hits for certain types of file operation (like deletes).

So unless you have a VERY VERY VERY good reason, and you know for certain that it will be beneficial in your specific circumstances, don’t implement de-dup.

HoneyBadger · December 12, 2024, 6:33pm

Hey @FPSDE

As mentioned by @Protopia deduplication comes with a few caveats around its use. It’s best used for very targeted datasets that you know will benefit from it - for existing data, you can simulate deduplication by using sudo zdb -S yourpoolname although this causes a significant amount of I/O to your pool to check and hash all blocks.

Deduplication tables (and therefore the vdevs they reside on) are critical to your pool health - losing them will cause your pool to be unmountable, so these pools should have the same level of redundancy as your pool. At minimum a mirror is advised to avoid a single point of failure, and for pools that use double-fault-tolerance like RAIDZ2 a 3-way mirror is recommended.

Can you offer some additional insight on the workload and data that you feel would benefit from deduplication? Traditional file storage does not usually qualify for this; often times you can see similar levels of space savings by increasing or altering the compression algorithm in use, with less impact on overall performance.

FPSDE · December 12, 2024, 7:25pm

Hi @HoneyBadger, @Protopia,

First of all, thank you very much for your answers!

I would have a relatively large number of identical VMs as a workload, so I think deduplication could be worthwhile.

But when thinking about it, I also tended to have certain reservations about the longevity or criticality of the pool as far as the deduplication table is concerned.

For the time being, I’m going to abandon the idea of using deduplication, as it’s a little too critical for me to risk a possible failure of the NAS despite backups.

magnanimous · December 13, 2024, 6:51pm

Where are you VM’s running?

Are you VM’s provisioned over NFS, Iscsi? or locally?

FPSDE · December 14, 2024, 6:29pm

Hi @magnanimous,

I run my VMs on a three-node VMware cluster, so it’s only my secondary storage in terms of performance, my primary high performance storage is realized via VMware vSAN.

Basically, I only provide datastores via NFS.

HoneyBadger · December 14, 2024, 7:16pm

I’d argue you can probably get some pretty screaming performance out of TrueNAS vs. vSAN if you tune it right.

I meant to loop back on this later, but got busy with work, so here goes.

Generally speaking, I’ll only put deduplication into use when higher-level abstraction fails. For example, if I’m trying to make a pile of VMs, I’ll try to use a hypervisor’s “linked-clone” feature such that it’s only writing the delta changes from the parent disk. (See VMware’s Horizon product for virtual desktops.)

If I can’t use that functionality because of technical or license limitations - well, now I’ll consider things like doing snapshot/clones at the storage level. For example, for a KVM-based hypervisor, I might create a base disk as a zvol, snapshot and clone it at the ZFS level.

The introduction of block cloning at the ZFS file level also helps - although, I don’t know if VMware’s NFS client can leverage that remotely. But for other systems? Do the same thing with the base disk, and then cp --reflink=auto it on the back end. It’ll increment the brt values, and effectively only have to write new data for the deltas. Not as efficient or easy to revert/clean-up as a snapshot at a higher level, but still workable.

These are all pretty heavy lifts from an administrative standpoint - but if properly used can mitigate a good chunk of duplicate storage.

FPSDE · December 18, 2024, 5:42pm

HoneyBadger:

Generally speaking, I’ll only put deduplication into use when higher-level abstraction fails. For example, if I’m trying to make a pile of VMs, I’ll try to use a hypervisor’s “linked-clone” feature such that it’s only writing the delta changes from the parent disk. (See VMware’s Horizon product for virtual desktops.)

If I can’t use that functionality because of technical or license limitations - well, now I’ll consider things like doing snapshot/clones at the storage level. For example, for a KVM-based hypervisor, I might create a base disk as a zvol, snapshot and clone it at the ZFS level.

The introduction of block cloning at the ZFS file level also helps - although, I don’t know if VMware’s NFS client can leverage that remotely. But for other systems? Do the same thing with the base disk, and then cp --reflink=auto it on the back end. It’ll increment the brt values, and effectively only have to write new data for the deltas. Not as efficient or easy to revert/clean-up as a snapshot at a higher level, but still workable.

These are all pretty heavy lifts from an administrative standpoint - but if properly used can mitigate a good chunk of duplicate storage.

Hi @HoneyBadger,

Thank you very much for your detailed feedback!

This has definitely helped me in this case to make a decision regarding the use or non-use of deduplication under ZFS and I will not use this for my purpose, because the cost / benefit ratio is too bad and my concerns about the failure of the deduplication table with regard to the recovery effort are then too great.

Thanks!