During a ESXI snapshot, truenas 25.04.1 is effecting esxi to take a snapshot before snapshotting the dataset, but its not deleting the esxi snapshot… so vm performance will begin to crater as those esxi side snapshots start piling up
is this standard logic, or is this another fangtooth bug?
truenas manages a share that holds serves a NFS dataset to esxi
truenas has an esxi integration that facilitates ESXI taking a snapshot of the VMs in that dataset to ensure data constancy then snapshots the ZFS dataset … then should delete the esxi snapshots…
the task works ‘correctly’ as in it creates the VM snapshots before ZFS creates the dataset snapshot… so the esxi credentials and authentication are working
the script is flawed in that truenas doesn’t go back to delete the VM snapshots… if you are snapshotting a VM dataset even every hour… that means 12 snapshots a day pile up on that VM… and performance craters…
even if truenas deletes the un-held snapshot after aging … the esxi VM snapshots remain … so VAAI or whatever script logic is flawed here
truenas should command that the VM generated snapshots are deleted after the ZFS dataset snapshot is taken… that is the way it should work
dont have a jira account and not going to generate one for truenes bug reports… here should be sufficient
Do you have that reference so I could read it? I’m curious about this integration that is built in.
You never specified if this setup worked before 25.04, so did it? If so, what version.
I run my TrueNAS on ESXi 8 but I do not make VM snapshots so now maybe something to learn.
EDIT: Also, the develpoers rarely visit these forums so a bug report is how they find out something is wrong. If you are a paying customer then I’m sure a phone call would generate some action.
for those running ESXI with vcenter … here is a workaround
in vcenter 8… and I am assuming 9 now, you can set snapshot creation and or deletion tasks per vm
so. either dont use the truenas esxi integrated snapshot at all… and setup 2 tasks… 1 to snapshot the vm before truenas snapshots the dataset … then one to delete the esxi snapshot a couple minutes later
or use the truenas integration to ensure snapshots are done on esxi prior to the zfs snapshot… and setup a single task on vcenter to periodically delete snapshots so they dont pile up.
at least until the truenas devs address the bug
for those wondering … why…
most of my clients that use or would want to consider something like truenas vs netapp or other top tier datacenter storage … is cost savings … and in SMB and medium business… hyperconverged is the buzword
why have storage servers running when the storage server can be hyperconvereged to run on the same iron as applications
so in a esxi environment … works for other hypervisor types as well
truenas serves the zfs back storage that the vm’s live on as well as application data
when using other zfs storage appliances with esxi integration, getting them to snapshot the vm for data consistency, snapshot the zfs dataset, then delete the esxi vm snapshot just works… not here in truenas land
so to backup vm workloads
snapshot the vm
snapshot the zfs dataset
pull that snapshot to another backup server via nectat/zfs send/recv and you can have almost atomic level, immutable backups of the vm environment with very little overhead and no cost for licenses of exotic esxi backup software