Lets start a new thread in General. Post the link here.
Document what the VMs and hardware are and if there was any signs before the freeze. After that w can submit a ticket if you have the diagnostics.
Lets start a new thread in General. Post the link here.
Document what the VMs and hardware are and if there was any signs before the freeze. After that w can submit a ticket if you have the diagnostics.
Is anyone else having IO performance issues with an Incus VMs that overwrite data? Iām having some very poor results if I fill up a sparse zvol with random data, and then try to write over that data again. The write performance is halved at least.
I donāt have an answer to your question, but I do wonder, is a sparse volume really the best pick if you have a workload that write intensive?
Well I wouldnāt think it would matter since ZFS is a copy on write filesystem. I use sparse for space efficiency, but this example is just the test case Iām exploring right now.
The data Iām gathering seems to indicate that the IO code path with an Incus VM isā¦not good. I want to know if anyone else is seeing these issues.
There is no such thing as a free lunchā¦
When data is overwritten between snapshots, the blocks are written to new space as you expect. BUT the old blocks and their space are then reclaimed. This requires metadata work. If your metadata is on HDDs, it can have a performance impact. When on RAIDZ is also slower.
The ZFS log spacemap is the mechansism doing all the work to keep track of the free space.
https://sdimitro.github.io/post/zfs-lsm-flushing/
With thick provisioned zvols, there is more space allocated and so my guess is the performance degradation might be less severe or take longer to get to a steady state. Perhaps you can validate.
There should be little performance difference from Electric Eel to Fangtooth⦠if there is, then itās worth looking at.
Thanks for this explanation of how ZFS worksā¦
I tested yesterday in Electric Eel some and confirmed Iām seeing a notable performance degradation. At best thereās a notable increase in CPU processing overhead, at worst thereās a disk IO drop off and stability issues with the VMs.
I think weāre at the point where I donāt think this will be fixed in this release, Iām just trying to get this confirmed as an issue for future point releases.
The next question is whether its related to incus or just a general view on iSCSI zvols on small RAID-Z HDD pools.
If its incus related, its an issue.
If its a degradation from 24.10⦠its an issue
If its just RAID-Z HDD performance, we know how to solve that. More vdevs or svdev or use flash.
We test iSCSI zvols extensively⦠on HDDs and NVMe. These are all working fine.
What CPU are you running? Itās possible there were some changes to speculative execution mitigations that may have negatively impacted your system.
AMD Ryzen 5 5600, and no I donāt think that was any part of the issue.
My issues seem to have mostly been something a little weird about how Incus was handling a disk it managed vs a disk that was attached from somewhere else. More in this thread, but we found some CLI tweaks that made it work a lot better for me and there are a few code changes coming in the release tomorrow that may help as well.