Which version of Ubuntu are you using? The 24.04 LTS or newer? Red Hat/Oracle Linux 9 are on a pretty old kernel, 5.15. I wonder if that’s contributing to my issues?
I assume by this you mean you didnt run into any issues?
root@intersect:/opt/gravwell/etc# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 24.10
Release: 24.10
Codename: oracular
root@intersect:/opt/gravwell/etc# uname -r
6.11.0-9-generic
Indeed, much better behaved right out of the gate. I was just following the instructions for migrating an existing VM, that’s why I’ve been attaching drives this whole time.
I tried Ubuntu 24.10, and the performance is worse somehow. Even though the overall system load is even higher, it doesn’t seem to be throwing NVMe errors like the Oracle9 VM did.
The problem you’ve run into appears to be due to how Incus is handling the cache behavior for “disks” versus “Volumes”. In the next release the UI will encourage the users to import their existing ZVOLs into an Incus managed “Volume” where the proper ZFS driver optimizations and caching mechanisms will be in play automatically.
Then I’m happy my suffering was not in vain.
So look for this in the next point release? (ie, not the 25.04 release in a few days)
The PR looks to be in 25.04.0
Is this the PR?
That reminds me, I should probably retest things on the release code.
Found this thread just searching for OOM errors as my Windows VM crashes daily after moving to Fangtooth.
It’s a Windows VM, and any heavy IO usage bogs down the Windows UI to pretty much unusable. swtpm kept me from being able to start the crashed VM again, so I just removed the tpm device thinking that might have been the issue somehow. At least now I can restart the VM when it crashes, but didn’t help at all with the OOM crashing.
While OS is different, symptoms look very similar to OP…
edit: can’t add link to my JIRA ticket for some reason… # 135499 btw.
Changing from nvme to virtio-scsi for my Windows VM seems to have improved things drastically. Going on 24 hours and usually my VM crashes with oom every day or so since moving to fangtooth … I’ll report back if things are stable for a few more days, but it’s looking promising.
How large is the memory footprint of your VM and your system as a whole? I’ve been chasing down a problem I can’t quite figure out still. Seems like it might be related to memory fragmentation, but I’m not sure yet.
I did some retesting, and my system on a whole is a lot better compared to before with virtio-blk devices and all my zvols imported into the incus volume management. However, if I load up the IO, things still seem less than awesome. Like I was noticing cpu steal time on my other VMs, despite what seemed to be a not fully loaded CPU. When I looked at it closer, it seemed like it was coming in waves, like every 20 seconds or so there was a blip of steal time. Seemed like the host OS was showing some blips of zfs/zvol CPU usage too. ‘top’ will show at 2-3 running processes, then a blip of 20-30 for a split second. This IO pipeline just seems poorly optimized, but I’m not sure how to articulate it such that I could present it to the devs as something actionable.
The shear number of zfs threads I see in top makes think that maybe it’s just been written for a more modern server with tons of cores and that my 6 core system is just left in context switch hell as a result. When I’m not leaning on it hard in one of these test cases, I never noticed a problem.
I was doing 24GiB Windows VM on a 128GB system. It is my largest VM by far (main desktop), and rest is marginal. I run a 5900x so 12 cores … which should be way overkill for what I’m doing, so all considering, there seems to be something wonky.
Right now things are still looking ok with virtio … so fingers crossed.
If it’s memory fragmentation related, it took about a week to show up for me. I was running a 32GiB VM, that I increased to 48GiB because zvols in the incus datastore don’t get data cached to ARC (other than metadata anyway). Are you running your volumes from the incus vol manager, or is it attached from another location? The latter will cause a lot of churn in the ARC that would make a memory fragmentation problem worse…in theory.
When I changed it to 48GiB the other day and tried to fire it up, it almost immediately OOM’ed, despite 60+GiB free. I had manually forced the kernel to run memory compaction first as an experiment, so I’m not filing a ticket just yet. I tried again a few minutes later and it didn’t OOM, but ZFS went crazy, the CPU went to 100% and the system load just went into orbit. I force quit the VM when the system load hit 200+ after a few minutes and then rebooted. On a clean boot all was fine.
Anyway, that’s why I’m poking at memory fragmentation, since it seems to happen after the system has been up for a while, and only seems to be a problem with these large memory VMs.
Interesting… I am using zvol from outside of incus as this was a VM I migrated over from pre-fangtooth, and that was how I read I was supposed to do it…
Are you saying the ARC issue would get resolved if I copy/migrate “into” incus as opposed to referencing external zvol?
Possibly. In 25.04.0 you are required to either clone or move the zvol into an incus-managed dataset when creating a new VM based off of it. Performance was one of the benefits I heard cited for why this decision was made.
May want to investigate this option
AFAICT, the only difference is the cache properties mentioned.
It’s a bit more than that. If your VMs drive is mounted as an “external ZVOL” Incus is treating it no differantly than an external hard drive. If it’s imported into the Incus “pool” it’s using Incus’s ZFS driver and so you benefit from those additional optimizations.
I’ll find some time to convert my zvol into a incus one later on. For now, wanted to report that moving from nvme to virtio-scsi now has me at 3 days (and running) uptime on the VM, where previously it would crash almost daily. Definitely seems to have improved things on my end …
This is my point.
The only practical difference is the zfs driver disables the cache when activating the zvol device before attaching it to the vm.
(I’m familiar enough with the internal workings of the zfs driver to have submitted bug fixes on it ;))