Incus VM cant add GPU via UI

scyto · May 19, 2025, 2:14am

When adding a GPU via the UI i get this error

middlewared.service_exception.ValidationErrors: [EINVAL] 
device.GPU.gpu_type: Field required

using this works to add the device sudo incus config device add frigate gpu0 gpu pci=0000:0e:00.0 but breaks the UI with more validation errors, resulting in a broken device section showing nothing, i guess because its not setting gputype : physical ?

scyto · May 19, 2025, 6:07am

also it breaks looking at datasets if you add the gpu by hand to the incus container. ouch.

Finallf · May 19, 2025, 10:05am

I saw that you are using frigate, I use it here too, but in a docker and not in a VM, I just passed the correct settings of my VGA in docker compose and it is working.

scyto · May 19, 2025, 3:12pm

Thanks, while ths has nothing to do with the OP, I go frigate working with GPU by using docker as the nested virtualization was fraught at best. I actuallu want to use my hailo 8 Ai card, but there is no way to load the drivers on truenas, this is why i was trying to use a nested VM (my truenas is on proxmox).

But nested virt seems a no go with pcie passthrough - when i finally got the hailo driver to load in the nested VM it reset the whole physcial server i quickly tried the GPU in the nested VM and that didnt reset the server but the nvidia drivers wouldn’t load.

So for now its gpu pass through from proxmox to truenas and use docker. I wouldn’t be messing with any of this if truenas would provide a way for drivers to be loaded with dkms - like a way to make sysext packages so the base OS is not touched.

tl;dr i prefer not to use this gpu for this, but i needed it working asap for reasons and (not unexpectedly) nested virt is a bust.

T0meo · May 19, 2025, 8:57pm

I wouldn’t bother with incus for now. I moved from 24.10 to 25.4 and NOTHING is working regarding VMs. You can’t install windows easily, you need to play around a lot to get other stuff working, it’s absurd. Same as you I tried a GPU passthrough and it’s not being detected or used in the incus container at all.

scyto · June 5, 2025, 3:55am

I agree it definitely wasn’t ready for prime time, seems like they shold have two branches with incus staying in beta.

FYI I eventually traced the resetting to be an issue with the card and how it likes to do bus resets - that caused by server to reboot with PCI SERR. I found the solution - it was a specific qemu conf setting to disable hotplug within in the Level 1 truenas VM. I havent had time to retry as i currently have a corrupted BMC firmware on my server causing lots of issues

GioPat · June 28, 2025, 2:04pm

I’m having the same issue, I was not able to enable hw acceleration inside my VM, and I can’t add the GPU to my VM. Do you have any hint?

scyto · July 8, 2025, 5:49pm

my mobo is now with asrock so i cant check

I think the patches in the fangtooth broadly fixed this for me, the command line syntaxt is arcane but i just added the gpu at the command line (after making sure there were no nvidia drivers loaded for apps) and it just worked

are you getting an error? something else?
and you are refering to incus VM not incus LXC right?

GioPat · July 9, 2025, 7:33am

I’m getting the following error in the guest VM

dmesg | grep amdgpu
[    2.339626] [drm] amdgpu kernel modesetting enabled.
[    2.339871] amdgpu: CRAT table not found
[    2.339877] amdgpu: Virtual CRAT table created for CPU
[    2.339893] amdgpu: Topology: Add CPU node
[    2.344940] amdgpu 0000:09:00.0: BAR 6: can't assign [??? 0x00000000 flags 0x20000000] (bogus alignment)
[    2.348736] amdgpu 0000:09:00.0: amdgpu: Unable to locate a BIOS ROM
[    2.348756] amdgpu 0000:09:00.0: amdgpu: Fatal error during GPU init
[    2.348831] amdgpu 0000:09:00.0: amdgpu: amdgpu: finishing device.
[    2.350183] amdgpu: probe of 0000:09:00.0 failed with error -22

Incus VM since I’m on 25.04

scyto · July 9, 2025, 4:18pm

sorry not many ideas about that, i don’t use AMD

you seem to have a BAR issue, have you tried flipping the Reizeable BAR setting in your BIOS?

instead of using the gpu type you could try using the PCI type (check using the incus command line what type it is currently GPU vs PCI)