Issue:
Cannot deploy any application that relies on ROCm compute in incus container because /dev/kfd is not passed. I can access /dev/dri, but that’s not enough to deploy ROCm applications.
Goal:
Deploy faster whisper with proper ROCm support and configurations. Technically the same thing can be done with the truenas apps/docker compose, BUT there is no published docker image for my use-case, and after changing the settings one apparently needs to re-bbuild the image, which makes it cumbersome to publish myself (I guess, no experience there).
Tried:
Isolate GPU and passthrough to VM → VM hangs (food for another post)
Recreate new container with and without passing GPU in the settings
Restarting system and container
Checking incus container config in truenas shell (was expecting to find /dev/dri there and just add /dev/kfd, but nope).
I do believe everything I need is there, I just lack the know-how on how to deploy it properly.
lxc won’t go away, the management plane will just switch from incus to libvirt. And existing incus lxc should automatically migrate to the libvirt backend.
You are right, and seems to have done the trick! In case anyone else ends up here, these are the steps:
Setup incus container using GUI, make sure to add the GPU during initial setup.
Open a container shell and make sure you can at least access ls /dev/dri, but not ls /dev/kfd
Give the container access to /dev/kfd for applications that depend on ROCm, by opening a truenas shell and running the following command (NOTE: change the gid to your truenas systems render group id):