Linux Jails (containers/vms) with Incus

Running into some wierd issues I haven’t been able to pin down quite yet with DinI (Docker in Incus). It seems to happen in one incus container, but not the other.

I have two containers that are identically configured:

  1. apps1
  2. apps2

As seen in the OP configs, the mknod config is enabled:

  security.nesting: "true"
  security.syscalls.intercept.mknod: "true"
  security.syscalls.intercept.setxattr: "true"

apps1 will pull the container just fine and run it, but apps2 will fail. The file it fails on varies by container. This is rocketchat, but I had similar issues with audibookshelf as well for /usr/lib/libncurses*.

[+] Running 11/1
 ⠦ rocketchat 11 layers [⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿]      0B/0B      Pulling                                      73.6s 
failed to register layer: failed to mknod('/usr/bin/gcc-nm', S_IFCHR, 0): file exists

I also see the following in the lxc.log:

lxc apps2 20250318190015.788 ERROR    seccomp - ../src/lxc/seccomp.c:seccomp_notify_handler:1555 - No such file or directory - Failed to send seccomp notification

It’s a head scratcher for sure… I’ll keep pecking away at it.

Decided to try and reboot TNCE to kick out all the cobwebs and got this from two of my instances. They just happen to be the one’s that nvidia.runtime enabled on them.

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/middlewared/job.py", line 515, in run
    await self.future
  File "/usr/lib/python3/dist-packages/middlewared/job.py", line 560, in __run_body
    rv = await self.method(*args)
         ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/middlewared/plugins/virt/global.py", line 218, in setup
    await self._setup_impl()
  File "/usr/lib/python3/dist-packages/middlewared/plugins/virt/global.py", line 390, in _setup_impl
    raise CallError(result.get('error'))
middlewared.service_exception.CallError: [EFAULT] The following instances failed to update (profile change still saved):
 - Project: default, Instance: apps1: Failed to create instance update operation: Instance is busy running a "stop" operation
 - Project: default, Instance: apps2: Failed to create instance update operation: Instance is busy running a "stop" operation

All other instances started without issue.

incus ls
+-------+---------+------------------------------+------+-----------+-----------+
| NAME  |  STATE  |             IPV4             | IPV6 |   TYPE    | SNAPSHOTS |
+-------+---------+------------------------------+------+-----------+-----------+
| apps1 | STOPPED |                              |      | CONTAINER | 0         |
+-------+---------+------------------------------+------+-----------+-----------+
| apps2 | STOPPED |                              |      | CONTAINER | 0         |
+-------+---------+------------------------------+------+-----------+-----------+
| some  | RUNNING | 192.168.5.4 (eth0)           |      | CONTAINER | 0         |
|       |         | 172.19.0.1 (br-17a59effbcd4) |      |           |           |
|       |         | 172.18.0.1 (br-2a52f4d7eccb) |      |           |           |
|       |         | 172.17.0.1 (docker0)         |      |           |           |
+-------+---------+------------------------------+------+-----------+-----------+
| dns   | RUNNING | 192.168.0.8 (eth0)           |      | CONTAINER | 0         |
|       |         | 172.20.0.1 (br-5ad964b36f2a) |      |           |           |
|       |         | 172.19.0.1 (br-b973de70b0e7) |      |           |           |
|       |         | 172.18.0.1 (br-e3edf144ee96) |      |           |           |
|       |         | 172.17.0.1 (docker0)         |      |           |           |
+-------+---------+------------------------------+------+-----------+-----------+
| mgmt  | RUNNING | 192.168.0.20 (eth0)          |      | CONTAINER | 0         |
|       |         | 172.19.0.1 (br-75f5880b647f) |      |           |           |
|       |         | 172.18.0.1 (br-f8e25a1bda4e) |      |           |           |
|       |         | 172.17.0.1 (docker0)         |      |           |           |
+-------+---------+------------------------------+------+-----------+-----------+
| proxy | RUNNING | 192.168.0.10 (eth0)          |      | CONTAINER | 0         |
|       |         | 172.20.0.1 (br-5b3b983430b9) |      |           |           |
|       |         | 172.19.0.1 (br-528a9380741f) |      |           |           |
|       |         | 172.18.0.1 (br-eccb1c2fe7b9) |      |           |           |
|       |         | 172.17.0.1 (docker0)         |      |           |           |
+-------+---------+------------------------------+------+-----------+-----------+

UI is borked too:

Attempting to start the instances fails…

incus start apps1
Error: Failed to run: /usr/libexec/incus/incusd forkstart apps1 /var/lib/incus/containers /run/incus/apps1/lxc.conf: exit status 1
Try `incus info --show-log apps1` for more info

and logs don’t work…

incus info --show-log apps1
Error: stat /proc/-1: no such file or directory

Restarting middlewared brings the UI back, machines still won’t start.

systemctl restart middlewared

Bug report submitted…


Workaround is a missing module which I’ll add to the docker-init-nvidia.yaml config and starting a jailmaker nvidia enabled VM.

nvidia_uvm

Need to see about running an modprobe script on boot I guess.

What’s weird is that even after boot, I can’t get those to run even with the updated kernel modules:

  linux.kernel_modules: br_netfilter,drm,drm_kms_helper,nvidia,video,drm_ttm_helper,nvidia_modeset,nvidia_drm,nvidia_uvm

As soon as I start my old jailmaker jail with nvidia enabled, I can then start those jails. I need to figure out what jailmaker is doing to the system to allow them to run…

In my reboots, something isn’t quite right with docker, zfs, overlayfs, and xattr
config:

security.syscalls.intercept.setxattr: "true"

dmesg:

[  800.201465] overlayfs: failed to set xattr on upper
[  800.201725] overlayfs: ...falling back to redirect_dir=nofollow.
[  800.201945] overlayfs: ...falling back to uuid=null.
[  800.202146] overlayfs: try mounting with 'userxattr' option

Looking through the code, this looks like something in jailmaker

    # Run nvidia-smi to initialize the nvidia driver
    # If we can't run nvidia-smi successfully,
    # then nvidia-container-cli list will fail too:
    if subprocess.run(["nvidia-smi", "-f", "/dev/null"]).returncode != 0:

EDIT: Yup… that needs to be run to initialize the nvidia drivers to allow jails to boot… but what else is it doing besides loading nvidia_uvm

I guess it’s time to break out good old strace and lsof

Ugh… WHY NVIDIA!!!

More references to this issue…

1 Like

I added a POSTINIT job for:

/usr/bin/nvidia-smi -f /dev/null && /usr/bin/echo "NVIDIA Initialized"


While not perfect, apps2 booted automatically on reboot this time. Seems there is a timeout issue for the machines in incus and rebooting in TNCE.

Maybe PREINIT will work, I’ll test that as well.

1 Like

Success! It has to be set as PREINIT to start the incus containers successfully.

+-------+---------+------------------------------+------+-----------+-----------+
| NAME  |  STATE  |             IPV4             | IPV6 |   TYPE    | SNAPSHOTS |
+-------+---------+------------------------------+------+-----------+-----------+
| apps1 | RUNNING | 192.168.5.2 (eth0)           |      | CONTAINER | 0         |
|       |         | 172.24.0.1 (br-a5d596c6a63d) |      |           |           |
|       |         | 172.23.0.1 (br-861f8618b3bc) |      |           |           |
|       |         | 172.22.0.1 (br-4406e090596c) |      |           |           |
|       |         | 172.21.0.1 (br-4d3076e18000) |      |           |           |
|       |         | 172.20.0.1 (br-e68a758b1218) |      |           |           |
|       |         | 172.19.0.1 (br-771d47b7112e) |      |           |           |
|       |         | 172.18.0.1 (br-c1117e3a55c6) |      |           |           |
|       |         | 172.17.0.1 (docker0)         |      |           |           |
+-------+---------+------------------------------+------+-----------+-----------+
| apps2 | RUNNING | 192.168.5.3 (eth0)           |      | CONTAINER | 0         |
|       |         | 172.25.0.1 (br-29181b3b1866) |      |           |           |
|       |         | 172.24.0.1 (br-a641a717524a) |      |           |           |
|       |         | 172.23.0.1 (br-497452f186ef) |      |           |           |
|       |         | 172.22.0.1 (br-fae1c9ebcb8c) |      |           |           |
|       |         | 172.21.0.1 (br-18c612165e31) |      |           |           |
|       |         | 172.20.0.1 (br-17618b4b1048) |      |           |           |
|       |         | 172.19.0.1 (br-5ce19f3e680a) |      |           |           |
|       |         | 172.18.0.1 (br-f1b3172a99c0) |      |           |           |
|       |         | 172.17.0.1 (docker0)         |      |           |           |
+-------+---------+------------------------------+------+-----------+-----------+
| some  | RUNNING | 192.168.5.4 (eth0)           |      | CONTAINER | 0         |
|       |         | 172.19.0.1 (br-17a59effbcd4) |      |           |           |
|       |         | 172.18.0.1 (br-2a52f4d7eccb) |      |           |           |
|       |         | 172.17.0.1 (docker0)         |      |           |           |
+-------+---------+------------------------------+------+-----------+-----------+
| dns   | RUNNING | 192.168.0.8 (eth0)           |      | CONTAINER | 0         |
|       |         | 172.20.0.1 (br-5ad964b36f2a) |      |           |           |
|       |         | 172.19.0.1 (br-b973de70b0e7) |      |           |           |
|       |         | 172.18.0.1 (br-e3edf144ee96) |      |           |           |
|       |         | 172.17.0.1 (docker0)         |      |           |           |
+-------+---------+------------------------------+------+-----------+-----------+
| mgmt  | RUNNING | 192.168.0.20 (eth0)          |      | CONTAINER | 0         |
|       |         | 172.19.0.1 (br-75f5880b647f) |      |           |           |
|       |         | 172.18.0.1 (br-f8e25a1bda4e) |      |           |           |
|       |         | 172.17.0.1 (docker0)         |      |           |           |
+-------+---------+------------------------------+------+-----------+-----------+
| proxy | RUNNING | 192.168.0.10 (eth0)          |      | CONTAINER | 0         |
|       |         | 172.20.0.1 (br-5b3b983430b9) |      |           |           |
|       |         | 172.19.0.1 (br-528a9380741f) |      |           |           |
|       |         | 172.18.0.1 (br-eccb1c2fe7b9) |      |           |           |
|       |         | 172.17.0.1 (docker0)         |      |           |           |
+-------+---------+------------------------------+------+-----------+-----------+

UI isn’t bugged anymore either.

Maybe this should be considered a bug or feature request for Incus? After all it has direct support for Nvidia?

1 Like

I agree. It should be something that when you add the nvidia.runtime=true config, it should do all of that automatically to allow for correct functionality.

1 Like

Updated the GPU FAQ in the OP with the updated nvidia procedure.

Shows what an amazing job @Jip-Hop did with Jailmaker :slight_smile:

2 Likes

Yeah, it was a good project and worked very well.

If you find any bugs or things to improve in Incus please file issue on their Github.
It would be great if its fixed upstream in time for Truenas 25.10

Truenas uses LTS version of Incus via Debian. And that gets updated cca every 3 months.

Yeah, if I run into an Incus specific issue, I’ll definitely report it. This isn’t an Incus issue, but rather something IX will need to handle in middlewared for this to be done properly. Nvidia is really a mess on Linux.

1 Like

Ok, this may be a missing a container storage config, testing…

zfs.delegate=true

This is beyond annoying now. I’m open to ideas.

I’ve tried:

  1. sysctl values
  2. fuse-overlayfs
  3. additional security.* configs in Incus
  4. zfs.delegate=true

Going to test a non-nvidia container to see if this is still happening. EDIT: it is…

EDIT2: going to check out some of the system.syscalls.*… Going back in.

I am little lost. This is about the nvidia devices that are present in host but for some reason are not mounted in the instance?

No, see the post I replied to in thread. It’s mknod issues. Nvidia is squared away finally :blush:

Since I’m looking at anything right now, even though ZFS is the best option he lists on that thread, he is suggesting using block mode for the containers, ie. /var/lib/docker. I don’t feel like this should be needed, since jailmaker used ZFS just fine. systemd-nspawn and incus probably interact very differently with the system. However, at this point, I’m willing to give it a shot…

I did find this file in overlay2, but it’s down a different path?

root@apps2:/var/lib/docker# rgrep -i gcc-nm
grep: overlay2/39ab609bd8f1d4587f80c09a37a7e6ea8a7c3c44b9c6eca117d281cb9c59a7d5/diff/usr/bin/gcc-nm: binary file matches

Yes.

Also, when delegating zfs the container should have access to more zfs features too