Linux Jails (containers/vms) with Incus

pentaflake · March 11, 2025, 11:14pm

Everything works great and is now no longer running with shift or privileged.

I need to get a feature request in on read-only passthrough mounts, cloud-init support and passing through the gpu via webui but otherwise those are easy enough to do manually through the cli for now.

For now, I can still push the instance creation with incus create instead of launch with my custom yaml with cloud init and then trigger the start with the middleware and everything works as expected with the idmapping correctly working in the instance.

Great update overall.

awalkerix · March 12, 2025, 11:55am

midclt call user.query '[["builtin", "=", false], ["userns_idmap", "=", null], ["local", "=", true]]' '{"select": ["username", "id"]}' | jq will return array of JSON objects with only the username and id keys, but only for local ones that aren’t builtin and don’t already have the idmap set.

dasunsrule32 · March 12, 2025, 5:31pm

All-in-one post with the previous info summarized with examples for working with userns_idmap’s.

Modified user query:

midclt call user.query '[["builtin", "=", false], ["userns_idmap", "=", null], ["local", "=", true]]' '{"select": ["username", "id", "uid", "userns_idmap"]}' | jq

Example output:

[
  {
    "id": 78,
    "uid": 373,
    "username": "etesync",
    "userns_idmap": null
  },
...
]

Update user userns_idmap:

midclt call user.update 78 '{"userns_idmap": "DIRECT"}'

Updated user:

[
  {
    "id": 78,
    "uid": 373,
    "username": "etesync",
    "userns_idmap": "DIRECT"
  }
]

Groups query:

midclt call group.query '[["builtin", "=", false], ["userns_idmap", "=", null], ["local", "=", true]]' '{"select": ["name", "id", "gid", "userns_idmap"]}' | jq

Example output:

[
  {
    "id": 117,
    "gid": 373,
    "name": "etesync",
    "userns_idmap": null
  },
...
]

Update group userns_idmap:

midclt call group.update 117 '{"userns_idmap": "DIRECT"}'

Updated group:

[
  {
    "id": 117,
    "gid": 373,
    "name": "etesync",
    "userns_idmap": "DIRECT"
  }
]

Pull in updated userns_idmap.

Restart container:

midclt call virt.instance.restart docker1 -j
Status: (none)
Total Progress: [########################################] 100.00%

Container config output for the raw.idmap:

  incus config show docker1|grep raw.idmap -A 4
  raw.idmap: |-
    uid 568 568
    uid 373 373
    gid 568 568
    gid 373 373

Unfortunately, I don’t think this one can be done via cloud-init since restarting via the Web UI will wipe out custom raw.idmap’s. Technically, if you didn’t interacte with the Web UI and managed it all from cloud-init and the local incus commands, it could work, but that will likely get messy quickly.

I’ll research and see what needs to happen or if a script needs to be made to call the midclt to perform these tasks.

dasunsrule32 · March 12, 2025, 7:52pm

I added the apps user (Kept the name the same to avoid confusion) to the OP in the docker-init.yaml and docker-init-nvidia.yaml configs. I removed shift: true from all examples as well. It’s no longer needed.

This means you can create a new dataset with the owner of apps and spin up a new container with the new dataset mounted and be able to write to it by default.

pentaflake · March 12, 2025, 8:26pm

You can still use cloud-init if you wanted to for now and understood the risks. I redeployed all of my containers last night to test again other than unifi as its nightmare to deploy automatically right now. I manually changed the config there to remove the shifting and running as privileged and it worked fine after multiple reboots from mdiclt and a full server reboot it started automatically fine.

incus create images:ubuntu/noble/cloud nginx < nginx.yml
midclt call virt.instance.start nginx
incus console nginx --show-log

You can see the whole cloud-init process go through properly in the console from start to finish. (Ctrl+A,Q to leave). Alternatively, you can also live watch the cloud-init process with below instead of the console if you prefer.

incus exec nginx -- tail -f /var/log/cloud-init-output.log

The only real process change with custom config is you can’t start/restart the instance with incus directly as it won’t apply the idmap but doing with midclt or when midclt starts during server boot still applies it correctly.

dasunsrule32 · March 12, 2025, 8:33pm

Yeah, it could work, but I want to make sure containers/vms made by the scripts in the OP survive and don’t need anything weird for them be used in the Web UI as well.

Currently, if you add the userns_idmap and do an incus restart <container>, it will not pick up the those idmap’s from the middleware unless you use midclt to reboot, which causes funky permissions issues because it’s not inheriting the raw.idmap.

This is fine if you’re adding that via cloud-init and don’t ever intend to use the Web UI. Plus it would likely break in the future. That’s why I’m trying to keep it within the ecosystem.

pentaflake · March 12, 2025, 8:35pm

Is it safe to clear the DIRECT mapping property state from the automatically apps user if not needed to minimize the amount of mapping pushed since it applies to all instances?

awalkerix · March 12, 2025, 8:41pm

I don’t think there’s much risk from this direct mapping. What is the concern?

pentaflake · March 12, 2025, 8:42pm

The only thing I do via CLI is the initial setup via yaml with cloud-init. All of my config options are WebUI friendly right now other than the Intel GPU passed to plex, but that currently doesn’t error in the WebUI like read-only mount did. I can edit the containers there fine without breaking anything. I killed my read-only mounts for now until I see if it ends up being accepted as feature request once I can get time to make it. Everything can be started / stopped / editted from the WebUI without issue. I’m happy to share some of my yaml files directly however I’m not putting them publicly like I did with my past Jail deployment guides in old CORE forums resources as they have some deployment stuff related to my use cases/deployment methods for some apps and I don’t have time to troubleshoot for others like I did back then.

pentaflake · March 12, 2025, 8:44pm

I may just leave it but more just general preference, I’d rather not have extra uid/gid mapped that I’m not explicitly using if though it doesn’t harm anything.

dasunsrule32 · March 12, 2025, 8:48pm

Sure, I always like to see how others handle similar tasks.

dasunsrule32 · March 13, 2025, 5:47pm

Updated the configs to add the local default and root users to the apps group. Modified the OP as well with information about setting your permissions on your host folders as well. If you set your host data to apps:apps, then your container shouldn’t have any issues reading/writing to the datasets passed through as long as the container users are members of the apps group and your host permissions for directories are 775 and files are 664.

dasunsrule32 · March 13, 2025, 6:24pm

I have successfully cut over 2 of my jailmaker instances using the scripts (Of course, modified for my environment ) in OP. Should be pretty straightforward for most with experience. Working great.

dasunsrule32 · March 13, 2025, 7:06pm

Have to figure out what is going on with sockets in Dockge. When switching over to Incus even over a bridge, getting slow connections and disconnects between Dockge agents, which I didn’t have an issue with in jailmaker. My jails were on Docker 27.5.1, Incus is on 28.0.1. I’ll see if that plays into anything here.

2025-03-13T19:01:53Z [AGENT-MANAGER] ERROR: Error from the socket server: apps1:5001
2025-03-13T19:01:53Z [AGENT-MANAGER] ERROR: Error from the socket server: apps2:5001
2025-03-13T19:01:53Z [AGENT-MANAGER] ERROR: Error from the socket server: dns:5001
2025-03-13T19:01:53Z [AGENT-MANAGER] ERROR: Error from the socket server: proxy:5001
2025-03-13T19:02:14Z [AGENT-MANAGER] ERROR: Error from the socket server: apps2:5001
2025-03-13T19:02:14Z [AGENT-MANAGER] ERROR: Error from the socket server: dns:5001
2025-03-13T19:02:14Z [AGENT-MANAGER] ERROR: Error from the socket server: proxy:5001
2025-03-13T19:02:15Z [AGENT-MANAGER] ERROR: Error from the socket server: apps1:5001

It is working, just slow and annoying right now.

dasunsrule32 · March 13, 2025, 8:44pm

Not working in RC1… Looking…

gaurhoth · March 14, 2025, 1:53pm

Playing with Fangtooth RC1 on a box and created an incus instance and enabled the GPU in devices, but I can’t run nvidia-smi in the instance (I did install the container toolkit). Comparing it to my jlmkr instance, there are a lot of missing libnvidia* files from the /usr/lib/x86_64-linux-gnu folder that I would expect to see.

Using incus config show , I noticed the nvidia.runtime: true line is missing from the config.

Is there some way to take this webui built instance and ‘add’ this to the config? Will this likely resolve my issue and get the GPU visible and usable in the incus instance or is this just not something expected to work?

Foxtrot314 · March 14, 2025, 3:35pm

Just something I found out about Incus.
You need to set config image.os: Windows for it to apply some Windows specific things to QEMU.

github.com/lxc/incus

Undocumented image.os behavior for Windows

opened 03:34PM - 14 Mar 25 UTC

foxtrotcz

So thanks to reading [this pull request](https://github.com/lxc/incus/pull/1767)… for solving RTC base in Windows VM I found out you configure QEMU differently if `image.os` contains word `windows`. For example you [disable 9p and vsock.](https://github.com/lxc/incus/pull/1217) This is quite important but I dont see it documented anywhere. Regular users will set up Windows VM according to [Simos guide](https://blog.simos.info/how-to-run-a-windows-virtual-machine-on-incus-on-linux/) and will never figure out they should set `image.os`. I didnt know it until now. This can cause wrong behavior of Windows VM in Incus simply because users dont know they should set `image.os: Windows`. I dont know if this is the only config that does something like this, but I believe it should be clearly documented so people now they should set it.

dasunsrule32 · March 15, 2025, 1:43pm

Can you run the following command and paste the results formatted as a code block for readability? Thanks.

incus config show <instance-name> --expanded

But getting Nvidia working can be configured with (off the top of my head):

incus config add <instance-name> -c nvidia.runtime=true

PS - I have a config in the OP that will configure Nvidia from the ground up.

dasunsrule32 · March 15, 2025, 2:32pm

Going to go through these, I know the default TNS values used to be set too low by default… Not sure if it’s been tweaked since apps moved to Docker.

dasunsrule32 · March 15, 2025, 5:27pm

So something DNS related since moving to Incus… looking. Going to the IP connects immediately.