Jailmaker sandbox. Suddenly cant ping or access it, but it has internet access

I have set up a Jailmaker sandbox using the tutorial from @Stux (thank you for that!) with the intention to migrate all my truecharts apps to there.

I named it docker, and installed dockge to manage my docker compose files. like in the tutorial

Everything worked fine initially. Got my dashboard running, Jellyfin, speedtest… but this evening i wanted to migrate my arr stack, and suddenly i cant access my sandbox anymore over the network, but i can still access it using the trueNAS shell and the “jlmkr shell docker” command.

I can’t ping the sandbox (should be on ip 192.168.68.55 according to my router), but i can ping my laptop and google.com just fine from inside the sandbox shell.

Prior to losing the ability to connect to the sandbox, i changed nothing in the network settings. I was just restoring my backups from Radarr and Sonarr.

When i use ip a in the sanbox shell, i get greeting with 24 entries, instead of the 3 that the tutorial shows. my knowledge of linux is pretty limited, and i honestly have no idea where to start to fix this issue. rebooting the sandbox, rebooting trueNAS, and rebooting my router did not work. I have set a static IP adress for the sandbox in my router (TPlink Deco).

“ip a” output:

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host noprefixroute
       valid_lft forever preferred_lft forever
2: mv-enp2s0@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 36:95:1b:6f:20:c4 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 192.168.68.55/22 metric 1024 brd 192.168.71.255 scope global dynamic mv-enp2s0
       valid_lft 6150sec preferred_lft 6150sec
    inet6 fe80::3495:1bff:fe6f:20c4/64 scope link
       valid_lft forever preferred_lft forever
3: br-094f61d70a4c: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
    link/ether 02:42:f3:4f:9c:07 brd ff:ff:ff:ff:ff:ff
    inet 172.22.0.1/16 brd 172.22.255.255 scope global br-094f61d70a4c
       valid_lft forever preferred_lft forever
4: br-0d7452a0f7be: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 02:42:cc:a1:28:a2 brd ff:ff:ff:ff:ff:ff
    inet 172.21.0.1/16 brd 172.21.255.255 scope global br-0d7452a0f7be
       valid_lft forever preferred_lft forever
    inet6 fe80::42:ccff:fea1:28a2/64 scope link
       valid_lft forever preferred_lft forever
5: br-134597c215b4: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
    link/ether 02:42:66:7b:3c:fc brd ff:ff:ff:ff:ff:ff
    inet 172.24.0.1/16 brd 172.24.255.255 scope global br-134597c215b4
       valid_lft forever preferred_lft forever
6: br-5c46e4b8c66d: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 02:42:3d:d4:ab:0a brd ff:ff:ff:ff:ff:ff
    inet 172.29.0.1/16 brd 172.29.255.255 scope global br-5c46e4b8c66d
       valid_lft forever preferred_lft forever
    inet6 fe80::42:3dff:fed4:ab0a/64 scope link
       valid_lft forever preferred_lft forever
7: br-941e37664e98: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
    link/ether 02:42:f8:f7:05:4d brd ff:ff:ff:ff:ff:ff
    inet 192.168.0.1/20 brd 192.168.15.255 scope global br-941e37664e98
       valid_lft forever preferred_lft forever
8: br-96470ae0dfb9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 02:42:21:82:5a:fb brd ff:ff:ff:ff:ff:ff
    inet 172.28.0.1/16 brd 172.28.255.255 scope global br-96470ae0dfb9
       valid_lft forever preferred_lft forever
    inet6 fe80::42:21ff:fe82:5afb/64 scope link
       valid_lft forever preferred_lft forever
9: br-53a8d608cb14: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
    link/ether 02:42:64:66:63:3b brd ff:ff:ff:ff:ff:ff
    inet 172.19.0.1/16 brd 172.19.255.255 scope global br-53a8d608cb14
       valid_lft forever preferred_lft forever
10: br-77bd0ee14cf4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 02:42:e8:54:a9:b5 brd ff:ff:ff:ff:ff:ff
    inet 172.20.0.1/16 brd 172.20.255.255 scope global br-77bd0ee14cf4
       valid_lft forever preferred_lft forever
    inet6 fe80::42:e8ff:fe54:a9b5/64 scope link
       valid_lft forever preferred_lft forever
11: br-798a8bd67805: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
    link/ether 02:42:56:fe:d9:7a brd ff:ff:ff:ff:ff:ff
    inet 172.23.0.1/16 brd 172.23.255.255 scope global br-798a8bd67805
       valid_lft forever preferred_lft forever
12: br-79a252d5c45b: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 02:42:49:f2:f0:2f brd ff:ff:ff:ff:ff:ff
    inet 172.18.0.1/16 brd 172.18.255.255 scope global br-79a252d5c45b
       valid_lft forever preferred_lft forever
    inet6 fe80::42:49ff:fef2:f02f/64 scope link
       valid_lft forever preferred_lft forever
13: br-e7c6e724012f: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
    link/ether 02:42:b9:4b:15:92 brd ff:ff:ff:ff:ff:ff
    inet 172.31.0.1/16 brd 172.31.255.255 scope global br-e7c6e724012f
       valid_lft forever preferred_lft forever
14: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
    link/ether 02:42:50:38:27:87 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
       valid_lft forever preferred_lft forever
16: vethdb2c95b@if15: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br-77bd0ee14cf4 state UP group default
    link/ether 26:0d:4c:00:fd:73 brd ff:ff:ff:ff:ff:ff link-netnsid 5
    inet6 fe80::240d:4cff:fe00:fd73/64 scope link
       valid_lft forever preferred_lft forever
18: veth8e2330f@if17: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br-0d7452a0f7be state UP group default
    link/ether 02:cf:ca:55:11:fe brd ff:ff:ff:ff:ff:ff link-netnsid 2
    inet6 fe80::cf:caff:fe55:11fe/64 scope link
       valid_lft forever preferred_lft forever
20: veth5d249dc@if19: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br-79a252d5c45b state UP group default
    link/ether ea:e1:79:f1:75:f3 brd ff:ff:ff:ff:ff:ff link-netnsid 1
    inet6 fe80::e8e1:79ff:fef1:75f3/64 scope link
       valid_lft forever preferred_lft forever
22: veth3e9819f@if21: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br-5c46e4b8c66d state UP group default
    link/ether 32:2d:50:9a:da:1d brd ff:ff:ff:ff:ff:ff link-netnsid 3
    inet6 fe80::302d:50ff:fe9a:da1d/64 scope link
       valid_lft forever preferred_lft forever
24: veth69da93f@if23: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br-96470ae0dfb9 state UP group default
    link/ether ca:08:da:d0:3b:0e brd ff:ff:ff:ff:ff:ff link-netnsid 4
    inet6 fe80::c808:daff:fed0:3b0e/64 scope link
       valid_lft forever preferred_lft forever

is anyone able to help me troubleshoot this?

The first question - is dockege running in the sandbox?

If you run docker ps in the jail does ot show dockge running?

Yes. docker stats shows that all my containers are running.
It’s good to note that it’s not just that I can’t open dockge, but I cant ping the sandbox, at all, while I was able to do so just fine before. To test, I’ve set up a quick new sandbox, and i can ping that one. Im starting to thing that it might me easier to restore the binds in that new sandbox, install docker, and just restart all containers in that new sandbox, however i would love to find out what on earth went wrong so that i can prevent it in the future (this also doesn’t instill much trust in this system for my critical self-hosted apps)

One of the issues that comes up, especially when using qBitTorrent, is running out of kernel handles for watch resources etc.

Try searching the jailmaker thread for “too many files”

I suspect this could explain a networking failure.

OK - so the problem is fairly fundamental then. Its at the jail level

Can you share the following please:

  1. Hardware Config with specific details on your network devices and configuration
  2. Your jail config (jlmkr edit [name])
  3. How is the IP assigned to the jail?

I had a qbittorrent container indeed, but i had not set it up properly yet, and not running when the problems occurred, or the 2 days before. Of course it might still be possible though.

of course, Let me know when you need extra info, but if it involves using grep commands in the terminal, then you’ll have to tell me exactly what i need to enter i’m afraid :sweat_smile:

1: CPU: Celeron J6413
RAM: 24GB (16 + 8GB) non-ECC
storage: RAIDZ1 5x6TB with 250 GB NVME L2ARC
250GB NVME boot drive
500GB Sata SSD (unmirrored) as app storage pool.
network: 2x intel i226-V and 1xRTL8125BG 2,5GBit. at the moment of failure i was using one of the Intel i226-V ports.
connected to a TP-link Deco XE75Pro mesh network.
Version: Dragonfish-24.04.1.1

jail config:

gpu_passthrough_intel=1
gpu_passthrough_nvidia=0
seccomp=1

systemd_nspawn_user_args=--network-macvlan=enp2s0
        --resolv-conf=bind-host
        --system-call-filter='add_key keyctl bpf'

        --bind='/mnt/apps/docker/stacks:/opt/stacks'
        --bind='/mnt/apps/docker/data:/mnt/data'
        --bind='/mnt/Klemmers/Klemmers/media:/mnt/media'
        --bind='/mnt/Klemmers/Klemmers/apps:/mnt/apps'
        --bind='/mnt/Klemmers/Klemmers/audiobooks:/mnt/audiobooks'

pre_start_hook=#!/usr/bin/bash
        set -euo pipefail
        echo 'PRE_START_HOOK'
        echo 1 > /proc/sys/net/ipv4/ip_forward
        modprobe br_netfilter
        echo 1 > /proc/sys/net/bridge/bridge-nf-call-iptables
        echo 1 > /proc/sys/net/bridge/bridge-nf-call-ip6tables

distro=debian
release=bookworm

initial_setup=#!/usr/bin/bash
        set -euo pipefail

        apt-get update && apt-get -y install ca-certificates curl
        install -m 0755 -d /etc/apt/keyrings
        curl -fsSL https://download.docker.com/linux/debian/gpg -o /etc/apt/keyrings/docker.asc
        chmod a+r /etc/apt/keyrings/docker.asc

        echo \
        "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/lin>        $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
        tee /etc/apt/sources.list.d/docker.list > /dev/null


        apt-get update
        apt-get install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin dbus

        if [ -f /usr/bin/nvidia-smi ]; then
        curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey -o /etc/apt/keyrings/nvidia.asc
        chmod a+r /etc/apt/keyrings/nvidia.asc
        curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
        sed 's#deb https://#deb [signed-by=/etc/apt/keyrings/nvidia.asc] https://#g' | \
        tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

        apt-get update
        apt-get install -y nvidia-container-toolkit

        nvidia-ctk runtime configure --runtime=docker
        systemctl restart docker
        fi

        docker info

systemd_run_default_args=--property=KillMode=mixed
        --property=Type=notify
        --property=RestartForceExitStatus=133
        --property=SuccessExitStatus=133
        --property=Delegate=yes
        --property=TasksMax=infinity
        --collect
        --setenv=SYSTEMD_NSPAWN_LOCK=0

systemd_nspawn_default_args=--keep-unit
        --quiet
        --boot
        --bind-ro=/sys/module
        --inaccessible=/sys/module/apparmor

3: the IP is assigned my DHCP address reservation in my router

by the way: ive spun up a new docker jail using the same jail config, installed dockge there, and loaded all my old containers there. Everything is working again in that new jail with not much effort, but for how long… I sure hope i dont need to spin up a new jail every week.

If you need more info: just ask

Well - I don’t see anything wrong there

:frowning:

I have found the trigger to reproduce the problem:

it happens whenever i try to install a container with the variable:

volumes:
  - /var/run/docker.sock:/var/run/docker.sock

then dockge gets to “creating network” and shits the bed after that.
Luckily i have made myself a copy-paste tutorial to recreate my specific setup, so getting everything back up and running only takes 10 minutes now :sweat_smile: