TrueNAS Scale Jailmaker fails to start after power outage

I recently migrated from truecharts apps and bluefin to dragonfish and jailmaker/docker. I had it up and running for a couple weeks after an embarrassing amount of time getting it running. This morning I woke up to a power outage and tonight noticed that my jail didn’t restart. Since I completed the migration, it hasn’t started automatically for any reboot, but manual starts had been working. Now I get job for jlmkr-docker.service failed with the associated see systemctl/journalctl log messages when I use jlmkr start docker. Here are the outputs:

jlmkr start docker-

admin@truenas[~]$ jlmkr list      
[sudo] password for admin: 
NAME   RUNNING STARTUP GPU_INTEL GPU_NVIDIA OS     VERSION ADDRESSES
docker False   True    True      False      debian 12      -        
admin@truenas[~]$ jlmkr start docker

Starting jail docker with the following command:

systemd-run --property=KillMode=mixed --property=Type=notify --property=RestartForceExitStatus=133 --property=SuccessExitStatus=133 --property=Delegate=yes --property=TasksMax=infinity --collect --setenv=SYSTEMD_NSPAWN_LOCK=0 --unit=jlmkr-docker --working-directory=./jails/docker '--description=My nspawn jail docker [created with jailmaker]' --property=ExecStartPre=/mnt/lackskill_pool01/jailmaker/jails/docker/.ExecStartPre -- systemd-nspawn --keep-unit --quiet --boot --bind-ro=/sys/module --inaccessible=/sys/module/apparmor --machine=docker --directory=rootfs --bind=/dev/dri --network-macvlan=enp7s0 --resolv-conf=bind-host '--system-call-filter=add_key keyctl bpf' --bind=/mnt/lackskill_pool01/jailmaker/data/docker/data:/mnt/data --bind=/mnt/lackskill_pool01/jailmaker/data/docker/stacks:/opt/stacks --bind=/mnt/lackskill_pool01/jailmaker:/mnt/jailmaker --bind=/mnt/lackskill_pool01/jailmaker/data/apps:/mnt/apps --bind=/mnt/lackskill_pool01/jailmaker/data:/data

Job for jlmkr-docker.service failed.
See "systemctl status jlmkr-docker.service" and "journalctl -xeu jlmkr-docker.service" for details.

Failed to start jail docker...
In case of a config error, you may fix it with:
jlmkr.py edit docker```

systemctl/journalctl outputs-

admin@truenas[~]$ sudo systemctl status jlmkr-docker.service
Unit jlmkr-docker.service could not be found.
admin@truenas[~]$ sudo journalctl -xeu jlmkr-docker.service 
░░ 
░░ The unit jlmkr-docker.service has entered the 'failed' state with result 'exit-code'.
Jul 20 21:35:02 truenas systemd[1]: Failed to start jlmkr-docker.service - My nspawn jail docker [created with jailmaker].
░░ Subject: A start job for unit jlmkr-docker.service has failed
░░ Defined-By: systemd
░░ Support: https://www.debian.org/support
░░ 
░░ A start job for unit jlmkr-docker.service has finished with a failure.
░░ 
░░ The job identifier is 509 and the job result is failed.
Jul 20 21:35:40 truenas systemd[1]: Starting jlmkr-docker.service - My nspawn jail docker [created with jailmaker]...
░░ Subject: A start job for unit jlmkr-docker.service has begun execution
░░ Defined-By: systemd
░░ 
░░ A start job for unit jlmkr-docker.service has begun execution.
░░ 
░░ The job identifier is 1004.
Jul 20 21:35:40 truenas .ExecStartPre[5796]: PRE_START_HOOK
Jul 20 21:35:40 truenas systemd-nspawn[5799]: Failed to stat /mnt/lackskill_pool01/jailmaker/data/docker/data: No such file or directory
Jul 20 21:35:40 truenas systemd[1]: jlmkr-docker.service: Main process exited, code=exited, status=1/FAILURE
░░ Subject: Unit process exited
░░ Defined-By: systemd
░░ 
░░ An ExecStart= process belonging to unit jlmkr-docker.service has exited.
░░ 
░░ The process' exit code is 'exited' and its exit status is 1.
Jul 20 21:35:40 truenas systemd[1]: jlmkr-docker.service: Failed with result 'exit-code'.
░░ Subject: Unit failed
░░ Defined-By: systemd
░░ 
░░ The unit jlmkr-docker.service has entered the 'failed' state with result 'exit-code'.
Jul 20 21:35:40 truenas systemd[1]: Failed to start jlmkr-docker.service - My nspawn jail docker [created with jailmaker].
░░ Subject: A start job for unit jlmkr-docker.service has failed
░░ Defined-By: systemd
░░ 
░░ A start job for unit jlmkr-docker.service has finished with a failure.
░░ 
░░ The job identifier is 1004 and the job result is failed.```

Here’s my config:

startup=1
gpu_passthrough_intel=1
gpu_passthrough_nvidia=0
# Turning off seccomp filtering improves performance at the expense of security
seccomp=1

# Use macvlan networking to provide an isolated network namespace,
# so docker can manage firewall rules
# Alternatively use --network-macvlan=eno1 instead of --network-bridge
# Ensure to change eno1/br1 to the interface name you want to use
# You may want to add additional options here, e.g. bind mounts
systemd_nspawn_user_args=--network-macvlan=enp7s0
        --resolv-conf=bind-host
        --system-call-filter='add_key keyctl bpf'
        --bind='/mnt/my-pool/jailmaker/data/docker/data:/mnt/data'
        --bind='/mnt/my-pool/jailmaker/data/docker/stacks:/opt/stacks'
        --bind='/mnt/my-pool/jailmaker:/mnt/jailmaker'
        --bind='/mnt/my-pool/jailmaker/data/apps:/mnt/apps'
        --bind='/mnt/my-pool/jailmaker/data:/data'

# Script to run on the HOST before starting the jail
# Load kernel module and config kernel settings required for docker
pre_start_hook=#!/usr/bin/bash
        set -euo pipefail
        echo 'PRE_START_HOOK'
        echo 1 > /proc/sys/net/ipv4/ip_forward
        modprobe br_netfilter
        echo 1 > /proc/sys/net/bridge/bridge-nf-call-iptables
        echo 1 > /proc/sys/net/bridge/bridge-nf-call-ip6tables

# Only used while creating the jail
distro=debian
release=bookworm

# Install docker inside the jail:
# https://docs.docker.com/engine/install/debian/#install-using-the-repository
# Will also install the NVIDIA Container Toolkit if gpu_passthrough_nvidia=1 during initial setup
# https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html
initial_setup=#!/usr/bin/bash
        set -euo pipefail

        apt-get update && apt-get -y install ca-certificates curl
        install -m 0755 -d /etc/apt/keyrings
        curl -fsSL https://download.docker.com/linux/debian/gpg -o /etc/apt/keyrings/docker.asc
        chmod a+r /etc/apt/keyrings/docker.asc

        echo \
        "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/debian \
        $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
        tee /etc/apt/sources.list.d/docker.list > /dev/null

        apt-get update
        apt-get install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

        # The /usr/bin/nvidia-smi will be present when gpu_passthrough_nvidia=1
        if [ -f /usr/bin/nvidia-smi ]; then
        curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey -o /etc/apt/keyrings/nvidia.asc
        chmod a+r /etc/apt/keyrings/nvidia.asc
        curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
        sed 's#deb https://#deb [signed-by=/etc/apt/keyrings/nvidia.asc] https://#g' | \
        tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

        apt-get update
        apt-get install -y nvidia-container-toolkit

        nvidia-ctk runtime configure --runtime=docker
        systemctl restart docker
        fi

        docker info

# You generally will not need to change the options below
systemd_run_default_args=--property=KillMode=mixed
        --property=Type=notify
        --property=RestartForceExitStatus=133
        --property=SuccessExitStatus=133
        --property=Delegate=yes
        --property=TasksMax=infinity
        --collect
        --setenv=SYSTEMD_NSPAWN_LOCK=0

systemd_nspawn_default_args=--keep-unit
        --quiet
        --boot
        --bind-ro=/sys/module
        --inaccessible=/sys/module/apparmor

Updated to 24.04.2 tonight.

Tired and frustrated, so I’m probably missing something. Go easy on me. Thanks!

The ' is in the wrong place. It should be after the =

Edit: that could be a bug in the logging from jailmaker though.

This seems like a bad idea to me :-/

You’re mounting all the jails into the jail.

This is your problem.

So. It seems some of your mounts don’t exist. You seem a bit confused too. You need to mount stacks at /opt/stacks.

Confirmed that the log output doesn’t match the config. The config has --system-call-filter=‘add_key keyctl bpf’

I’m full of bad ideas… In reality this was an evolution of not being able to get containers to see mounts that weren’t located within the jail, so I concocted that little gem of an idea after hours of troubleshooting that problem.

You sir are a gentleman and a scholar. That did fix it. Can I buy you a coffee or a beer?

My confusion is a 100% certainty. I thrashed for 50 hours over 4 days while on vacation between figuring out truecharts wasn’t the path forward, not being able to get official charts working, finding your guide on youtube, and then messing those instructions up like 4 times. lol.