## Summary
After upgrading to 26.0.0-BETA.1, Apps refuse to start on every boot. `docker.status` returns:
{“description”: “Application(s) have failed to start:\n[EFAULT] Unable to determine default interface”, “status”: “FAILED”}
`docker.service` stays inactive. Any new app install (e.g. Tailscale) fails with "docker not installed". 25.10.3 was unaffected, the issue started immediately after the BETA upgrade.
By the time the system is reachable, the default route is up and the interface is healthy. The error is stale, set during early boot, and middlewared never retries.
## Reproduction
1. 26.0.0-BETA.1, Apps pool selected, at least one app configured (Tailscale, SearXNG, etc.).
2. Reboot.
3. Once SSH is up:
```bash
sudo midclt call docker.status
# -> {"description": "...Unable to determine default interface", "status": "FAILED"}
sudo systemctl is-active docker
# -> inactive
ip route show default
# -> default via 192.168.0.1 dev enp0s31f6 proto dhcp src 192.168.0.48 metric 1002
cat /sys/class/net/enp0s31f6/operstate
# -> up
So the default route and the interface are both up, yet docker.status is still FAILED.
Root cause (middlewared source dive)
In /usr/lib/python3/dist-packages/middlewared/utils/interface.py:
def wait_for_default_interface_link_state_up() -> tuple[str | None, bool]:
default_interface = get_default_interface()
if default_interface is None:
return default_interface, False
return default_interface, wait_on_interface_link_state_up(default_interface)
get_default_interface() is called exactly once. If /proc/net/route does not yet contain a default route at that instant (race against NetworkManager / DHCP), it returns None and the helper gives up immediately. The 60s IFACE_LINK_STATE_MAX_WAIT budget is only consumed when polling operstate after a default interface has already been resolved, so it does not cover discovery of the default interface itself.
When this happens at boot, docker.state.start_service fails, sets state to FAILED, and never retries. State stays FAILED until the operator intervenes manually.
Verification that the helper itself works post-boot:
$ sudo python3 -c "from middlewared.utils.interface import \
get_default_interface, wait_for_default_interface_link_state_up; \
print('default:', get_default_interface()); \
print('wait:', wait_for_default_interface_link_state_up())"
default: enp0s31f6
wait: ('enp0s31f6', True)
Helper is correct, only the single-shot discovery is racy.
Workaround (every reboot)
sudo midclt call -j docker.fs_manage.mount
sudo midclt call docker.state.start_service
sudo midclt call docker.status
# -> {"description": "Application(s) are currently running", "status": "RUNNING"}
After this, all installed apps come up and new app installs work.
Suggested fix
Poll get_default_interface() within the existing IFACE_LINK_STATE_MAX_WAIT budget, then use the remaining budget for the operstate check. Patch ready, total worst-case wait stays bounded at the same 60s.
Why I’m posting here instead of Jira
I created a jira.ixsystems.com account but the Create Issue page returns “You are not authorized to perform this operation.” Posting here so iX staff can either re-file in Jira or unblock my account; happy to attach a GitHub PR against truenas/middleware:master once a Jira ticket exists.
Environment
- TrueNAS 26.0.0-BETA.1 (multi-boot with 25.10.3, 25 path was unaffected)
- Hardware: <fill in: motherboard / NIC>
- Default route iface:
enp0s31f6(Intel I219-V, 1G), DHCP from upstream router - Apps pool:
data - Apps installed: tailscale, searxng (both come up after manual
start_service)
I can attach journalctl -u middlewared -b 0 and cat /proc/net/route snapshots from boot if helpful.