Error in Apps Service after Dragonfish update from Cobia

I’m struggling after an update from Cobia to Dragonfish (went straight to 24.04.1) today. I had zero issues when I migrated from Bluefin to Cobia so I didn’t expect to have this problem.
The upgrade went through, but my Apps are not loading and I’m getting an error that says Error in Apps Service: Application(s) have failed to start: list index outside of range.
When I tried to check or change the Application settings, everything seems to fail with this networking errors as follows:

Error: route_v4_interface
Timed out waiting 60 seconds for bond1: eno1=eno2 to come up

More info…
Error: Traceback (most recent call last):
File “/usr/lib/python3/dist-packages/middlewared/job.py”, line 469, in run
await self.future
File “/usr/lib/python3/dist-packages/middlewared/job.py”, line 511, in __run_body
rv = await self.method(*args)
^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/schema/processor.py”, line 47, in nf
res = await f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/schema/processor.py”, line 187, in nf
return await func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/plugins/kubernetes_linux/update.py”, line 506, in do_update
await self.validate_data(config, ‘kubernetes_update’, old_config)
File “/usr/lib/python3/dist-packages/middlewared/plugins/kubernetes_linux/update.py”, line 377, in validate_data
verrors.check()
File “/usr/lib/python3/dist-packages/middlewared/service_exception.py”, line 70, in check
raise self
middlewared.service_exception.ValidationErrors: [EINVAL] kubernetes_update.route_v4_interface: Timed out waiting 60 seconds for bond1: eno1=eno2 to come up

I have the eno1 and eno2 setup for LACP (bond1) with what I thought was pretty much default setup. I have no issues connecting to my SMB shares or the truenas.local gui.

There were not any problems under Cobia and I’m not sure how to resolve this.

I found the guide here: Cobia to DragonFish Storage Migration | TrueCharts Charts
But when I try to run the first command line, it errors out, again, because it can’t get bond1 to come up.

Help please!

For now, I have rolled back to Cobia 23.10.2

There seems to be some conflict issue in the 24.04.1 version with Kubernetes and my bonded LACP interface. Kubernetes wouldn’t connect, but it also wouldn’t release my interface, therefore, I couldn’t find a way to delete it and re-create it. I couldn’t get any apps to load from previous settings, and I couldn’t install any new Apps.

Did you add a description to your LACP interface; specifically in the description field?
If so, try removing the description and see if it helps.

This one:
description

I am suggesting this this because it looks like there’s a bug in how the apps code handles the name of the interface if there’s a description added.

2 Likes

I didn’t try that. For now, I’m just happy that nothing was wrecked with my Cobia configuration. Everything is up and running for now. I’ll hold off on the Dragonfish for a few more maintenance updates and then try when I have more time on a weekend.

1 Like

You can keep tabs on the issue, as I see it, in this report:
https://ixsystems.atlassian.net/browse/NAS-129150

Any (?) interface with a description would likely cause this issue.

2 Likes

Thanks for the additional information in the bug report.

Yes, I appreciate the additional information and bug report. It sounds like the exact issue I am having.

Neofusion, your suggestion to remove the description on my bridged interface worked for me. Thanks!

Just created an account for that. Thank you! Not only link aggregation devices seem to have this issue, even my “normal” lan device stopped kubernetes from working, after deleting the description on the network settings everything seems to be back to normal again.

Thank you a lot!

Any tips on getting the “description” removed? When I try, it errors saying that eno2 is in use and it won’t let me save changes.

1 Like

Bingo!! Worked for me immediately. Had the same issue on upgrading to 24.04.1 and removing the description in the network interface dialog solved it right away.

NB - mine was not an aggregated LAN setup, just my one nic LAN interface. I had given it a name in the description field which, when removed, solved the post-upgrade kubernetes problem.

Hi everyone,

I have the same issue and when I came back to the 23.10 it stayed the same.

Screenshots below:




Thanks!

I rolled back also, went back to networking, edit my bond - remove the description and uncheck autoconfigure IPv6 , save, then updated to dragonfish with no issues. Btw I’m interested in setting up sandboxes with jailmaker. I’m done with truecharts apps.

TrueNAS Scale: Setting up Sandboxes with Jailmaker

1 Like

Hi again,

After a few hours, I see that the “Initializing apps service” error and all my apps stay in the “Deploying” state.

I did remove the Network description interface and auto v6 is disabled.

Please help, thanks!

Now the service is initializing and this is output of the K3s:

systemctl status k3s
● k3s.service - Lightweight Kubernetes
Loaded: loaded (/lib/systemd/system/k3s.service; disabled; preset: disabled)
Active: activating (start) since Thu 2024-05-30 07:39:52 EEST; 1min 32s ago
Docs: https://k3s.io
Process: 178389 ExecStartPre=/sbin/modprobe br_netfilter (code=exited, status=0/SUCCESS)
Process: 178390 ExecStartPre=/sbin/modprobe overlay (code=exited, status=0/SUCCESS)
Main PID: 178391 (k3s-server)
Tasks: 164
Memory: 961.2M
CPU: 5min 10.893s
CGroup: /system.slice/k3s.service
├─178391 “/usr/local/bin/k3s server”
└─178520 "containerd "

It appears the 24.04.1.1 update fixed this issue. I tried the upgrade again after checking the Bug thread linked above and everything is up and running with no issues (so far).

I never removed the description from my bond1.

Thanks for fixing this quickly!