Tailscale fails to deploy

HP ProLiant DL360 Gen9
32GB Ram
Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz
Dragonfish-24.04.1.1 (this happened before I upgraded to this so it cant be the version im in now)
2x 2TB SSD Mirror

Tailscale is not working I have already deleted it and then added it back but it does the same thing. It gets stuck on deploying. It’s been days like that. Last time this happened I had to reload Truenas. Thank goodness I’ve been testing it out but now I don’t want to reload truenas. Can anyone help?

2024-06-04 11:47:36 Startup probe failed: Logged out.

2024-06-04 11:07:37 Container image “tailscale/tailscale:v1.66.4” already present on machine

2024-06-03 20:12:45 Back-off restarting failed container tailscale in pod tailscale-677457658-brpsq_ix-tailscale(4361cb6b-692e-4be0-a31f-b7a3285752db)

1 Like

This may be related to a long-standing bug that we have not been able to reproduce previously. I suggest using the “Report a bug” link above to submit a bug report. After you submit it a comment will be added with a link to securely upload a debug file from the system so we can investigate it.

OK thanks

same thing here.
it’s been 3 days that Tailscale is SUDDENLY not deploying.
veeeery annoying since I had no notification about that, and no changes were made by me to cause this.

My Tailscale does deploy, but in my case, I can’t deploy Jellyfin, it just redeploys and redeploys all the time, is that the same behaviour as what you have?

Hello,
I’m using tailscale 1.66.3 on truenas scale.
Version 1.66.4 crashes with a crashloop back off error.
I’ve to modify the image to reinstall version 1.66.3
Am I the only one ?

same here

I also tried upgrading tailscale to 1.66.4 but it failed to deploy. I ended up rolling back to 1.66.3 (Chart Version: 1.0.39).

OS Version: TrueNAS-SCALE-24.04.1
Product: FREENAS-MINI-2.0
Model: Intel(R) Atom™ CPU C2750 @ 2.40GHz
Memory: 31 GiB

Edit: I found the solution.

After updating, edit the tailscale settings.

Under: Extra Arguments
Click Add and enter --reset

Save and deploy.


Same here, I tried first updating Tailscale while on Cobia. Failed, so rolled back.

Then I updated to Dragonfish 24.04.1.1 today, and attempted to update Tailscale again. Same problem, stuck at Deploying…

Rolled back to 1.0.39, wich is working ok now.

2 Likes

“—reset”

Took a week of deploying, added said argument and now is running.

I had a similar issue. I only had one app - tailscale - and on one truenas it worked fine and on another it would continuously deploy. The pod never started because the auth key for tailscale never got used and the new machine never showed on tailscale admin page. I had even removed the app service, deleted the ix-applications dataset area and recreated it all to no avail.
Even with adding the extra argument --reset, it still won’t deploy and i’m unable to find much in the way of docs for directly connecting to kubernetes here or any real log ino about what is wrong. I can see log info for the pod as shown below (and this seems consistent across multiple attempts):

2024-06-24 22:28:08.000523-04:00boot: 2024/06/25 02:28:08 error checking get permission on secret tailscale-tailscale-secret: Post "https://kubernetes.default.svc/apis/authorization.k8s.io/v1/selfsubjectaccessreviews": dial tcp: lookup kubernetes.default.svc on 172.17.0.10:53: write udp 192.168.0.80:52550->172.17.0.10:53: write: operation not permitted
2024-06-24 22:28:08.000824-04:00boot: 2024/06/25 02:28:08 error checking update permission on secret tailscale-tailscale-secret: Post "https://kubernetes.default.svc/apis/authorization.k8s.io/v1/selfsubjectaccessreviews": dial tcp: lookup kubernetes.default.svc on 172.17.0.10:53: write udp 192.168.0.80:60240->172.17.0.10:53: write: operation not permitted
2024-06-24 22:28:08.001026-04:00boot: 2024/06/25 02:28:08 error checking patch permission on secret tailscale-tailscale-secret: Post "https://kubernetes.default.svc/apis/authorization.k8s.io/v1/selfsubjectaccessreviews": dial tcp: lookup kubernetes.default.svc on 172.17.0.10:53: write udp 192.168.0.80:43507->172.17.0.10:53: write: operation not permitted
2024-06-24 22:28:08.001273-04:00boot: 2024/06/25 02:28:08 error setting up for running on Kubernetes: Getting Tailscale state Secret tailscale-tailscale-secret: Get "https://kubernetes.default.svc/api/v1/namespaces/ix-tailscale/secrets/tailscale-tailscale-secret": dial tcp: lookup kubernetes.default.svc on 172.17.0.10:53: write udp 192.168.0.80:35040->172.17.0.10:53: write: operation not permitted

This looks to be some sort of configuration or permissions issue but I can’t seem to find anything but dead ends on the web here. Any tips appreciated, as the --reset did not help me at least.

Also, it would be nice to know how --reset was discovered or where that is documented - i.e. why does that parameter help in some cases?

@tegra did you file a bug? if so, can we have the link here? i may have something to add to it.

Tailscale changelog is listing updates and changes made. There is a new release 1.68.1 currently available in the TrueNAS catalog of apps.
TrueCharts appears to be on 1.66.3.

Trialscale Change Log Changelog · Tailscale

Perhaps the answer to the issues is in the list of recent changes.

yes…huh will try this now - i can see yesterday night there was an update to the container version as highlighted here

ah to no avail. same kind of error. here’s more log info in case this helps identify the issue

2024-06-26 09:00:36
Back-off restarting failed container tailscale in pod tailscale-7b44c768b5-rvxr7_ix-tailscale(8a9f7ae8-b09f-4ae0-8c96-b76369ae7f15)
 2024-06-26 08:59:27
Startup probe errored: rpc error: code = Unknown desc = failed to exec in container: container is in CONTAINER_EXITED state
2024-06-26 08:59:08
Pod sandbox changed, it will be killed and re-created.
2024-06-26 08:58:55
Stopping container tailscale
2024-06-26 08:58:54
Started container tailscale
2024-06-26 08:58:51
Created container tailscale
2024-06-26 08:58:45
Container image "tailscale/tailscale:v1.68.1" already present on machine
2024-06-26 08:58:40
Scaled up replica set tailscale-7b44c768b5 to 1
2024-06-26 08:58:40
Created pod: tailscale-7b44c768b5-rvxr7
2024-06-26 08:58:40
Successfully assigned ix-tailscale/tailscale-7b44c768b5-rvxr7 to ix-truenas

pod logs

2024-06-26 09:01:12.508381-04:00boot: 2024/06/26 13:01:12 error checking get permission on secret tailscale-tailscale-secret: Post "https://kubernetes.default.svc/apis/authorization.k8s.io/v1/selfsubjectaccessreviews": dial tcp: lookup kubernetes.default.svc on 172.17.0.10:53: write udp 192.168.0.80:56760->172.17.0.10:53: write: operation not permitted
2024-06-26 09:01:12.508534-04:00boot: 2024/06/26 13:01:12 error checking update permission on secret tailscale-tailscale-secret: Post "https://kubernetes.default.svc/apis/authorization.k8s.io/v1/selfsubjectaccessreviews": dial tcp: lookup kubernetes.default.svc on 172.17.0.10:53: write udp 192.168.0.80:59987->172.17.0.10:53: write: operation not permitted
2024-06-26 09:01:12.508819-04:00boot: 2024/06/26 13:01:12 error checking patch permission on secret tailscale-tailscale-secret: Post "https://kubernetes.default.svc/apis/authorization.k8s.io/v1/selfsubjectaccessreviews": dial tcp: lookup kubernetes.default.svc on 172.17.0.10:53: write udp 192.168.0.80:44615->172.17.0.10:53: write: operation not permitted
2024-06-26 09:01:12.509003-04:00boot: 2024/06/26 13:01:12 error setting up for running on Kubernetes: Getting Tailscale state Secret tailscale-tailscale-secret: Get "https://kubernetes.default.svc/api/v1/namespaces/ix-tailscale/secrets/tailscale-tailscale-secret": dial tcp: lookup kubernetes.default.svc on 172.17.0.10:53: write udp 192.168.0.80:49965->172.17.0.10:53: write: operation not permitted

so the pod logs look to still be showing trouble getting a secret (although the error message looks to mention a write operation is not permitted).

additional searching on something like Getting Tailscale state Secret tailscale-tailscale-secret dial tcp write operation not permitted leads us to a dead end post that i had found previously on the old truenas forum - Tailscale app: error checking get permission on secret file | TrueNAS Community

fwiw - i’m attempting to bind the tailscale container to the host network by checking the app’s host network checkbox - i have seen this work fine on another truenas system of the same version - just not for the one i need it on. i have also tried unchecking that host network option - no real difference.

is there a documented/supported way to connect to the kubernetes cluster that runs here? that may be necessary to troubleshoot further.

I began a separate thread, believing my problems to be different from this thread, but now I am not too sure. Sorry for duplicating, here’s mine (so far):

Ought I to attempt to move it to become a post on the end of this one here?

I shall be pleased to submit a bug report too if you want me to - please let me know.

EB

Edit: I’m using the official truenas app for Tailscale found in the apps and it is working.

I didn’t make it complicated and followed Tom Lawrence’s video on setting up the Tailscale app on Truenas Scale. Initially I set up under Cobia, but after updating to Dragonfish latest stable (.1.1) the installs broke when tailscale app wanted an update and I had to roll back to the previous working version of the tailscale app, then setup a new instance then remove the old instance.

When first did a setup I followed the video from Tom I did the following:
entered the auth key from Tailscale
hostname to appear on tailscale (I set a host name different from my Truenas servers hostname).
set the route as the route of my network as in this example 192.168.1.0/24 so I could use exit node
Checked advertise exit node
Checked userspace >> I think this needs to be checked
Checked host network. >>without doing this I couldn’t get the app to talk to the local network.
logged into tailscale my network (at some point don’t remember exactly when in setup)
Allowed app to deploy
set app to run
In my tailscale site I went to the server name and then to edit route settings and and enabled the options there.

As Tom said this is really basic, but does work. If you can get this to work, then you can build from there.

To connect a computer to your tailscale network, you need to install tailscale on the computer, and some things may need special handling, but tailscale goes over almost everything to get stuff working in their docs.

They also have docs on how to setup and run your own

Ahh yes, thanks for reminding me: I remember that video from several months ago. I used it to set up the truecharts (tc) instance, to which I easily connected my tablets and phones. It all worked perfectly with TrueNAS advertising as an exit node and the route of 192.168.1.0/24 exposing what I wanted.

I have still failed to get the truenas (tc) tailscale app working, but these are the steps I have attempted.

  • I stopped the working tailscale app.

  • I stopped and deleted the “deploying” tn app and reinstalled it.

  • I stopped and deleted the “deploying” tn app and reinstalled it again.

  • I found the container image in the tn UI docker.io/tailscale/tailscale:v1.68.1 and deleted it

  • I reinstalled the tn app and tried again.

  • I searched for “tailscale” in the containers and found two unlabelled containers (SHA256 listed only) and checked and checked again that these two were showing up when searching for tailscale" … I then deleted these two

  • I reinstalled the tn app and tried again. It stuck on “deploying”.

  • I stopped and deleted the “deploying” tn app and reinstalled it again.

  • I stopped and deleted the “deploying” tn app and reinstalled it again, this time removing any advertised routes.

  • I stopped and deleted the “deploying” tn app and reinstalled it again, this time deciding not to advertise as an exit node.

  • I stopped and deleted the “deploying” tn app and reinstalled it again, this time without any advertised routes and also deciding not to advertise as an exit node.

  • I stopped and deleted the “deploying” tn app and reinstalled it again, this time without any advertised routes and also deciding not to advertise as an exit node and also unselecting “run as userspace”.

  • I confirmed there weren’t any “residual” containers left over when I searched them again for “tailscale”

  • I installed the tc tailscale with the settings borrowed from the video, like I did before Christmas, and it installed and it is running; my devices are connected to tailscale but, this time, the TN is not being seen as an exit node even though it is set to do so and the settings are the same as before (I took a screenshot so I could copy them, when I uninstalled it earlier today).

  • I generated a new TS auth key and used it in the tc app - all OK now - perhaps this is the problem I have been experiencing all along. Perhaps I needed a new auth key when I tried the tn app.

  • I reinstalled the tn app again but this time I used this new key, having set the key to be re-usable, and the tn app fired up into “running” after a few seconds and I can see it in the tailscale web dashboard and my phones and tablets are connected and using it as an exit node. Problem solved for me : employ a reusable Tailscale key

None of this is mission critical for me so it has only been some hobby frustration/success/education, but if I can usefully report some logs to a bug tracker somewhere, please let me know where and what to do.

EB, running Dragonfish-24.04.1.1