LetsEncrypt ssl cert distribution tips?

John · March 7, 2025, 9:45pm

While not looking at any code (I’ve seen enough ‘spaghetti code’ and APIs) the comment (above) about ‘websocket design’ caught my eye.

A ‘websocket design’ would mean a client server model. -i.e., not as “user friendly” -e.g. a user would have to build a client which isn’t a bad thing if you provide a skeleton and/or are dealing with other developers who understand what a websock is. And what I’m saying is that ‘websock’ has moving parts like “authorization”, so depending on how the truenas websock api is written, you (read: users of your script) would probably need a key or something like that which is far more overhead than necessary for something like this.

I’d be tempted to deploy/design this as a pull operation where each target uploads their pub key to a source and makes requests of the source based on a scheduler. The source would “know about” targets (aka: have an ‘authorization’ list based upon these keys) and errors in the system would be isolated to the source being down vs the source script potentially crashing if one target is down (and not completing tasks for the other targets) used in a push method.

However, if you insist on the push method, I would have probably started with modifying the ssh-copy-id script (which has been throughly vetted) instead of trying to write a python solution (a sledge hammer in this case).

All that being said, there is already a tool for this called Anvil (which I have not looked at either).

dan · March 7, 2025, 10:08pm

Fortunately, iX have already done this; I gave the link up-topic.

I think you’ve just conceptually reinvented the ACME protocol.

I’ll happily entertain any alternative suggestions for TrueNAS. Because as far as I know, there are two valid ways to get a cert into TrueNAS:

Have TrueNAS get the cert itself, or
Using the API, which in turn offers
- Manually import the cert using the web UI, or
- Code something to call the API to import/deploy the cert on an automatic basis.

The first is only an option if you’re using a DNS host that your version of TrueNAS supports–if you’re using CORE and Cloudflare, sucks to be you; CORE doesn’t support Cloudflare and never will. If you’re using SCALE and anything other than Route53, Cloudflare, or OVH, you’re in the same boat, with the exception that iX might someday add support for your DNS host. The second is a non-starter for short-lived certs. So the third is where I’ve landed, I wrote the script with input and assistance from many others, and it’s worked adequately for some time.

Using the new API and iX’ client, it seems I can do the same thing (import a cert, tell the GUI to use it, tell FTP to use it, tell any apps that use certs to use it, delete any old certs) in a few dozen lines of code, but their client seems to have a bug with connecting to remote hosts.

I’ll look forward to your modification of ssh-copy-id that gets a cert integrated into a TrueNAS system in such a way that the GUI and middleware are tracking it and that any cert-aware apps are using it.

John · March 7, 2025, 11:27pm

Sorry, seems as though I hit a nerve. I don’t have a need for this but I did 5 minutes worth of reading on ACME.sh. Got bored and distracted; played with my 7 year old in the snow. Got cold. Came inside. Remembered this. Went to github. Found this.

github.com/acmesh-official/acme.sh

deploy/truenas_ws.sh

master

#!/usr/bin/env sh

# TrueNAS deploy script for SCALE/CORE using websocket
# It is recommend to use a wildcard certificate
#
# Websocket Documentation: https://www.truenas.com/docs/api/scale_websocket_api.html
#
# Tested with TrueNAS Scale - Electric Eel 24.10
# Changes certificate in the following services:
#  - Web UI
#  - FTP
#  - iX Apps
#
# The following environment variables must be set:
# ------------------------------------------------
#
# # API KEY
# # Use the folowing URL to create a new API token: <TRUENAS_HOSTNAME OR IP>/ui/apikeys
# export DEPLOY_TRUENAS_APIKEY="<API_KEY_GENERATED_IN_THE_WEB_UI"
#

This file has been truncated. show original

dan · March 8, 2025, 12:21am

Maybe a bit. It seems to me like you made a comment without having any idea what you were talking about in the context of this thread, and are trying to deflect from that fact. Your complete lack of substantive response to the points I made reinforces that impression.

John · March 8, 2025, 1:20am

No, I’m not trying to deflect at all (I’m truly trying to have a discussion about ideas/concepts so I can help with code–which is what I’m okay at). I don’t know anything about ACME at all (I thought I said that much at least). …however, on to your points. Please bear with me because I have 5 minutes of reading on ACME. As far as I know you can generate a cert without having to deploy it with ACME, correct?

Assumptions so far:

We are using ACME on a non-truenas device (like a RasPi, VM, Jail, whatever). Call it “cert.local”.
We don’t actually have to use a websocket to communicate with this device; meaning, we can communicate via SSH (for example) from “truenas.local” to cert.local?
The only time we’d need websocket is if we try to PUSH from cert.local TO truenas.local.

essinghigh · March 8, 2025, 1:25am

Before putting in my two cents, maybe it’s worth getting an idea of how your home lab systems are set up.

Possible a central reverse proxy deployment (NPM, Caddy, Traefik, etc.) would work for you instead of having to handle certificate management across a number of systems?

In regards to push/pull, if you’re exposing something to the internet it’s probably a better idea to push the cert to it in order to avoid storing any sort of credentials on an internet-facing system. If it’s internal only, then I personally don’t think it makes a massive difference.

John · March 8, 2025, 1:31am

Agree on the “internet facing/push” concept, but I thought this was to eliminate the “challenge” aspect for the PUSH method on a local setup. Is that wrong?

essinghigh · March 8, 2025, 1:40am

Based on their initial post I’m assuming they want to avoid running an ACME client on each machine. I’m not very familiar with ACME but my understanding is that if they were to run it standalone on each machine and grab a cert from, say, LetsEncrypt, then it would generate a new cert for each instance of the ACME client running - probably not great for ratelimiting:

Each time you request a certificate from Let’s Encrypt, a new order is created
Up to 5 certificates can be issued per exact same set of hostnames every 7 days.

So I gathered they were instead trying to request a cert once, and then distribute it to their other environments.

Could be completely wrong but that’s just how I interpreted it from my very limited understanding of ACME/certbot

John · March 8, 2025, 1:53am

From what I know you don’t have to have ACME on each node. You want it on one node and distribute from there to all the other nodes.

And I am assuming the ‘cert.local’ node can run ACME.SH using the standalone method.
acme.sh --issue -d example.com --standalone -d www.example.com
Link:

This leads me to believe/assume my assumptions above are valid.

essinghigh · March 8, 2025, 2:06am

Yes, I think we’re getting at the same thing.
Original post mentions running acme all over the lab, which reads to me like multiple instances of an ACME client.
Instead, central management via a single client, then pushing (or pulling) via whatever method.

Of course if they only plan on (or already are) running web-based services, perhaps cert management via a reverse proxy would simplify things a lot as there wouldn’t be any distribution required.

John · March 8, 2025, 3:11am

So, I started toying with concepts and:

I found the part of ssh-copy-id I’d need to change to implement a down-and-dirty push method to truenas.local (this would bypass the websocket). All I’d need is the middleware call.
I thought I’d build the pull method using similar tactics as I use on my headless git server. but…
I took another quick look at anvil and I learned that it is using a pull methodology so, I’d probably use it instead of rolling my own.

Here is the first blog in his series on the subject.
https://dan.langille.org/2017/07/04/acme-sh-getting-free-ssl-certificates-installation-configuration-on-freebsd/

mntbighker · March 10, 2025, 5:36pm

My final solution was to run an FQDN cert on TrueNAS, and a wildcard on the VM running the web services with certbot.

dan · March 10, 2025, 5:56pm

Fair enough.

Sure. Though it may not get the cert installed.

This goes back to my earlier post: to get a cert into TrueNAS properly, you have to use the API. You can do that through the web GUI (which is a web-based frontend to the middleware, which in turn uses the API–or maybe it’s a frontend to the API, which in turn uses the middleware; either way, the API is involved), or you can do that directly using API calls. In SCALE 24.10 and earlier, you can use either the REST API or the Websocket API; I understand the REST API will be deprecated in 25.04. Whether that means it won’t be available at all, or just that you’ll get some kind of warning, I don’t know.

As to the mechanics of that, lots of options:

You can run a script^[1] on cert.local that makes the appropriate API calls to deploy the cert to your NAS–this is probably the simplest option.
You can push the cert out to your NAS via something like SCP or rsync, and then run a script on the NAS itself to import/deploy the cert
You can pull the cert from cert.local to your NAS using SCP/rsync, and then run a script on your NAS to import/deploy the cert.

But the common thread in all of these is that you’d need to use the API, and at this point, if you’re using SCALE, it should probably be the websocket API. I don’t believe CORE has that one available; you’d use the REST API there.

Note that I’m talking here specifically about deploying the cert to TrueNAS; any other system is going to have its own requirements.

Correct. A more sensible approach–though not without its own challenges–is to do what OP’s asking about, which is to get a wildcard cert on one system, and then distribute it to the others.

However, if you’re getting individual certs for each of the LAN resources, you’re probably getting a cert for each specific FQDN, rather than a whole mess of identical certs. In that case, the rate limits are less of an issue.

which in turn can be one of my scripts, one of the deploy scripts from acme.sh, or something else, including something you develop yourself if you prefer ↩︎

dan · March 10, 2025, 6:02pm

I guess I was, though the Websocket version of the script hasn’t had much testing yet:

John · March 10, 2025, 8:31pm

Sorry, I’ll have to give you some bullet points (running away at the moment).

You do not have to use the API from cert.local, you can use the API via a script on the truenas.local to do the install part (easier than dealing with the web socket/authorization aspects which can cause you problems if that API changes later).

I wouldn’t build your script in python unless it’s absolutely necessary (acme.sh is built in shell script). …I don’t know python but I’m sure I can scrape together enough ability in a weekend or so to help you out if you need/want.

To build a cert updater, sans API, I would:

BRIEF:

ABC on cert.local only calls ACME.sh for a given client.
XYZ on cert.local only echos the cert for a given client argument.

NOTE:

cert.local is the VM,jail,raspi, etc. which has acme.sh.
truenas.local is our local thing which needs the cert (and we need to use the API to get the cert “installed”.

STEPS:
on each client (truenas.local for example) a script calls ABC@cert.local via ssh based on a scheduler (e.g., cron) to get a cert built for it.

on each client (truenas.local) at a later time, a script calls XYZ@cert.local via ssh. if a cert is returned, install it, restart services, etc.

This is essentially how I communicate in my local network with headless services like my git server.

I don’t know how long it takes to generate a cert when you call/use ACME.sh but I assume it takes a while (hence why I would split the operations up into two pieces; ‘order’ and ‘recieve’).

Again, sorry for the bullet point response but I have written down (most of my setup) for my headless git server which you can see how I use a simple ssh call to a jail which returns my repository list and repository logs here (similar mindset):

Which is bascially small scripts that look like this (I gave you a link to my not-so-great writup/example so you can see how I treated script arguments–trivial task, I know, but worth the mention).

ssh                     \
    -l user             \
    -i ~/.ssh/id_rsa    \
    user@remote.local   \
    -t "scriptcall.sh"

Here is a secondary form of that but if you want to issue a series of commands–just in case your calling me an idiot because calls to acme.sh are almost instant.

ssh                     \
    -l user             \
    -i ~/.ssh/id_rsa    \
    user@remote.local   \
<< EOF
echo "Hello World!"
# run_another_command
# A newline is important before final EOF.

EOF

Last note:
If you really wanted to push these certs from cert.local via ssh you can (I only spent 10 minutes or so looking but I’m almost sure) use ssh-copy-id as a framework.

Sorry, gotta run.

dan · March 10, 2025, 8:59pm

Correct.

Not really–it’s a websocket API either way. You’re making the same calls to it either way. If those methods change or go away, you’re going to be in trouble either way. And regardless of whether you’re running my script locally or on a remote system, you need an API token to authenticate, because the API rate-limits unauthenticated requests.^[1]

It’s pretty much built at this point (link right above your post), so I don’t see why not. And it does things I don’t think I’d be able to do using a shell script, like testing the validity of the cert/key pair and stepping through all the installed apps to update the certs for only those apps that are already configured with one. Not to mention a proper logging facility with selectable log levels (though at this point it only logs to stderr).

I’m not married to Python as such–I’m using it because the original code snippet I found was in Python, so I built the rest of my script around it. The websocket version is pretty much a complete rewrite (the only code in common is the part that sets the configuration), but I was already using Python, so didn’t see a reason to change it.

There’s definitely room for improvement; the most obvious area to me is in terms of error handling. But it seems to be in a functional state, and work on SCALE 23.10-25.04. The code could certainly be tightened up a bit, but I think it’s reasonably logical and non-duplicative.

…using a Python client for the API–the midclt command that script uses is just a CLI wrapper around the Python client library I’m using. And I’m obviously biased, but I think if you compare my code against what acme.sh is using, you’d find mine considerably easier to follow. If nothing else, Python avoids the need for incantations like $(printf "%s" "$_ws_response" | jq -r '."version"' | cut -d '-' -f 2 | tr '[:lower:]' '[:upper:]').

But with that said, it’s good to have options. I knew acme.sh had a deployment script for the old API, it was based on my original deployment script. But I hadn’t realized they had one for the websocket API as well.

…which presumably would be for some other system than TrueNAS. I’ve already shown the way I handle that up-thread, but I don’t have any particularly strong feelings about how it “should” be done on those.

I obviously can’t speak to how any other script might handle it, but the one from acme.sh also requires one. ↩︎