Support local ACME providers without dns challenges

Destari · February 20, 2025, 12:21am

Problem/Justification
I would like to utilize a local step-ca certificate authority which supports the ACME protocol, which would allow local TLS challenges to automate certificate renewal.

Impact
A local certificate authority must be utilized by manually issuing certificates, leading to longer life-span certificates that negate the security benefits of an ACME provider.

User Story
For some users which utilize an internal CA authority that supports the ACME protocol (such as Step CA) we cannot utilize the ACME functionality in TrueNAS completely because it does not support sane challenges for an on-premise scenario.

The main issues I see are:

Not allowing longer than 30 days for certificate issuance.
Requiring DNS-challenges when the local provider is able to use TLS or HTTP challenges. DNS challenges cannot be made for IP address SAN entries, while other challenges support them. This is a valid use case on-premise.

I would suggest allowing more open usage of the ACME client to use non-DNS challenges, and remove hardcoded restrictions unless utilizing the the built-in LetsEncrypt provider URIs.

dan · February 20, 2025, 12:32am

Nope.

While I agree with the principle, it’s simply not the case that certs have to be installed manually.

Destari · February 20, 2025, 12:48am

While I appreciate the workaround, that doesn’t really solve the problem suggested here. That script doesn’t tie into all the reloads required to deploy the new cert to the web-ui or other services on its own, nor does it support SANs.

This will also get wiped out with each upgrade, and realistically should just be something supported in the UI, thus a feature request.

At the end of the day, Step CA, the local CA I am referencing here supports DNS challenges. The core issue is I cannot utilize a DNS challenge for an IP address SAN, and ideally should not need to utilize multiple scripts/cron to work around that basic requirement.

dan · February 20, 2025, 12:55am

No, that’s done by the deploy-freenas script linked in its docs.

…and this is done by the ACME client with which you get the cert (which would likely be acme.sh)

I have no idea why you think this, and it certainly isn’t correct.

I don’t at all disagree with the feature request, but what you want can certainly be automated.

Destari · February 20, 2025, 1:17am

I don’t see how that really solves the issue. If I use that script and add IP SANs to the acme.sh command, but the script doesn’t bind nginx to port 80 on those interfaces, they fail. That wouldn’t be an issue in standalone mode with TLS alpn, but that’s not what the script seems to be doing. This would only work on interfaces where the web-ui/nginx already had a binding, correct?

I have had several occurrences where during an upgrade, the OS filesystem is cleared and scripts tied to cron jobs get removed. Regardless, this is far more persistent if managed by the backed-up configuration that TrueNAS maintains.

dan · February 20, 2025, 1:53am

The script binds nginx to port 80, period.

I’ve never seen the boot pool cleared like that, but the easy answer here is to put the script somewhere other than on your boot pool.

John · February 20, 2025, 2:28am

@Destari
Forgive me if I misunderstood the situation but from what I gather you want to manage certs from one location and distribute to nodes. I can think of two methods for this. However, I do not have this need, so to be honest I am offering a blind suggestion–or two -i.e. “grain of salt”.

One being the use of anvil (The code of which I haven’t looked at but is supposedly built for the reason(s) you mention): cert-puller: using anvil to pull down & install new certificates, then restart services – Dan Langille's Other Diary

Two, roll your own using entr like something the following demonstrates (but I would change the sync script if it were me): Vincent's blog

Let me know if you want to roll your own method (sounds a little fun).

nabsltd · February 20, 2025, 5:26pm

I “rolled my own” using pfSense to generate all the certificates from Let’s Encrypt. I added an “after renewal” script that gets passed a host name, and the script grabs it from the local filesystem and uploads it via SFTP to the correct host. The SFTP is to a non-root user with no login shell, and the file is dropped into their home directory.

Each host has a cron job (running as root) that looks for a new cert that was uploaded. If so, it runs the right script to put the file in place and restart services.

The advantage to this method is that there are only two requirements for the systems with certs:

Running an SSH server
Running some sort of job scheduler (cron, Windows Task Scheduler, etc.)

Although it is better security for the host to “reach out” and grab the cert, everything I use that requires certs has an SSH server running, and all of them can talk to the firewall (even if they might not be able to send packets through it).

Using public-key authentication for the SSH allows me to set it up quickly on a new host (create a user, create the authorized_keys file, and put the public key into it).

Destari · February 20, 2025, 5:49pm

That’s not the feature I am looking for here. I am running an internal certificate authority, like a miniature version of Lets Encrypt inside my data center which supports the ACME protocol. I am looking to allow TrueNAS to make certificate requests to my internal CA server without requiring changing DNS records. ACME supports other types of “challenges” which verify that the domain/ip address is owned by the system making the cert request but TrueNAS only supports DNS record challenges via the UI.

John · February 21, 2025, 11:50pm

While not wanting to distract from the topic/concept of this post too much. Besides, this subject is above my head; I don’t know certs, ACME protocol, Lets Encrypt, etc.

I’m a bit invested in a side-project at the moment (so my brain is a bit “off-topic” at the moment) but if I was forced to design a system for this task, I would do something similar to yours @nabsltd solution but more heavy on the “pull” (using a scheduler to pull the data down from a known provider) instead of “pushing” to known users.

As a quick demonstration (typing this here; not actually tested).

This function/script represents a cert generation thing that resides on the provider (this script can call whatever internet service to get or generate the cert, the only requirement is that it needs to echo the cert).
HOST:

#!/bin/sh
# cert_generate --
#   This function gets the certificate from where ever it needs and
#   echos the output to STDOUT.
cert_generate() {
        openssl rand -hex 64
}
cert_generate

This function represents a script on the CLIENT:

#!/bin/sh 
# cert_get --
#   Call the HOST to get the cert and write it to the appropriate
#   file. BTW, you don't actually have to use port 22. :)
cert_get() {
        ssh \
        -p 22 \
        -l <USER> \
        -i ~/.ssh/id_ed25519 \
        192.168.1.10 \
        -t "cert_generate" > /path/to/file
        # service restart
        # blah, blah, blah
}
cert_get

Then you can just set up a scheduler on each CLINET to call the cert_get() function on the HOST.

Obvious problems (someone smart should work out):

The CLIENT(S) public key needs to be added to the HOST (but a combination of a non-root user on the HOST and a plain-text password in the CLIENT function could work too in a pinch).
I am assuming a cert can be generated on the fly but if it cannot, then the HOST needs to manage the certs (but this can be automated based on the “known_hosts” file).
If #2 above then I would do this in a “single process ‘jail’” or “thin jail” (using FreeBSD lingo; use your babblefish to translate).
I don’t know why the anvil tool is written to be a “pull process” but there must be a reason.
Code is represented in monolithic functions (too many uses of “and” in my function headers), replace with actual “good code”.
I am assuming more time is spent on the CLIENT(s) and thus my design would be easier to employ vs the “push” method (but related to #4 above).

Sorry, @Destari 1. I could not help. 2. If I distracted the thread too much. I will not pollute your thread (good luck, though).

divstar · May 31, 2025, 3:09pm

I’d also like this feature to exist.

Perhaps I am misunderstanding something, but it would be nice if I could use ACME to set up a certificate for the Web UI. If I am not totally wrong (I have used Proxmox only for a couple of months and can’t exactly remember whether I ever managed to connect it to a step-ca instance running in Docker (or Kubernetes)), Proxmox allows it: pve.proxmox.com/wiki/Certificate_Management#sysadmin_certs_api_gui.

I haven’t looked into it long enough yet (shell seems to be what I am looking for), but truenas.com/docs/scale/scaletutorials/credentials/certificates/settingupletsencryptcertificates/ seems to somewhat cover this. However: it does not cover the acme.sh or said shell option - so this feature request still stands. I can of course manually create a certificate, that will survive long enough for me not to be bothered (e.g. 5+ years), but it’s probably not as secure as updating it more often.

Thanks for the discussion .

Constantin · May 31, 2025, 10:23pm

I might be wrong in my understanding of your question, but I believe @Dan and I have made a local CA happen. His is likely still functional, mine has been retired for a while. Both of ours are based on the small step RasPi implementation.

I found it quite a bear to set up and eventually a OS upgrade borked it good. I then gave up. I find owning a domain and using the cloudflare option a much less painful way to do it.

dan · May 31, 2025, 11:59pm

The blog article they had to set it up certainly wasn’t the easiest way to go about it, though it did bring some advantages (I like the idea of using a YubiKey as a poor man’s HSM, for one). I am still using the local CA, though not for the TrueNAS UI certs.

As nice as it would be to have this integrated into TrueNAS (and without prejudice to whether iX will ever do it, though the lack of support for this feature request doesn’t bode well), I know the script I linked up-thread at least has worked (I don’t recall if I’ve tested it with SCALE, though I’d be surprised if it didn’t) to allow an ACME CA to get a cert using HTTP validation. As a result, you can automate getting and renewing the certs.

divstar · June 5, 2025, 1:08am

I actually got it to work.
I can post the full script, but I still need to refactor it a little before doing so.

Basically I have a shell script, that uses midclt and step along with a step-ca docker container to accomplish the UI certificate renewal. I run it on a 12h schedule (twice a day) via CronJob. The domain certificate is only valid for 24h (I might expand the duration a little to have a some sort of fail over).

Draw back: I installed the CA root certificate into the trust store by copying it to /usr/local/share/certs/ and running update-ca-certificates (no private key required, because I won’t be using this CA to issue anything).

The script basically:

Copies the certificate (in /etc/certificates/) to a temp folder.
Renews the domain certificate by running step ca renew (actual commands may vary, but will be in the shell script).
Sets the ui_certificate to 1 (the truenas_default one`).
Retrieves the current domain certificate by name and deletes it (using midclt).
Inserts the renewed certificate via midclt and keeps the new certificate ID from the response.
Updates the ui_certificate value with the new ID and restarts the UI (no delays or checkins though, using midclt).

Once I finished refactoring it, I’ll post the script here in this response (if the edit will allow so) or a gist.

dan · June 5, 2025, 11:27am

Seems you could save a step here. Skip 3, do 5, then 4. Or just get the new cert then run deploy_truenas.py (GitHub - danb35/deploy-freenas: Python script to automate deploying TLS certificates to TrueNAS servers), which takes care of 3-6 and optionally some other stuff.

divstar · June 5, 2025, 11:58am

You are using the REST API though from what I see, no? According to 25.04 (Fangtooth) Version Notes | TrueNAS Documentation Hub it is said to be deprecated. midclt is not. Then again your script also seems to use WebSockets - which is what midclt uses?

dan · June 5, 2025, 11:59am

That repo contains two scripts; one uses the REST API and the other the Websocket API. Use the former on CORE, the latter on SCALE (though the former one will work at least through SCALE 24.10).