Problem/Justification
The Docker engine can export prometheus metrics about itself. This is configured in its config.json file which is fully (properly) under TrueNAS control, and it’s off. The only ways to turn this on are hacks that violate the appliance boundary of TrueNAS.
Note that this isn’t about “prometheus in TrueNAS”. This is about prometheus anywhere getting metrics for TrueNAS’s core docker engine, the way Docker intends them to be pulled.
Impact
Engine metrics allow direct visibility into container states. That’s particularly important when containers go wrong, such as being stuck in “created” state where they’re largely invisible to cadvisor or similar metrics. Other metrics paths (cadvisor etc.) focus on container resources, not engine state.
Suggested Approaches
Simple: Just add "metrics-addr": "0.0.0.0:9323" to docker’s config.json. Done. Users who don’t care won’t notice. Performance impact is invisible (prometheus is a pull path, so if nobody pulls…) Maybe a few kilobytes of RAM. (9323 is docker’s official prometheus port.)
Better: Add a “Docker metrics address” setting somewhere in Advanced Settings (defaulting to blank or “none”). Paste that into the config.json. Advantage: no default impact at all, more configurable. Drawback: UI change.
Existing Workarounds
A. Hack config.json with a post-boot script. This violates the appliance boundary and requires a stutter-restart of docker engine right when everything’s flapping around during a boot.
B. Rubber-band together some shell scripts with docker CLI and then feed the output into our prometheus pipes.
User Story
Shops using Prometheus for metrics collection want all the prometheus feeds available. Docker is at the core of your app story, so we’re highly sensitive to docker irregularities. The docker-provided metrics are highly useful to create dashboards and alerts about this. A practical example is containers stuck in odd states (like “created”) because something unusual went wrong creating them. These do show up in Portainer logs etc. if you know what to look for, but there’s no clean way to create reliable alerts from there.
Ultimately this is a blind spot in observability for TrueNAS that’s just gratuitously unnecessary. Exposing Docker engine’s own metrics is easy and cheap. Just do it.