Immich not using GPU

Using the default app I can ‘see’ my GPU from inside the ML container with nvidia-smi. But there is no process running according to nvidia-smi. The ML container logs only show ‘Setting execution providers to ‘CPUExecutionProvider’’ for each ML task. I am running the default apps container.

What can I do to have immich use the GPU?

I have tried restarting the container with GPU selected and unselected, and tried moving the ML directory. No effect. I’m not seeing any errors.

±----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.172.08 Driver Version: 570.172.08 CUDA Version: 12.8 |
|-----------------------------------------±-----------------------±---------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3060 Off | 00000000:0B:00.0 Off | N/A |
| 0% 49C P0 40W / 170W | 0MiB / 12288MiB | 2% Default |
| | | N/A |
±----------------------------------------±-----------------------±---------------------+

±----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
±----------------------------------------------------------------------------------------+

2026-01-19 11:09:20.865440+00:00e[2;36m[01/19/26 22:09:20]e[0me[2;36m e[0me[34mINFO e[0m Starting gunicorn e[1;36m23.0e[0m.e[1;36m0e[0m

2026-01-19 11:09:20.866138+00:00e[2;36m[01/19/26 22:09:20]e[0me[2;36m e[0me[34mINFO e[0m Listening at: e[4;94mhttp://e[0me[1m[e[0m::e[1m]e[0m:e[1;36m32002e[0m e[1m(e[0me[1;36m64e[0me[1m)e[0m

2026-01-19 11:09:20.866625+00:00e[2;36m[01/19/26 22:09:20]e[0me[2;36m e[0me[34mINFO e[0m Using worker: immich_ml.config.CustomUvicornWorker

2026-01-19 11:09:20.872761+00:00e[2;36m[01/19/26 22:09:20]e[0me[2;36m e[0me[34mINFO e[0m Booting worker with pid: e[1;36m76e[0m

2026-01-19 11:09:22.400895+00:00e[2;36m[01/19/26 22:09:22]e[0me[2;36m e[0me[34mINFO e[0m Started server process e[1m[e[0me[1;36m76e[0me[1m]e[0m

2026-01-19 11:09:22.401315+00:00e[2;36m[01/19/26 22:09:22]e[0me[2;36m e[0me[34mINFO e[0m Waiting for application startup.

2026-01-19 11:09:22.401910+00:00e[2;36m[01/19/26 22:09:22]e[0me[2;36m e[0me[34mINFO e[0m Created in-memory cache with unloading after 300s

2026-01-19 11:09:22.401934+00:00e[2;36m e[0m of inactivity.

2026-01-19 11:09:22.402439+00:00e[2;36m[01/19/26 22:09:22]e[0me[2;36m e[0me[34mINFO e[0m Initialized request thread pool with e[1;36m12e[0m threads.

2026-01-19 11:09:22.402916+00:00e[2;36m[01/19/26 22:09:22]e[0me[2;36m e[0me[34mINFO e[0m Application startup complete.

2026-01-19 11:10:39.134139+00:00e[2;36m[01/19/26 22:10:39]e[0me[2;36m e[0me[34mINFO e[0m Downloading visual model e[32m’ViT-B-16-SigLIP2__webli’e[0m

2026-01-19 11:10:39.134216+00:00e[2;36m e[0m to

2026-01-19 11:10:39.134231+00:00e[2;36m e[0m e[35m/mlcache/clip/ViT-B-16-SigLIP2__webli/visual/e[0me[95mmodel.e[0m

2026-01-19 11:10:39.134242+00:00e[2;36m e[0m e[95monnx.e[0m This may take a while.

2026-01-19 11:10:55.361754+00:00Fetching 9 files: 0%| | 0/9 [00:00<?, ?it/s] Fetching 9 files: 11%|█ | 1/9 [00:00<00:02, 3.13it/s] Fetching 9 files: 44%|████▍ | 4/9 [00:13<00:17, 3.49s/it] Fetching 9 files: 89%|████████▉ | 8/9 [00:15<00:01, 1.82s/it] Fetching 9 files: 100%|██████████| 9/9 [00:15<00:00, 1.77s/it]

2026-01-19 11:10:55.363191+00:00e[2;36m[01/19/26 22:10:55]e[0me[2;36m e[0me[34mINFO e[0m Loading visual model e[32m’ViT-B-16-SigLIP2__webli’e[0m to

2026-01-19 11:10:55.363220+00:00e[2;36m e[0m memory

2026-01-19 11:10:55.364079+00:00e[2;36m[01/19/26 22:10:55]e[0me[2;36m e[0me[34mINFO e[0m Setting execution providers to

2026-01-19 11:10:55.364103+00:00e[2;36m e[0m e[1m[e[0me[32m’CPUExecutionProvider’e[0me[1m]e[0m, in descending order of

2026-01-19 11:10:55.364115+00:00e[2;36m e[0m preference

2026-01-19 11:10:55.803194+00:00e[2;36m[01/19/26 22:10:55]e[0me[2;36m e[0me[34mINFO e[0m Downloading detection model e[32m’buffalo_l’e[0m to

2026-01-19 11:10:55.803265+00:00e[2;36m e[0m e[35m/mlcache/facial-recognition/buffalo_l/detection/e[0me[95mmode[0m

2026-01-19 11:10:55.803279+00:00e[2;36m e[0m e[95mel.onnx.e[0m This may take a while.

2026-01-19 11:10:59.776998+00:00Fetching 4 files: 0%| | 0/4 [00:00<?, ?it/s] Fetching 4 files: 25%|██▌ | 1/4 [00:00<00:00, 3.14it/s] Fetching 4 files: 75%|███████▌ | 3/4 [00:01<00:00, 1.54it/s] Fetching 4 files: 100%|██████████| 4/4 [00:03<00:00, 1.07s/it] Fetching 4 files: 100%|██████████| 4/4 [00:03<00:00, 1.08it/s]

2026-01-19 11:10:59.778034+00:00e[2;36m[01/19/26 22:10:59]e[0me[2;36m e[0me[34mINFO e[0m Loading detection model e[32m’buffalo_l’e[0m to memory

2026-01-19 11:10:59.778704+00:00e[2;36m[01/19/26 22:10:59]e[0me[2;36m e[0me[34mINFO e[0m Setting execution providers to

2026-01-19 11:10:59.778732+00:00e[2;36m e[0m e[1m[e[0me[32m’CPUExecutionProvider’e[0me[1m]e[0m, in descending order of

2026-01-19 11:10:59.778744+00:00e[2;36m e[0m preference

2026-01-19 11:11:00.062309+00:00e[2;36m[01/19/26 22:11:00]e[0me[2;36m e[0me[34mINFO e[0m Loading recognition model e[32m’buffalo_l’e[0m to memory

2026-01-19 11:11:00.063004+00:00e[2;36m[01/19/26 22:11:00]e[0me[2;36m e[0me[34mINFO e[0m Setting execution providers to

2026-01-19 11:11:00.063029+00:00e[2;36m e[0m e[1m[e[0me[32m’CPUExecutionProvider’e[0me[1m]e[0m, in descending order of

2026-01-19 11:11:00.063047+00:00e[2;36m e[0m preference

2026-01-19 11:11:01.295672+00:00e[2;36m[01/19/26 22:11:01]e[0me[2;36m e[0me[34mINFO e[0m Setting execution providers to

2026-01-19 11:11:01.295737+00:00e[2;36m e[0m e[1m[e[0me[32m’CPUExecutionProvider’e[0me[1m]e[0m, in descending order of

2026-01-19 11:11:01.295765+00:00e[2;36m e[0m preference

2026-01-19 11:16:22.419421+00:00e[2;36m[01/19/26 22:16:22]e[0me[2;36m e[0me[34mINFO e[0m Shutting down due to inactivity.

2026-01-19 11:16:22.466292+00:00e[2;36m[01/19/26 22:16:22]e[0me[2;36m e[0me[34mINFO e[0m Shutting down

2026-01-19 11:16:22.567601+00:00e[2;36m[01/19/26 22:16:22]e[0me[2;36m e[0me[34mINFO e[0m Waiting for application shutdown.

2026-01-19 11:16:22.621585+00:00e[2;36m[01/19/26 22:16:22]e[0me[2;36m e[0me[34mINFO e[0m Application shutdown complete.

2026-01-19 11:16:22.622076+00:00e[2;36m[01/19/26 22:16:22]e[0me[2;36m e[0me[34mINFO e[0m Finished server process e[1m[e[0me[1;36m76e[0me[1m]e[0m

2026-01-19 11:16:22.667847+00:00e[2;36m[01/19/26 22:16:22]e[0me[2;36m e[0me[1;31mERROR e[0m Worker e[1m(e[0mpie[1;92md:76e[0me[1m)e[0m was sent SIGINT!

2026-01-19 11:16:22.671872+00:00e[2;36m[01/19/26 22:16:22]e[0me[2;36m e[0me[34mINFO e[0m Booting worker with pid: e[1;36m247e[0m

2026-01-19 11:16:24.026229+00:00e[2;36m[01/19/26 22:16:24]e[0me[2;36m e[0me[34mINFO e[0m Started server process e[1m[e[0me[1;36m247e[0me[1m]e[0m

2026-01-19 11:16:24.026662+00:00e[2;36m[01/19/26 22:16:24]e[0me[2;36m e[0me[34mINFO e[0m Waiting for application startup.

2026-01-19 11:16:24.027319+00:00e[2;36m[01/19/26 22:16:24]e[0me[2;36m e[0me[34mINFO e[0m Created in-memory cache with unloading after 300s

2026-01-19 11:16:24.027349+00:00e[2;36m e[0m of inactivity.

2026-01-19 11:16:24.027833+00:00e[2;36m[01/19/26 22:16:24]e[0me[2;36m e[0me[34mINFO e[0m Initialized request thread pool with e[1;36m12e[0m threads.

2026-01-19 11:16:24.028339+00:00e[2;36m[01/19/26 22:16:24]e[0me[2;36m e[0me[34mINFO e[0m Application startup complete.

usually you have to check nvidia-smi on the host if a process is using it, not from inside the container, at least thats the case for my jellyfin app.
Can you check if the host is also not showing any process running?

I’d neglected to select CUDA under the ‘Machine Learning Image Type’ menu. Getting errors now, but I can work through them. The GPU is doing work though, so at least one model is working.

Thank you for pointing out I can run nvidia-smi on the host, I had not realized that.

1 Like