Hi,
when trying to run Apps from the store [like Immich, OWU, Llama etc] I am only getting “Passthrough available (non-Nvidia) GPU’s”.
The GPU [Quadro P2000, tried a 3080 before which did the same thing] is being recognized in SCALE and by lspci, but seemingly has no drivers and cannot be loaded by Apps.
There also used to be an “Install Nvidia Drivers” option on EE, where is it now? is it just gone?
I just got mine working on SCALE 25.04, plex app, GTX 950:
Enable IOMMU in BIOS/UEFI:
Reboot your TrueNAS SCALE server.
Enter the BIOS/UEFI settings during boot (usually by pressing DEL, F2, F11, or F12).
Locate the IOMMU settings. This is often under sections like:
Advanced
System Agent (for Intel)
Northbridge/Chipset (for AMD)
Virtualization Technology
Enable IOMMU (or the equivalent setting).
Save the changes and exit the BIOS/UEFI.
Enable IOMMU in TrueNAS SCALE:
Open the TrueNAS SCALE Shell: You can do this from the web interface or via SSH.
Execute the midclt command to set the kernel extra options. Remember to use the correct IOMMU parameters for your CPU:
Intel:
midclt call system.advanced.update ‘{ “kernel_extra_options”: “intel_iommu=on iommu=pt” }’
AMD:
midclt call system.advanced.update ‘{ “kernel_extra_options”: “amd_iommu=on iommu=pt” }’
If you encounter issues with AMD, you might also need to include pci=assign-busses:
midclt call system.advanced.update ‘{ “kernel_extra_options”: “amd_iommu=on iommu=pt pci=assign-busses” }’
Verify the setting: You can verify that the option has been set by retrieving the advanced system settings:
midclt call system.advanced.get
Look for the “kernel_extra_options” field in the output and confirm it contains the parameters you set.
Reboot your TrueNAS SCALE server: The kernel extra options are applied during the boot process, so a reboot is necessary.
sudo reboot
once IOMMU is enabled, we do the following:
1. Verify GPU Detection
Open the TrueNAS SCALE shell and run:
lspci
grep NVIDIA
You should see output similar to:
09:00.0 VGA compatible controller: NVIDIA Corporation GM206 [GeForce GTX 950] (rev a1)
09:00.1 Audio device: NVIDIA Corporation GM206 High Definition Audio Controller (rev a1)
Note the PCI Bus ID (e.g., 09:00.0).
2. Enable NVIDIA for Docker
Run:
midclt call -j docker.update ‘{“nvidia”: true}’
3. Reboot TrueNAS
Run:
sudo reboot
4. Verify Docker NVIDIA Runtime
After rebooting, run:
docker info
grep -i nvidia
You should see nvidia
listed under Runtimes:
and Default Runtime:
.
5. Identify GPU UUID
Run:
nvidia-smi -L
Example output:
GPU 0: NVIDIA GeForce GTX 950 (UUID: GPU-7ae8def0-6839-1bbb-4b26-272e7560256f)
Note the UUID.
6. Update Application Configuration with UUID
Replace the placeholders (APP_NAME
, PCI_SLOT
, GPU_UUID
) with your values.
Example command:
midclt call -j app.update plex ‘{“values”: {“resources”: {“gpus”: {“use_all_gpus”: false, “nvidia_gpu_selection”: {“0000:09:00.0”: {“use_gpu”: true, “uuid”: “GPU-7ae8def0-6839-1bbb-4b26-272e7560256f”}}}}}}’
7. Restart the Application
Through the TrueNAS SCALE web interface:
Apps → Installed Applications → Plex → Restart
8. Verify GPU Usage
In Plex Web UI:
Settings → Transcoder → Ensure Hardware Acceleration is enabled
Play a transcoded video and:
Check Plex Activity Dashboard → Status → Now Playing → Look for (hw)
next to the stream
(Optional) Run on TrueNAS shell during transcoding:
nvidia-smi
Hi there. Alas, this method does not work for me. To the point even that if i start the jellyfin app, it completely crashes the entire truenas system.
I’ve tried to unbind the GPU with command
midclt call -j app.update jellyfin ‘{“values”: {“resources”: {“gpus”: {}}}}’
To no avail.
Also resetting to default behavious through
midclt call -j app.update jellyfin ‘{“values”: {“resources”: {“gpus”: {“use_all_gpus”: true}}}}’
Still crashes Truenas if I start the app.
Just for confirmation: Yes, IOMMU is on, yes the variables are correct in the command
midclt call -j app.update jellyfin ‘{“values”: {“resources”: {“gpus”: {“use_all_gpus”: false, “nvidia_gpu_selection”: {“0000:21:00.0”: {“use_gpu”: true, “uuid”: “GPU-8d457232-05dc-e284-d0d5-3d69018befec”}}}}}}’
2 points to note tho which may or may not have an effect:
1/ It’s a dual CPU server. One graphics card (P40) linked to CPU 1 for my ComfyUI instance, the other (GTX 1060) to CPU 2. Could it be that it’s the hardware which crashes if CPU 1 tries to access GPU2? if so, can I ‘link’ CPU2 to the jellyfin app?
2/ not all commands gave me the correct reply as you described. (But, nvidia-smi does list the 1060 as properly installed.
Somewhat at a loss here and unsure on how to get Jellyfin back to it’s original state before I started tinkering to get the transcoding working…
Help much appreciated, and thanks in advance!
Also - for anyone stumbling across this post - the webpage changes the quotes from the straight quotes to the rounded ones - so you need to either type in the command or copy it to a text editor and find/replace them…
Not sure if you tried this, but I asked AI (I know, that’s cheating right )
Likely Causes of the Crash
1. Dual CPU Architecture (NUMA Split)
- The server uses two CPUs, and the GPUs are attached to different CPUs:
- P40 → CPU 1
- GTX 1060 → CPU 2
- If Jellyfin (or its Kubernetes pod) runs on a core tied to CPU 1 and tries to access the GPU on CPU 2, cross-CPU memory access (especially for IOMMU and DMA) may lead to instability or total system crashes.
- This is a known issue in NUMA systems without proper isolation or affinity control.
→ Fix Suggestion:
Set CPU affinity or pin Jellyfin’s pod to a core on CPU 2 to match the GPU topology. Unfortunately, TrueNAS SCALE doesn’t currently offer direct CPU core pinning for apps in its GUI or midclt
, but manually editing the Kubernetes pod deployment (if you drop to the k8s layer) could help.
2. Incorrect PCI Binding or VFIO Conflict
- It’s possible that the NVIDIA driver or TrueNAS’s k8s pod runtime is trying to use the GPU before it’s fully bound or available, or is trying to unbind it while it’s in use.
- A misconfiguration here (like conflicting VFIO bindings, or not releasing a GPU from another container or service) can also hard crash the system.
→ Fix Suggestion:
Ensure that no other service is using the GPU and confirm that VFIO-pci isn’t trying to claim it. Check with:
bash
CopyEdit
lspci -nnk | grep -A 3 -i nvidia
Make sure the NVIDIA driver is bound, not vfio-pci.
3. NVIDIA P40 + GTX 1060 Driver Incompatibility
- Mixing data center (P40) and consumer GPUs (GTX 1060) can cause instability if the driver doesn’t fully support both in the same instance, especially in containerized environments.
- The TrueNAS SCALE host uses a unified NVIDIA runtime. One driver version must work for both cards, and sometimes issues arise if one GPU uses features unsupported by the other.
→ Fix Suggestion:
Check that both GPUs are supported in the same NVIDIA driver version (nvidia-smi
can confirm driver version).
4. Container Runtime GPU Assignment Issues
- When configuring the GPU UUID in the app config, if the pod is misassigned or misconfigured (e.g., GPU not visible in the container), it may attempt a low-level GPU call and crash if access is denied or corrupt.
→ Fix Suggestion:
Try setting use_all_gpus: true
temporarily and test if Jellyfin works without any GPU assignment. If it does, it confirms that the crash is GPU-passthrough related.
How to Restore Jellyfin to a Working State
To fully reset the Jellyfin app’s GPU config:
bash
CopyEdit
midclt call -j app.update jellyfin '{"values": {"resources": {"gpus": {}}}}'
Then restart the app:
bash
CopyEdit
midclt call app.restart jellyfin
If that doesn’t work, uninstall and reinstall the app via the TrueNAS SCALE Web UI to wipe the config clean.
Additional Suggestions
- Use
journalctl -xe
right after the crash (if accessible via SSH) to see kernel panic logs or GPU driver errors. - Run Jellyfin without GPU passthrough first. If it works fine, then it’s confirmed to be the GPU passthrough causing the issue.
- Consider using only one GPU at a time per workload to simplify GPU passthrough setup.
My setup involves only 1 cpu, so hard to replicate/test these suggestions
Out of curiosity - which one is this? (I’ve been troubleshooting with the help of deepseek) The differences are… interesting. (I’ll TS with that one then)
As to feedback -
1/ Both CPU’s should (according to HP) be able to access all GPUs. The difference is in bandwidth speed apparently
2/ My P40 is bound to vfio-pci, not the 1060.
(results from command:
admin@truenas[~]$ lspci -nnk | grep -A 3 -i nvidia
0a:00.0 3D controller [0302]: NVIDIA Corporation GP102GL [Tesla P40] [10de:1b38] (rev a1)
** Subsystem: NVIDIA Corporation GP102GL [Tesla P40] [10de:11d9]**
** Kernel driver in use: vfio-pci**
** Kernel modules: nouveau, nvidia_drm, nvidia**
1f:08.0 System peripheral [0880]: Intel Corporation Xeon E7 v2/Xeon E5 v2/Core i7 QPI Link 0 [8086:0e80] (rev 04)
** Subsystem: Hewlett-Packard Company Xeon E7 v2/Xeon E5 v2/Core i7 QPI Link 0 [103c:18a8]**
1f:09.0 System peripheral [0880]: Intel Corporation Xeon E7 v2/Xeon E5 v2/Core i7 QPI Link 1 [8086:0e90] (rev 04)
–
21:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP106 [GeForce GTX 1060 6GB] [10de:1c03] (rev a1)
Subsystem: PC Partner Limited / Sapphire Technology GP106 [GeForce GTX 1060 6GB] [174b:2438]
Kernel driver in use: nvidia
Kernel modules: nouveau, nvidia_drm, nvidia
21:00.1 Audio device [0403]: NVIDIA Corporation GP106 High Definition Audio Controller [10de:10f1] (rev a1)
Subsystem: PC Partner Limited / Sapphire Technology GP106 High Definition Audio Controller [174b:2438]
Kernel driver in use: snd_hda_intel
Kernel modules: snd_hda_intel
So I gotta check that
3/ Driver is 550.142, and yes it both supports the P40 and 1060
Nvidia-smi only lists the 1060 (probably because the P40 is linked to vfio for some reason…)
4/ Jellyfin worked fine without the GPU. Once connected - death. Even if removed with that command (so it should be back to default) - death. So yeah i agree the issue is passthrough related.
This said (as extra info)
I’ve tried making a linux instance and connect the P40 to that one. Trying to create the image with the CPU in it - death.
I did not know the journal command tho. Running it now (system has ran for a day now) does give me this interesting feedback:
May 11 02:11:20 truenas kernel: NVRM: GPU at PCI:0000:21:00: GPU-8d457232-05dc-e284-d0d5-3d69018befec
May 11 02:11:20 truenas kernel: NVRM: Xid (PCI:0000:21:00): 62, pid=‘’, name=, 0001c02e 0001c01a 0001c074 0001bec5 0001bebe 0001bd88 00000011 00000000
That’s the 1060. (The P40 is 0a:00.00)
In fairness, I do have to mention I’m doing things that the server was never designed for… It was never designed for usb 3.2, nor 2.5Gbps network, nor the eclectic combination of supported and not supported disks (Some SSDs, one NVME) spread over an LFF drive bay and an SFF one (not allowed by HP) and a PCI card holding the NVME drive…
Every step of this build has been a pain i would not dare to inflict upon my worst enemies…
I’ll check the logs tomorrow and connect to the management console (which for some reason refuses to connect to the network, but that’s not Truenas’s fault) and report back. If meanwhile you (or AI) have further suggestions (at least the name) I’m all ears!
Thanks for the help!