25.10.4 Nvidia GPU driver/smi/app issue - identical system works fine

Helping a friend with his HexOS deployment with nearly identical hardware to a deployment I have. It’s running 25.10.4 and he can’t get nvidia-smi to detect the GPU even if it shows up with midclt. I emptied my bag of tricks to no result and need more experienced help.

Chronology:

GPU (5060 Ti 16Gb) not detected during HexOS install.

Reseated GPU, ensured Secure Boot disabled, resizable bar enabled.

Still not detected by HexOS installer, removed GPU and went ahead with install. All troubleshooting done in TrueNAS interface from here on.

Verified GPU works in a windows system.

Reinstalled GPU, checked BIOS settings again, manually set PCI to gen 5 from auto.

Apps > Settings > “Install NVIDIA Drivers” checked.

GPU does not appear in immich/jellyfin settings.

Verified GPU not isolated in advanced settings. Appears as “NVIDIA Corporation VGA compatible controller” instead of 5060 Ti.

Shell > nvidia-smi returns No devices were found

Try manually installing drivers:

# 1. Download the compatible driver extension for TrueNAS 25.10
wget -O /tmp/nvidia.raw https://truenas-drivers.zhouyou.info/25.10.4/nvidia.raw

# 2. Unmerge the current (broken) system extension
systemd-sysext unmerge

# 3. Make the usr dataset writable temporarily
zfs set readonly=off "$(zfs list -H -o name /usr)"

# 4. Copy the new driver into the extensions directory
cp /tmp/nvidia.raw /usr/share/truenas/sysext-extensions/nvidia.raw

# 5. Lock the usr dataset again
zfs set readonly=on "$(zfs list -H -o name /usr)"

# 6. Merge the new extension
systemd-sysext merge

Restart system

GPU now appears under immich app settings but nvidia-smi still returns No devices were found

When GPU checked in app settings, app fails to start

[EFAULT] Failed 'up' action for 'immich' app. Please check /var/log/app_lifecycle.log for more details

Query middleware for GPU’s midclt call app.gpu_choices | jq

app.gpu_choices | jq
{
  "0000:02:00.0": {
    "vendor": "NVIDIA",
    "description": "NVIDIA GeForce RTX 5060 Ti",
    "error": null,
    "vendor_specific_config": {
      "uuid": "GPU-00dbeb0b-01e1-69ed-7a4c-0f7ed36d6ab7"
    },
    "gpu_details": {
      "addr": {
        "pci_slot": "0000:02:00.0",
        "domain": "0000",
        "bus": "02",
        "slot": "00"
      },
      "description": "NVIDIA Corporation VGA compatible controller",
      "devices": [
        {
          "pci_id": "10DE:2D04",
          "pci_slot": "0000:02:00.0",
          "vm_pci_slot": "pci_0000_02_00_0"
        },
        {
          "pci_id": "10DE:22EB",
          "pci_slot": "0000:02:00.1",
          "vm_pci_slot": "pci_0000_02_00_1"
        }
      ],
      "vendor": "NVIDIA",
      "uses_system_critical_devices": false,
      "critical_reason": null,
      "available_to_host": true
    },
    "pci_slot": "0000:02:00.0"
  },

Verify UUID correct in /mnt/.ix-apps/user_config.yaml

Repeated manual driver install to no effect.

Any ideas?

The GPU may be too new of a release for that version of TrueNAS. Some of those require newer drivers. Try searching the forum for that model. If you have a spare boot drive, you could try installing 26 Beta 2 as a test to see if it works under that version. The actual release of 26 should be sometime later this year.

I see some related posts from last year and earlier this year but nvidia-smi was working for them and Goldeye apparently works for 5000 series.

Also I have another HexOS instance with similar hardware running 25.10.4 with a 5060 Ti and no such issues…

Comparing sudo lspci -k | grep -A4 -i nvidia from both systems:

Working:

02:00.0 VGA compatible controller: NVIDIA Corporation Device 2d04 (rev a1)
        Subsystem: ZOTAC International (MCO) Ltd. Device 1772
        Kernel driver in use: nvidia
        Kernel modules: nouveau, nvidia_drm, nvidia
02:00.1 Audio device: NVIDIA Corporation Device 22eb (rev a1)
        Subsystem: NVIDIA Corporation Device 0000
        Kernel driver in use: snd_hda_intel
        Kernel modules: snd_hda_intel
03:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller S4LV008[Pascal]
        Subsystem: Samsung Electronics Co Ltd NVMe SSD Controller S4LV008[Pascal]

Not working:

02:00.0 VGA compatible controller: NVIDIA Corporation Device 2d04 (rev a1)
        Subsystem: ZOTAC International (MCO) Ltd. Device 1777
        Kernel driver in use: nvidia
        Kernel modules: nouveau, nvidia_drm, nvidia
02:00.1 Audio device: NVIDIA Corporation Device 22eb (rev a1)
        Subsystem: NVIDIA Corporation Device 0000
        Kernel driver in use: snd_hda_intel
        Kernel modules: snd_hda_intel
80:14.0 USB controller: Intel Corporation Device 7f6e (rev 10)
        Subsystem: Micro-Star International Co., Ltd. [MSI] Device 7e32

I think it depends on the firmware version of the cards. Some work, others were reported as not working.

You can try checking out this thread or one linked inside it and see if it help diag a bit.

2 Likes