Fangtooth VM or app with Blackwell?

mntbighker · October 3, 2025, 6:59pm

Is there any scenario where I can pass my 5060TI through to a Linux VM in Fangtooth? Or will Goldeye be the only option? It’s not clear to me what driver support is needed in the TruNAS host to get the GPU passed to the guest. I’m perfectly fine with installing CUDA, etc in the guest so I can run a small LLM. But what is needed to get the GPU passed through? Is Docker going to ultimately be the “better” route to running Olama? Either in Fang or Gold. Safe to assume that checking install Nvidia drivers in Fang is useless for Blackwell. If host drivers are not needed for VM passthrough maybe I have some BIOS settings wrong, so the option isn’t appearing in advanced settings?

LarsR · October 3, 2025, 7:02pm

Honestly? With Goldeye release later this month i’d propably wait.
But nothing should prevent you from passing it through to a vm. the only thing required is that truenas sees the pcie device which you have to isolate to pass it through during vm setup.

HoneyBadger · October 3, 2025, 7:31pm

Isolating a GPU for a VM doesn’t require driver support in the host - so as mentioned you can isolate your 5060Ti in 25.04 and pass it to a VM even though the host driver is too old.

But we’ve got Blackwell support in 25.10 which is presently out as RC1, so that release should let you run Ollama as a Docker container - likely with less overhead than a full VM.

mntbighker · October 3, 2025, 8:42pm

Thanks, but so far no luck with pass through. Could be my mobo is the culprit. I can’t see a way to prevent it from latching onto the 5060TI as the primary display. And most of the advanced BIOS options for virtualization and PCI don’t seem to be exposed. I have a Gigabyte Auros Z590, so not exactly a “server” board. The board appears in the IOMMU group, but it’s not offered for isolation. It may be that my best option will be Goldeye Docker. It’s just that I have a LOT of VM experience and very little experience with containers. Another option I suppose would be ProxMox with TruNAS as a VM. But I expect that is not a great optimal use of resources.

edit:

GPU passthrough only works on Truenas, if there are no other devices in the same IOMMU group (other than an Audio Device belonging to the GPU ).

Consumer motherboards are notorious for putting everything in one group. Server/workstation boards often seperate nicely.

IOMMU Group 2:
00:01.0 PCI bridge: Intel Corporation 6th-10th Gen Core Processor PCIe Controller (x16) (rev 05)
00:01.1 PCI bridge: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x8) (rev 05)
01:00.0 RAID bus controller: Broadcom / LSI SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] (rev 03)
02:00.0 VGA compatible controller: NVIDIA Corporation Device 2d04 (rev a1)
02:00.1 Audio device: NVIDIA Corporation Device 22eb (rev a1)

HoneyBadger · October 3, 2025, 9:24pm

From the looks of that overloaded IOMMU group, TrueNAS as a VM won’t work either because you won’t be able to split the HBA out to pass it through to TrueNAS, unless you use ACS overrides or similar to eliminate the IOMMU/PCI security fencing.

Is it possible there’s a BIOS update for your board that might improve the IOMMU handling (or expose a setting that lets you split things up better?)

mntbighker · October 3, 2025, 9:56pm

Yep, so it’s goldeye/docker or a different mobo. Nothing interesting on the BIOS update page.

mntbighker · October 10, 2025, 3:19am

@HoneyBadger

root@nas[~]# sh ./run.sh
IOMMU Group 0:
00:02.0 VGA compatible controller: Intel Corporation CometLake-S GT2 [UHD Graphics 630] (rev 05)

IOMMU Group 2:
00:01.0 PCI bridge: Intel Corporation 6th-10th Gen Core Processor PCIe Controller (x16) (rev 05)
01:00.0 VGA compatible controller: NVIDIA Corporation Device 2d04 (rev a1)
01:00.1 Audio device: NVIDIA Corporation Device 22eb (rev a1)

IOMMU Group 18:
06:00.0 PCI bridge: ASPEED Technology, Inc. AST1150 PCI-to-PCI Bridge (rev 04)
07:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics Family (rev 41)

Got an affordable Supermicro Mobo, and there is the latest. Any suggestions on what to look at next to isolate the GPU? The BIOS has a zillion more options than the Gigabyte had. There may be a PCIe option I missed.

HoneyBadger · October 10, 2025, 1:40pm

Is it showing up in the GPU options for isolation? I suspect we’re tripping a false positive in middleware (that’s been addressed in 25.10) with it seeing a PCI bridge device in IOMMU Group 2 and thinking it’s a critical device.

If it’s still not there in the regular drop-downs, you can

Upgrade to 25.10-RC1 and see if it’s resolved (which will also get you native driver support and let you run Ollama as an App)
Manually add the two components as PCIe devices (01:00.0 and 01:00.1) to your VM - although you probably don’t need audio here.

What model? Should only need IOMMU/VT-d enabled.

mntbighker · October 10, 2025, 6:52pm

It is not showing as an option for isolation.

It’s an X12SCZ-QF-B.

HoneyBadger · October 10, 2025, 7:43pm

Drop to a shell and show me the result of

midclt call system.advanced.get_gpu_pci_choices

please

It should give you JSON output including a couple fields:

"uses_system_critical_devices": false,
"critical_reason": null

If they’re anything other than false/null then you might have the issue in question - which would hopefully be resolved by running 25.10-RC1

mntbighker · October 10, 2025, 8:37pm

root@nas[~]# midclt call system.advanced.get_gpu_pci_choices

{}

HoneyBadger · October 10, 2025, 8:40pm

That’s … not what I expected, certainly. Is your display monitor plugged into the ASPEED?

mntbighker · October 10, 2025, 8:41pm

yes

HoneyBadger · October 10, 2025, 8:45pm

I’m quite confused here.

How about midclt call virt.device.pci_choices | jq and scrolling to find your NVIDIA card? Is it in there?

mntbighker · October 10, 2025, 8:50pm

{
  "0000:01:00.0": {
    "pci_addr": "0000:01:00.0",
    "capability": {
      "class": "0x030000",
      "domain": "0",
      "bus": "1",
      "slot": "0",
      "function": "0",
      "product": "Not Available",
      "vendor": "NVIDIA Corporation"
    },
    "controller_type": "VGA compatible controller",
    "critical": false,
    "iommu_group": {
      "number": 2,
      "addresses": [
        {
          "domain": "0x0000",
          "bus": "0x00",
          "slot": "0x01",
          "function": "0x0"
        },
        {
          "domain": "0x0000",
          "bus": "0x01",
          "slot": "0x00",
          "function": "0x0"
        },
        {
          "domain": "0x0000",
          "bus": "0x01",
          "slot": "0x00",
          "function": "0x1"
        }

Both the video and the audio.

HoneyBadger · October 10, 2025, 8:52pm

How odd that it’s masking it off, but not telling you why.

Is it already tagged for isolation? lspci -k | grep vfio

You should be able to add it as a PCI device to a VM then by its slot/device id though as mentioned in post#8 under “option 2”

mntbighker · October 10, 2025, 9:00pm

Returns nothing

mntbighker · October 10, 2025, 9:06pm

raise CallError('\n'.join(errors))

middlewared.service_exception.CallError: [EFAULT] internal error: qemu unexpectedly closed the monitor: 2025-10-10T21:04:31.931336Z qemu-system-x86_64: -device {“driver”:“vfio-pci”,“host”:“0000:01:00.0”,“id”:“hostdev0”,“bus”:“pci.5”,“addr”:“0x0”}: vfio 0000:01:00.0: group 2 is not viable
Please ensure all devices within the iommu_group are bound to their vfio bus driver.

mntbighker · October 11, 2025, 1:31am

I was going to throw Goldeye on a spare couple of SSD’s but it refuses to install without wiping the existing NVME boot, and I’m not ripping the box apart to perform a test at this point in time. Tomorrow I have some hardware coming to shuffle some things around, so I may disconnect some things to get a clean Goldeye install to run a few tests. I won’t upgrade the existing system to an RC1.

mntbighker · October 12, 2025, 1:05am

Ok, I managed to get a temp Goldeye install going. The GPU appears in all the right places, including isolation. So I’m going back to Fang, and when Gold goes gm I’ll look forward to actually using the GPU I bought Not sure what’s different to prevent Fang from working for passthrough.

–Thanks