HELP - GPU Passthrough to VM Issues

G3K3L · July 30, 2025, 6:00pm

Hello everyone, I am new here so please be patient with me. I have read a lot of guides and TrueNAS forum posts for solutions but I still could not manage to set up my GPU passthrough to my VMs.

Before getting into detail, here is my setup:
Mobo: ASUS P8B75-V (with latest bios)
CPU: Intel i7 2600
RAM: (4x) 4GB 1333Mhz DDR3 (Total 16GB)
Storage:
-SanDisk 128GB SATA SSD
-(3x) Seagate Constellation ES.3 ST2000NM0033 2TB 7200 RPM 128MB Cache
GPU: Nvidia GTX 960 MSI 4GB gaming
PSU: Thermaltake Smart 500W

So, I set up my server using HexOS without a GPU and now getting more familiar (or trying to) with TrueNAS Scale. I have setup Immich and Plex and have no issues with them so far. Then recently I upgraded my mobo to the one that is mentioned above and tried to add my GTX960. I updated the BIOS to the most recent available version. And after trying many things for hours and hours I realised 2 things.
First of all even though my CPU has a built in iGPU (and it is selected as the main display device) if I plug in a dGPU into the first PCIe slot, it is not recognized in the advanced settings menu for GPU isolation UI (dropdown) eventhough I can go to shell and check nvidia-smi and see it recognized) so I don’t really know what is the deal with that.
The second thing is if I move the GPU to the secondary PCIe slot, it becomes visible on the dropdown under GPU isolation and I can select it. After that I go to my already created Windows VM and add it there as PCI device (2 of them one as VGA output it says and one as audio device) and then if I try to launch that VM it fails. I tried so many things I don’t even know where to start to mention.

Ideally it would be nice to figure out why the first PCIe is not even showing for isolation but honestly I am not planning to add a 2nd PCIe device to this machine anytime soon so we can skip that as well. I would really appreciate it if someone can help me with the problem to launch the VM when it is isolated and added to the VM.

Here is the error I get when I try to launch the VM with the GPU attached to the VM:

[EFAULT] internal error: qemu unexpectedly closed the monitor: 2025-07-30T17:54:18.723358Z qemu-system-x86_64: -device {"driver":"vfio-pci","host":"0000:02:00.0","id":"hostdev0","bus":"pci.0","addr":"0x7"}: VFIO_MAP_DMA failed: Bad address 2025-07-30T17:54:18.761342Z qemu-system-x86_64: -device {"driver":"vfio-pci","host":"0000:02:00.0","id":"hostdev0","bus":"pci.0","addr":"0x7"}: vfio 0000:02:00.0: failed to setup container for group 12: memory listener initialization failed: Region pc.ram: vfio_dma_map(0x5560bb4135d0, 0x100000000, 0x1c0000000, 0x7fbe23600000) = -2 (No such file or directory)

More info… shows me this:

 Error: Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/middlewared/plugins/vm/supervisor/supervisor.py", line 189, in start
    if self.domain.create() < 0:
       ^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/libvirt.py", line 1373, in create
    raise libvirtError('virDomainCreate() failed')
libvirt.libvirtError: internal error: qemu unexpectedly closed the monitor: 2025-07-30T17:54:18.723358Z qemu-system-x86_64: -device {"driver":"vfio-pci","host":"0000:02:00.0","id":"hostdev0","bus":"pci.0","addr":"0x7"}: VFIO_MAP_DMA failed: Bad address
2025-07-30T17:54:18.761342Z qemu-system-x86_64: -device {"driver":"vfio-pci","host":"0000:02:00.0","id":"hostdev0","bus":"pci.0","addr":"0x7"}: vfio 0000:02:00.0: failed to setup container for group 12: memory listener initialization failed: Region pc.ram: vfio_dma_map(0x5560bb4135d0, 0x100000000, 0x1c0000000, 0x7fbe23600000) = -2 (No such file or directory)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/middlewared/main.py", line 211, in call_method
    result = await self.middleware.call_with_audit(message['method'], serviceobj, methodobj, params, self)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/middlewared/main.py", line 1529, in call_with_audit
    result = await self._call(method, serviceobj, methodobj, params, app=app,
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/middlewared/main.py", line 1460, in _call
    return await methodobj(*prepared_call.args)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/middlewared/schema/processor.py", line 179, in nf
    return await func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/middlewared/schema/processor.py", line 49, in nf
    res = await f(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/middlewared/plugins/vm/vm_lifecycle.py", line 58, in start
    await self.middleware.run_in_thread(self._start, vm['name'])
  File "/usr/lib/python3/dist-packages/middlewared/main.py", line 1367, in run_in_thread
    return await self.run_in_executor(io_thread_pool_executor, method, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/middlewared/main.py", line 1364, in run_in_executor
    return await loop.run_in_executor(pool, functools.partial(method, *args, **kwargs))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/middlewared/plugins/vm/vm_supervisor.py", line 68, in _start
    self.vms[vm_name].start(vm_data=self._vm_from_name(vm_name))
  File "/usr/lib/python3/dist-packages/middlewared/plugins/vm/supervisor/supervisor.py", line 198, in start
    raise CallError('\n'.join(errors))
middlewared.service_exception.CallError: [EFAULT] internal error: qemu unexpectedly closed the monitor: 2025-07-30T17:54:18.723358Z qemu-system-x86_64: -device {"driver":"vfio-pci","host":"0000:02:00.0","id":"hostdev0","bus":"pci.0","addr":"0x7"}: VFIO_MAP_DMA failed: Bad address
2025-07-30T17:54:18.761342Z qemu-system-x86_64: -device {"driver":"vfio-pci","host":"0000:02:00.0","id":"hostdev0","bus":"pci.0","addr":"0x7"}: vfio 0000:02:00.0: failed to setup container for group 12: memory listener initialization failed: Region pc.ram: vfio_dma_map(0x5560bb4135d0, 0x100000000, 0x1c0000000, 0x7fbe23600000) = -2 (No such file or directory)

Note: If I de-attach the GPU from the VM (regardless of the GPU physically installed or not), I can launch the VM without any issues and can connect to it through the SPICE interface

Note 2: There is no IOMMU setting on the BIOS but Virtualisation is on and I don’t know what other settings should I be changing or if it is even related to that (the mobo is pretty old, I built this system just because I had most parts lying around, trying not to spend too much on it if possible) - (The mobo upgrade was to get a mobo that can allow me to put the GPU in as the old one was hitting the SATA ports and not going in)

Here is a screenshot of the devices when I add the GPU:

Thank you so much for your help in advance

G3K3L · August 19, 2025, 10:15pm

Some Updates:

GPU is recognized in lspci in both slots

image913×281 13.3 KB
For some reason when I plug the GPU in the first PCIe slot, the only thing that pops up on isolation is this unrecognized 02:00.0 unknown slot

image468×249 5.6 KB

which happens to be the iGPU of the CPU (i7 2600)
And similarly if I go to my already set and working VMs and look at the drop down menu to add/pass through GPU the only thing that is available is that Unknown slot 02:00.0 that when I try to choose I get the 1 GPU error:

“At least 1 GPU is required by the host for its functions.
With your selection, no GPU is available for the host to consume.”

P.S: for both of these screenshots I tried to set the discrete GPU as the main output device on BIOS (I know normally it should be set to iGPU, but I wanted to test as that was the only thing I have not tried so far / or at least feels like it)
But since I have done that the 02:00.0 device became invisible in lspci which I have no idea how.

lspci -k outputs the following:

image905×892 44.1 KB

now this image shows for the audio part of the discrete GPU the kernel driver is snd_hda_intel which I am not sure if that is normal but that is not the problem (I hope)

Also just as a small additional info, I am currently in TrueNAS version 24.10.2.4 I forgot to add that to the main post.
OS Version:TrueNAS-SCALE-24.10.2.4

Please let me know if additional info is required or want me to see the output of a specific command.

Thank you

HoneyBadger · August 20, 2025, 1:52pm

From a quick glance at the manual, have you set the Primary Display to iGPU, and enabled the iGPU Multi-Monitor setting? You may also need to plug a monitor into the onboard display ports instead of your GTX960.

The motherboard looks like it’s based on the rather old Intel B75 chipset, so IOMMU groups may be unoptimized or even impossible as you mentioned there doesn’t seem to be a setting for them. It might be called “Intel VT-d” in that era as well.

Check to see if there are any subdirectories present in the folder /sys/kernel/iommu_groups/ - if there are, you can show the group breakdowns with the following script:

for d in $(find /sys/kernel/iommu_groups/ -type l | sort -n -k5 -t/); do 
    n=${d#*/iommu_groups/*}; n=${n%%/*}
    printf 'IOMMU Group %s ' "$n"
    lspci -nns "${d##*/}"
done;

Your GPU will need to be by itself (or with its own audio device, at most) - because the whole group gets passed to a VM at a time.

G3K3L · August 20, 2025, 3:23pm

Thank you for your reply,

So I switched the primary display back to iGPU and re-enabled the iGPU Multi-Monitor setting. Should I keep everything else the same (like Render Standby etc.)

And made sure the intel Virtualization Technology is enabled.

Here is an updated screenshot of lspci after doing those:

and lspci -k:

There seems to be 13 groups under /sys/kernel/iommu_groups/ :

And finally I created and ran the small script you gave me, which showed that the GPU and the PCI bridge seems to be in the same iommu group, is that a problem? (if I am understanding it right):

Is it possible to change a device’s iommu group manually?
And is it possible that the 2 PCIe slots are on different iommu groups (therefore should I switch the GPU to the second slot) ?

Thank you so much
I finally feel like there is some progress!

G3K3L · August 20, 2025, 9:29pm

Hey HoneyBadger,

While searching around to find similar situations to what I am having right now, I came across the following post:

This is very similar to what I am having and I was wondering if it is actually fine to just passthrough the GPU to VM without isolating in the Advanced Settings menu.

Could you please elaborate/explain if it would work for me and if there is any risk or anything to actually do this?

P.S. This also answers my question:

And finally I created and ran the small script you gave me, which showed that the GPU and the PCI bridge seems to be in the same iommu group, is that a problem?

Probably no, as other people who also have the PCI bridge seems to be able to passthrough their GPU to VMs just fine. But I’d still appreciate an answer if there is more to the story.

Thank you

Fleshmauler · August 21, 2025, 12:23am

Likely yes.

It could fail to passthrough & the VM won’t start due to the system using the GPU for something.

Uhh - additional info that might be useful; sometime GPU has an ‘audio’ component to it & it also needs to be passed through, else it things just don’t work.

G3K3L · August 22, 2025, 3:19am

Hello HoneyBadger and Fleshmauler,

Thank you for your help, I have some more progress about it!

Turns out there was another issue:

I have 16GB of total RAM and I was trying to assign 10GB to the VM, which for some reason it was capable of running when I was not passing the GPU but it was giving an error when I add the 2 PCIe passthrough (one for VGA part of the GPU and one for audio part of it) so decreasing it to 8GB got rid of that error.
And it can boot into the VM now!!!
It is still not isolated though. When I first boot up the server, everything works nice and easy I can run the VM and I installed the GPU drivers on Windows inside the VM without any issues so now I can use the GPU in the VM. But… when I am done using the VM and shut it down, somehow, the GPU gets “recognized” by the TrueNAS system and it gets unavailable in the device dropdown menu. I can not boot into the VM again before a full restart of the server. (So that is what happens if you use the GPU without isolation I guess)
Finally, (I remember reading this from a few other forum posts as well), when the GPU is passed through to the VM, I can not make any changes to the VM system, meaning I can not edit the amount of RAM, core counts, etc etc without removing the passthrough, making the changes and re-adding the 2 passthrough lines in the devices part of the VM. (I don’t know if this is so big of a problem but it feels like it can not be normal, once I find a good balance for the system I can just carry on with the settings but still)

My current biggest issue is that I need to reboot the server every time I shut down the VM to be able to run it again. (so isolation)

Thank you so much once again though as this at least allows me to get to work on the VM and rebooting is just an annoyance but would be nice to fix if possible.

HoneyBadger · August 26, 2025, 3:24pm

That’s what’s presently preventing it from showing in the dropdown. We’ve made some changes to the “critical device detection” in the upcoming 25.10 to address situations like this where the GPU is the only downstream device from a PCI bridge.

If you don’t have or need any NVIDIA cards to be accessible by the host (eg: for Apps) then you can uninstall the driver from the Apps page, and this should hopefully prevent it from binding to the nvidia driver when the VM is powered off/restarted (because the driver won’t be loaded) and that should address that challenge as well for now.