No GPUs available for isolation when activating IOMMU groups in bios

Hello,

I got new hardware and was pretty stoked to also finally have a GPU to pass it to my VM.
Thats why i also bought a CPU with iGPU so that i have my main GPU available for passthrough.

So at first, i did not active IOMMU groups in the bios, but i could see both my iGPU and my GPU in Truenas under “Isolated GPU Device(s)”. So I isolated my GPU and wanted to assign that GPU in my VM settings.

I got the following error (expected):

[EINVAL] attribute.pptdev: Not a valid choice. The PCI device is not available for passthru: Unable to determine iommu group.

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/middlewared/main.py", line 211, in call_method
    result = await self.middleware.call_with_audit(message['method'], serviceobj, methodobj, params, self)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/middlewared/main.py", line 1529, in call_with_audit
    result = await self._call(method, serviceobj, methodobj, params, app=app,
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/middlewared/main.py", line 1460, in _call
    return await methodobj(*prepared_call.args)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/middlewared/service/crud_service.py", line 230, in create
    return await self.middleware._call(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/middlewared/main.py", line 1460, in _call
    return await methodobj(*prepared_call.args)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/middlewared/service/crud_service.py", line 261, in nf
    rv = await func(*args, **kwargs)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/middlewared/schema/processor.py", line 49, in nf
    res = await f(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/middlewared/schema/processor.py", line 179, in nf
    return await func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/middlewared/plugins/vm/vm_devices.py", line 161, in do_create
    data = await self.validate_device(data, update=False)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/middlewared/plugins/vm/vm_devices.py", line 292, in validate_device
    await self.middleware.run_in_thread(device_obj.validate, device, old, vm_instance, update)
  File "/usr/lib/python3/dist-packages/middlewared/main.py", line 1367, in run_in_thread
    return await self.run_in_executor(io_thread_pool_executor, method, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/middlewared/main.py", line 1364, in run_in_executor
    return await loop.run_in_executor(pool, functools.partial(method, *args, **kwargs))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/middlewared/plugins/vm/devices/device.py", line 45, in validate
    verrors.check()
  File "/usr/lib/python3/dist-packages/middlewared/service_exception.py", line 72, in check
    raise self
middlewared.service_exception.ValidationErrors: [EINVAL] attribute.pptdev: Not a valid choice. The PCI device is not available for passthru: Unable to determine iommu group

Then i enabled IOMMU groups in the bios (along with other settings, see below).
When i check the groups, i get the following output:

IOMMU Group 0 00:01.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir PCIe Dummy Host Bridge [1022:1632]
IOMMU Group 0 00:01.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Renoir PCIe GPP Bridge [1022:1633]
IOMMU Group 0 01:00.0 PCI bridge [0604]: Intel Corporation Device [8086:4fa1] (rev 01)
IOMMU Group 0 02:01.0 PCI bridge [0604]: Intel Corporation Device [8086:4fa4]
IOMMU Group 0 02:04.0 PCI bridge [0604]: Intel Corporation Device [8086:4fa4]
IOMMU Group 0 03:00.0 VGA compatible controller [0300]: Intel Corporation DG2 [Arc A380] [8086:56a5] (rev 05)
IOMMU Group 0 04:00.0 Audio device [0403]: Intel Corporation DG2 Audio Controller [8086:4f92]
IOMMU Group 1 00:02.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir PCIe Dummy Host Bridge [1022:1632]
IOMMU Group 1 00:02.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne PCIe GPP Bridge [1022:1634]
IOMMU Group 1 00:02.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne PCIe GPP Bridge [1022:1634]
IOMMU Group 1 05:00.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] 500 Series Chipset USB 3.1 XHCI Controller [1022:43ee]
IOMMU Group 1 05:00.1 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] 500 Series Chipset SATA Controller [1022:43eb]
IOMMU Group 1 05:00.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 500 Series Chipset Switch Upstream Port [1022:43e9]
IOMMU Group 1 06:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43ea]
IOMMU Group 1 06:04.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43ea]
IOMMU Group 1 06:08.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43ea]
IOMMU Group 1 07:00.0 Serial Attached SCSI controller [0107]: Broadcom / LSI SAS2308 PCI-Express Fusion-MPT SAS-2 [1000:0087] (rev 05)
IOMMU Group 1 08:00.0 Non-Volatile memory controller [0108]: Shenzhen Longsys Electronics Co., Ltd. Lexar NM610 PRO NVME SSD (DRAM-less) [1d97:1202] (rev 01)
IOMMU Group 1 09:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [10ec:8168] (rev 15)
IOMMU Group 1 0a:00.0 Non-Volatile memory controller [0108]: Sandisk Corp WD Blue SN580 NVMe SSD (DRAM-less) [15b7:5041] (rev 01)
IOMMU Group 2 00:08.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir PCIe Dummy Host Bridge [1022:1632]
IOMMU Group 2 00:08.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Renoir Internal PCIe GPP Bridge to Bus [1022:1635]
IOMMU Group 2 0b:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Renoir [1002:1636] (rev d9)
IOMMU Group 2 0b:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Renoir Radeon High Definition Audio Controller [1002:1637]
IOMMU Group 2 0b:00.2 Encryption controller [1080]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 10h-1fh) Platform Security Processor [1022:15df]
IOMMU Group 2 0b:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne USB 3.1 [1022:1639]
IOMMU Group 2 0b:00.4 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne USB 3.1 [1022:1639]
IOMMU Group 2 0b:00.6 Audio device [0403]: Advanced Micro Devices, Inc. [AMD] Family 17h/19h HD Audio Controller [1022:15e3]
IOMMU Group 3 00:14.0 SMBus [0c05]: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller [1022:790b] (rev 51)
IOMMU Group 3 00:14.3 ISA bridge [0601]: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge [1022:790e] (rev 51)
IOMMU Group 4 00:18.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir Device 24: Function 0 [1022:1448]
IOMMU Group 4 00:18.1 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir Device 24: Function 1 [1022:1449]
IOMMU Group 4 00:18.2 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir Device 24: Function 2 [1022:144a]
IOMMU Group 4 00:18.3 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir Device 24: Function 3 [1022:144b]
IOMMU Group 4 00:18.4 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir Device 24: Function 4 [1022:144c]
IOMMU Group 4 00:18.5 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir Device 24: Function 5 [1022:144d]
IOMMU Group 4 00:18.6 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir Device 24: Function 6 [1022:144e]
IOMMU Group 4 00:18.7 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir Device 24: Function 7 [1022:144f]
sudo dmesg | grep IOMMU
[    0.014506] DMAR: IOMMU enabled
[    0.397703] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters supported
[    0.398674] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
[    0.603615] perf/amd_iommu: Detected AMD IOMMU #0 (2 banks, 4 counters/bank).
[    0.858701] AMD-Vi: AMD IOMMUv2 loaded and initialized
sudo lspci -nn | grep -iP "VGA|audio"
03:00.0 VGA compatible controller [0300]: Intel Corporation DG2 [Arc A380] [8086:56a5] (rev 05)
04:00.0 Audio device [0403]: Intel Corporation DG2 Audio Controller [8086:4f92]
0b:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Renoir [1002:1636] (rev d9)
0b:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Renoir Radeon High Definition Audio Controller [1002:1637]
0b:00.6 Audio device [0403]: Advanced Micro Devices, Inc. [AMD] Family 17h/19h HD Audio Controller [1022:15e3]

But then, no devices (iGPU and GPU) are listed as available GPUs for isolation, therefore i cant assign them to my VM.

Whats wrong? Could i fix it with manual editing? Or change my bios settings?
I am running Truenas Scale 24.10.2

Hardware:

Mainboard: ASRock B550m Pro4
CPU: AMD Ryzen 5 Pro 4650G
GPU: Intel Arc A380
PCIE Devices: HP H220 LSI 9205-8i HBA

Relevant bios settings:

Advanced\CPU Configuration\SVM Mode: Enabled

Advanced\PCI Configuration\Above 4G Decoding (& Re-Size BAR Support): Disabled
Advanced\PCI Configuration\SR-IOV Support: Enabled

Advanced\AMD PBS\Graphics Features\Primary Video Adaptor: Int Graphics (IGD)

Advanced\AMD CBS\NBIO Commio Options\IOMMU: Enabled
Advanced\AMD CBS\NBIO Common Options\DMA Protection: Enabled
Advanced\AMD CBS\NBIO Common Options\DMAr Support: Enabled
Advanced\AMD CBS\NBIO Common Options\PCIe ARI Support: Enabled
Advanced\AMD CBS\NBIO Common Options\PCIe ARI Enumeration: Enabled
Advanced\AMD CBS\NBIO Common Options\PSPP Policy: Disabled

Advanced\AMD CBS\NBIO Common Options\GFX Configuration\iGPU Configuration: UMA_AUTO

Boot\CSM\CSM: Disabled

Hi I am having the same issue. Did you figure out a fix by chance?

In the above example, the GPU (Intel Arc) is not in a descreet IOMMU Group. It is in Group 0, with other devices.

GPU passthrough only works on Truenas, if there are no other devices in the same IOMMU group (other than an Audio Device belonging to the GPU ).

Consumer motherboards are notorious for putting everything in one group. Server/workstation boards often seperate nicely.

1 Like

I gave up eventually. Ended up installing jellyfin as a truenas app (and not via docker vm) and assigned my gpu.

A bit frustrating, because then I wouldn’t bought a CPU with integrated graphics in the first place (which was expensive and hard to get).

So, my problem is hardware related and cannot be fixed with software tweaks?

On Truenas sadly yes. Its a hardware limitation.

Look at my IOMMU Groups. There are a Total of 93 !

IOMMU Group 0:
        b2:00.0 PCI bridge [0604]: Intel Corporation Sky Lake-E PCI Express Root Port A [8086:2030] (rev 07)
IOMMU Group 1:
        b3:00.0 VGA compatible controller [0300]: NVIDIA Corporation AD104 [GeForce RTX 4070] [10de:2786] (rev a1)
        b3:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:22bc] (rev a1)
IOMMU Group 2:
        64:00.0 PCI bridge [0604]: Intel Corporation Sky Lake-E PCI Express Root Port A [8086:2030] (rev 07)
IOMMU Group 3:
        64:02.0 PCI bridge [0604]: Intel Corporation Sky Lake-E PCI Express Root Port C [8086:2032] (rev 07)
IOMMU Group 4:
        66:00.0 Non-Volatile memory controller [0108]: Intel Corporation Optane SSD 900P Series [8086:2700]
IOMMU Group 5:
        16:00.0 PCI bridge [0604]: Intel Corporation Sky Lake-E PCI Express Root Port A [8086:2030] (rev 07)
IOMMU Group 6:
        16:02.0 PCI bridge [0604]: Intel Corporation Sky Lake-E PCI Express Root Port C [8086:2032] (rev 07)
IOMMU Group 7:
        16:03.0 PCI bridge [0604]: Intel Corporation Sky Lake-E PCI Express Root Port D [8086:2033] (rev 07)

On Proxmox or vanilla Debian, you could use the ACS Override Patch

Any clue what I have to watch out for when it comes to mobo specifications in terms of clean separate groups when buying my next board?

Yes. Buy a server grade board either for an Intel Xeon or an EPYC CPU from AMD.

I refuse to believe that there aren’t any AM4/AM5 boards out there with correct separation.
That would mean every AM4 user cannot use GPU passthrough.

Maybe some manufacturers or more expensive boards support it?

There is no certain way to know before buying and trying it out. That beeing said, Virtualisation and PCI Passthrough is more of a “server” thing, less of an “office or gaming” thing. Therefore manufacturers dont optimise their consumer boards/BIOS for cleanly seperated IOMMU groups.

The problem might be solved in software… if you could convince AsRock technical support to release a BIOS with proper separation of devices in IOMMU. (Of course, for a cheap consumer motherboard, it is hopeless…)
Everything would likely work to satisfaction with a Gigabyte MC12-LE0 board (Gigabyte even recently released an updated BIOS to enable the iGPU on APUs like your PRO 4650G) or a B550D4 board from AsRock Rack (the sister company addressing the server market).

2 Likes

I had the MC12-LE0 on my list, but since Wolfgang promoted it in one of his videos the price skyrocketed. Should have bought one for 159 Euro a couple of months ago… Sigh