Devices Using PCIE Tunneling Having Issues (i.e. over USB4/TB4)

scyto · September 20, 2024, 12:26am

I have a NAS that has USB/TB4 ports.

I have nvidia GPU, BMC 10gb network card and Mellanox ConnectX-4 LX all connected by TB4/USB4.

None of these devices work.

All these devices work fine on the right mix of linux kernels. For example the BMC and NVIDIA work fine this way on ZimaOS.

the common root to all is “Unable to change power state from D3cold to D0, device inaccessible” I have had this issue on ubuntu 24.04 and it was resolved with kernel

this was solved by upgrading to kernel 6.8.4-060804 Ubuntu 24.04 - Unable to change power state from D3cold to D0, device inaccessible - Graphics / Linux / Linux - NVIDIA Developer Forums

With the rise in USB-40 capable hardware (i.e. USB4 / TB4) PCIE tunneling enabled by this technology is interesting in multiple NAS scenarios in the coming year. I don’t think it is urgent to fix this i would advocate it for the first 2025 release as the first round of USB4/TB4 motherboards supporting software connection manager will be released in Sept / Oct 2024.

for the NVidia 2080ti dmesg only shows this (device 25 is the NVidia)

[  471.106763] pci 0000:25:00.2: Unable to change power state from D3cold to D0, device inaccessible
[  583.781488] pci 0000:25:00.2: Unable to change power state from D3cold to D0, device inaccessible

for the mellanox dmesg shows

oot@truenas[~]# dmesg | grep mlx
[    1.536529] mlx5_core 0000:39:00.0: firmware version: 14.32.1010
[    1.536568] mlx5_core 0000:39:00.0: 8.000 Gb/s available PCIe bandwidth, limited by 2.5 GT/s PCIe x4 link at 0000:00:07.2 (capable of 63.008 Gb/s with 8.0 GT/s PCIe x8 link)
[    3.546198] mlx5_core 0000:39:00.0: poll_health:819:(pid 0): Fatal error 1 detected
[    3.546234] mlx5_core 0000:39:00.0: print_health_info:423:(pid 0): PCI slot is unavailable
[   62.582296] mlx5_core 0000:39:00.0: wait_func:1172:(pid 257): INIT_HCA(0x102) timeout. Will cause a leak of a command resource
[   62.582308] mlx5_core 0000:39:00.0: mlx5_function_open:1242:(pid 257): init hca failed
[   62.597225] mlx5_core 0000:39:00.0: probe_one:1952:(pid 257): mlx5_init_one failed with error code -110
[   62.597250] mlx5_core 0000:39:00.0: mlx5_fw_fatal_reporter_err_work:679:(pid 98): health works are not permitted at this stage
[   62.598666] mlx5_core: probe of 0000:39:00.0 failed with error -110
[   62.600153] mlx5_core 0000:39:00.1: Unable to change power state from D3cold to D0, device inaccessible
[   62.600263] mlx5_core 0000:39:00.1: mlx5_pci_vsc_init:61:(pid 257): Failed to get valid vendor specific ID
[   62.600271] mlx5_core 0000:39:00.1: firmware version: 65535.65535.65535
[   62.600277] mlx5_core 0000:39:00.1: 8.000 Gb/s available PCIe bandwidth, limited by 2.5 GT/s PCIe x4 link at 0000:00:07.2 (capable of 4032.000 Gb/s with 64.0 GT/s PCIe x63 link)
[   82.602352] mlx5_core 0000:39:00.1: wait_fw_init:206:(pid 257): Waiting for FW initialization, timeout abort in 100s (0xffffffff)
[  102.606349] mlx5_core 0000:39:00.1: wait_fw_init:206:(pid 257): Waiting for FW initialization, timeout abort in 79s (0xffffffff)
[  122.610351] mlx5_core 0000:39:00.1: wait_fw_init:206:(pid 257): Waiting for FW initialization, timeout abort in 59s (0xffffffff)
[  142.614353] mlx5_core 0000:39:00.1: wait_fw_init:206:(pid 257): Waiting for FW initialization, timeout abort in 39s (0xffffffff)
[  162.618353] mlx5_core 0000:39:00.1: wait_fw_init:206:(pid 257): Waiting for FW initialization, timeout abort in 19s (0xffffffff)
[  182.610350] mlx5_core 0000:39:00.1: mlx5_function_enable:1145:(pid 257): Firmware over 120000 MS in pre-initializing state, aborting
[  182.610408] mlx5_core 0000:39:00.1: probe_one:1952:(pid 257): mlx5_init_one failed with error code -16
[  182.614454] mlx5_core: probe of 0000:39:00.1 failed with error -16

root@truenas[~]# dmesg | grep bnx2x  
[    1.479752] bnx2x 0000:0f:00.0: msix capability found
[    1.480196] bnx2x 0000:0f:00.0: part number 0-0-0-0
[   11.510310] bnx2x: [bnx2x_fw_command:3054(eth%d)]FW failed to respond!
[   11.510317] bnx2x 0000:0f:00.0 (unnamed net_device) (uninitialized): bc 7.13.75
[   11.510322] bnx2x: [bnx2x_fw_dump_lvl:794(eth%d)]\x013MCP PC at 0xffffffff
[   11.510324] bnx2x: [bnx2x_fw_dump_lvl:815(eth%d)]Trace buffer signature is missing.
[   11.510326] bnx2x: [bnx2x_prev_unload:10893(eth%d)]MCP response failure, aborting
[   11.510474] bnx2x 0000:0f:00.1: msix capability found
[   11.510485] bnx2x 0000:0f:00.0: msix capability found
[   11.510780] bnx2x 0000:0f:00.0: Unable to change power state from D3cold to D0, device inaccessible
[   11.510789] bnx2x 0000:0f:00.1: Unable to change power state from D3cold to D0, device inaccessible
[   11.510945] bnx2x: PCI device error, probably due to fan failure, aborting
[   11.511035] bnx2x: PCI device error, probably due to fan failure, aborting

(funny as the card has no fan and works perfectly with ZimaOS - a no name startup NAS OS)

NickF1227 · September 20, 2024, 1:17am

This is the bleeding edge. You yourself realize that the hardware for doing this is brand new. If a newer kernel fixes this issue, then I suspect it will come to TrueNAS whenever it ends up in an LTS.

TrueNAS is a stable enterprise product. zimaOS can afford to run on less conservative software trains.

scyto · September 20, 2024, 3:19am

yup that’s why i said this would be good for 2025 release, this isn’t my first rodeo

NickF1227 · September 20, 2024, 3:26am

For sure. I’m just saying, I don’t know that any TrueNAS changes are required other than a future kernel They are just PCI-E devices over a different fabric.

etorix · September 20, 2024, 8:22am

USB4 incorporates Thunderbolt 3; Thunderbolt 4 is a step above USB4.

Anyway, the official position so far is that Thunderbolt is not officially supported by TrueNAS and therefore not tested. I doubt that any enterprise customer of TrueNAS cares about Thunderbolt, and that defines the amount of money and developer time that iX is willing to spend on it.

scyto · September 20, 2024, 4:06pm

@NickF1227 agreed thats likely, but i make no assumptions about what truenas does and doesnt chose in the kernel configmake menu options, so just logging it here for posterity in the offchance it might influence them.

scyto · September 20, 2024, 4:11pm

To refine their further, Thunderbolt 4 is a just a certification and a specific USB4 controller. It is the superset of all USB4 mandatory and optional features including the USB-40, interdomain channel bonding, tunneled PCIE, tunneled USB (as in 2 and 3.x), DP ALT mode , oh a mandatory PD support i thin. and lastly a logo and, in theory, a quality certification.

But as i am currently with fighting with a mobo manufacturer who says they are TB4 and can’t do some of the things i listed i am uncler the certification is worth much (not to mention a cable supplier with certified cables that don’t do inter-domain channel bonding).

I am quite excited once we have broad penetration and understanding of TB4 / USB-40 (and the upcoming USB-80). This was my first dabble in all of these 2 years ago, turned out quite well. proxmox cluster proof of concept (github.com)

NickF1227 · September 20, 2024, 4:13pm

Wrong Storage OS
In general, TB would “work” if the devices are just exposed as PCI-E devices. The problem you have right now is a power state problem, which is not surprising to see (very early days!).

But as @etorix said, this would not be an officially supported feature used in Enterprise, and iXsystems probably wouldn’t be testing it.

If there are bugs or things that don’t work when the kernel is here, you would have to make a “Feature Request” and iX would have to choose to work on it or not. Additional packages would likely not be included, so it’s really up to the Linux kernel here. Latest Feature Requests topics - TrueNAS Community Forums

scyto · September 20, 2024, 4:24pm

oops woke up 15m ago, still on top third of my giant mug-o-tea™

scyto · September 20, 2024, 4:27pm

The power problem is a persistent on again and off again bug, actually there are earlier mainstream kernels than the one in TrueNAS Scale 24.10 that work just fine with this.

It is about finding the right combo of kernel and PCIE power management settings and kernel flags. there are 5.x series kernel where this works fine.

So it is a testing and prioritization issue i.e. is the scenario worth supporting and worrying about - not a question of ‘waiting for the upstream kernel to be fixed, per-se’

Thanks for point me to the feature request thread, i will go log something, I just started evaluating TrueNAS as a candidate for my future Synology replacement at end of year. eGPU over TB is core as that’s a great way to add GPU for AI and other VFs.

NickF1227 · September 20, 2024, 4:33pm

Right but the LTS kernel thats in EE is in the 6.6 branch. The link you shared in Ubuntu 24.04 was regarding moving to the 6.8 branch, which is not LTS.

root@prod[~]# uname -r
6.6.44-production+truenas

See: The Linux Kernel Archives - Releases

If a patch comes into the 6.6 branch, it may wind up in a TrueNAS 24.10.x release or potentially you’d have to wait for TrueNAS 25.04 (assuming the cadence is the same).

scyto · September 20, 2024, 9:19pm

I assume you mean the 6.6 LTS - i admit i have only had this working on normal kernel.org 6.1, 6.5, etc

What else can i try with a kernel.org 6.6 LTS kernel (i also want to see if it has my 6.5 connection and IPv6 fixes for thunderboltb-net)

Stux · September 21, 2024, 12:17am

Seems like a useful link

https://www.reddit.com/r/debian/comments/18ko759/does_debian_12_have_kernel_66/

Has suggested distros with 6.6, also suggest compile your own

TNS generally only adopts a Linux kernel when Linus declares it LTS. The current most recent LTS kernel is 6.6.

Maybe 6.11 will be declared LTS… who knows. We’re about due.

scyto · September 21, 2024, 8:29pm

thanks, i can roll my own if/when i have time, done it before so not huge PITA but i only do it once every year to two so have to put a dev image back together will all the right tools (luckily i saved the instructions, lol).

scyto · November 9, 2024, 7:09pm

You probably are not interested, just updating the post incase anyone searches and arrives here.

Turns out this is a regression from dragonfish, i only tried dragonfish for the first time today.

of course, this doesn’t mean IX will class it as a regression to be fixed

[NAS-132394] NVIDIA Drivers Don’t Work with USB4 connected card [regression from 24.04.00] - iXsystems TrueNAS Jira4)

Jeverett · July 8, 2025, 3:17pm

I’d certainly question the enterprise worthiness of, as mentioned above, hanging a (PCIe x16) GPU, a (PCIe x4) 10Gbps NIC, and a (PCIe x8) 25Gbps NIC all off a tunneled implementation of PCIe x4 – but I’d think there could be a sensible enterprise use case for being able to directly attach an external NVMe enclosure to the NAS, at full PCIe x4/NVMe/Thunderbolt 3 speed, for say:

fast import (cp/rsync) of (read only mounted non-ZFS) data into NAS datastores
with the right tooling, maybe even forensic drive image capture, ingest, and storage of images in TrueNAS, or
fast zfs send/receives between ZFS internal and external media for backups/restores/etc.

I’ll have to test out how well this works now in EE and FT on my lab TrueNAS with TB3.

Just eyeballing now, I see module support:

root@TrueNAS02[/mnt/TANK_NVME/_Apps]# lsmod |grep bolt
intel_wmi_thunderbolt    16384  0
thunderbolt           491520  0
wmi                    36864  5 video,intel_wmi_thunderbolt,asus_wmi,wmi_bmof,mxm_wmi

root@TrueNAS02[/mnt/TANK_NVME/_Apps]# cat /sys/bus/thunderbolt/devices/domain0/security 
user

but no userland tools

root@TrueNAS02[/mnt/TANK_NVME/_Apps]# boltctl
zsh: command not found: boltctl
root@TrueNAS02[/mnt/TANK_NVME/_Apps]# tbtadm
zsh: command not found: tbtadm

scyto · July 8, 2025, 5:41pm

i think hanging a GPU off USB4 for AI workloads is absolutely a [edgecase] enterprise scenario - along with any other PCIE hardware that cannot be easily added. i agree this doesn’t apply to most rack mount like hardware yet - which lack USB4. Far more likely those machines will get external MCIO connectors.

secondly truenas have invested time in plenty of non-enterprise features for the community edition - so the argument “non-enterprise features wont get invested in” is a specious argument folks throw out when they think “i don’t want that and it might take investment away from what i want so i will dismiss it”.

We now see NAS devices shipping with USB4 and (soon USB5) in the community domain - as i accuratrely predicted, this requirement will start increasing within the community based installs on this hardware.

And remember hilarioulsy this was a broad kernel change that affected almost all distributions of Linux including LTR older kernels, and is easily fixed with a kernel parameter.

But this community is more interested in telling people what not to do, than actually helping, unfortunately, i never asked ix to support this, i just asked for help from the community all i got was grief - not a friendly community at all IMHO.

etorix · July 8, 2025, 5:52pm

The basic idea is that users here value their data (if you don’t, why bother with ZFS to begin with?), hence should not play with poorly supported/unsupported features or waste time trying to coerce dubious hardware.
This is a geeky community, but not a hacking community.

scyto · July 8, 2025, 6:00pm

the hardware is not dubious it is industry standard

the linux kernel has these as mainline features across many versions including LTSR

if you don’t want to use it - awesome, don’t, no one is asking you to use it, but stop telling others their requirements are not relevant

to be clear if truenas want to pull out all the things they have built for non-enterprise sceanrios, kernel drivers for hardware irrelevant in the enetrprise space, i would understand that, but they haven’t and they continue to add feature and support for hardware irrelevant in the enterprise space

you are gatekeeping what you think is geeky vs hacking and tbh thanks for proving my points.

I am muting this topic and have nothing more to add. Marking as solved. As i solved it for myself.

Jeverett · July 8, 2025, 9:32pm

I’m pretty sympathetic to Thunderbolt noodling – within arms reach I still have an Akitio Node (a TB3 PCIe x16/GPU enclosure before from the eGPU standard came and forbid daisy-chain TB3 ports) as well as a Sonnet Echo Express SE IIIe (a 3 slot TB3 PCIe adapter enclosure) – but, (speaking as someone who has pushed the limits) realistically, it’s a pretty fragile scenario to try to hang too much off a tunneled external PCIe bus.

Connecting an eGPU for hashing, or as a boost for video editing work (where bandwidth to the GPU isn’t that relevant). Fine. Attaching a NVMe enclosure. Perfect. Both at the same time? You’re already pushing it.

That’s probably why the eGPU standard forbid the downstream TB3 port – support nightmare, especially with marketing creating unrealistic expectations by saying, “One port does it all!” Way too many people would oversubscribe the bus in terms of bandwidth, PCI addressing possibility, you name it. Things would get connected, as if Thunderbolt was magic, something that would automatically connect anything, rather than a protocol with specific engineering limits, and things would flake out.

Aside from limited uses, the age of Thunderbolt and eGPUs has largely passed by. On egpu.io, there was certainly no shortage of troubleshooting all sorts of eGPU system hangs:
https://egpu.io/gsearch/?q=system+hang
And, over 2 years ago, the rise of OcuLink was evident:
https://egpu.io/osmeta-oculink-egpu-review-and-installation-guide/

Today, it makes less and less sense to piece together an octopus of external peripherals, rather than just getting a different case/system to bring things in internally. Especially now, in an age of 128 PCIe lane Epyc systems, it doesn’t make much sense to use an expensive external enclosure to hang some rare, expensive GPU out a slow, kludgy external bus where bandwidth is important, like if say you were doing deep learning, or real-time inference involving datasets on your NAS. Similarly, the same would go if doing work with LLMs, it’s easier to just disaggregate the GPU into another system and connect to it via an API, like with Ollama.

To be fair, I did shove an aging Nvidia RTX 3080 Ti into one of my lab TrueNAS boxes – and it’s jury-rigged attached in the x4 mode slot (because I needed the x16 and the x8 mode slots for a dual-U2 NVMe adapter and a ConnectX-4 LX adapter. And, it’s a fun addition to a TrueNAS box, especially given TrueNAS’ builtin Nvidia driver/container toolkit support, even if limited at present to Cuda 12.4.

Getting the proper container versions for Cuda 12.4, it’s perfectly functional for:

hashcat

mkdir docker-hashcat-cuda-12.4
cd docker-hashcat-cuda-12.4 
git clone https://github.com/dizcza/docker-hashcat.git
cd docker-hashcat
sed -i 's/cuda:12.8.1/cuda:12.4.1/' Dockerfile
docker build -t docker-hashcat:cuda12.4 .
cd ..
mkdir {data,dicts}
docker run --rm -it --gpus all -v ./data:/root/hashcat/data -v ./dicts:/root/hashcat/dicts docker-hashcat:cuda12.4

and

openai-whisper:

mkdir openai-whisper
cd openai-whisper/
git clone https://github.com/manzolo/openai-whisper-docker
cd openai-whisper-docker/
docker build -t openai-whisper .
cd ..
mkdir {audio-files,models}
cp <file_to_voice_transcribe.mp4> audio-files/
docker run --gpus all -it --rm -v ./models:/root/.cache/whisper -v ./audio-files:/app openai-whisper whisper <file_to_voice_transcribe.mp4> --device cuda --model turbo --language English --output_dir /app --output_format all

… and I’ll eventually maybe see how well it can handle some bigger stuff, hopefully tied into data in an Elastic stack running in TrueNAS:

# docker run --rm -ti --gpus=all nvcr.io/nvidia/pytorch:24.05-py3 bash 
# docker run --rm -ti --gpus=all nvcr.io/nvidia/tensorflow:24.05-tf2-py3 bash
# docker run --rm -ti --gpus=all nvcr.io/nvidia/morpheus/morpheus:24.06-runtime bash

… but realistically, this is just a lab box, for experimentation and maybe some proof of concept stuff. It’d be even less viable, if I were trying to do tunneled PCIe on it.

Although, I’m am going to give a single TB3 NVMe a shot – especially because, I believe an SED (Self-Encrypting Drive) NVMe will work attached to tunneled PCIe, whereas it doesn’t with USB attached drives. So there’s an interesting use case there: zfs send/receive to an external SED that can be used for secure offsite backup/data transfer. The offsite data custodian wouldn’t even have to get the SED unlock password, or the encrypted ZFS dataset keys.