GPU integrations for AI/Apps - Fangtooth 25.04.0

Thread created at the request of Captain_Morgan:

Searching for Radeon (on this forum, not the old one, because why bother?) brings up multiple threads in relation to transcoding on Jellyfin and Plex.

I am using a Powercolour Red Devil AMD Radeon RX 6750 XT 12GB graphics card, and it is set up to transcode for Jellyfin, set up as an app. OS is Truenas 25.04.0, upgraded via EE > 25.04 Beta > RC1 > .0

Evidence of effective transcoding from jellyfin log:


B2025-04-27 07:13:34.244395+00:00[08:13:34] [INF] [21] MediaBrowser.MediaEncoding.Transcoding.TranscodeManager: /usr/lib/jellyfin-ffmpeg/ffmpeg -analyzeduration 200M -probesize 1G -f matroska -init_hw_device drm=dr:/dev/dri/renderD128 -init_hw_device vaapi=va@dr -init_hw_device vulkan=vk@dr -filter_hw_device vk -hwaccel vaapi -hwaccel_output_format vaapi -noautorotate -i file:"XXX" -noautoscale -map_metadata -1 -map_chapters -1 -threads 0 -map 0:0 -map 0:1 -map -0:s -codec:v:0 h264_vaapi -rc_mode VBR -b:v 9208800 -maxrate 9208800 -bufsize 18417600 -sei -a53_cc -force_key_frames:0 "expr:gte(t,n_forced*3)" -vf "setparams=color_primaries=bt2020:color_trc=smpte2084:colorspace=bt2020nc,hwmap=derive_device=drm,format=drm_prime,libplacebo=upscaler=none:downscaler=none:w=1920:h=804:format=bgra:tonemapping=bt.2390:peak_detect=0:color_primaries=bt709:color_trc=bt709:colorspace=bt709,format=vulkan,hwmap=derive_device=vaapi,format=vaapi,scale_vaapi=format=nv12" -codec:a:0 libfdk_aac -ac 2 -vbr:a 5 -af "volume=2" -copyts -avoid_negative_ts disabled -max_muxing_queue_size 2048 -f hls -max_delay 5000000 -hls_time 3 -hls_segment_type mpegts -start_number 0 -hls_segment_filename "/cache/transcodes/29e6aafb1dd5bde305d81fc944a0cee7%d.ts" -hls_playlist_type vod -hls_list_size 0 -y "/cache/transcodes/29e6aafb1dd5bde305d81fc944a0cee7.m3u8"

(As an aside, this is mostly for proof of concept, the CPU running Truenas is better at transcoding than a Radeon, but given I only use transcoding to watch stuff on a phone, power efficiency is more important than quality, but this is likely not the case for most people in the community who will need a GPU for decent transcoding ability).

Now to the matter proper: Ollama, rocm, and driver availability.

The Radeon GPU is passed through to Ollama, which is set up as an app. Below is the output from the Ollama log:

2025-04-26 17:47:29.930256+00:00time=2025-04-26T17:47:29.930Z level=INFO source=gpu.go:217 msg="looking for compatible GPUs"

2025-04-26 17:47:29.948207+00:00time=2025-04-26T17:47:29.947Z level=WARN source=amd_linux.go:61 msg="ollama recommends running the https://www.amd.com/en/support/linux-drivers" error="amdgpu version file missing: /sys/module/amdgpu/version stat /sys/module/amdgpu/version: no such file or directory"

2025-04-26 17:47:29.949612+00:00time=2025-04-26T17:47:29.949Z level=WARN source=amd_linux.go:443 msg="amdgpu detected, but no compatible rocm library found. Either install rocm v6, or follow manual install instructions at https://github.com/ollama/ollama/blob/main/docs/linux.md#manual-install"

2025-04-26 17:47:29.949740+00:00time=2025-04-26T17:47:29.949Z level=WARN source=amd_linux.go:348 msg="unable to verify rocm library: no suitable rocm found, falling back to CPU"

2025-04-26 17:47:29.949789+00:00time=2025-04-26T17:47:29.949Z level=INFO source=gpu.go:377 msg="no compatible GPUs were discovered"

2025-04-26 17:47:29.949809+00:00time=2025-04-26T17:47:29.949Z level=INFO source=types.go:130 msg="inference compute" id=0 library=cpu variant="" compute="" driver=0.0 name="" total="251.5 GiB" available="7.1 GiB"

This raises a few questions: is rocm on Truenas? RDNA2 GPU (specifically 6800XT but rocm works with the 6750XT in Mint Cinnamon) are compatible with previous versions of rocm (5.x was the last IIRC, so I acknowledge this may be not worth supporting from iX systems perspective. Current vsion is 6.4.0). I acknowledge I am a lay person and I’m likely looking in the wrong places (lspci, sudo modinfo amdgpu etc), but I can’t find any evidence that rocm is installed.

There are two posts on rocm that the forum search identifies: The exchange that lead to this post in the 25.04.0 release thread, and the coral gasket drive feature request thread. It looks like this is not a high profile discussion.

Gasket drivers have been requested for quite a while, and as Honeybadger notes, systemd-sysext is a potential route, and realistically this should also be a potential route to enable the rocm if not installed.

The greater point I’d like to make is that from a community perspective, is that if the coral (etc) drivers are not included which reduces the effectiveness of Truenas as a docker host (after all, what is the point of using a docker app if it cannot use the tools because Truenas lacks the drivers?) This reinforces the strategy of relying on instances and virtual machines…but then this pushes against the strategy of using docker apps natively as we can’t split the equipment between the VM and baremetal.

And as a follow up point:

If Intel gpu are not supported via Ollama, AMD gpu are basically not supported by rocm (not a Truenas fault but still a problem for iX to consider), and nvidia gpu can’t be passed through to the app, then is there actually any GPU integration in Truenas Community edition?

I’m mindful of this chat comment a two months ago:

Reality of where storage is going these days. Gone are the days where 90%+ of your horsepower was needed to "do storage". Now we are in a situation where even a box the size of a Mac Mini has SO much extra horse power to spare that it is just itching to run more workload directly.

Plus with the growing abundance of Apps / Services which are far more efficient with direct access to underlying storage, its kind of inevitable. Hosting Photos? Immich over an SMB/NFS share would be ridiculous. Direct access to the data makes for so much better of an experience and with 90% of my CPU to spare, why not?

And this post this week:

iXsystems risk the downside of vendor lock in with none of the positives (can’t even promote specific hardware or profit from the lockin/lockout with no control over it). After all, using a Truenas system as the NVR is perhaps the lowest hanging fruit available for that compute power, as every business needs security cameras, for a company that exists to sell storage capacity.

It feels to me like there is more work to do, and given the community are the testers the road map kinda needs to be communicated to us ahead of anyone else.

I’m not sure what other peoples views and thoughts are, but I welcome any input from iX on the paths forward with GPU (and TPU) integration for AI and apps.

I’m not sure how to take this statement. Nvidia passthrough works great here, I’m using it for my ollama. Intel GPU passthrough works just fine as well, not TrueNAS’s issue that ollama hasn’t added proper support for Intel GPU acceleration for Ollama, plenty of other apps leverage iGPU and ARC descrete cards just fine, long with AMD as you already pointed out.

If you have specific problems with specific configurations, then we need debugs and tickets to examine and figure out whats going on host-side. If its issues with specific applications not being able to leverage your particular GPU of choice, then thats more of an upstream issue.

Corral TPU is another story, and one we should address at some point on our end though :slight_smile:

The one way I don’t want this to be take is accusatory! Blame Morgan, he put me up to it, it’s all his fault you’re stuck replying to me on Sunday morning! j/k

Thanks for the post. It’s good to hear that nvidia cards are working: my eye was drawn to the “Passthrough available (non-nvidia) GPUs option”, and short of dismantling two boxes I had no way to test.

I agree wholeheartedly with the Intel GPU issue being nothing to do with iX in terms of “fault”, and frankly the AMD issue is more likely to be 90% my own incompetence and 10% AMD hating RDNA development. That “If” at the start of the statement should be bolded, underlined and in size 72 font.

I must admit to keeping an eye out to see if that 24GB B580 rumour turns out to be true.

But the acknowledgement re: Coral is welcome :slight_smile: I think iX would make many people happy with that resolution, more so than anything else GPU TBH.

Can you run rocm-smi in a shell in the container?

You mean in the app screen, choose frigate, shell, run the command? Or in the cli? Or something else entirely?

I meant the container shell as described under the Workloads widget:

But I thought this was about ollama, not frigate…

1 Like

Sorry, Frigate on the brainer tbh and cooking/eating dinner

Output /bin/sh: 1: rocm-smi: not found.

Maybe the path isn’t set appropriately when you enter the container like that.
Do you see anything if you run ls -al /opt/rocm/bin in the container?

If you see rocm-smi there, try using the full path: /opt/rocm/bin/rocm-smi

No such file or directory :frowning:

Then I don’t know.
It looked like the ollama-rocm container would include the rocm-dev package.

Maybe it does, but my rudimentary Docker knowledge keeps me from fully understanding the workflow. As such I don’t know where it’s actually installed inside the container.

Thanks, you’ve already taught me two things I didn’t know :slight_smile:

I’m responsible for a lot…

But in this case, I was recommending getting specific on support for your GPU.

We need to decide whether its TrueNAS issue, Linux issue or App issue

To be fair, I wrote the post in the knowledge that one reply should be “Is this really true?” And it would be a fair amswer.

Because AMD rocm support in RDNA is appalling, and I agree that this is not a Truenas issue (although I’d like to know how to find out which version of rocm Fangtooth is using, if it is 6.4.0, then no wonder I’ve been finding it difficult as that version only supports 7900 cards, so there isn’t much to integrate). And the Intel support is an Ollama issue. I’ve already accepted that this is likely not within the community knowledge base to fix because it isn’t broken, it’s working as designed.

If that leaves Nvidia as the only viable option, well, so be it. I’ll just have to swap graphics cards around again (4070 super is in use in main gaming PC but when I don’t have time to game what’s the point, right?)

But Coral drivers have been a feature request for months, with significant community support and no iX response. Getting them into Truenas reduces the need for GPU integration (and is a low power solution).

Its a multi-stage process to get features added:

  1. Identify the requirement and what is needed technically
  2. Size the demand (Feature Request)
  3. Schedule in a TrueNAS release (2 releases a year so - 6-12 months)
  4. Develop code and middleware integration
  5. Test QA and BETA

Some of the features happen because of upstream improvements in Linux kernel or Debian.

https://tracker.debian.org/pkg/rocminfo

Some web searching found that rocm on debian bookworm is an ongoing issue. Reddit - The heart of the internet