Intel ARC temperature and fan control sensor support

Problem/Justification
(What is the problem you are trying to solve with this feature/improvement or why should it be considered?)

A lot of people use TrueNAS Scale for streaming media and use ARC GPUs to transcode.
Currently there is no way I’d know of to monitor fan speed and temperature.
Fan control for these is very wonky so it would be beneficial to be able to monitor fan speed and gpu temperature.

Impact
(How is this feature going to impact all TrueNAS users? What are the benefits and advantages? Are there disadvantages?)
It would not affect all TrueNAS users. Just the ones using their NASes for streaming media, although there are a lot of those.
It would make monitoring these excellent transcoding GPUs much easier.

User Story
(Please give a short description on how you envision some user taking advantage of this feature, what are the steps a user will follow to accomplish it)

I would personally be able to try and create a custom cooling solution for my Sparkle A310 Eco. I have not yet tried it because I would not be able to control the fan based on GPU temperature without knowing it.

A patch was added to the i915 driver quite recently.
It would have to be backported and integrated into the TrueNAS Scale kernel.

I wholly understand if this feature request is dismissed, as transcoding is not one of the primary uses of a TrueNAS system and the use to enterprise customers would likely be null. I still think a lot of users would benefit from this and hope, this may be considered.

1 Like

Fan speed monitoring (not control) should show up in kernel 6.12, which is hoping to be included in Fangtooth (25.04) but this is an additional feature ask.

3 Likes

That‘s nice to know.
I couldn‘t find kernel version yet on the fangtooth release page.
Temperature sensing would be way more important for my specific usecase though.

You can expect some of that information to start getting built out before the end of the year.

that’s also great to hear, I’m excited what’s coming with 25.04.
24.10 was probably the most useful update for me since I started using TrueNAS (or rather FreeNAS back then).

I’m just curious: I see you have a Ryzen Pro 4650G APU, which is exactly the same as my system. I’m just using the built-in iGPU for transcoding, and it seems to work fine (in fact, this was exactly why I got this CPU in the first place, so I wouldn’t need an external GPU and could also use ECC memory). Why did you add an extra GPU card?

Tone mapping for HDR content, (arguably) higher quality transcoding, ability to simultaneously transcode more streams, more support for various codecs (ie: av1), have budget & want to fill up available pcie slots, etc

As @Fleshmauler said: HDR tonemapping, AV1 and overall speed but most importantly: the iGPU couldn‘t be properly forwarded to my Ubuntu VM which I used before TrueNAS could properly handle a docker stack by itself.

1 Like

I see, thanks for the reply. Out of curiosity, do you have any idea what the idle power draw on that Intel GPU card is?

Personally, I haven’t had to deal with AV1 video much yet (maybe it’ll be an issue in the future though), but the HDR tonemapping could become an issue, I’m not sure yet.

With ASPM enabled it’s basically negligible I think 1-3W is what I‘d guess.

1 Like

I am out of votes, but owning an Arc GPU myself, this would be a welcome addition.

1 Like

I just updated my NAS to Fangtooth and I don’t get a temperature reading via sensors:

i915-pci-0300
Adapter: PCI adapter
in0:           0.00 V
fan1:        3035 RPM
power1:           N/A  (max =  31.25 W)
energy1:      29.18 kJ

Edit: ah damn 6.12 only brought fan speed reporting…

From sensors we can see the fan is rammping up and down, but there is no way to controll it for now, which is said.

I just recently came across this thread on the Jellyfin forums where @TheDreadPirate wrote an excellent guide on how to update the ARC firmware on linux.

I followed this specific post for my Sparkle A310 Eco and the fan isn’t ramping up and down anymore!

My feature request is still relevant (or so I think) as temperature and fan speed reporting (and control) would still be very useful for monitoring or maybe noise tuning.

3 Likes

Thank you for the update, I will give it a try when upgrading my instance to 25.04.2

1 Like

Thanks for the info @TheColin21.

I didn’t follow that method, because I had a spare Sparke A310 ECO and a Dell R730 I use as a test chassis. I initially downloaded the Windows Server 2022 180-day Evaluation and installed that. DON’T DO THAT! The Sparkle card is not supported under Windows Server, and blue-screens and crashes when you try to install the Intel driver.

I then downloaded Windows 11 Enterprise (90-day evaluation), and burned it to a USB stick using Rufus with all the “hacks” enabled (disable Bitlocker, disable TPM 2.0 check, create local account etc). After the installation, and after installing all the Windows Updates, I ran the Arc Firmware Tool (Windows tool) version 1.44, in order to get a baseline on the stock firmwares. Note that I didn’t use this tool for any updates - it was purely to check driver/firmware/oprom versions.

I then went to Intel® Arc™ A310 Graphics and pulled down the current latest version 32.0.101.6972 “gfx_win_101.6972.exe” driver, which also contains firmware updates.

I installed that, and then checked the firmwares afterwards using the aforementioned Arc Firmware Tool. Diffing the logs pre and post, l I can see the main things that changed in terms of firmwares / oproms are as follows:

FW Version: DG02_2.2353 (old)
FW Version: DG02_2.2357 (new)
OPROM CODE Version: 14 00 2C 04 00 00 00 00 (old)
OPROM CODE Version: 14 00 31 04 00 00 00 00 (new)
OPROM DATA Version: 14 00 2C 04 00 00 00 00 (old and new / no change)
Device: Fw Data Version: Format 1, Major Version: 101, OEM Manufacturing Data Version: 2, Major VCN: 1 (old)
Device: Fw Data Version: Format 1, Major Version: 101, OEM Manufacturing Data Version: 3, Major VCN: 1 (new)

The specific link that you linked to above does all of these, except for the OPROM Code version. It’s incorrect in saying there’s no OPROM change - there is a change. If you want to get the equivalent upgrade to a native Intel software installation on Windows 11, then I recommend also upgrading the OPROM Code version, which you can extract from the Windows file download. (in gfx_win_101.6972\Graphics\ifwi\acm)

The OPROM helps the motherboard recognize and initialize the graphics card before the operating system loads (during Post). If your system uses a legacy BIOS (not UEFI), the OPROM ensures that the graphics card can output video during boot, allowing you to see the POST screen and access BIOS settings. In UEFI mode though (which most people should be using), the OPROM is not used after post, as it uses UEFI drivers instead.

Sparkle use Intel’s default ROMs, which include OPROM, which is why it was updated when updating using the Intel driver. For best compatibility I recommend upgrading OPROM as well - especially if using legacy BIOS (not UEFI). If you’re not using the display output from the card, and just using it for transcode, then it probably doesn’t matter. If it works, it works :slight_smile: But I like to apply all the updates, rather than skipping some.

1 Like

Thanks for the extensive description :smiley:

I could not find a newer OPROM version. Could you dump it?
I would’ve flashed it if I found a newer one.

Edit: nvm I didn’t read properly. Overlooked this:

My board does have UEFI and I would actually like it not to output to the A310 but that’s another story

I just updated a couple of my Sparkle A310 ECO cards in production boxes. They are Proxmox VE 8.4 but unfortunately the ldd version in that is too old (Debian 12 based), and so instead I booted off a recent version of Ubuntu Desktop live CD. Here’s my text file of my commands for doing the whole thing.

Find device path

It will usually be “/dev/mei1” (or “/dev/mei2”)

sudo ./igsc list-devices

Check versions

sudo ./igsc fw version --device /dev/mei1
sudo ./igsc oprom-data version --device /dev/mei1
sudo ./igsc oprom-code version --device /dev/mei1
sudo ./igsc fw-data version --device /dev/mei1

FLASHING

Flash Firmware Code

sudo ./igsc fw update --device /dev/mei1 --image ./fwcode/dg2_gfx_fwupdate_SOC2.bin

No update currently for OPROM Data (for Sparkle A310 ECO)

Flash OPROM Code

sudo ./igsc oprom-code update --device /dev/mei1 --image ./opromcode/dg2_c_oprom.rom

Flash Firmware Data

sudo ./igsc fw-data update --device /dev/mei1 --image ./fwdata/dg2_sparkle_lp-eco-a310_config-data.bin

Reboot and you’re done.

Note that I had to have the “igsc” and “libigsc.so.0” files in the directory I ran this from, making igsc executable. This is all stuff you know about, but maybe it’s good to have it all in the one place here, and someone finds it useful.

1 Like