Fan/temperature issues with virtualized TrueNAS

I’m having several fan/temperature related issues on my new build, and while I appreciate that most of them relate to my underlying system rather than TrueNAS itself, there is a TrueNAS question; also, you guys are generally really good about this sort of thing so if you can help me out more broadly, I’d be extremely grateful!

tl;dr: I have a new system where the fans/sensors seem to be exclusively controlled by the mobo, and I need to know if there’s any way to control this via software; also, I don’t know how to get the disk temperatures (which virtualized TrueNAS can access) anywhere else.

I’ve recently finished a new build, running Proxmox on an i7-14700k CPU and an ASRock Rack W680D4U-2L2T motherboard. TrueNAS Scale is running on a VM on this, with an LSI 9211-8i passed through (supporting 6 spinning drives), and two other PCIe devices passed through (for U.2 NVMe drives). I have a half-dozen other VMs (one running a dozen Docker apps, the rest very small) and containers. This is all working fine; there’s rarely a heavy load on the system.

The most TrueNAS-related fan question is about how I can relate the HD temperatures to the fans. The broader question is that I’m not clear on how to control the fans at all, in Proxmox.

I currently have a CPU fan and two case fans (that blow over the HD backplane), all plugged into separate fan headers on the motherboard. My initial surprise was that any time there was any stress on the CPU, all the fans would noisily spin up; I didn’t have time to investigate it. Eventually, I installed fancontrol and related utilities, which revealed that the CPU temps are consistently rather high (in the 60s, C) even with almost no load, and also that there’s no fan data coming to the sensors; sensors gives 0 RPM for every fan, and increasing the fan divisor (as is sometimes recommended) has no effect. So pwmconfig can’t do anything.

Poking into the various BIOS settings via the mobo’s BMC (and I’m really getting out of my depth here), I see that the BIOS at least knows that there are three fans plugged in, all of which are associated with the CPU sensor, which is why the case fans spin up when the CPU is under load. I’m exploring the open/closed loop control tables (I’m really out of my depth here). But meanwhile, the HD temperatures aren’t reported in the BIOS anywhere. TrueNAS shows that the HDs are at more or less fine temperatures–I assume it gets this directly from the disks–but I don’t know how to get this info to any other place. If I remove the case fans from the CPU temperature sensor, I don’t know where else to associate them. I also don’t know why the CPU temperature is so high even when the CPU fan (which is good, and which I’m fairly sure is correctly installed), is spinning.

So the main TrueNAS question would be if there’s a way to take the fan data that it has access to, and get that anywhere else. Broader questions would be why the motherboard sensors aren’t going through to Proxmox, and why the CPU is so hot, and, well, anything else I’m not understanding about the process. Thanks.

6 VMs + 12 Docker Containers + Apps is not almost no load. Any number of things could be happening in your system at any given point. Have you checked htop during these fan-ramps / temperature spikes to confirm that it is actually TrueNAS loading the system?

In terms of fan control & linking it to the disk temperatures. If you can’t control the fans connected directly to the motherboard via Linux, then you will either need to look into third-party solutions or rely on fan control via UEFI/BIOS. If you are unable to set your temperature source to a specific sensor (like a disk), then there isn’t really a whole lot you can do, though personally I’d be keeping it on CPU in any case, as letting the CPU run wild will invariably cause the rest of the system to heat up along with it.

I’ve checked the documentation for your motherboard, assuming W680D4U-2L2T is up to date, and could not find any mention of software control for the fans. Also couldn’t find any mention online. I’d consider looking at third-party fan controllers that have Linux support. Corsiar Commander Pro has had support since kernel version 5.9 and can likely be found for pretty cheap on Ebay. You may be able to pass this through to TrueNAS as well and control the fans there to get the integration with disk temps that you’re looking for - though I can’t say this is an area I’m familiar with.

When I ran my TrueNAS system virtualized under ESXI it was in a Supermicro board. Supermicro uses IPMI to control fans.

I could use TrueNas hosted fan scripts to connect to the IPMI over the network to control the baseboard’s fans.

Of course, if ASRock Rack doesn’t support BMC/IPMI fan control that’s irrelevant :slight_smile:

Yes, those pesky fan controls, Open Loop and Closed Loop drove me nuts as well.

First, I don’t know if yo can pass the hard drive temperatures through to the motherboard or even have Proxmox control the fans.

Second, I don’t know your fan arrangement however proper flow is critical. The air must flow evenly across the hard drives to cool them. If you can space them out that really helps. If you toss in a few photos then I can provide some good help.

As for fan speeds, concentrate on the CPU fan, it needs to work properly. If you are having 60C tems at idle, that is not good. You either have poor airflow through the case or the heatsink is not installed correctly. Fix that first. If TrueNAS it not throwing temperature alarms then maybe the drives are not too warm, check the SMART data to view the drive temps. Note which ones are warmer. If all your drives are 45C or less, that is good. Many prefer 40C and below. My drives tend to run at 42C, but my system is upstairs in a warm room.

So you have fixed the crazy CPU temps and now idle is at ~46C or less. But 60C is way too high.

Fan speeds. Ideally the CPU fan will be running at about 500 RPM when idle.
Next set your case fans to minimum speed, no open/closed loop, use manual setting so they start rotating once the power comes on.

You should be able to use other temperature sensors on the motherboard besides the CPU to control the case fans however it is just easier to fix them to a low speed. And make sure you component temps stay within specifications.

If you are unable to drop the hardware temp adequately then you may need to rethink your case arrangement or maybe just replace the case entirely. I have been known to modify my cases to relocate a fan or just add a fan. Whatever it take to get and maintain proper air flow usually means a quieter system.

Good Luck

1 Like

I would suggest looking at my Fan Control Script for a starting point. Since you are virtualizing TrueNAS you may need to substantially change the ipmiWrite and ipmiRead functions in order to communicate with the BMC. If you want the case fans to controlled by the HDD temps you likely will have to run the script from TrueNAS rather than Proxmox.

Most of the time the system CPU is at 1–2%, which seems like “almost no load” to me. The various VMs and containers are mostly idle. The fan-ramps have occurred when there has been a little more load, but this has typically been in the range of say 10–15%. Except for brief stress-testing, the CPU is never that, um, stressed.

This is clearly my next effort! The case doesn’t have great airflow, I knew that, but it should be good enough. (My old case has famously bad ventilation, and the disks are warmer than they should be, but the CPU is fine.) I don’t ever have the expectation that the CPU will really be hammered for long periods.

I’m not eager to tear everything down to redo the heatsink—I’m pretty sure it’s installed correctly—but it seems like the thing I need to check.

Meanwhile, I did try to change things around with the BMC settings, and it’s so complicated that I couldn’t figure it out; the simple things I did try to do (manual settings) didn’t work (at least, the temperature went even higher), so it’s back to the manual for this.

Yes, this at least is OK; TrueNAS reports that the spinning disks are in the range of 35–40C. Which also suggests that the overall airflow/ventilation can’t be that bad.

I appreciate all the responses. This is very frustrating to me. The idea of a third-party fan controller is worth considering, but if there’s an underlying problem I need to solve that first.

1 Like

The manual sucks! My ASUS manual barely mentions it and then I had to search the internet and eventually just test things out.

When you pack a few drives in there, if the airflow is not “distributed” across the surfaces then you get the heat buildup. It is not all about high airflow, it is more about moving air across the surfaces to pull the heat out. I had one case where two fans were pushing air across the hard drives but it wasn’t doing very well. The air was going around the hard drives through openings on the sides of the internal case. I added two foam tubes (what I had at the time) to shove into the space which now forced the airflow across the hard drives, things were cool again.

If you send me a few photos, I might be able to suggest some changes to make, or at least get your brain thinking like I do with respect to fluid dynamics.

But you also said the drives are in the range of 35-40C, that is not bad and acceptable. Now do a scrub if you have a lot of data and watch the temps go up, see how high they get. You can use my script and use the ‘-m’ switch and run the script every 10 minutes while the scrub is happening and have a set of data to use.

I have to run out the door, running late for a medical appointment. When I return I will post my BIOS setting for the Open/Closed loop thing. Hopefully it will help.

I appreciate that, and maybe I’ll try when I open things up. But it honestly doesn’t seem like there’s much I can do. The way this case (a Silverstone CS381, by the way, which I don’t seem to have mentioned) is set up, the HD cage is positioned closely over the CPU part of the mobo, so there’s not much room to do anything. I needed a fairly low profile fan, even.

I’ve looked at a few build videos of this case, and it doesn’t seem like I’ve done anything completely wrong.

I tried this, and after 10m of scrubbing, the drives were barely affected. One or two of them went up by 1 or 2 degrees, that’s it. The drives are all a little bit different (35, 37, 38 degrees), and some of them did go up a bit, or I would worry that the sensors are broken. But it appears that drive temperatures aren’t a major concern.

The only sensible explanation would seem to be that I really badly fucked up the heatsink connection. Which I guess is possible.

Coming back to this after a break. The original situation, if you recall, was that my new build—running Proxmox with TrueNAS SCALE in a VM—was running a very hot CPU (60s C) even unloaded; I rebuilt it with more attention to the cooling, and there was no difference. As posted just above, I thought that “[t]he only sensible explanation would seem to be that I really badly fucked up the heatsink connection”, but I eventually gave up worrying about it.

Well, as discussed elsewhere, I decided to move my install from Proxmox to bare metal. This is complete; I’m running SCALE on the original machine.

And: temperatures are totally fine. In the 30s C unloaded. And this is reported by TrueNAS, by the sensors command, by IPMI; it’s not my imagination, or Proxmox mis-reporting the temperature.

I have no idea why it would be the case that Proxmox would run so vastly hotter. It’s true that I don’t have any VMs running on the SCALE build, but I was getting temperatures in the 60s on the Proxmox build even when completely unloaded.

So, I guess this is a good result, and I guess this is one more reason to be glad at having ditched Proxmox.

2 Likes

Nice :slight_smile: