Supermicro FAN Control!

This is the config section from my version of the hybrid fan control script that is running on my Node 304.

###############################################################################################
## CONFIGURATION
################

## DEBUG LEVEL
## 0 means no debugging. 1,2,3,4 provide more verbosity
## You should run this script in at least level 1 to verify its working correctly on your system
$debug = 1;

## CPU THRESHOLD TEMPS
## A modern CPU can heat up from 35C to 60C in a second or two. The fan duty cycle is set based on this
$high_cpu_temp = 70;		# will go HIGH when we hit
$med_cpu_temp = 60;	 	# will go MEDIUM when we hit, or drop below again
$low_cpu_temp = 50;		# will go LOW when we fall below 35 again

## HD THRESHOLD TEMPS
## HDs change temperature slowly. 
## This is the temperature that we regard as being uncomfortable. The higher this is the
## more silent your system.
## Note, it is possible for your HDs to go above this... but if your cooling is good, they shouldn't.
$hd_max_allowed_temp = 40;	# celsius. you will hit 100% duty cycle when you HDs hit this temp.

## CPU TEMP TO OVERRIDE HD FANS
## when the CPU climbs above this temperature, the HD fans will be overridden
## this prevents the HD fans from spinning up when the CPU fans are capable of providing 
## sufficient cooling.
$cpu_hd_override_temp = 75;

## CPU/HD SHARED COOLING
## If your HD fans contribute to the cooling of your CPU you should set this value.
## It will mean when you CPU heats up your HD fans will be turned up to help cool the
## case/cpu. This would only not apply if your HDs and fans are in a separate thermal compartment.
$hd_fans_cool_cpu = 1;		# 1 if the hd fans should spin up to cool the cpu, 0 otherwise


#######################
## FAN CONFIGURATION
####################

## FAN SPEEDS
## You need to determine the actual max fan speeds that are achieved by the fans
## Connected to the cpu_fan_header and the hd_fan_header.
## These values are used to verify high/low fan speeds and trigger a BMC reset if necessary.
$cpu_max_fan_speed 	= 6500;
$hd_max_fan_speed 	= 1800;


## CPU FAN DUTY LEVELS
## These levels are used to control the CPU fans
$fan_duty_high	= 100;		# percentage on, ie 100% is full speed.
$fan_duty_med 	= 70; # was 70
$fan_duty_low 	= 30;

## HD FAN DUTY LEVELS
## These levels are used to control the HD fans
$hd_fan_duty_high 	= 100;	# percentage on, ie 100% is full speed.
$hd_fan_duty_med_high 	= 85;
$hd_fan_duty_med_low	= 50;
$hd_fan_duty_low 	= 30;	# some 120mm fans stall below 30.


## FAN ZONES
# Your CPU/case fans should probably be connected to the main fan sockets, which are in fan zone zero
# Your HD fans should be connected to FANA which is in Zone 1
# You could switch the CPU/HD fans around, as long as you change the zones and fan header configurations.
#
# 0 = FAN1..5
# 1 = FANA
$cpu_fan_zone = 1;
$hd_fan_zone = 0;


## FAN HEADERS
## these are the fan headers which are used to verify the fan zone is high. FAN1+ are all in Zone 0, FANA is Zone 1.
## cpu_fan_header should be in the cpu_fan_zone
## hd_fan_header should be in the hd_fan_zone
$cpu_fan_header = "FAN4";	
$hd_fan_header = "FAN1";



################
## MISC
#######

## PLATFORM
## The platform is either "FreeBSD" or "Linux", and is determined by calling uname
$platform = `/usr/bin/uname`; # "FreeBSD" when on CORE or "Linux" on SCALE.
chomp $platform;

## IPMITOOL PATH
## ipmitool is used to invoke the IPMI tool to access the SuperMicro BMC
$ipmitool = "ipmitool";

## uncomment the following line, and replace HOST/ADMIN/PASSWORD with your IPMI credentials to access IPMI over the network 
#$ipmitool = "$impitool -I lanplus -H 192.168.1.209 -U <username> -P <password>";	# network access, necessary when running in a VM

## HD POLLING INTERVAL
## The controller will only poll the harddrives periodically. Since hard drives change temperature slowly
## this is a good thing. 180 seconds is a good value.
$hd_polling_interval = 180;	# seconds

## FAN SPEED CHANGE DELAY TIME
## It takes the fans a few seconds to change speeds, we allow a grace before verifying. If we fail the verify
## we'll reset the BMC
$fan_speed_change_delay = 10; # seconds

## BMC REBOOT TIME
## It takes the BMC a number of seconds to reset and start providing sensible output. We'll only
## Reset the BMC if its still providing rubbish after this time.
$bmc_reboot_grace_time = 120; # seconds

## BMC RETRIES BEFORE REBOOTING
## We verify high/low of fans, and if they're not where they should be we reboot the BMC after so many failures
$bmc_fail_threshold	= 1; 	# will retry n times before rebooting

# edit nothing below this line
########################################################################################################################

If you have the same build as me, with the same fans, connected to the same headers, this should work fairly well.

Meanwhile, these are my current IPMI thresholds as per sensor list all

root@chronus[/mnt/tank/server/scripts]# ipmitool sensor list all | grep FAN
FAN1             | 1000.000   | RPM        | ok    | 100.000   | 200.000   | 300.000   | 25300.000 | 25400.000 | 25500.000 
FAN2             | 800.000    | RPM        | ok    | 0.000     | 100.000   | 200.000   | 25300.000 | 25400.000 | 25500.000 
FAN3             | 1000.000   | RPM        | ok    | 100.000   | 200.000   | 300.000   | 25300.000 | 25400.000 | 25500.000 
FAN4             | 3500.000   | RPM        | ok    | 300.000   | 500.000   | 700.000   | 25300.000 | 25400.000 | 25500.000 

and it shows you the current fan speeds too :stuck_out_tongue:

And this is after I added the heatsink to my boot m.2

Also, I added support in my version on this platform for monitoring the m.2 slot’s nvme.

This is a modification to the get_cpu_temp_direct function

sub get_cpu_temp_direct
{
    # the following command needs to return a list of temps for the cores, output is something like "50.0\n51.0\n"
        my $core_temps = $platform eq "FreeBSD" ?
                `sysctl -a dev.cpu | egrep -E \"dev.cpu\.[0-9]+\.temperature\" | awk '{print \$2}' | sed 's/.\$//'`
        :
                `sensors -A coretemp-isa-0000 | egrep 'Package id [0-9]:' | awk '{print \$4}' | sed 's/[^0-9\.]*//g'`
        ;

        # the below line adds the temp sensors from nvme0 (boot drive) to the list. The CPU fan cools these drives
        # so I want them to go into the core temps list too.
        my $nvme_temps = `smartctl -a /dev/nvme0 | grep "Temperature Sensor" |  awk '{print \$4}'`;
        $core_temps = $core_temps.$nvme_temps;

        chomp($core_temps);

        dprint(3,"core_temps:\n$core_temps\n");

Which basically makes it so the nvme temps count as a cpu core temp for the purpose of controlling the CPU fan… which is what cools the m.2.

Works quite well when the heatsink is installed.

1 Like