I can't stop disks from spinning down every 3 seconds - I want to stop them from spinning down

I have this issue that was driving me a bit insane for many months now. Enough to have me install Xpenology on my machine to get rid of it. I have two Segate Exos drives and they keep making this annoying noise, in unison every 3 seconds.

I always thought this was due to the system logs, or the apps, or something else flushing to the drive, and couldn’t rid rid of it, but now, I’ve finally figured out the noise stops when the disks spin. I got there by accident when I tried to run a SMART Long test, and during more than 1 day, the noise stopped.

Anyway, so I tried to mess with the options for disk Power Management, and to my surprise, they don’t do anything. I can put whatever I want in “Advanced Power Management”, and the disks continue doing the same.

Is this a bug on Truenas? Is there any work around to keep my drives from spinning down? (I’m pretty sure this also is ruining the drives, spinning down every 3 seconds).

Which version are you running?
Have you ever tried to spindown your drives? What is the spinup count in the latest smart report?

TN will not spin down your drives by default, so it has to be something set by yourself.

You want to have Always ON in HDD Standby parameter on TN side, as well as at least Level 128 in APM.

Some (typically consumer) drives are set to aggressively spin down in their firmware, although 3 seconds is quite quick…

What hardware do you have?

Hi, thank you for your answer.
I’ running latest version Dragonfish, but like I said this happened since a long time ago as well. About 1 year ago when I last tried TrueNAS.

And yes, I also have tried both Advanced Power Management disabled, 128, 254… and well, everything in there actually.

Also, this happens from the get go, I tried re-installing TrueNAS fresh, and the result is exactly the same.

The smart output is the following:

ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   077   064   044    Pre-fail  Always       -       54297628
  3 Spin_Up_Time            0x0003   092   090   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       128
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   083   061   045    Pre-fail  Always       -       192701054
  9 Power_On_Hours          0x0032   096   096   000    Old_age   Always       -       4199
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       76
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
188 Command_Timeout         0x0032   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   060   048   040    Old_age   Always       -       40 (Min/Max 38/40)
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       152
193 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       295
194 Temperature_Celsius     0x0022   040   052   000    Old_age   Always       -       40 (0 13 0 0 0)
195 Hardware_ECC_Recovered  0x001a   008   005   000    Old_age   Always       -       54297628
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0023   100   100   001    Pre-fail  Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       4143 (219 179 0)
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       83500498557
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       188277695749

Hi there.

It’s those Seagate Enterprise 12TB (which are actually Seagate Exos according to the output).

But like I said, this only happens on TrueNAS, I can get this to stop is installing other OSes. So, it’s something TrueNAS is trying here I can’t identify.

Please post the full smart output of one of the drives instead of the abridged version.

1 Like

Please post the full smart output of one of the drives instead of the abridged version.

Here it goes:

admin@nas-h[~]$ sudo smartctl -a /dev/sda
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.6.29-production+truenas] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     ST12000NM0127
Serial Number:    ZJV18271
LU WWN Device Id: 5 000c50 0b15ff0db
Firmware Version: G005
User Capacity:    12,000,138,625,024 bytes [12.0 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
Device is:        Not in smartctl database 7.3/5528
ATA Version is:   ACS-3 T13/2161-D revision 5
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Sun Jun 23 10:38:46 2024 WEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
                                        was completed without error.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (  575) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        (1114) minutes.
Conveyance self-test routine
recommended polling time:        (   2) minutes.
SCT capabilities:              (0x50bd) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   077   064   044    Pre-fail  Always       -       54305692
  3 Spin_Up_Time            0x0003   092   090   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       128
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   083   061   045    Pre-fail  Always       -       192721743
  9 Power_On_Hours          0x0032   096   096   000    Old_age   Always       -       4200
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       76
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
188 Command_Timeout         0x0032   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   060   048   040    Old_age   Always       -       40 (Min/Max 38/40)
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       152
193 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       295
194 Temperature_Celsius     0x0022   040   052   000    Old_age   Always       -       40 (0 13 0 0 0)
195 Hardware_ECC_Recovered  0x001a   008   005   000    Old_age   Always       -       54305692
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0023   100   100   001    Pre-fail  Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       4143 (116 254 0)
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       83500498557
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       188277703813

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%      4199         -
# 2  Extended offline    Aborted by host               10%      4199         -
# 3  Short offline       Completed without error       00%      4161         -
# 4  Extended offline    Aborted by host               90%      4154         -
# 5  Extended offline    Completed without error       00%      4021         -
# 6  Extended offline    Aborted by host               80%      1274         -
# 7  Short offline       Completed without error       00%        95         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

The above only provides legacy SMART information - try 'smartctl -x' for more

admin@nas-h[~]$

This does not look like a drive that’s spinning up and down every 3 seconds though.
Where is you system’s dataset?

This does not look like a drive that’s spinning up and down every 3 seconds though.

Maybe it’s not spinning, but the heads, in unison, make this seek noise every 3 seconds.

Where is you system’s dataset?

It’s in another drive, an NVME. I already tried the usual culprits for these cases. Change the dataset to an NVME, change the Apps dataset to an NVME. I even tried to change the zfs_txg_timeout to see if it did anything. The noise remains exactly the same unless the drive is fully spinning (doing a SMART test, for instance).

Agreed.
Not having heard Exos’ I am not familiar with with what they can sound like.
Something else is making the sound.

For a while I was thinking you may have SAS drives doing background medium scans. On repurposed enterprise drives those are often set to start automatically whenever the drive is otherwise idle.

But you Exos’ are SATA and do not appear to support that type of background scan.

1 Like

Well, I’ve just found out something that might be pertinent:

hdparm reports that APM is not supported. While this is very strange, it explains why changing the APM for the drives from the TrueNAS GUI didn’t do anything.

admin@nas-h[~]$ sudo hdparm -B 254 /dev/sdb

/dev/sdb:
 setting Advanced Power Management level to 0xfe (254)
SG_IO: bad/missing sense data, sb[]:  70 00 05 00 00 00 00 0a 04 51 40 fe 21 04 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 APM_level      = not supported

Take a look here, should solve your issue.

Could you elaborate on why?
None of the Idle settings trigger at the frequency described by the OP.
The shortest Idle mentioned in that reddit is 2 minutes.

The thread explains how to tune the drives behaviour with Seagate’s software. OP could try disabling either idle_b or plain everything.

Specific post I would suggest reading.

This is some interesting info, but unfortunately, it didn’t fix the issue.

Issuing a spin down with this utility does stop the noise. So at least that clears out that it’s not TrueNAS writing something to the disks that causes the noise.

I would run something like zpool iostat -vly 1 in a shell to see if you observe corresponding zfs activity when you hear the noise.

CTRL-c to exit the selfupdating stats.

If you do, there’s reason to look at what could be writing to your pool.

Nothing happening there (it’s the mainPool where those disks are).

But as you can see on my answer right before this one, I successfully managed to spindown the disks and they kept spun down. So, TrueNAS is not trying to write anything there.

                                            capacity     operations     bandwidth    total_wait     disk_wait    syncq_wait    asyncq_wait  scrub   trim  rebuild
pool                                      alloc   free   read  write   read  write   read  write   read  write   read  write   read  write   wait   wait   wait
----------------------------------------  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----
boot-pool                                 2.54G  99.5G      0     77      0  1.08M      -    9ms      -    1ms      -    1us      -    8ms      -      -      -
  nvme2n1p3                               2.54G  99.5G      0     77      0  1.08M      -    9ms      -    1ms      -    1us      -    8ms      -      -      -
----------------------------------------  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----
fastPool                                  1.66G   926G      0      0      0      0      -      -      -      -      -      -      -      -      -      -      -
  7547edc6-dce0-44ca-ae1a-d79d5a1dd7b3    1.66G   926G      0      0      0      0      -      -      -      -      -      -      -      -      -      -      -
----------------------------------------  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----
mainPool                                  5.95T  4.95T      0      0      0      0      -      -      -      -      -      -      -      -      -      -      -
  mirror-0                                5.95T  4.95T      0      0      0      0      -      -      -      -      -      -      -      -      -      -      -
    fc9f853e-77df-40c5-b1c3-18f063bbabc5      -      -      0      0      0      0      -      -      -      -      -      -      -      -      -      -      -
    fc35a953-bc5d-45c2-a82a-9e6326311e3f      -      -      0      0      0      0      -      -      -      -      -      -      -      -      -      -      -
----------------------------------------  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----