HDD pool (Seagate Exos) waking up periodically despite standby / EPC tuning

Hi,

I’m trying to debug regular wake-ups on a cold-storage HDD pool on TrueNAS SCALE.

My goal is for this HDD pool to stay spun down during the day and only wake up during nightly backup/replication jobs.

Setup

  • TrueNAS SCALE
  • Main pool: FastPool (SSD pool, active)
  • Cold pool: SlowPool
  • SlowPool = mirror of 2x Seagate Exos 28TB
  • ST28000NM000C-3WM103
  • SN04 firmware
  • boot-pool is on SSDs
  • System dataset is not on SlowPool

Problem
The Exos drives do enter standby correctly, but they wake up regularly during the day even though the pool should be mostly idle outside the nightly backup window.

What I already checked

  • `atime=off` on SlowPool
  • compression = lz4
  • pool health is clean
  • SMART health is clean
  • no reallocated/pending/uncorrectable sectors
  • EPC tuned with openSeaChest:
  • Idle_B disabled
  • Idle_C disabled
  • Standby_Z enabled at 300 seconds
  • manual spin down works correctly, and `smartctl -n standby` then reports STANDBY

Current relevant ZFS properties on SlowPool

  • mountpoint=/mnt/SlowPool
  • compression=lz4
  • atime=off
  • relatime=on
  • sharesmb=off
  • sharenfs=off
  • sync=standard
  • primarycache=all
  • secondarycache=all

Pool status

  • FastPool: ONLINE, no errors
  • SlowPool: ONLINE, no errors
  • boot-pool: ONLINE, no errors

Observed activity
From `zpool iostat -v 5`, I occasionally see small but real I/O on SlowPool even when the system is otherwise mostly idle:

  • around 3 read ops / 1 write op
  • around 1.36M read / 646K write
    Then it goes back to 0/0.

SMART / health
Both disks look healthy:

  • SMART overall-health: PASSED
  • Reallocated_Sector_Ct = 0
  • Current_Pending_Sector = 0
  • Offline_Uncorrectable = 0
  • UDMA_CRC_Error_Count = 0
  • no errors logged

Wake-up / cycle data
A while ago I had:

  • Start_Stop_Count ~2025 / 2008 at 2462 Power_On_Hours

Now I have:

  • /dev/sdb Start_Stop_Count = 2560, Power_On_Hours = 3069
  • /dev/sdc Start_Stop_Count = 2576, Power_On_Hours = 3069

So the disks are still definitely cycling.

Snapshots
SlowPool contains replicated backup datasets and many snapshots, mostly created nightly at 02:00, which explains nighttime activity but not the daytime wake-ups.

Running services

  • cifs RUNNING
  • nfs RUNNING
  • ssh RUNNING

Questions

  • What would you investigate next on SCALE to identify what is periodically waking these disks?
  • Is this likely middlewared / reporting / collectd / netdata related?
  • Can SMB or NFS services wake disks even if the pool itself is not actively shared?
  • Are there known SCALE background tasks that can generate small periodic reads/writes on an otherwise idle ZFS pool?
  • What is the best way to trace exactly which process is issuing I/O to this pool?

Thanks

Since 25.10 spinndown is broken because a new kernel module to monitor temps was added.

@LarsR Thanks for the information ! Any fix to be expected on next major version ? (I think in April?)

I already filed a bug report for 25.10 beta, and recently a feature request:

HDD Standby feature request

No reaction from the Truenas team yet.

As I understand, I don’t have exactly the problem of drives not spinning down.

As EXOS are enterprise drive, it uses EPC and not APM to spindown.

I’m able to spindown drives without any problems. The problem is wake ups, that occurs more or less every hour and I’m not able to understand why…

Trurnas also changed the release schedule from 6 months to 12 months, so next major update will be more like August/September. The first beta of truenas 26 is out for 1 or 2 weeks now. But as far as I’ve seen it’s not fixed in the beta

The problem is that the kernel module probes the drives for temp data so the truenas GUI can display them in the reporting tab. And that probing causes disks to wake up

1 Like

@T13nou which version of Truenas are you running?

Version:

25.10.1 - Goldeye

On what pool is your system dataset?

On the boot-pool (2 SSD mirroring)

One interesting thing is that it wakes up every 1h30 precisely.

It feels like a rolling process is occuring.

I’ve disabled SMB and NFS shares and it’s the same, still waking up every 1h30.

All my snapshots are scheduled to happen once a day at night.

I’ve got one replication PULL from off site TrueNAS server and also happens one a day at night.

When it happens it’s really really read and write, like a control check or something like that

The only cron job I have is related to smart and not runs that often

And finally on my SlowPool, I’ve got no shares activated (neither SMB or NFS)

Maybe it is this that causes it?

Not sure that polling is enough to wake up drives, but the 1h30 would fit…

1 Like

Wow thanks ! it feels like 90 minutes, is exactly my case… Now I’ll need to find out how to increase this polling :slight_smile:

As far as I understand it, it’s a hard coded backup process that can’t be changed/edited

That 90min is the smartctl polling job, but that is not the only thing awaking disks without actual use.

Before 25.10 that could be set to not awake disks in standby with the power mode option: S.M.A.R.T. Service Screen | TrueNAS Documentation Hub
But that option was removed from 25.10.

See HDD Sleep/Spindown/Standby - #71 by ark for a patch that solves the known related issues with 25.10.

I have similar setup with EXOS drives and with the patch they only wake up when I use data from there or during scrubs.

2 Likes

Works perfectly :smile: I’ve applied the patch and got a spin down all the time except during backups of course thanks @ark ! Does it still work with new update ?

Yes, same patch file works up to 25.10.3.

None of the affected python files changed with the updates to 25.10.
26 will probably need a new patch file, will check that when GA is released.

And it needs to be reapplied after update or I don’t touch anything ?

Yes, every time you update you need to redo the patch after rebooting, because each update has it own copy of the files.

You can see the different copies with “zfs list | grep ROOT”.