Unwanted spindown on 2 out of 8 disks

tokyo · October 17, 2024, 12:22pm

Hi, 2 days ago, 2 out of my 8 disks started to spin down after 3 minutes of idling. The other 6 disks are fine and are permanently spinning. This happened after rebooting the VM containing TrueNAS and rebooting the host machine (Proxmox). Previously, I had performed these kinds of graceful reboots, and everything was fine. (I shut down TrueNAS via the GUI and waited until it stopped, then I did a reboot via the Proxmox GUI.)

I’m running TrueNAS Core TrueNAS-13.0-U6.2 in a Proxmox 8.2.7 VM on a Dell T320 with an HBA RAID controller in passthrough mode, which basically means that Proxmox doesn’t even know the disks are connected—only TrueNAS sees and manages them.

I have 1 RAIDZ2 pool containing 8x 12TB HDDs. Each disk is configured in the same way:

HDD Standby: Always On
Advanced Power Management: Disabled
Force HDD Standby: Disabled (unchecked/empty box)
Acoustic Level: Disabled
Enable SMART: Enabled (box checked)

I tried rebooting again, I tried setting HDD Standby to 5 minutes and then back to Always On, I did a scrub, but still, after 3 minutes of idle on the disks, disk /dev/da6 and /dev/da7 spin down (I can hear them spin down, plus I see the power consumption drop on a meter and I can see much higher number in SMART on those 2 disks in: Start_Stop_Count and Power_Cycle_Count).

No SMART problems or errors.

I applied a temporary fix by writing to the disk every 2 minutes, but that’s just a workaround. I’d like to fix it so that all the disks stay permanently spinning.

How can I start to debug this problem? Where should I look?

Let me know if you need anything else!

tokyo · October 20, 2024, 11:12am

any idea whats going on in here?

etorix · October 20, 2024, 1:43pm

This, unfortunately, is not quite the case. See this thread

and act accordingly to blacklist the HBA, disable Proxmox scanning, or moving TrueNAS bare metal. Also, it is really a “RAID controller”? Unrelated to your issue, this is not good either.

Chris_Holzer · October 28, 2024, 12:15pm

If Proxmox grabs the HBA and/or the disks, then (in my experience) the TrueNAS VM will not start as the passthrough fails since the HBA is in use by proxmox and the sas driver loaded.

If proxmox would import the ZFS pool then it would show up inside proxmox.
@tokyo did you check that? I am 99.9% sure that this is not the case.

(making sure that then card ID does not load the sas driver but the vifo driver
on the Proxmox host is highly recommended for passthrough though)

To me the issue of the OP sounds more like there is something wrong with these 2 disks, like the spinddown delay being enforced onto the drive config (which used to be possible with specific tools).
Or the HBA is misbehaving for some reason - never seen that kind of issue tho.

tokyo · October 28, 2024, 10:33pm

@etorix @Chris_Holzer I do not know what was the issue, but downgrading to proxmox 8.2.2 from 8.2.7 fixed the issue and the disks no longer spin down.

It was not the issue where proxmox would take over/import my zfs pool because the pool was always successfully imported by Truenas, those 2 disks also worked fine and i could read/write data to those 2 disks, they just spin down in idle. I also tried the trick with disabling zfs import service on proxmox, and blacklisting HBA card on proxmox host but that also didn’t help my issue (I guess because i had no problems with importing the pool)

Chris_Holzer · October 29, 2024, 6:20am

Thanks for the info! That is very interesting.

I am still running 8.1.3 on my production system. Will see if i can do a test with 8.2.2 vs 8.2.7 on my test system. Maybe I can reproduce the behaviour.

Would be cool if you could report this in the Proxmox forums too.

Chris_Holzer · October 30, 2024, 8:50pm

I just tried to reproduce this on my test machine with PVE 8.2.7 an LSI SAS 9300-8i HBA with 8 HDDs attached and TrueNAS Scale 24.10.

After several reboots and now 2 hours of uptime, non of the HDDs spun down.

So maybe this is a TrueNas Core specific issue. I will see if I can run a test with that during the weekend.

tokyo · October 30, 2024, 9:38pm

Well, what can I say—it’s working for me after that downgrade, and I have zero issues, so for now, I have no plans to upgrade Proxmox. Thanks for trying to reproduce the issues on your end—I appreciate it!

tokyo · December 30, 2024, 4:08pm

Update

It worked fine since my last reply. Today, I performed a planned and safe reboot, but since then, the problem has returned. After 4 minutes 30 seconds of idling, I can hear four disks spinning down. The TrueNAS version has not changed, Proxmox updates are disabled, and I have not performed any manual updates to any software since my last reply.