Has anyone else noticed that when running multiple SMART Long tests in SCALE it’s taking noticeably longer to complete Vs CORE?
Appreciate my example is rather extreme with 90 drives however I’ve schedule this on my CORE systems to run SMART Long on all my drives at the same time on a given date and never noticed an issue over many years however with my two new SCALE systems version 24.10.2 the tests are taking much longer and often never completing before they are due to run again (monthly).
A quick look online does suggest some fundamental differences in how FreeBSD and Linux run SMART tests.
Im going to try staggering them in batches of say 20 and see if that improves things but was curious to hear if anyone else had noticed this?
I’d strongly recommend you take a look at Multi-Report
It’s a handy script, written specifically for TruesNAS, that schedules SMART testing for the HDDs in your systems with the intention of spreading the tests over a calendar month. It emails you a report on your overall disk health each time it runs.
1 Like
Impossible, Okay, Possible, but only because the drives must be significantly more active on your CORE system. That is the only thing that will slow a SMART test down, priority goes to requested operations, SMART tests are at the bottom of the priority list.
Drive_Selftest (part of Multi-Report) will schedule all your drives for you, 90 drives is a lot. It can perform drive tests “All Daily”, Once a Week, or One a Month.
To see what it would do, grab the script form GitHub, run it using the -demo switch. That will give you the default setup output. Now what you will really want to see is -demo long 1 month which will let you know which drives would be Long tested on which day of the month. If you do not want testing on the weekend, edit the script settings to remove Saturday and Sunday testing, run the demo again.
The trick, for now, you must run the script everyday. If you miss a day, the drive(s) scheduled for that day are missed. I am working to resolve that, something I didn’t plan to need to do but it is in the next version which I hope will be out next month.
1 Like
Thanks all, it’s nice to just have a sounding board sometimes.
Mystery solved. It appears to be a difference in the drives firmware version.
On these new systems that are coincidentally running SCALE (hence my assumption) Im running 24TB Seagate EXOS SAS drives. Checking out the 180 drives (two systems) I have a mixture of version EE03, EE04 and EE05. It appears that EE03 doesn’t like doing long SMART tests and seemingly freezes at 100%. EE05 is quite happy and works as expected while EE04 seems a bit hit and miss.