HDD Reported Faulty in Truenas But OK in HD Sentinel

Recently my system has been reporting a faulty hard drive . I had reset the warnings once as a check but eventually the same drive became “faulty” again and the pool went into a degraded state. Using smartctl did seem to indicate the drive was “pre-fail”, “old age” etc. on a few attributes. I decided to replace the hard drive (WD Red 3Tb) with a WD Red Pro (6Tb).

NB: 3Tb versions are no longer available and the 6Tb WD Red Plus wasn’t readily availble at my local computer store.

I decided to do a quick recheck of the drive using HD Sentinel on my Windows machine. I did a full drive reinitialization [took 24 hours and includes wiping the desk] and the check came back the drive was ok. No issues. Output below.

I’m now somewhat confused. Was the drive actually faulty? If not, I can keep it aside as a spare. What could cause one system to report “pre-fail” whilst another is “a-ok!”? Is there any additional tests I can do to try and get a more determinitive outcome?

As I wasn’t expecting different outcomes, i did not keep a copy of the smartctl output from Truenas.

Thanks in advance

This may help?

1 Like

You don’t need to “keep” it: It’s still available on the drive. Get it !
(smartctl -a /dev/adaN or sdX on TrueNAS, whatever it takes on Windows)
I’d trust TrueNAS rather than a graphical utility on Windows to correctly report on potential issues.

HDD Sentinel is a good application, however I’m not sure why the sector map was presented instead of the SMART stats it shows prior to a scan.

@wraith what was Sentinel showing on the disk health page or SMART info? Why is it showing 746gb instead of 3tb? And finally…

8MB/s for the test is EXTREMELY slow. Is this one of those SMR drives? EDIT: WD30EFAX is supposed to be SMR and WD30EFRX CMR, so … Did you have it on a very slow old USB dock while you were doing a scan, or something?

Okedokey
Thanks for the link. Will have a look through and try those items out and revert.

@etorix
Thanks for this, didn’t realise it! I used HD Sentinel to get, what appears to be, the SMART history. Being a new user I’m unable to upload the text file; The report is 34000 characters long. But the reports has a list of Extended Self tests that were completed Successfully, extract below. Apologies with the formatting, struggling a little with the new forum format.

-- Summary SMART error log (Page: 1, subpage: 0) --

Version : 1
Most recent index : 0 [0]
No error information recorded

– Comprehensive SMART error log (Page: 2, subpage: 0) –

Version : 1
Most recent index : 0 [0]
Device error count : 0 [0]
No error information recorded

– Extended comprehensive SMART error log (Page: 3, subpage: 0) –

Version : 1
Most recent index : 0 [0]
Device error count : 0 [0]
No error information recorded

– Extended comprehensive SMART error log (Page: 3, subpage: 1) –

No entries found

– SMART self-test log (Page: 6, subpage: 0) –

Version : 1
Most recent index : 12 [+1]
No. Timestamp Test type Result Remaining LBA (Info)
1 5330 (222 days, 2 hours) Short Self-test Successfully Completed - -
2 5498 (229 days, 2 hours) Short Self-test Successfully Completed - -
3 5666 (236 days, 2 hours) Short Self-test Successfully Completed - -
4 5680 (236 days, 16 hours) Extended Self-test Successfully Completed - -
5 5834 (243 days, 2 hours) Short Self-test Successfully Completed - -
6 5835 (243 days, 3 hours) Short Self-test Successfully Completed - -
7 5835 (243 days, 3 hours) Conveyance Self-test Aborted By Host 80% 3444664496 (105)
8 5837 (243 days, 5 hours) Short Self-test Successfully Completed - -
9 5884 (245 days, 4 hours) Extended Self-test Successfully Completed - -
10 5890 (245 days, 10 hours) Short Self-test Successfully Completed - -
11 5928 (247 days, 0 hours) Extended Self-test Aborted By Host 90% 833072 (103)
12 5935 (247 days, 7 hours) Extended Self-test Successfully Completed - -
13 4160 (173 days, 8 hours) Short Self-test Successfully Completed - -
14 4327 (180 days, 7 hours) Short Self-test Successfully Completed - -
15 4359 (181 days, 15 hours) Extended Self-test Successfully Completed - -
16 4495 (187 days, 7 hours) Short Self-test Successfully Completed - -
17 4663 (194 days, 7 hours) Short Self-test Successfully Completed - -
18 4827 (201 days, 3 hours) Short Self-test Successfully Completed - -
19 4994 (208 days, 2 hours) Short Self-test Successfully Completed - -
20 5098 (212 days, 10 hours) Extended Self-test Successfully Completed - -
21 5162 (215 days, 2 hours) Short Self-test Successfully Completed

S.M.A.R.T. ------------

No. Attribute Thre… Value Worst Data Status Flags

1 Raw Read Error Rate 51 200 200 0 OK
3 Spin Up Time 21 178 177 6075
4 Start/Stop Count 0 100 100 149 OK (Always passing)
5 Reallocated Sectors Co… 140 200 200 0 OK
7 Seek Error Rate 0 200 200 0 OK (Always passing)
9 Power On Time Count 0 3 3 71488 OK (Always passing)
10 Spin Retry Count 0 100 100 0 OK (Always passing)
11 Drive Calibration Retr… 0 100 100 0 OK (Always passing)
12 Drive Power Cycle Count 0 100 100 147 OK (Always passing)
192 Power off Retract Cycl… 0 200 200 146 OK (Always passing)
193 Load/Unload Cycle Count 0 200 200 2 OK (Always passing)
194 Disk Temperature 0 120 101 30 OK (Always passing)
196 Reallocation Event Count 0 200 200 0 OK (Always passing)
197 Current Pending Sector… 0 200 200 0 OK (Always passing)
198 Off-Line Uncorrectable… 0 100 253 0 OK (Always passing)
199 Ultra ATA CRC Error Co… 0 200 200 0 OK (Always passing)
200 Write Error Rate 0 200 200 0 OK (Always passing)

Hi @Jorsher
The graphical was shown as wasn’t aware I could get the SMART history from the drive. Plus it visually showed all the “green” :slight_smile:
I also noted the drive capacity anomaly and thought the reinitialization would resolve that. It didn’t. A bit of hunting around and found this error is discussed on the HD Sentinel website and is a function of my older caddy, which is slow and can only support drives upto 2Tb. I’ll be getting a new drive caddy this weekend and try again. I assume this won’t impact SMART tests as these are run by the drive…?
- -

Hi All.

I got myself a new caddy that supports drives up to 16Tb. The hard drive’s 3Tb capacity is now correctly identifed. I repeated the surface reinitialization check via HD Sentinel and another SMART and:

  • HD Sentinel reinitialization reported no errors and that the disk is PERFECT. Picture attached for completeness. Getting speeds upto 40Mb/sec now :slight_smile:
  • SMART extended test was completed successfully. A few “aborted by users” but the more recent ones completed. Report from HD Sentinel attached.

Any thoughts?

Disk report 2024 09 12.txt (15.9 KB)

The drive looks OK from the SMART data. Might have been a problem with a cable or something transient. If your replacement drive on the same port and cable will also throw an error at some point, I’d strongly look at the cable.

Thanks @Alexey ,

I did consider a faulty cable but didn’t think that a faulty cable would distort the SMART results…? There are other drives in the case showing “old age” and “pre-fail” and I’m now wondering…

“Old age” and “Pre fail” are just SMART attribute tags. Meaning, “When Value is below Threshold for this attribute, the drive is considered old beyond design limit”. or “When Value is below Thresh for this one, the drive is going to fail soon”.

smartctl always displays the tags; to know if the actual condition happened, you need to see
VALUE below THRESH = condition exists now.
WORST below THRESH but VALUE above THRESH = condition existed at some point but improved now.

I scrolled up and I don’t see any smartctl output, only HD Sentinel. If you send some smartctl output from the actual unit, I can read it out for you.

Hi @Alexey

Is this it [about a third of the way down the .txt file]?

S.M.A.R.T.

No. Attribute Thre… Value Worst Data Status Flags
1 Raw Read Error Rate 51 200 200 0 OK Self Preserving, Error-Rate, Performance, Statistica…
3 Spin Up Time 21 181 177 5950 OK Self Preserving, Performance, Statistical, Critical
4 Start/Stop Count 0 100 100 160 OK (Always passing) Self Preserving, Event Count, Statistical
5 Reallocated Sectors Co… 140 200 200 0 OK Self Preserving, Event Count, Statistical, Critical
7 Seek Error Rate 0 200 200 0 OK (Always passing) Self Preserving, Error-Rate, Performance, Statistical
9 Power On Time Count 0 2 2 71632 OK (Always passing) Self Preserving, Event Count, Statistical
10 Spin Retry Count 0 100 100 0 OK (Always passing) Self Preserving, Event Count, Statistical
11 Drive Calibration Retr… 0 100 100 0 OK (Always passing) Self Preserving, Event Count, Statistical
12 Drive Power Cycle Count 0 100 100 152 OK (Always passing) Self Preserving, Event Count, Statistical
192 Power off Retract Cycl… 0 200 200 148 OK (Always passing) Self Preserving, Event Count, Statistical
193 Load/Unload Cycle Count 0 200 200 11 OK (Always passing) Self Preserving, Event Count, Statistical
194 Disk Temperature 0 118 101 32 OK (Always passing) Self Preserving, Statistical
196 Reallocation Event Count 0 200 200 0 OK (Always passing) Self Preserving, Event Count, Statistical
197 Current Pending Sector… 0 200 200 0 OK (Always passing) Self Preserving, Event Count, Statistical
198 Off-Line Uncorrectable… 0 100 253 0 OK (Always passing) Self Preserving, Event Count
199 Ultra ATA CRC Error Co… 0 200 200 0 OK (Always passing) Self Preserving, Event Count, Statistical
200 Write Error Rate 0 200 200 0 OK (Always passing) Error-Rate

which is Threshold 0, Value 2, is the only thing I find interesting, because it normally counts down from Value 100 and 2 roughly translates to “2% of design life span remaining”. Hour count of 71632 suggest the drive is 8 years old or thereabouts.

Interesting on the hour count. I’ve found the receipt for this very drive and I bought it early November 2012. It’s nearly 12 years old!