How broken is my drive?

  • I could not get it to complete a smart test at first

  • Ran badblocks (0/0/8192 errors)

  • After badblocks it passed a short and long smart test

Are the bad sectors fixed and the drive is usable for a little while more?
Are the errors real bad errors?

START OF INFORMATION SECTION ===
Model Family:     Seagate Desktop HDD.15
Device Model:     ST4000DM000-1F2168
Serial Number:    Z301GDZ1
LU WWN Device Id: 5 000c50 06676669c
Firmware Version: CC52
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5900 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database 7.3/5528
ATA Version is:   ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Sun Apr 20 23:57:40 2025 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
                                        was completed without error.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever 
                                        been run.
Total time to complete Offline 
data collection:                (  623) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine 
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        ( 523) minutes.
Conveyance self-test routine
recommended polling time:        (   2) minutes.
SCT capabilities:              (0x1085) SCT Status supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   116   099   006    Pre-fail  Always       -       111677008
  3 Spin_Up_Time            0x0003   092   091   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   098   098   020    Old_age   Always       -       2580
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   073   060   030    Pre-fail  Always       -       12959692552
  9 Power_On_Hours          0x0032   087   087   000    Old_age   Always       -       11894
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   099   099   020    Old_age   Always       -       1480
183 Runtime_Bad_Block       0x0032   001   001   000    Old_age   Always       -       166
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   001   001   000    Old_age   Always       -       507
188 Command_Timeout         0x0032   100   093   000    Old_age   Always       -       6 6 516
189 High_Fly_Writes         0x003a   096   096   000    Old_age   Always       -       4
190 Airflow_Temperature_Cel 0x0022   062   049   045    Old_age   Always       -       38 (Min/Max 22/40)
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       321
193 Load_Cycle_Count        0x0032   086   086   000    Old_age   Always       -       29716
194 Temperature_Celsius     0x0022   038   051   000    Old_age   Always       -       38 (0 12 0 0 0)
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   199   000    Old_age   Always       -       494
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       8081h+45m+12.332s
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       91890145154
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       101018261090

SMART Error Log Version: 1
ATA Error Count: 507 (device log contains only the most recent five errors)
        CR = Command Register [HEX]
        FR = Features Register [HEX]
        SC = Sector Count Register [HEX]
        SN = Sector Number Register [HEX]
        CL = Cylinder Low Register [HEX]
        CH = Cylinder High Register [HEX]
        DH = Device/Head Register [HEX]
        DC = Device Command Register [HEX]
        ER = Error register [HEX]
        ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 507 occurred at disk power-on lifetime: 11713 hours (488 days + 1 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 00 ff ff ff 4f 00      00:24:51.735  READ FPDMA QUEUED
  60 00 00 ff ff ff 4f 00      00:24:51.735  READ FPDMA QUEUED
  60 00 00 ff ff ff 4f 00      00:24:51.735  READ FPDMA QUEUED
  60 00 00 ff ff ff 4f 00      00:24:51.735  READ FPDMA QUEUED
  60 00 00 ff ff ff 4f 00      00:24:51.735  READ FPDMA QUEUED

Error 506 occurred at disk power-on lifetime: 11713 hours (488 days + 1 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 00 ff ff ff 4f 00      00:14:43.697  READ FPDMA QUEUED
  60 00 00 ff ff ff 4f 00      00:14:43.697  READ FPDMA QUEUED
  60 00 00 ff ff ff 4f 00      00:14:43.697  READ FPDMA QUEUED
  60 00 00 ff ff ff 4f 00      00:14:43.697  READ FPDMA QUEUED
  60 00 00 ff ff ff 4f 00      00:14:43.697  READ FPDMA QUEUED

Error 505 occurred at disk power-on lifetime: 11693 hours (487 days + 5 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 08 ff ff ff 4f 00      07:15:01.757  READ FPDMA QUEUED
  2f 00 01 10 00 00 e0 00      07:15:01.671  READ LOG EXT
  60 00 08 ff ff ff 4f 00      07:14:58.190  READ FPDMA QUEUED
  60 00 80 ff ff ff 4f 00      07:14:58.190  READ FPDMA QUEUED
  60 00 80 ff ff ff 4f 00      07:14:58.189  READ FPDMA QUEUED

Error 504 occurred at disk power-on lifetime: 11693 hours (487 days + 5 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 08 ff ff ff 4f 00      07:14:58.190  READ FPDMA QUEUED
  60 00 80 ff ff ff 4f 00      07:14:58.190  READ FPDMA QUEUED
  60 00 80 ff ff ff 4f 00      07:14:58.189  READ FPDMA QUEUED
  60 00 08 ff ff ff 4f 00      07:14:58.189  READ FPDMA QUEUED
  60 00 08 ff ff ff 4f 00      07:14:58.189  READ FPDMA QUEUED

Error 503 occurred at disk power-on lifetime: 11693 hours (487 days + 5 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 80 ff ff ff 4f 00      07:14:54.651  READ FPDMA QUEUED
  60 00 00 ff ff ff 4f 00      07:14:54.650  READ FPDMA QUEUED
  60 00 00 ff ff ff 4f 00      07:14:54.649  READ FPDMA QUEUED
  60 00 00 ff ff ff 4f 00      07:14:54.647  READ FPDMA QUEUED
  60 00 00 ff ff ff 4f 00      07:14:54.646  READ FPDMA QUEUED

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%     11888         -
# 2  Short offline       Completed without error       00%     11879         -
# 3  Extended offline    Aborted by host               90%     11876         -
# 4  Extended offline    Aborted by host               90%     11850         -
# 5  Conveyance offline  Completed: read failure       90%     11811         -
# 6  Extended offline    Completed: read failure       90%     11800         -
# 7  Extended offline    Completed: read failure       90%     11778         -
# 8  Extended offline    Completed: read failure       90%     11777         -
# 9  Extended offline    Completed: read failure       90%     11761         -
5 of 5 failed self-tests are outdated by newer successful extended offline self-test # 1

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

The above only provides legacy SMART information - try 'smartctl -x' for more

I don’t know how you arrive at this conclusion…
UDMA CRC errors rather point to cables and/or controller, and a quick search suggests that the drive is CMR. Seems fine to me, although way too small.

1 Like

These errors are not great:
187 Reported_Uncorrect 0x0032 001 001 000 Old_age Always - 507

Going by Backblaze’s interpretation, 187 is a common sign to a failing drive.

There’s also the history of the failed long SMART tests. The last one went okay but what happened during the previous 4 + failed conveyance test?

Drive looks dodgy to me, I recommend you monitor the numbers to see if they continue to go up and get a replacement strategy in place.

2 Likes

My guess, with very limited understanding, is that the bad sectors/blocks/parts of the drive did not get corrected or replaced until badblocks tried to write data to them, and that that is why the previous smart tests failed and the latest (after badblocks) passed.

Thanks for your advice, will monitor!

I would say that the drive is fine and as @etorix said, the UDMA_CRC Errors are likely a cable. However if those are not incrementing, the problem is likely an old one.

Note that the issue ID 187 is an old_age identified value, not pre-failure. You have no failure indications.

To address the SMART Test failures… Typically if you have a few failures (especially Extended test failures), these do not go away but in your case they did. Sounds crazy. I would advise you to run a weekly Long/Extended test and daily Short tests. Keep an eye on the drive and if the errors scroll off the list (meaning you have passes) then just keep running those tests. It is only just under 9 hours to run a Long test, well worth it.

An alternative option. Leave those test failures on the drive, RMA the drive.

2 Likes

thanks for your input!

will schedule weekly tests

it long out of warranty, kind of hoping it will fail so I need to buy a bigger drive, or you know, a few of them :upside_down_face:

Do not wait for actual failure, risk to your data, and the need to act in emergency. Get and burn-in new drives as you find opportunities.

Well since you put it that way…

Those SMART long/extended tests are hard failures, this drive should be replaced before it fails completely. And do not forget the other drives in your system which may be that old as well. :joy:

In all seriousness :clown_face:, I am surprised that the extended tests started to pass, I have never seen that before, not like what your drive shows. I really would replace that drive if you already had some plans to replace them anyway. As @etorix said above, don’t wait. Do it on your schedule, do not wait for a complete failure, it is never enjoyable when that happens.

1 Like