Currently Unreadable (pending) Sectors all over system

So my Truenas Build is new. This is quite a large system. I have been getting the 2 Currently unreadable (pending ) sectors error.
It is happening to random drives throughout the system.

The current layout is the following.
OS: 240gb ssd x2 in mirror hooked to Sata on MB
VM: 1tb nvme ssd x2 in mirror hooked to Nvme slots on MB
Array1: 4tb hdd x8 in RaidZ2 hooked via Adaptec 82885t Expander which is hooked up to a 9305-16E HBA running IT firmware.
All of these above are in a norco RPC-4308 Case

Array2: 4tb ssd x24, 3 vdev w/ 8x ssd’s in RaidZ2
Hooked up via Adaptec 82885t Expander which is Hooked up to the HBA above via a 8044 cable. This is in a standalone Norco RPC-4224 Case.

Array3: 4tb ssd x24, 3 vdev w/ 8x ssd’s in RaidZ2
Hooked up via Adaptec 82885t Expander which is Hooked up to the HBA above via a 8044 cable. This is in a standalone Norco RPC-4224 Case.

At first I thought it might be a bug with the sas system since it was only happening to the ssd’s hooked in to them, but then yesterday it happened to one of the hdd’s. Then today it happend to one of the OS ssd’s and VM ssd’s. I have ran tests on all of the drives that this pop up for and they all check out. Whats even more weird almost all of these drives are new.

Any input would be appreciated. I have scanned google and this forum, and I cant seem to find the any relevant info to this particular case.

Thanks Folks

Yes, you can have too much storage, it happens when you cannot afford the power bill. :slight_smile:

As for your little problem… perform the following (you may need to start the commands with sudo if not logged in as root:

  1. smartctl -x /dev/sdb and post the output in code tags </> above.
  2. Note the drive serial number.
  3. Record the RAW values for ID’s 5, 187, and 198. If any of these are not a zero (0) then that is a failure indication.

Feel free to try out my Drive Troubleshooting Flowcharts linked in my signature. I would be glad to hear feedback on how I could improve it. It will also tell you if the drive should be replaced.

And after you post that output from step 1, I will tell you what those results are.

1 Like

So true… Before the ssd upgrade I was running sas drives, and, yes, ouch was the power bill. But on the other side the ssd’s even though there are 48+, its not to bad on power. On idle the entire system is only drawing 115-125 watts, and under load 175-200.

As requested.

This was the hdd I sent the screen shot in the post. This is a slightly older WD 4tb. Yes I know it is a purple drive. But I have a ton of these from a CCTV project, and I have found these to be the longest lasting drives I have encountered.

Model Family:     Western Digital Purple (Pro)
Device Model:     WDC WD40PURX-64GVNY0
Serial Number:    WD-WCC4EE1YTPD1
LU WWN Device Id: 5 0014ee 260192364
Firmware Version: 80.00A80
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Device is:        In smartctl database 7.3/5671
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Tue Apr 22 08:57:46 2025 CDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is:   Unavailable
APM feature is:   Unavailable
Rd look-ahead is: Enabled
Write cache is:   Enabled
DSN feature is:   Unavailable
ATA Security is:  Disabled, NOT FROZEN [SEC1]
Wt Cache Reorder: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x04) Offline data collection activity
                                        was suspended by an interrupting command from host.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever 
                                        been run.
Total time to complete Offline 
data collection:                (52980) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine 
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        ( 530) minutes.
Conveyance self-test routine
recommended polling time:        (   5) minutes.
SCT capabilities:              (0x703d) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     POSR-K   200   200   051    -    2
  3 Spin_Up_Time            POS--K   181   178   021    -    7941
  4 Start_Stop_Count        -O--CK   099   099   000    -    1410
  5 Reallocated_Sector_Ct   PO--CK   200   200   140    -    0
  7 Seek_Error_Rate         -OSR-K   200   200   000    -    0
  9 Power_On_Hours          -O--CK   071   071   000    -    21489
 10 Spin_Retry_Count        -O--CK   100   100   000    -    0
 11 Calibration_Retry_Count -O--CK   100   100   000    -    0
 12 Power_Cycle_Count       -O--CK   100   100   000    -    743
192 Power-Off_Retract_Count -O--CK   200   200   000    -    679
193 Load_Cycle_Count        -O--CK   200   200   000    -    1540
194 Temperature_Celsius     -O---K   110   103   000    -    42
196 Reallocated_Event_Count -O--CK   200   200   000    -    0
197 Current_Pending_Sector  -O--CK   200   200   000    -    2
198 Offline_Uncorrectable   ----CK   100   253   000    -    0
199 UDMA_CRC_Error_Count    -O--CK   200   200   000    -    0
200 Multi_Zone_Error_Rate   ---R--   100   253   000    -    0
                            ||||||_ K auto-keep
                            |||||__ C event count
                            ||||___ R error rate
                            |||____ S speed/performance
                            ||_____ O updated online
                            |______ P prefailure warning

General Purpose Log Directory Version 1
SMART           Log Directory Version 1 [multi-sector log support]
Address    Access  R/W   Size  Description
0x00       GPL,SL  R/O      1  Log Directory
0x01           SL  R/O      1  Summary SMART error log
0x02           SL  R/O      5  Comprehensive SMART error log
0x03       GPL     R/O      6  Ext. Comprehensive SMART error log
0x06           SL  R/O      1  SMART self-test log
0x07       GPL     R/O      1  Extended self-test log
0x09           SL  R/W      1  Selective self-test log
0x10       GPL     R/O      1  NCQ Command Error log
0x11       GPL     R/O      1  SATA Phy Event Counters log
0x21       GPL     R/O      1  Write stream error log
0x22       GPL     R/O      1  Read stream error log
0x80-0x9f  GPL,SL  R/W     16  Host vendor specific log
0xa0-0xa7  GPL,SL  VS      16  Device vendor specific log
0xa8-0xb7  GPL,SL  VS       1  Device vendor specific log
0xbd       GPL,SL  VS       1  Device vendor specific log
0xc0       GPL,SL  VS       1  Device vendor specific log
0xc1       GPL     VS      93  Device vendor specific log
0xe0       GPL,SL  R/W      1  SCT Command/Status
0xe1       GPL,SL  R/W      1  SCT Data Transfer

SMART Extended Comprehensive Error Log Version: 1 (6 sectors)
Device Error Count: 2
        CR     = Command Register
        FEATR  = Features Register
        COUNT  = Count (was: Sector Count) Register
        LBA_48 = Upper bytes of LBA High/Mid/Low Registers ]  ATA-8
        LH     = LBA High (was: Cylinder High) Register    ]   LBA
        LM     = LBA Mid (was: Cylinder Low) Register      ] Register
        LL     = LBA Low (was: Sector Number) Register     ]
        DV     = Device (was: Device/Head) Register
        DC     = Device Control Register
        ER     = Error register
        ST     = Status register
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 2 [1] occurred at disk power-on lifetime: 21199 hours (883 days + 7 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  40 -- 51 00 00 00 01 b7 29 5e 48 40 00  Error: UNC at LBA = 0x1b7295e48 = 7367908936

  Commands leading to the command that caused the error were:
  CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
  -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
  60 08 00 00 08 00 01 b7 29 62 f8 40 00     00:20:51.190  READ FPDMA QUEUED
  60 08 00 00 00 00 01 b7 29 5a f8 40 00     00:20:51.190  READ FPDMA QUEUED
  60 08 00 00 08 00 01 b7 29 52 f8 40 00     00:20:51.164  READ FPDMA QUEUED
  60 08 00 00 00 00 01 b7 29 4a f8 40 00     00:20:51.164  READ FPDMA QUEUED
  60 08 00 00 08 00 01 b7 29 42 f8 40 00     00:20:51.139  READ FPDMA QUEUED

Error 1 [0] occurred at disk power-on lifetime: 21199 hours (883 days + 7 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  40 -- 51 00 00 00 01 b7 40 83 b0 40 00  Error: UNC at LBA = 0x1b74083b0 = 7369425840

  Commands leading to the command that caused the error were:
  CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
  -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
  60 08 00 00 08 00 01 b7 40 84 40 40 00     00:20:12.085  READ FPDMA QUEUED
  60 08 00 00 00 00 01 b7 40 7c 40 40 00     00:20:12.084  READ FPDMA QUEUED
  60 08 00 00 08 00 01 b7 40 74 40 40 00     00:20:12.057  READ FPDMA QUEUED
  60 08 00 00 00 00 01 b7 40 6c 40 40 00     00:20:12.057  READ FPDMA QUEUED
  60 08 00 00 08 00 01 b7 40 64 40 40 00     00:20:12.029  READ FPDMA QUEUED

SMART Extended Self-test Log Version: 1 (1 sectors)
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%     21274         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

SCT Status Version:                  3
SCT Version (vendor specific):       258 (0x0102)
Device State:                        Active (0)
Current Temperature:                    42 Celsius
Power Cycle Min/Max Temperature:     38/42 Celsius
Lifetime    Min/Max Temperature:      3/49 Celsius
Under/Over Temperature Limit Count:   0/0
Vendor specific:
01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

SCT Temperature History Version:     2
Temperature Sampling Period:         1 minute
Temperature Logging Interval:        1 minute
Min/Max recommended Temperature:      0/60 Celsius
Min/Max Temperature Limit:           -41/85 Celsius
Temperature History Size (Index):    478 (278)

Index    Estimated Time   Temperature Celsius
 279    2025-04-22 01:00    42  ***********************
 ...    ..(114 skipped).    ..  ***********************
 394    2025-04-22 02:55    42  ***********************
 395    2025-04-22 02:56    41  **********************
 ...    ..(  8 skipped).    ..  **********************
 404    2025-04-22 03:05    41  **********************
 405    2025-04-22 03:06    42  ***********************
 ...    ..(350 skipped).    ..  ***********************
 278    2025-04-22 08:57    42  ***********************

SCT Error Recovery Control:
           Read:      1 (0.1 seconds)
          Write:      1 (0.1 seconds)

Device Statistics (GP/SMART Log 0x04) not supported

Pending Defects log (GP Log 0x0c) not supported

SATA Phy Event Counters (GP Log 0x11)
ID      Size     Value  Description
0x0001  2            0  Command failed due to ICRC error
0x0002  2            0  R_ERR response for data FIS
0x0003  2            0  R_ERR response for device-to-host data FIS
0x0004  2            0  R_ERR response for host-to-device data FIS
0x0005  2            0  R_ERR response for non-data FIS
0x0006  2            0  R_ERR response for device-to-host non-data FIS
0x0007  2            0  R_ERR response for host-to-device non-data FIS
0x0008  2            0  Device-to-host non-data FIS retries
0x0009  2            0  Transition from drive PhyRdy to drive PhyNRdy
0x000a  2            1  Device-to-host register FISes sent due to a COMRESET
0x000b  2            0  CRC errors within host-to-device FIS
0x000f  2            0  R_ERR response for host-to-device data FIS, CRC
0x0012  2            0  R_ERR response for host-to-device non-data FIS, CRC
0x8000  4       228708  Vendor specific

This was from the OS drive error I got yesterday.

Model Family:     Crucial/Micron Client SSDs
Device Model:     CT240BX500SSD1
Serial Number:    2448E9976742
LU WWN Device Id: 5 00a075 1e9976742
Firmware Version: M6CR056
User Capacity:    240,057,409,536 bytes [240 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Form Factor:      2.5 inches
TRIM Command:     Available
Device is:        In smartctl database 7.3/5671
ATA Version is:   ACS-3 T13/2161-D revision 4
SATA Version is:  SATA 3.3, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Tue Apr 22 09:01:57 2025 CDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is:   Unavailable
APM feature is:   Unavailable
Rd look-ahead is: Enabled
Write cache is:   Enabled
DSN feature is:   Unavailable
ATA Security is:  Disabled, frozen [SEC2]
Wt Cache Reorder: Unavailable

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
See vendor-specific Attribute list for marginal Attributes.

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever 
                                        been run.
Total time to complete Offline 
data collection:                (  120) seconds.
Offline data collection
capabilities:                    (0x11) SMART execute Offline immediate.
                                        No Auto Offline data collection support.
                                        Suspend Offline collection upon new
                                        command.
                                        No Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        No Selective Self-test supported.
SMART capabilities:            (0x0002) Does not save SMART data before
                                        entering power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine 
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        (  10) minutes.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     POSR-K   100   100   000    -    0
  5 Reallocate_NAND_Blk_Cnt -O--CK   000   000   010    NOW  0
  9 Power_On_Hours          -O--CK   100   100   000    -    233
 12 Power_Cycle_Count       -O--CK   100   100   000    -    22
171 Program_Fail_Count      -O--CK   100   100   000    -    0
172 Erase_Fail_Count        -O--CK   100   100   000    -    0
173 Ave_Block-Erase_Count   -O--CK   100   100   000    -    1
174 Unexpect_Power_Loss_Ct  -O--CK   100   100   000    -    6
180 Unused_Reserve_NAND_Blk PO--CK   100   100   000    -    0
183 SATA_Interfac_Downshift -O--CK   100   100   000    -    0
184 Error_Correction_Count  -O--CK   100   100   000    -    0
187 Reported_Uncorrect      -O--CK   100   100   000    -    0
194 Temperature_Celsius     -O---K   074   070   000    -    26 (Min/Max 23/30)
196 Reallocated_Event_Count -O--CK   100   100   000    -    0
197 Current_Pending_ECC_Cnt -O--CK   100   100   000    -    0
198 Offline_Uncorrectable   ----CK   100   100   000    -    0
199 UDMA_CRC_Error_Count    -O--CK   100   100   000    -    0
202 Percent_Lifetime_Remain ----CK   100   100   001    -    0
206 Write_Error_Rate        -OSR--   100   100   000    -    0
210 Success_RAIN_Recov_Cnt  -O--CK   100   100   000    -    0
246 Total_LBAs_Written      -O--CK   100   100   000    -    27474878
247 Host_Program_Page_Count -O--CK   100   100   000    -    858589
248 FTL_Program_Page_Count  -O--CK   100   100   000    -    0
249 Unkn_CrucialMicron_Attr -O--CK   100   100   000    -    0
250 Read_Error_Retry_Rate   -O--CK   100   100   000    -    0
251 Unkn_CrucialMicron_Attr -O--CK   100   100   000    -    10629649
252 Unkn_CrucialMicron_Attr -O--CK   100   100   000    -    0
253 Unkn_CrucialMicron_Attr -O--CK   100   100   000    -    0
254 Unkn_CrucialMicron_Attr -O--CK   100   100   000    -    0
223 Unkn_CrucialMicron_Attr -O--CK   100   100   000    -    0
                            ||||||_ K auto-keep
                            |||||__ C event count
                            ||||___ R error rate
                            |||____ S speed/performance
                            ||_____ O updated online
                            |______ P prefailure warning

General Purpose Log Directory Version 1
SMART           Log Directory Version 1 [multi-sector log support]
Address    Access  R/W   Size  Description
0x00       GPL,SL  R/O      1  Log Directory
0x06           SL  R/O      1  SMART self-test log
0x07       GPL     R/O      1  Extended self-test log
0x10       GPL     R/O      1  NCQ Command Error log
0x11       GPL     R/O      1  SATA Phy Event Counters log
0x24       GPL     R/O     88  Current Device Internal Status Data log
0x30       GPL,SL  R/O      9  IDENTIFY DEVICE data log
0x80-0x9f  GPL,SL  R/W     16  Host vendor specific log

SMART Extended Comprehensive Error Log (GP Log 0x03) not supported

SMART Error Log not supported

SMART Extended Self-test Log Version: 1 (1 sectors)
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

Selective Self-tests/Logging not supported

SCT Commands not supported

Device Statistics (GP/SMART Log 0x04) not supported

Pending Defects log (GP Log 0x0c) not supported

SATA Phy Event Counters (GP Log 0x11)
ID      Size     Value  Description
0x0001  4            0  Command failed due to ICRC error
0x0002  4            0  R_ERR response for data FIS
0x0005  4            0  R_ERR response for non-data FIS
0x000a  4            2  Device-to-host register FISes sent due to a COMRESET

And this last one is from one of the 4tb ssd’s in the large array which had this error this morning.

Device Model:     SPCC Solid State Disk
Serial Number:    230119975171001
LU WWN Device Id: 0 000000 000000000
Firmware Version: VE0R6365
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Form Factor:      2.5 inches
TRIM Command:     Available, deterministic
Device is:        Not in smartctl database 7.3/5671
ATA Version is:   ACS-3, ATA8-ACS T13/1699-D revision 6
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Tue Apr 22 09:10:05 2025 CDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is:   Unavailable
APM feature is:   Unavailable
Rd look-ahead is: Enabled
Write cache is:   Enabled
DSN feature is:   Unavailable
ATA Security is:  Disabled, NOT FROZEN [SEC1]
Wt Cache Reorder: Unavailable

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever 
                                        been run.
Total time to complete Offline 
data collection:                (    1) seconds.
Offline data collection
capabilities:                    (0x59) SMART execute Offline immediate.
                                        No Auto Offline data collection support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0002) Does not save SMART data before
                                        entering power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine 
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        (   2) minutes.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     POSR-K   100   100   050    -    0
  5 Reallocated_Sector_Ct   PO--CK   100   100   010    -    0
  9 Power_On_Hours          -O--CK   100   100   000    -    6924
 12 Power_Cycle_Count       -O--CK   100   100   000    -    55
161 Unknown_Attribute       -O--CK   100   100   050    -    0
162 Unknown_Attribute       -O--CK   100   100   000    -    168
163 Unknown_Attribute       -O--CK   100   100   000    -    3000
164 Unknown_Attribute       -O--CK   100   100   000    -    0
166 Unknown_Attribute       -O--CK   100   100   000    -    205
167 Unknown_Attribute       -O--CK   100   100   000    -    0
168 Unknown_Attribute       -O--CK   100   100   000    -    0
169 Unknown_Attribute       -O--CK   100   100   000    -    100
171 Unknown_Attribute       -O--CK   100   100   000    -    0
172 Unknown_Attribute       -O--CK   100   100   000    -    0
174 Unknown_Attribute       -O--CK   100   100   000    -    15
175 Program_Fail_Count_Chip -O--CK   100   100   000    -    0
181 Program_Fail_Cnt_Total  -O---K   100   100   000    -    21048
187 Reported_Uncorrect      -O--CK   100   100   000    -    0
194 Temperature_Celsius     -O---K   100   100   000    -    22
195 Hardware_ECC_Recovered  -O-RCK   100   100   000    -    0
196 Reallocated_Event_Count -O--CK   100   100   000    -    0
199 UDMA_CRC_Error_Count    -O--CK   100   100   000    -    0
206 Unknown_SSD_Attribute   -O--CK   100   100   000    -    0
207 Unknown_SSD_Attribute   -O--CK   100   100   000    -    1
232 Available_Reservd_Space -O--CK   100   100   000    -    100
241 Total_LBAs_Written      -O--CK   100   100   000    -    1301
242 Total_LBAs_Read         -O--CK   100   100   000    -    277
249 Unknown_Attribute       -O--CK   100   100   000    -    435
250 Read_Error_Retry_Rate   -O--CK   100   100   000    -    1306
                            ||||||_ K auto-keep
                            |||||__ C event count
                            ||||___ R error rate
                            |||____ S speed/performance
                            ||_____ O updated online
                            |______ P prefailure warning

General Purpose Log Directory Version 1
SMART           Log Directory Version 1 [multi-sector log support]
Address    Access  R/W   Size  Description
0x00       GPL,SL  R/O      1  Log Directory
0x01           SL  R/O      1  Summary SMART error log
0x02           SL  R/O      1  Comprehensive SMART error log
0x03       GPL     R/O      2  Ext. Comprehensive SMART error log
0x04       GPL,SL  R/O      5  Device Statistics log
0x06           SL  R/O      1  SMART self-test log
0x07       GPL     R/O      1  Extended self-test log
0x09           SL  R/W      1  Selective self-test log
0x10       GPL     R/O      1  NCQ Command Error log
0x11       GPL     R/O      1  SATA Phy Event Counters log
0x30       GPL,SL  R/O      8  IDENTIFY DEVICE data log
0x80-0x9f  GPL,SL  R/W     16  Host vendor specific log
0xe0       GPL,SL  R/W      1  SCT Command/Status
0xe1       GPL,SL  R/W      1  SCT Data Transfer

SMART Extended Comprehensive Error Log Version: 1 (2 sectors)
No Errors Logged

SMART Extended Self-test Log Version: 1 (1 sectors)
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%      6855         -

SMART Selective self-test log data structure revision number 0
Note: revision number not 1 implies that no selective self-test has ever been run
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

SCT Commands not supported

Device Statistics (GP Log 0x04)
Page  Offset Size        Value Flags Description
0x01  =====  =               =  ===  == General Statistics (rev 1) ==
0x01  0x008  4              55  ---  Lifetime Power-On Resets
0x01  0x010  4            6924  ---  Power-on Hours
0x01  0x018  6      2730093018  ---  Logical Sectors Written
0x01  0x020  6        10980975  ---  Number of Write Commands
0x01  0x028  6       582018605  ---  Logical Sectors Read
0x01  0x030  6         3819146  ---  Number of Read Commands
0x01  0x038  6     24700151996  ---  Date and Time TimeStamp
0x04  =====  =               =  ===  == General Errors Statistics (rev 1) ==
0x04  0x008  4               0  ---  Number of Reported Uncorrectable Errors
0x04  0x010  4               0  ---  Resets Between Cmd Acceptance and Completion
0x06  =====  =               =  ===  == Transport Statistics (rev 1) ==
0x06  0x008  4              66  ---  Number of Hardware Resets
0x06  0x018  4               0  ---  Number of Interface CRC Errors
0x07  =====  =               =  ===  == Solid State Device Statistics (rev 1) ==
0x07  0x008  1               0  ---  Percentage Used Endurance Indicator
                                |||_ C monitored condition met
                                ||__ D supports DSN
                                |___ N normalized value

Pending Defects log (GP Log 0x0c) not supported

SATA Phy Event Counters (GP Log 0x11)
ID      Size     Value  Description
0x0001  4            0  Command failed due to ICRC error
0x000a  4            1  Device-to-host register FISes sent due to a COMRESET

You should be doing regular SMART long tests (say monthly) and even more frequent SMART short tests (say weekly).

Once you have resolved this you should think about implementing @joeschmuck’s Multi-Report script.

1 Like

Nothing wrong with Purple.

How are temperatures on the HBA and expander? Do these have direct cooling?

I have had colleagues flip me shit for using CCTV drives in my nas’s, from my experence over the years implementing large CCTV systems, I have come to respect the purple drives over almost any other spinning platters…

I keep the cases very tidy, and all the fans are upgraded to 3500 rpm pwm fans, so they can move an impressive amount of air. I have taken thermal shots of them at load. The hottest I have seen them is @49C/120F+/-

This is the controller.



Also one of the storage arrays.



I thought it might be thermals or something with the SAS untill I started getting the same errors on drive connected directly to the MB.

1 Like

The crucial ssd has failed, replace it immediately. RMA it.

As said, purple drives are fine. I too used one for a data drive for a while. This drive is failing. ID 1 should be zero value. Keep an eye on it. It can return back to zero over time.
Ren a SMART Long test on this drive now. Hopefully it will pass.

The pending sector errors can go away but typically ID 5 will increment.

That should give you enough fuel to replace 2 drives.

That crucial ssd, that is an odd one. ID 5 Thresh is 010, the value reported is 000 whish is well below 010. It also tells you NOW!

Good luck

2 Likes

Interesting… That crucial drive is brand new, I will just send it back, since I have had it for like 7 days.

As for the purple I will run a long test on it tonight and get back to you. It is an older drive, so I can swap it if it needs.

What is your thoughts on the other one? I also get a few of the errors on that one, and i dont see any issues on 1 or 5 on it.

I do not see anything wrong with the SPCC drive. Disclaimer: I’m on my cellphone and without glasses. I can take another look at it after i get home. Run a long test as well.

1 Like

I have a Crucial SSD as my laptop SSD, and a 2nd SSD from a different manufacturer as a data SSD - my laptop is a normal windows 10 laptop aside from having more than one disk.

This CT500MX500SSD1 500GB SSD is wearing at >3x the rate it should do according to the TBW specs. Here are the relevant SMART attributes:

No. Attributes Thresh. Value Worst Status Data
1 Read Error Rate 0 100 100 OK (Always passing) 0
5 Reallocated Sectors Count 10 100 100 OK 0
9 Power On Time Count 0 100 100 OK (Always passing) 10345
12 Power Cycle Count 0 100 100 OK (Always passing) 76
171 Program Fail Count 0 100 100 OK (Always passing) 0
172 Erase Fail Count 0 100 100 OK (Always passing) 0
173 Wear Leveling Count 0 56 56 OK (Always passing) 449
174 Unexpected Power Loss Count 0 100 100 OK (Always passing) 7
180 Unused Reserve (Spare) NAND Blocks 0 0 0 OK (Always passing) 39
183 SATA Interface Downshift 0 100 100 OK (Always passing) 0
184 Error Correction Count 0 100 100 OK (Always passing) 0
187 Uncorrectable Error Count 0 100 100 OK (Always passing) 0
194 Controlled Temperature 0 56 33 OK (Always passing) 57; 44
196 Reallocation Event Count 0 100 100 OK (Always passing) 0
197 Current Pending Sector Count 0 100 100 OK (Always passing) 0
198 Uncorrectable Error Count Off-line 0 100 100 OK (Always passing) 0
199 UltraDMA CRC Error Count 0 100 100 OK (Always passing) 0
202 Percentage Of The Rated Lifetime Used 1 56 56 OK 44
206 Write Error Rate 0 100 100 OK (Always passing) 0
210 Successful RAIN Recovery Count 0 100 100 OK (Always passing) 0
246 Total Host Sector Writes 0 100 100 OK (Always passing) 62127940186
247 Host Program Sectors Count 0 100 100 OK (Always passing) 1079173953
248 FTL Program Page Count 0 100 100 OK (Always passing) 794883098

The key figures are:

  • 10,345 hours which is just over 14 months from installation.
  • 56% wear
  • 62,127,940,186 sector writes (of 512B) so the writes to date are c. 28.93TB.
  • The stated TBW for this drive is 180TB, so 28.93TB should be wear of c. 16.07% not 56%.

Please bear in mind that these stats come from the Crucial firmware and the Crucial Storage Executive software so they are entirely Crucial’s statistics.

I had a long chat with TS demanding a replacement, and they never explained how my figures were wrong but instead accused me of running a crypto miner (which I am not!! on a laptop? how many centuries would it take to mine one btcoin on a 10 year old laptop?).

My recommendation is to avoid Crucial if at all possible - they lie about their specification (TBW) and blame the customer when called out about it.

1 Like

Yeah, thats sad. Does it say somewhere when you buy an ssd that it CANT be used on a miner… lol.

I will probably just grab a couple samsungs for the os.

Now home, have 3D printer running a 13 hour print job (using ABS so very slow) and after that one, I have the other half to do tomorrow.

I examined the SPCC data and it all looks fine, except for the lack of SMART long tests. Hint hint.

damn it, I just bought 2 crucial ssds, 1 nvme and 1 sata a few days ago. :frowning:

At least where I live the warranty is between myself and the brick and mortar store that I purchased ti from, so getting a replacement is not hard.