Help understanding "Errors" and "Degraded" on a pool

I am running TrueNAS Scale 24.04.2.5. I have 2 zpools, a 2-vdev raidz2 of 12 spinning drives, and a newer pool of 6 2TB SSD’s, intended to run some vm’s off of. These were not the finest SSD’s, but my environment is not so critical that I needed expensive drives, I got mid-tier Amazon deals, and I understood the risks. This was fine for about a month, but the ssd’s began showing errors (initially within fault tolerance). I replaced one, but then errors started showing on the other drives, and as a precaution I moved all storage off of that pool. This was prudent, as it very quickly catastrophically failed and the pool and data would have been lost.

I completed a single drive replacement, and put the drive into an external enclosure for analysis…and the drive tested fine. No smart errors, wiped it, and it’s seemingly good as new. Meanwhile, the rest of the pool looks like this:

And the only fine drive is the one I replaced. Looking at the drive with the jillion errors, I put in a smart test, and got the following results:

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x02) Offline data collection activity
                                        was completed without error.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever 
                                        been run.
Total time to complete Offline 
data collection:                (   33) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine 
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        (  85) minutes.
Conveyance self-test routine
recommended polling time:        (   2) minutes.
SCT capabilities:              (0x0031) SCT Status supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 20
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  5 Reallocated_Sector_Ct   0x0013   100   100   050    Pre-fail  Always       -       0
  9 Power_On_Hours          0x0012   100   100   000    Old_age   Always       -       1157
 12 Power_Cycle_Count       0x0012   100   100   000    Old_age   Always       -       6
167 Unknown_Attribute       0x0022   100   100   000    Old_age   Always       -       0
168 Unknown_Attribute       0x0012   100   100   000    Old_age   Always       -       0
169 Unknown_Attribute       0x0013   100   100   010    Pre-fail  Always       -       0
171 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       0
172 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       0
173 Unknown_Attribute       0x0012   200   200   000    Old_age   Always       -       21479555094
174 Unknown_Attribute       0x0022   100   100   000    Old_age   Always       -       15
175 Program_Fail_Count_Chip 0x0022   100   100   010    Old_age   Always       -       0
180 Unused_Rsvd_Blk_Cnt_Tot 0x0033   092   092   000    Pre-fail  Always       -       405
187 Reported_Uncorrect      0x0032   092   000   000    Old_age   Always       -       4
192 Power-Off_Retract_Count 0x0012   100   100   000    Old_age   Always       -       5
194 Temperature_Celsius     0x0022   025   025   000    Old_age   Always       -       25 (Min/Max 24/33)
206 Unknown_SSD_Attribute   0x0032   200   200   000    Old_age   Always       -       5
207 Unknown_SSD_Attribute   0x0032   200   200   000    Old_age   Always       -       72
208 Unknown_SSD_Attribute   0x0032   200   200   000    Old_age   Always       -       22
209 Unknown_SSD_Attribute   0x0032   200   200   000    Old_age   Always       -       1
210 Unknown_Attribute       0x0032   200   200   000    Old_age   Always       -       1951
211 Unknown_Attribute       0x0032   200   200   000    Old_age   Always       -       999
231 Unknown_SSD_Attribute   0x0023   093   093   005    Pre-fail  Always       -       1794
241 Total_LBAs_Written      0x0032   100   100   000    Old_age   Always       -       12590
242 Total_LBAs_Read         0x0032   100   100   000    Old_age   Always       -       1295
243 Unknown_Attribute       0x0032   050   050   000    Old_age   Always       -       327685
245 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       25

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%      1154         -
# 2  Short offline       Completed without error       00%      1092         -
# 3  Extended offline    Completed without error       00%       900         -
# 4  Short offline       Completed without error       00%       803         -
# 5  Short offline       Completed without error       00%       707         -
# 6  Short offline       Completed without error       00%       611         -
# 7  Extended offline    Completed without error       00%       516         -
# 8  Short offline       Completed without error       00%       419         -
# 9  Short offline       Completed without error       00%       251         -
#10  Extended offline    Completed without error       00%       156         -
#11  Short offline       Completed without error       00%        59         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
  128        0    65535  Read_scanning was completed without error
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

The above only provides legacy SMART information - try 'smartctl -x' for more

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x02) Offline data collection activity
                                        was completed without error.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever 
                                        been run.
Total time to complete Offline 
data collection:                (   33) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine 
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        (  85) minutes.
Conveyance self-test routine
recommended polling time:        (   2) minutes.
SCT capabilities:              (0x0031) SCT Status supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 20
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  5 Reallocated_Sector_Ct   PO--C-   100   100   050    -    0
  9 Power_On_Hours          -O--C-   100   100   000    -    1157
 12 Power_Cycle_Count       -O--C-   100   100   000    -    6
167 Unknown_Attribute       -O---K   100   100   000    -    0
168 Unknown_Attribute       -O--C-   100   100   000    -    0
169 Unknown_Attribute       PO--C-   100   100   010    -    0
171 Unknown_Attribute       -O--CK   100   100   000    -    0
172 Unknown_Attribute       -O--CK   100   100   000    -    0
173 Unknown_Attribute       -O--C-   200   200   000    -    21479555094
174 Unknown_Attribute       -O---K   100   100   000    -    15
175 Program_Fail_Count_Chip -O---K   100   100   010    -    0
180 Unused_Rsvd_Blk_Cnt_Tot PO--CK   092   092   000    -    405
187 Reported_Uncorrect      -O--CK   092   000   000    -    4
192 Power-Off_Retract_Count -O--C-   100   100   000    -    5
194 Temperature_Celsius     -O---K   025   025   000    -    25 (Min/Max 24/33)
206 Unknown_SSD_Attribute   -O--CK   200   200   000    -    5
207 Unknown_SSD_Attribute   -O--CK   200   200   000    -    72
208 Unknown_SSD_Attribute   -O--CK   200   200   000    -    22
209 Unknown_SSD_Attribute   -O--CK   200   200   000    -    1
210 Unknown_Attribute       -O--CK   200   200   000    -    1951
211 Unknown_Attribute       -O--CK   200   200   000    -    999
231 Unknown_SSD_Attribute   PO---K   093   093   005    -    1794
241 Total_LBAs_Written      -O--CK   100   100   000    -    12590
242 Total_LBAs_Read         -O--CK   100   100   000    -    1295
243 Unknown_Attribute       -O--CK   050   050   000    -    327685
245 Unknown_Attribute       -O--CK   100   100   000    -    25
                            ||||||_ K auto-keep
                            |||||__ C event count
                            ||||___ R error rate
                            |||____ S speed/performance
                            ||_____ O updated online
                            |______ P prefailure warning

General Purpose Log Directory Version 1
SMART           Log Directory Version 1 [multi-sector log support]
Address    Access  R/W   Size  Description
0x00       GPL,SL  R/O      1  Log Directory
0x01           SL  R/O      1  Summary SMART error log
0x02           SL  R/O     51  Comprehensive SMART error log
0x03       GPL     R/O     64  Ext. Comprehensive SMART error log
0x04       GPL,SL  R/O      8  Device Statistics log
0x06           SL  R/O      1  SMART self-test log
0x07       GPL     R/O      1  Extended self-test log
0x09           SL  R/W      1  Selective self-test log
0x10       GPL     R/O      1  NCQ Command Error log
0x11       GPL     R/O      1  SATA Phy Event Counters log
0x30       GPL,SL  R/O      9  IDENTIFY DEVICE data log
0x80-0x9f  GPL,SL  R/W     16  Host vendor specific log
0xe0       GPL,SL  R/W      1  SCT Command/Status
0xe1       GPL,SL  R/W      1  SCT Data Transfer

SMART Extended Comprehensive Error Log Version: 1 (64 sectors)
Device Error Count: 4
        CR     = Command Register
        FEATR  = Features Register
        COUNT  = Count (was: Sector Count) Register
        LBA_48 = Upper bytes of LBA High/Mid/Low Registers ]  ATA-8
        LH     = LBA High (was: Cylinder High) Register    ]   LBA
        LM     = LBA Mid (was: Cylinder Low) Register      ] Register
        LL     = LBA Low (was: Sector Number) Register     ]
        DV     = Device (was: Device/Head) Register
        DC     = Device Control Register
        ER     = Error register
        ST     = Status register
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 4 [3] occurred at disk power-on lifetime: 1024 hours (42 days + 16 hours)
  When the command that caused the error occurred, the device was in an unknown state.

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  40 -- 41 00 28 00 00 ee 77 4e 10 40 00  Error: 

  Commands leading to the command that caused the error were:
  CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
  -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
  00 00 00 00 00 00 00 02 67 38 08 40 00     00:05:59.500  NOP [Abort queued commands]
  2f 00 00 00 01 00 00 00 00 00 10 00 80     00:05:56.600  READ LOG EXT
  00 00 00 00 00 00 00 02 67 37 c0 40 01     00:05:56.600  NOP [Abort queued commands]
  ea 00 00 00 00 00 00 00 00 00 00 00 80     00:05:50.000  FLUSH CACHE EXT
  ea 00 00 00 00 00 00 00 00 00 00 00 80     00:05:50.000  FLUSH CACHE EXT

Error 3 [2] occurred at disk power-on lifetime: 1024 hours (42 days + 16 hours)
  When the command that caused the error occurred, the device was in an unknown state.

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  40 -- 41 00 10 00 00 02 67 39 00 40 00  Error: 

  Commands leading to the command that caused the error were:
  CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
  -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
  00 00 00 00 00 00 00 02 67 37 c0 40 01     00:05:56.600  NOP [Abort queued commands]
  ea 00 00 00 00 00 00 00 00 00 00 00 80     00:05:50.000  FLUSH CACHE EXT
  ea 00 00 00 00 00 00 00 00 00 00 00 80     00:05:50.000  FLUSH CACHE EXT
  ea 00 00 00 00 00 00 00 00 00 00 00 80     00:05:40.000  FLUSH CACHE EXT
  ea 00 00 00 00 00 00 00 00 00 00 00 80     00:05:40.000  FLUSH CACHE EXT

Error 2 [1] occurred at disk power-on lifetime: 1017 hours (42 days + 9 hours)
  When the command that caused the error occurred, the device was in an unknown state.

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  40 -- 41 00 30 00 00 cf 30 ab 90 40 00  Error: 

  Commands leading to the command that caused the error were:
  CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
  -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
  00 00 00 00 00 00 00 02 67 38 08 40 00 42d+09:01:13.500  NOP [Abort queued commands]
  ea 00 00 00 00 00 00 00 00 00 00 00 80 42d+09:01:10.800  FLUSH CACHE EXT
  2f 00 00 00 01 00 00 00 00 00 10 00 80 42d+09:01:10.800  READ LOG EXT
  60 00 00 00 00 00 00 02 67 37 c0 00 80 42d+09:01:10.800  READ FPDMA QUEUED
  ea 00 00 00 00 00 00 00 00 00 00 00 80 42d+09:01:08.100  FLUSH CACHE EXT

Error 1 [0] occurred at disk power-on lifetime: 1017 hours (42 days + 9 hours)
  When the command that caused the error occurred, the device was in an unknown state.

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  40 -- 41 01 00 00 00 02 67 36 d8 40 00  Error: UNC at LBA = 0x026736d8 = 40318680

  Commands leading to the command that caused the error were:
  CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
  -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
  60 00 00 00 00 00 00 02 67 37 c0 00 80 42d+09:01:10.800  READ FPDMA QUEUED
  ea 00 00 00 00 00 00 00 00 00 00 00 80 42d+09:01:08.100  FLUSH CACHE EXT
  ea 00 00 00 00 00 00 00 00 00 00 00 80 42d+09:01:08.100  FLUSH CACHE EXT
  ea 00 00 00 00 00 00 00 00 00 00 00 80 42d+09:01:08.100  FLUSH CACHE EXT
  ea 00 00 00 00 00 00 00 00 00 00 00 80 42d+09:01:08.100  FLUSH CACHE EXT

SMART Extended Self-test Log Version: 1 (1 sectors)
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%      1154         -
# 2  Short offline       Completed without error       00%      1092         -
# 3  Extended offline    Completed without error       00%       900         -
# 4  Short offline       Completed without error       00%       803         -
# 5  Short offline       Completed without error       00%       707         -
# 6  Short offline       Completed without error       00%       611         -
# 7  Extended offline    Completed without error       00%       516         -
# 8  Short offline       Completed without error       00%       419         -
# 9  Short offline       Completed without error       00%       251         -
#10  Extended offline    Completed without error       00%       156         -
#11  Short offline       Completed without error       00%        59         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
  128        0    65535  Read_scanning was completed without error
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

SCT Status Version:                  3
SCT Version (vendor specific):       1 (0x0001)
Device State:                        Active (0)
Current Temperature:                    28 Celsius
Power Cycle Min/Max Temperature:      ?/31 Celsius
Lifetime    Min/Max Temperature:      ?/ ? Celsius
Under/Over Temperature Limit Count:   0/0

SCT Temperature History Version:     2
Temperature Sampling Period:         1 minute
Temperature Logging Interval:        1 minute
Min/Max recommended Temperature:     -127/127 Celsius
Min/Max Temperature Limit:           -127/127 Celsius
Temperature History Size (Index):    478 (87)

Index    Estimated Time   Temperature Celsius
  88    2025-03-07 10:10    28  *********
 ...    ..(  5 skipped).    ..  *********
  94    2025-03-07 10:16    28  *********
  95    2025-03-07 10:17    27  ********
  96    2025-03-07 10:18    28  *********
 ...    ..(  3 skipped).    ..  *********
 100    2025-03-07 10:22    28  *********
 101    2025-03-07 10:23    27  ********
 102    2025-03-07 10:24    28  *********
 103    2025-03-07 10:25    27  ********
 104    2025-03-07 10:26    28  *********
 ...    ..( 48 skipped).    ..  *********
 153    2025-03-07 11:15    28  *********
 154    2025-03-07 11:16    27  ********
 155    2025-03-07 11:17    28  *********
 ...    ..( 15 skipped).    ..  *********
 171    2025-03-07 11:33    28  *********
 172    2025-03-07 11:34    27  ********
 173    2025-03-07 11:35    28  *********
 ...    ..(  7 skipped).    ..  *********
 181    2025-03-07 11:43    28  *********
 182    2025-03-07 11:44    27  ********
 183    2025-03-07 11:45    28  *********
 ...    ..( 11 skipped).    ..  *********
 195    2025-03-07 11:57    28  *********
 196    2025-03-07 11:58    27  ********
 197    2025-03-07 11:59    28  *********
 ...    ..(  5 skipped).    ..  *********
 203    2025-03-07 12:05    28  *********
 204    2025-03-07 12:06    27  ********
 205    2025-03-07 12:07    27  ********
 206    2025-03-07 12:08    28  *********
 ...    ..(  6 skipped).    ..  *********
 213    2025-03-07 12:15    28  *********
 214    2025-03-07 12:16    27  ********
 215    2025-03-07 12:17    28  *********
 ...    ..(  4 skipped).    ..  *********
 220    2025-03-07 12:22    28  *********
 221    2025-03-07 12:23    27  ********
 ...    ..(  3 skipped).    ..  ********
 225    2025-03-07 12:27    27  ********
 226    2025-03-07 12:28    28  *********
 ...    ..( 21 skipped).    ..  *********
 248    2025-03-07 12:50    28  *********
 249    2025-03-07 12:51    27  ********
 250    2025-03-07 12:52    28  *********
 251    2025-03-07 12:53    28  *********
 252    2025-03-07 12:54    28  *********
 253    2025-03-07 12:55    27  ********
 254    2025-03-07 12:56    28  *********
 ...    ..( 11 skipped).    ..  *********
 266    2025-03-07 13:08    28  *********
 267    2025-03-07 13:09    27  ********
 268    2025-03-07 13:10    28  *********
 ...    ..(  4 skipped).    ..  *********
 273    2025-03-07 13:15    28  *********
 274    2025-03-07 13:16    27  ********
 275    2025-03-07 13:17    27  ********
 276    2025-03-07 13:18    28  *********
 277    2025-03-07 13:19    27  ********
 278    2025-03-07 13:20    28  *********
 ...    ..(  2 skipped).    ..  *********
 281    2025-03-07 13:23    28  *********
 282    2025-03-07 13:24    27  ********
 283    2025-03-07 13:25    28  *********
 284    2025-03-07 13:26    27  ********
 285    2025-03-07 13:27    27  ********
 286    2025-03-07 13:28    27  ********
 287    2025-03-07 13:29    28  *********
 288    2025-03-07 13:30    28  *********
 289    2025-03-07 13:31    28  *********
 290    2025-03-07 13:32    27  ********
 ...    ..(  2 skipped).    ..  ********
 293    2025-03-07 13:35    27  ********
 294    2025-03-07 13:36    28  *********
 ...    ..(  2 skipped).    ..  *********
 297    2025-03-07 13:39    28  *********
 298    2025-03-07 13:40    27  ********
 ...    ..(  2 skipped).    ..  ********
 301    2025-03-07 13:43    27  ********
 302    2025-03-07 13:44    28  *********
 ...    ..(  4 skipped).    ..  *********
 307    2025-03-07 13:49    28  *********
 308    2025-03-07 13:50    27  ********
 309    2025-03-07 13:51    27  ********
 310    2025-03-07 13:52    28  *********
 ...    ..(  5 skipped).    ..  *********
 316    2025-03-07 13:58    28  *********
 317    2025-03-07 13:59    27  ********
 ...    ..(  3 skipped).    ..  ********
 321    2025-03-07 14:03    27  ********
 322    2025-03-07 14:04    28  *********
 323    2025-03-07 14:05    28  *********
 324    2025-03-07 14:06    28  *********
 325    2025-03-07 14:07    27  ********
 326    2025-03-07 14:08    27  ********
 327    2025-03-07 14:09    28  *********
 ...    ..(  3 skipped).    ..  *********
 331    2025-03-07 14:13    28  *********
 332    2025-03-07 14:14    27  ********
 ...    ..(  4 skipped).    ..  ********
 337    2025-03-07 14:19    27  ********
 338    2025-03-07 14:20    28  *********
 ...    ..( 11 skipped).    ..  *********
 350    2025-03-07 14:32    28  *********
 351    2025-03-07 14:33    30  ***********
 352    2025-03-07 14:34    28  *********
 353    2025-03-07 14:35    27  ********
 354    2025-03-07 14:36    27  ********
 355    2025-03-07 14:37    28  *********
 ...    ..( 14 skipped).    ..  *********
 370    2025-03-07 14:52    28  *********
 371    2025-03-07 14:53    27  ********
 372    2025-03-07 14:54    28  *********
 ...    ..(  6 skipped).    ..  *********
 379    2025-03-07 15:01    28  *********
 380    2025-03-07 15:02    27  ********
 381    2025-03-07 15:03    27  ********
 382    2025-03-07 15:04    27  ********
 383    2025-03-07 15:05    28  *********
 ...    ..(  2 skipped).    ..  *********
 386    2025-03-07 15:08    28  *********
 387    2025-03-07 15:09    27  ********
 388    2025-03-07 15:10    27  ********
 389    2025-03-07 15:11    28  *********
 ...    ..(  3 skipped).    ..  *********
 393    2025-03-07 15:15    28  *********
 394    2025-03-07 15:16    27  ********
 ...    ..(  4 skipped).    ..  ********
 399    2025-03-07 15:21    27  ********
 400    2025-03-07 15:22    28  *********
 ...    ..(  3 skipped).    ..  *********
 404    2025-03-07 15:26    28  *********
 405    2025-03-07 15:27    27  ********
 406    2025-03-07 15:28    27  ********
 407    2025-03-07 15:29    27  ********
 408    2025-03-07 15:30    28  *********
 ...    ..(  2 skipped).    ..  *********
 411    2025-03-07 15:33    28  *********
 412    2025-03-07 15:34    27  ********
 ...    ..(  3 skipped).    ..  ********
 416    2025-03-07 15:38    27  ********
 417    2025-03-07 15:39    28  *********
 418    2025-03-07 15:40    28  *********
 419    2025-03-07 15:41    28  *********
 420    2025-03-07 15:42    27  ********
 ...    ..(  4 skipped).    ..  ********
 425    2025-03-07 15:47    27  ********
 426    2025-03-07 15:48    28  *********
 ...    ..(  2 skipped).    ..  *********
 429    2025-03-07 15:51    28  *********
 430    2025-03-07 15:52    27  ********
 431    2025-03-07 15:53    27  ********
 432    2025-03-07 15:54    27  ********
 433    2025-03-07 15:55    28  *********
 ...    ..(  2 skipped).    ..  *********
 436    2025-03-07 15:58    28  *********
 437    2025-03-07 15:59    27  ********
 438    2025-03-07 16:00    28  *********
 439    2025-03-07 16:01    27  ********
 440    2025-03-07 16:02    28  *********
 441    2025-03-07 16:03    27  ********
 442    2025-03-07 16:04    28  *********
 ...    ..(  2 skipped).    ..  *********
 445    2025-03-07 16:07    28  *********
 446    2025-03-07 16:08    27  ********
 447    2025-03-07 16:09    28  *********
 448    2025-03-07 16:10    27  ********
 449    2025-03-07 16:11    27  ********
 450    2025-03-07 16:12    27  ********
 451    2025-03-07 16:13    28  *********
 ...    ..(  5 skipped).    ..  *********
 457    2025-03-07 16:19    28  *********
 458    2025-03-07 16:20    27  ********
 459    2025-03-07 16:21    27  ********
 460    2025-03-07 16:22    28  *********
 ...    ..(  6 skipped).    ..  *********
 467    2025-03-07 16:29    28  *********
 468    2025-03-07 16:30    27  ********
 469    2025-03-07 16:31    28  *********
 ...    ..(  7 skipped).    ..  *********
 477    2025-03-07 16:39    28  *********
   0    2025-03-07 16:40    27  ********
   1    2025-03-07 16:41    27  ********
   2    2025-03-07 16:42    28  *********
 ...    ..(  3 skipped).    ..  *********
   6    2025-03-07 16:46    28  *********
   7    2025-03-07 16:47    27  ********
 ...    ..(  2 skipped).    ..  ********
  10    2025-03-07 16:50    27  ********
  11    2025-03-07 16:51    28  *********
 ...    ..(  3 skipped).    ..  *********
  15    2025-03-07 16:55    28  *********
  16    2025-03-07 16:56    27  ********
  17    2025-03-07 16:57    27  ********
  18    2025-03-07 16:58    27  ********
  19    2025-03-07 16:59    28  *********
  20    2025-03-07 17:00    27  ********
  21    2025-03-07 17:01    27  ********
  22    2025-03-07 17:02    28  *********
 ...    ..(  2 skipped).    ..  *********
  25    2025-03-07 17:05    28  *********
  26    2025-03-07 17:06    27  ********
  27    2025-03-07 17:07    27  ********
  28    2025-03-07 17:08    28  *********
 ...    ..(  5 skipped).    ..  *********
  34    2025-03-07 17:14    28  *********
  35    2025-03-07 17:15    27  ********
  36    2025-03-07 17:16    27  ********
  37    2025-03-07 17:17    27  ********
  38    2025-03-07 17:18    28  *********
  39    2025-03-07 17:19    28  *********
  40    2025-03-07 17:20    28  *********
  41    2025-03-07 17:21    27  ********
 ...    ..(  5 skipped).    ..  ********
  47    2025-03-07 17:27    27  ********
  48    2025-03-07 17:28    28  *********
 ...    ..(  2 skipped).    ..  *********
  51    2025-03-07 17:31    28  *********
  52    2025-03-07 17:32    27  ********
 ...    ..(  3 skipped).    ..  ********
  56    2025-03-07 17:36    27  ********
  57    2025-03-07 17:37    28  *********
 ...    ..(  2 skipped).    ..  *********
  60    2025-03-07 17:40    28  *********
  61    2025-03-07 17:41    27  ********
 ...    ..(  2 skipped).    ..  ********
  64    2025-03-07 17:44    27  ********
  65    2025-03-07 17:45    28  *********
 ...    ..(  2 skipped).    ..  *********
  68    2025-03-07 17:48    28  *********
  69    2025-03-07 17:49    27  ********
 ...    ..(  3 skipped).    ..  ********
  73    2025-03-07 17:53    27  ********
  74    2025-03-07 17:54    28  *********
  75    2025-03-07 17:55    28  *********
  76    2025-03-07 17:56    28  *********
  77    2025-03-07 17:57    27  ********
 ...    ..(  5 skipped).    ..  ********
  83    2025-03-07 18:03    27  ********
  84    2025-03-07 18:04    28  *********
 ...    ..(  2 skipped).    ..  *********
  87    2025-03-07 18:07    28  *********

SCT Error Recovery Control command not supported

Device Statistics (GP Log 0x04)
Page  Offset Size        Value Flags Description
0x01  =====  =               =  ===  == General Statistics (rev 1) ==
0x01  0x008  4               6  ---  Lifetime Power-On Resets
0x01  0x010  4            1157  ---  Power-on Hours
0x01  0x018  6     26404982330  ---  Logical Sectors Written
0x01  0x020  6       998505594  ---  Number of Write Commands
0x01  0x028  6      2717776180  ---  Logical Sectors Read
0x01  0x030  6        37823576  ---  Number of Read Commands
0x01  0x038  6   1065014793012  ---  Date and Time TimeStamp
0x07  =====  =               =  ===  == Solid State Device Statistics (rev 1) ==
0x07  0x008  1              14  N--  Percentage Used Endurance Indicator
                                |||_ C monitored condition met
                                ||__ D supports DSN
                                |___ N normalized value

Pending Defects log (GP Log 0x0c) not supported

SATA Phy Event Counters (GP Log 0x11)
ID      Size     Value  Description
0x0001  2            4  Command failed due to ICRC error
0x0003  2            0  R_ERR response for device-to-host data FIS
0x0004  2            0  R_ERR response for host-to-device data FIS
0x0006  2            0  R_ERR response for device-to-host non-data FIS
0x0007  2            0  R_ERR response for host-to-device non-data FIS
0x0008  2            0  Device-to-host non-data FIS retries
0x0009  4            0  Transition from drive PhyRdy to drive PhyNRdy
0x000a  4            1  Device-to-host register FISes sent due to a COMRESET
0x000f  2            0  R_ERR response for host-to-device data FIS, CRC
0x0010  2            0  R_ERR response for host-to-device data FIS, non-CRC
0x0012  2            0  R_ERR response for host-to-device non-data FIS, CRC
0x0013  2            0  R_ERR response for host-to-device non-data FIS, non-CRC

I see several errors in there, but the disk shows “4” errors which is a far cry from Truenas’ console reading of 150k.

Can someone help explain what the “Error” count is in the above screen, and how it translates to actual failing drives? I’m taking this pool out and figuring out how to fund ordering better drives, but I’m confused by the disconnect between smart and the errors displayed.

smart ID 187 Reported_Uncorrect = 4 thats not good
Power_On_Hours 1157 or a little over 48 days
Is this drive new? it also shows a very low data written only 6 MB i find that hard to believe
you May have found a bad batch of drives mauby. but with all drives showing problems i would start with checking cables and if your using a HBA for the drives check that, that isn’t cooking itself they run hot and need lots of cooling.

Truenas uses zfs for data storage. As long as your not striping all the drives together when it reads data it douse a compare or checksum on the data it gets back from the drives. if it fails it will log that and will even tell you what drive failed its checksum. Thats were the 150k errors are from. If Truenas logs a lot of errors in quick succession from a drive it will kick the drive but all your drives are doing that so it cant so it drops the pool instead. i suspect cabling failer.

This is a new chassis, so I guess that’s possible, but the HBA is from the old one and it’s been rock solid for a long time. I went from a 12 bay chassis to a 36 bay chassis, and the whole thing is the same HBA, and there’s no issues with the other zpool, so I don’t suspect backplane or HBA.

The drives were in service for less than 45 days, so I suspect bad batch more than anything, but I’m not sure. I may make a big VM an spam it with data and run it on there just to stress test it a bit and see if I can generate more errors.