Bad Resilver problem and getting worse

Can you connect the flaky drive to your motherboard’s SATA headers? Even if you have to disassemble the case on a bench, as a test. It eliminates a lot of variables. Sorry if this has already been suggested, I’ve searched the thread and did not see it.

1 Like

I thought it was best practice to NOT have smart tests run at same time as scrub/resilver? Not that it seemed to matter in this case, but wanted to double check my understanding.

IMO best next step.

1 Like

No, you did not come off badly. I simply wanted to review ZFS background. Both for you, and other readers now and in the future. Sorry if I sounded like a teacher, (I wrote computer documentation as part of my prior jobs…).

A simple cable disconnect of the drive trying to resilver. I’d do both power and data cables, just to isolate it completely.

As for ZFS scrubs every 4 weeks, that is mostly okay. Some people prefer every 2 weeks. And some want weekly, (which I think is excessive). Over 4 weeks I think is too long. So, 2 to 4 weeks is about right, unless there is older hardware, like used disks. Then closer to 2 week is probably better.

Make sure your disks get enough cooling during scrubs. They basically do non-stop seeks and reads until the scrub completes.

2 Likes

Firmware is good (there’s a P16.00.14.00 somewhere, but I think that beyond .10 it’s all about SSD behaviour). Is the additional power supply connected? (9300-16i typically have a 4-pin Molex).
75 cm should be within specs if backplane traces are short, but I’d be wary about these, especially if the problematic drives are behind these cables. I concur that using SATA cables from the motherboard could help.

If possible, yes indeed, as double duty will put maximal stress—and extend the duration of both operations, but it is possible.

That, unfortunately, is NOT possible. Jonsbo N cases in general appear to have poor drive cooling due to the use of consumer style fans to try and pull air from behind a backplane, and the N5 puts the PSU behind 4 of the 12 bays so for the leftmost drives there really i no good path for air flow.


The 8 vertical drives behind the half-height backplane on the right may enjoy some air flow from the two 120mm fans:

But the 4 horizontal drives can hardly have any joy through this, with the PSU just behind:

1 Like

No, it is not connected but I figured this should be fine as an LSI 9300-16i should not be drawing more than 30 watts roughly and the PCIE slot should be able to provide 75 watts if I’m not mistaken. I could as one possible trouble shooting step acquire for a reasonable sum a LSI 9400i - 16i or aLSi 9305-16i from the same reputable seller of refurbished server hardware as my original LSI 9300 - 16i, Should be at my door step in a few days

I have now started running long Smart Tests on all 11 drives of the pool.using the using GSmartControl on the SystemRescue distort booted from an usb. I guess this will take at least 20 - 24 hours. I hope that all disks are passing this long Smart test.

yes airflow is suboptimal and worse than I imagined when I decided on the case. I have two Noctua nf a12x25 in the back running at a 100%, drives are always at roughly 40-43 degrees and at roughly 50-52 while a scrub happens…. I think this is far from ideal but not absolutely terrible but I might be mistaken …

interestingly the physical placement of the drives doesn’t make such a big difference (as one would believe from the layout of the case) drives are normally all within 2-3 degrees of each other …. also all drives that reported errors were on the board with eight drives. To be precise the drive that originally faulted was in position 6. I added the replacement that I received from Seagate to the same slot. There this replacement also faulted as I described. Two times. The first time the resilver completed. The second time when I shut down the server to investigate the drives right next to it ( so position seven and eight) reported the errors I described. The drive that is being resilvered is now still in position 6 but has no longer reported any errors. The drive that now always reports errors was initially in position 7 (therefore sharing a cable to the HBA with positions 5,6,7,8) but I now moved it to position 1 and it shows the behaviour that I told you about and no longer shares a cable with 5,6,7,8 but with 1,2,3,4

75 W is for a full x16 card. The 9300-16i is limited to 25 W and this is not enough for two LSI 3008 and a switch. Please connect additional power: It’s there for a reason.
A 9305 or 9400 would be fully powered from the PCIe slot, and run cooler, but this may not be necessary. (A plain 9300-8i supplemented by motherboard SATA ports would do as well.)

If the long SMART tests show no issues, we might have have nailed the hardware problem: Power to the HBA!

I’d say this is a fair assessment (50°C is still “within specs” for warranty purposes), but some here would not be happy with these numbers and want to see 10°C less to preserve their drives for longer.

3 Likes

Once again I am very thankful for all of your help and making me understand the issues at hand.

Ok, I understand, I had to search for the PCIe power cables of my power supply, and luckily found them. I now attached them to the LSI 9300-16i. Sadly, it didn’t fix my problem. The same behaviour as before continues. I completed my long SMART test on all drives of the pool — 10 of them finished and passed (including the new 18TB drive, which is being resilvered and which initially faulted two times). But the one that seems to be causing the problems, ST16000NM000H-3KW103 ZYD00P5Y, always gets interrupted after a few minutes. Here is the report.

	
json_format_version	
0	1
1	0
smartctl	
version	
0	7
1	5
pre_release	false
svn_revision	"5714"
platform_info	"x86_64-linux-6.18.34-1-lts"
build_info	"(local build)"
argv	
0	"smartctl"
1	"--health"
2	"--info"
3	"--get=all"
4	"--capabilities"
5	"--attributes"
6	"--format=brief"
7	"--log=xerror,50,error"
8	"--log=xselftest,50,selftest"
9	"--log=selective"
10	"--log=directory"
11	"--log=scttemp"
12	"--log=scterc"
13	"--log=devstat"
14	"--log=sataphy"
15	"--json=o"
16	"/dev/sds"
output	
0	"smartctl 7.5 2025-04-30 r5714 [x86_64-linux-6.18.34-1-lts] (local build)"
1	"Copyright (C) 2002-25, Bruce Allen, Christian Franke, www.smartmontools.org"
2	""
3	"=== START OF INFORMATION SECTION ==="
4	"Device Model:     ST16000NM000H-3KW103"
5	"Serial Number:    ZYD00P5Y"
6	"LU WWN Device Id: 5 000c50 0e7fa6d6d"
7	"Firmware Version: EN01"
8	"User Capacity:    16,000,900,661,248 bytes [16.0 TB]"
9	"Sector Sizes:     512 bytes logical, 4096 bytes physical"
10	"Rotation Rate:    7200 rpm"
11	"Form Factor:      3.5 inches"
12	"Device is:        Not in smartctl database 7.5/5706"
13	"ATA Version is:   ACS-5 (minor revision not indicated)"
14	"SATA Version is:  SATA 3.3, 6.0 Gb/s (current: 6.0 Gb/s)"
15	"Local Time is:    Tue Jun 23 21:39:53 2026 UTC"
16	"SMART support is: Available - device has SMART capability."
17	"SMART support is: Enabled"
18	"AAM feature is:   Unavailable"
19	"APM feature is:   Unavailable"
20	"Rd look-ahead is: Enabled"
21	"Write cache is:   Enabled"
22	"DSN feature is:   Disabled"
23	"ATA Security is:  Disabled, NOT FROZEN [SEC1]"
24	""
25	"=== START OF READ SMART DATA SECTION ==="
26	"SMART overall-health self-assessment test result: PASSED"
27	""
28	"General SMART Values:"
29	"Offline data collection status:  (0x82)\tOffline data collection activity"
30	"\t\t\t\t\twas completed without error."
31	"\t\t\t\t\tAuto Offline Data Collection: Enabled."
32	"Self-test execution status:      (  32)\tThe self-test routine was interrupted"
33	"\t\t\t\t\tby the host with a hard or soft reset."
34	"Total time to complete Offline "
35	"data collection: \t\t(  567) seconds."
36	"Offline data collection"
37	"capabilities: \t\t\t (0x7b) SMART execute Offline immediate."
38	"\t\t\t\t\tAuto Offline data collection on/off support."
39	"\t\t\t\t\tSuspend Offline collection upon new"
40	"\t\t\t\t\tcommand."
41	"\t\t\t\t\tOffline surface scan supported."
42	"\t\t\t\t\tSelf-test supported."
43	"\t\t\t\t\tConveyance Self-test supported."
44	"\t\t\t\t\tSelective Self-test supported."
45	"SMART capabilities:            (0x0003)\tSaves SMART data before entering"
46	"\t\t\t\t\tpower-saving mode."
47	"\t\t\t\t\tSupports SMART auto save timer."
48	"Error logging capability:        (0x01)\tError logging supported."
49	"\t\t\t\t\tGeneral Purpose Logging supported."
50	"Short self-test routine "
51	"recommended polling time: \t (   2) minutes."
52	"Extended self-test routine"
53	"recommended polling time: \t (1374) minutes."
54	"Conveyance self-test routine"
55	"recommended polling time: \t (   3) minutes."
56	"SCT capabilities: \t       (0x50bd)\tSCT Status supported."
57	"\t\t\t\t\tSCT Error Recovery Control supported."
58	"\t\t\t\t\tSCT Feature Control supported."
59	"\t\t\t\t\tSCT Data Table supported."
60	""
61	"SMART Attributes Data Structure revision number: 10"
62	"Vendor Specific SMART Attributes with Thresholds:"
63	"ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE"
64	"  1 Raw_Read_Error_Rate     POSR--   070   064   044    -    9581200"
65	"  3 Spin_Up_Time            PO----   093   090   000    -    0"
66	"  4 Start_Stop_Count        -O--CK   100   100   020    -    438"
67	"  5 Reallocated_Sector_Ct   PO--CK   100   100   010    -    0"
68	"  7 Seek_Error_Rate         POSR--   087   060   045    -    520340282"
69	"  9 Power_On_Hours          -O--CK   092   092   000    -    7634"
70	" 10 Spin_Retry_Count        PO--C-   100   100   097    -    0"
71	" 12 Power_Cycle_Count       -O--CK   100   100   020    -    438"
72	" 18 Unknown_Attribute       PO-R--   100   100   050    -    0"
73	"187 Reported_Uncorrect      -O--CK   100   100   000    -    0"
74	"188 Command_Timeout         -O--CK   100   100   000    -    0"
75	"190 Airflow_Temperature_Cel -O---K   061   036   000    -    39 (Min/Max 39/40)"
76	"192 Power-Off_Retract_Count -O--CK   100   100   000    -    265"
77	"193 Load_Cycle_Count        -O--CK   097   097   000    -    7447"
78	"194 Temperature_Celsius     -O---K   039   064   000    -    39 (0 21 0 0 0)"
79	"197 Current_Pending_Sector  -O--C-   100   100   000    -    0"
80	"198 Offline_Uncorrectable   ----C-   100   100   000    -    0"
81	"199 UDMA_CRC_Error_Count    -OSRCK   200   200   000    -    0"
82	"200 Multi_Zone_Error_Rate   PO---K   100   100   001    -    0"
83	"240 Head_Flying_Hours       ------   100   100   000    -    4345 (110 72 0)"
84	"241 Total_LBAs_Written      ------   100   253   000    -    73966167809"
85	"242 Total_LBAs_Read         ------   100   253   000    -    558276226123"
86	"                            ||||||_ K auto-keep"
87	"                            |||||__ C event count"
88	"                            ||||___ R error rate"
89	"                            |||____ S speed/performance"
90	"                            ||_____ O updated online"
91	"                            |______ P prefailure warning"
92	""
93	"General Purpose Log Directory Version 1"
94	"SMART           Log Directory Version 1 [multi-sector log support]"
95	"Address    Access  R/W   Size  Description"
96	"0x00       GPL,SL  R/O      1  Log Directory"
97	"0x01           SL  R/O      1  Summary SMART error log"
98	"0x02           SL  R/O      5  Comprehensive SMART error log"
99	"0x03       GPL     R/O      5  Ext. Comprehensive SMART error log"
100	"0x04       GPL     R/O    256  Device Statistics log"
101	"0x04       SL      R/O      8  Device Statistics log"
102	"0x06           SL  R/O      1  SMART self-test log"
103	"0x07       GPL     R/O      1  Extended self-test log"
104	"0x08       GPL     R/O      2  Power Conditions log"
105	"0x09           SL  R/W      1  Selective self-test log"
106	"0x0a       GPL     R/W      8  Device Statistics Notification"
107	"0x0c       GPL     R/O   2048  Pending Defects log"
108	"0x0f       GPL     R/O      2  Sense Data for Successful NCQ Cmds log"
109	"0x10       GPL     R/O      1  NCQ Command Error log"
110	"0x11       GPL     R/O      1  SATA Phy Event Counters log"
111	"0x13       GPL     R/O      1  SATA NCQ Send and Receive log"
112	"0x21       GPL     R/O      1  Write stream error log"
113	"0x22       GPL     R/O      1  Read stream error log"
114	"0x24       GPL     R/O    768  Current Device Internal Status Data log"
115	"0x2f       GPL     R/O      1  Sector Configuration log"
116	"0x30       GPL,SL  R/O      9  IDENTIFY DEVICE data log"
117	"0x80-0x9f  GPL,SL  R/W     16  Host vendor specific log"
118	"0xa1       GPL,SL  VS     160  Device vendor specific log"
119	"0xa2       GPL     VS   16320  Device vendor specific log"
120	"0xa4       GPL,SL  VS     160  Device vendor specific log"
121	"0xa6       GPL     VS     192  Device vendor specific log"
122	"0xa8-0xa9  GPL,SL  VS     136  Device vendor specific log"
123	"0xab       GPL     VS       1  Device vendor specific log"
124	"0xad       GPL     VS      16  Device vendor specific log"
125	"0xb1       GPL,SL  VS     160  Device vendor specific log"
126	"0xb4       GPL,SL  VS      16  Device vendor specific log"
127	"0xb6       GPL     VS    1920  Device vendor specific log"
128	"0xbe-0xbf  GPL     VS   65535  Device vendor specific log"
129	"0xc1       GPL,SL  VS       8  Device vendor specific log"
130	"0xc3       GPL,SL  VS      32  Device vendor specific log"
131	"0xc6       GPL     VS    5184  Device vendor specific log"
132	"0xc7       GPL,SL  VS       8  Device vendor specific log"
133	"0xc9       GPL,SL  VS       8  Device vendor specific log"
134	"0xca       GPL,SL  VS      16  Device vendor specific log"
135	"0xcd       GPL,SL  VS       1  Device vendor specific log"
136	"0xce       GPL     VS       1  Device vendor specific log"
137	"0xcf       GPL     VS     512  Device vendor specific log"
138	"0xd1       GPL     VS     656  Device vendor specific log"
139	"0xd2       GPL     VS   10256  Device vendor specific log"
140	"0xd4       GPL     VS    2048  Device vendor specific log"
141	"0xda       GPL,SL  VS       1  Device vendor specific log"
142	"0xe0       GPL,SL  R/W      1  SCT Command/Status"
143	"0xe1       GPL,SL  R/W      1  SCT Data Transfer"
144	""
145	"SMART Extended Comprehensive Error Log Version: 1 (5 sectors)"
146	"No Errors Logged"
147	""
148	"SMART Extended Self-test Log Version: 1 (1 sectors)"
149	"Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error"
150	"# 1  Extended offline    Interrupted (host reset)      00%      7633         -"
151	"# 2  Extended offline    Interrupted (host reset)      00%      7631         -"
152	"# 3  Short offline       Completed without error       00%      7630         -"
153	"# 4  Extended offline    Interrupted (host reset)      00%      7507         -"
154	"# 5  Extended offline    Interrupted (host reset)      00%      7464         -"
155	"# 6  Short offline       Completed without error       00%      7330         -"
156	"# 7  Extended offline    Interrupted (host reset)      00%      7311         -"
157	"# 8  Short offline       Completed without error       00%      7242         -"
158	"# 9  Extended offline    Completed without error       00%      7165         -"
159	"#10  Short offline       Completed without error       00%      7040         -"
160	"#11  Extended offline    Interrupted (host reset)      00%      6994         -"
161	"#12  Short offline       Completed without error       00%      6971         -"
162	"#13  Extended offline    Interrupted (host reset)      00%      6959         -"
163	"#14  Short offline       Completed without error       00%      6902         -"
164	"#15  Extended offline    Completed without error       00%      6875         -"
165	"#16  Extended offline    Interrupted (host reset)      00%      6784         -"
166	"#17  Short offline       Completed without error       00%      6596         -"
167	"#18  Extended offline    Interrupted (host reset)      00%      6535         -"
168	"#19  Extended offline    Completed without error       00%      6492         -"
169	""
170	"SMART Selective self-test log data structure revision number 1"
171	" SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS"
172	"    1        0        0  Not_testing"
173	"    2        0        0  Not_testing"
174	"    3        0        0  Not_testing"
175	"    4        0        0  Not_testing"
176	"    5        0        0  Not_testing"
177	"Selective self-test flags (0x0):"
178	"  After scanning selected spans, do NOT read-scan remainder of disk."
179	"If Selective self-test is pending on power-up, resume after 0 minute delay."
180	""
181	"SCT Status Version:                  3"
182	"SCT Version (vendor specific):       522 (0x020a)"
183	"Device State:                        Active (0)"
184	"Current Temperature:                    39 Celsius"
185	"Power Cycle Min/Max Temperature:     39/41 Celsius"
186	"Lifetime    Min/Max Temperature:     21/64 Celsius"
187	"Specified Max Operating Temperature:    60 Celsius"
188	"Under/Over Temperature Limit Count:   0/0"
189	"SMART Status:                        0xc24f (PASSED)"
190	"Vendor specific:"
191	"00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00"
192	"00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00"
193	""
194	"SCT Temperature History Version:     2"
195	"Temperature Sampling Period:         5 minutes"
196	"Temperature Logging Interval:        59 minutes"
197	"Min/Max recommended Temperature:     10/40 Celsius"
198	"Min/Max Temperature Limit:            5/60 Celsius"
199	"Temperature History Size (Index):    128 (26)"
200	""
201	"Index    Estimated Time   Temperature Celsius"
202	"  27    2026-06-18 16:07    33  **************"
203	"  28    2026-06-18 17:06     ?  -"
204	"  29    2026-06-18 18:05    26  *******"
205	"  30    2026-06-18 19:04    43  ************************"
206	"  31    2026-06-18 20:03    45  **************************"
207	"  32    2026-06-18 21:02    44  *************************"
208	"  33    2026-06-18 22:01    40  *********************"
209	"  34    2026-06-18 23:00     ?  -"
210	"  35    2026-06-18 23:59    26  *******"
211	"  36    2026-06-19 00:58    37  ******************"
212	"  37    2026-06-19 01:57    48  *****************************"
213	"  38    2026-06-19 02:56    50  *******************************"
214	"  39    2026-06-19 03:55    49  ******************************"
215	"  40    2026-06-19 04:54    48  *****************************"
216	"  41    2026-06-19 05:53    48  *****************************"
217	"  42    2026-06-19 06:52    49  ******************************"
218	"  43    2026-06-19 07:51    50  *******************************"
219	"  44    2026-06-19 08:50    49  ******************************"
220	"  45    2026-06-19 09:49    49  ******************************"
221	"  46    2026-06-19 10:48    49  ******************************"
222	"  47    2026-06-19 11:47    50  *******************************"
223	"  48    2026-06-19 12:46    49  ******************************"
224	"  49    2026-06-19 13:45    50  *******************************"
225	"  50    2026-06-19 14:44    50  *******************************"
226	"  51    2026-06-19 15:43    49  ******************************"
227	" ...    ..(  3 skipped).    ..  ******************************"
228	"  55    2026-06-19 19:39    49  ******************************"
229	"  56    2026-06-19 20:38    48  *****************************"
230	"  57    2026-06-19 21:37    49  ******************************"
231	" ...    ..(  2 skipped).    ..  ******************************"
232	"  60    2026-06-20 00:34    49  ******************************"
233	"  61    2026-06-20 01:33    48  *****************************"
234	"  62    2026-06-20 02:32    38  *******************"
235	"  63    2026-06-20 03:31    38  *******************"
236	"  64    2026-06-20 04:30    47  ****************************"
237	"  65    2026-06-20 05:29    51  ********************************"
238	"  66    2026-06-20 06:28    51  ********************************"
239	"  67    2026-06-20 07:27    50  *******************************"
240	" ...    ..(  2 skipped).    ..  *******************************"
241	"  70    2026-06-20 10:24    50  *******************************"
242	"  71    2026-06-20 11:23    49  ******************************"
243	"  72    2026-06-20 12:22     ?  -"
244	"  73    2026-06-20 13:21    33  **************"
245	"  74    2026-06-20 14:20     ?  -"
246	"  75    2026-06-20 15:19    36  *****************"
247	"  76    2026-06-20 16:18     ?  -"
248	"  77    2026-06-20 17:17    27  ********"
249	"  78    2026-06-20 18:16     ?  -"
250	"  79    2026-06-20 19:15    31  ************"
251	"  80    2026-06-20 20:14     ?  -"
252	"  81    2026-06-20 21:13    32  *************"
253	"  82    2026-06-20 22:12     ?  -"
254	"  83    2026-06-20 23:11    28  *********"
255	"  84    2026-06-21 00:10     ?  -"
256	"  85    2026-06-21 01:09    32  *************"
257	"  86    2026-06-21 02:08     ?  -"
258	"  87    2026-06-21 03:07    32  *************"
259	"  88    2026-06-21 04:06     ?  -"
260	"  89    2026-06-21 05:05    33  **************"
261	"  90    2026-06-21 06:04     ?  -"
262	"  91    2026-06-21 07:03    34  ***************"
263	"  92    2026-06-21 08:02     ?  -"
264	"  93    2026-06-21 09:01    34  ***************"
265	"  94    2026-06-21 10:00     ?  -"
266	"  95    2026-06-21 10:59    29  **********"
267	"  96    2026-06-21 11:58     ?  -"
268	"  97    2026-06-21 12:57    33  **************"
269	"  98    2026-06-21 13:56     ?  -"
270	"  99    2026-06-21 14:55    35  ****************"
271	" 100    2026-06-21 15:54     ?  -"
272	" 101    2026-06-21 16:53    35  ****************"
273	" 102    2026-06-21 17:52     ?  -"
274	" 103    2026-06-21 18:51    39  ********************"
275	" 104    2026-06-21 19:50     ?  -"
276	" 105    2026-06-21 20:49    39  ********************"
277	" 106    2026-06-21 21:48     ?  -"
278	" 107    2026-06-21 22:47    39  ********************"
279	" 108    2026-06-21 23:46     ?  -"
280	" 109    2026-06-22 00:45    40  *********************"
281	" 110    2026-06-22 01:44     ?  -"
282	" 111    2026-06-22 02:43    37  ******************"
283	" 112    2026-06-22 03:42     ?  -"
284	" 113    2026-06-22 04:41    39  ********************"
285	" 114    2026-06-22 05:40     ?  -"
286	" 115    2026-06-22 06:39    40  *********************"
287	" 116    2026-06-22 07:38    37  ******************"
288	" 117    2026-06-22 08:37    37  ******************"
289	" 118    2026-06-22 09:36    37  ******************"
290	" 119    2026-06-22 10:35     ?  -"
291	" 120    2026-06-22 11:34    37  ******************"
292	" 121    2026-06-22 12:33     ?  -"
293	" 122    2026-06-22 13:32    39  ********************"
294	" 123    2026-06-22 14:31     ?  -"
295	" 124    2026-06-22 15:30    28  *********"
296	" 125    2026-06-22 16:29     ?  -"
297	" 126    2026-06-22 17:28    34  ***************"
298	" 127    2026-06-22 18:27     ?  -"
299	"   0    2026-06-22 19:26    34  ***************"
300	"   1    2026-06-22 20:25     ?  -"
301	"   2    2026-06-22 21:24    35  ****************"
302	"   3    2026-06-22 22:23     ?  -"
303	"   4    2026-06-22 23:22    38  *******************"
304	"   5    2026-06-23 00:21     ?  -"
305	"   6    2026-06-23 01:20    39  ********************"
306	"   7    2026-06-23 02:19     ?  -"
307	"   8    2026-06-23 03:18    38  *******************"
308	"   9    2026-06-23 04:17     ?  -"
309	"  10    2026-06-23 05:16    27  ********"
310	"  11    2026-06-23 06:15     ?  -"
311	"  12    2026-06-23 07:14    34  ***************"
312	"  13    2026-06-23 08:13     ?  -"
313	"  14    2026-06-23 09:12    36  *****************"
314	"  15    2026-06-23 10:11     ?  -"
315	"  16    2026-06-23 11:10    31  ************"
316	"  17    2026-06-23 12:09     ?  -"
317	"  18    2026-06-23 13:08    34  ***************"
318	"  19    2026-06-23 14:07     ?  -"
319	"  20    2026-06-23 15:06    28  *********"
320	"  21    2026-06-23 16:05     ?  -"
321	"  22    2026-06-23 17:04    37  ******************"
322	"  23    2026-06-23 18:03    39  ********************"
323	"  24    2026-06-23 19:02     ?  -"
324	"  25    2026-06-23 20:01    40  *********************"
325	"  26    2026-06-23 21:00    39  ********************"
326	""
327	"SCT Error Recovery Control:"
328	"           Read:     70 (7.0 seconds)"
329	"          Write:     70 (7.0 seconds)"
330	""
331	"Device Statistics (GP Log 0x04)"
332	"Page  Offset Size        Value Flags Description"
333	"0x01  =====  =               =  ===  == General Statistics (rev 1) =="
334	"0x01  0x008  4             438  ---  Lifetime Power-On Resets"
335	"0x01  0x010  4            7634  ---  Power-on Hours"
336	"0x01  0x018  6     72175362799  ---  Logical Sectors Written"
337	"0x01  0x020  6      1578807047  ---  Number of Write Commands"
338	"0x01  0x028  6    544071312526  ---  Logical Sectors Read"
339	"0x01  0x030  6      1773117277  ---  Number of Read Commands"
340	"0x01  0x038  6               -  ---  Date and Time TimeStamp"
341	"0x03  =====  =               =  ===  == Rotating Media Statistics (rev 1) =="
342	"0x03  0x008  4            7418  ---  Spindle Motor Power-on Hours"
343	"0x03  0x010  4            4345  ---  Head Flying Hours"
344	"0x03  0x018  4            7447  ---  Head Load Events"
345	"0x03  0x020  4               0  ---  Number of Reallocated Logical Sectors"
346	"0x03  0x028  4               0  ---  Read Recovery Attempts"
347	"0x03  0x030  4               0  ---  Number of Mechanical Start Failures"
348	"0x03  0x038  4               0  ---  Number of Realloc. Candidate Logical Sectors"
349	"0x03  0x040  4             265  ---  Number of High Priority Unload Events"
350	"0x04  =====  =               =  ===  == General Errors Statistics (rev 1) =="
351	"0x04  0x008  4               0  ---  Number of Reported Uncorrectable Errors"
352	"0x04  0x010  4               0  ---  Resets Between Cmd Acceptance and Completion"
353	"0x04  0x018  4               0  -D-  Physical Element Status Changed"
354	"0x05  =====  =               =  ===  == Temperature Statistics (rev 1) =="
355	"0x05  0x008  1              39  ---  Current Temperature"
356	"0x05  0x010  1              46  ---  Average Short Term Temperature"
357	"0x05  0x018  1              40  ---  Average Long Term Temperature"
358	"0x05  0x020  1              63  ---  Highest Temperature"
359	"0x05  0x028  1              24  ---  Lowest Temperature"
360	"0x05  0x030  1              60  ---  Highest Average Short Term Temperature"
361	"0x05  0x038  1              34  ---  Lowest Average Short Term Temperature"
362	"0x05  0x040  1              45  ---  Highest Average Long Term Temperature"
363	"0x05  0x048  1              38  ---  Lowest Average Long Term Temperature"
364	"0x05  0x050  4            3916  ---  Time in Over-Temperature"
365	"0x05  0x058  1              60  ---  Specified Maximum Operating Temperature"
366	"0x05  0x060  4               0  ---  Time in Under-Temperature"
367	"0x05  0x068  1               5  ---  Specified Minimum Operating Temperature"
368	"0x06  =====  =               =  ===  == Transport Statistics (rev 1) =="
369	"0x06  0x008  4             289  ---  Number of Hardware Resets"
370	"0x06  0x010  4              35  ---  Number of ASR Events"
371	"0x06  0x018  4               0  ---  Number of Interface CRC Errors"
372	"0xff  =====  =               =  ===  == Vendor Specific Statistics (rev 1) =="
373	"0xff  0x008  7               0  ---  Vendor Specific"
374	"0xff  0x010  7               0  ---  Vendor Specific"
375	"0xff  0x018  7               0  ---  Vendor Specific"
376	"                                |||_ C monitored condition met"
377	"                                ||__ D supports DSN"
378	"                                |___ N normalized value"
379	""
380	"SATA Phy Event Counters (GP Log 0x11)"
381	"ID      Size     Value  Description"
382	"0x000a  2            1  Device-to-host register FISes sent due to a COMRESET"
383	"0x0001  2            0  Command failed due to ICRC error"
384	"0x0003  2            0  R_ERR response for device-to-host data FIS"
385	"0x0004  2            0  R_ERR response for host-to-device data FIS"
386	"0x0006  2            0  R_ERR response for device-to-host non-data FIS"
387	"0x0007  2            0  R_ERR response for host-to-device non-data FIS"
388	"0x000b  2            0  CRC errors within host-to-device FIS"
389	"0x000d  2            0  Non-CRC errors within host-to-device FIS"
390	""
drive_database_version	
string	"7.5/5706"
exit_status	0
local_time	
time_t	1782250793
asctime	"Tue Jun 23 21:39:53 2026 UTC"
device	
name	"/dev/sds"
info_name	"/dev/sds [SAT]"
type	"sat"
protocol	"ATA"
model_name	"ST16000NM000H-3KW103"
serial_number	"ZYD00P5Y"
wwn	
naa	5
oui	3152
id	3891948909
firmware_version	"EN01"
user_capacity	
blocks	31251759104
bytes	16000900661248
logical_block_size	512
physical_block_size	4096
rotation_rate	7200
form_factor	
ata_value	2
name	"3.5 inches"
trim	
supported	false
in_smartctl_database	false
ata_version	
string	"ACS-5 (minor revision not indicated)"
major_value	8160
minor_value	65535
sata_version	
string	"SATA 3.3"
value	511
interface_speed	
max	
sata_value	14
string	"6.0 Gb/s"
units_per_second	60
bits_per_unit	100000000
current	
sata_value	3
string	"6.0 Gb/s"
units_per_second	60
bits_per_unit	100000000
smart_support	
available	true
enabled	true
read_lookahead	
enabled	true
write_cache	
enabled	true
ata_dsn	
enabled	false
ata_security	
state	33
string	"Disabled, NOT FROZEN [SEC1]"
enabled	false
frozen	false
master_password_id	65534
smart_status	
passed	true
ata_smart_data	
offline_data_collection	
status	
value	130
string	"was completed without error"
passed	true
completion_seconds	567
self_test	
status	
value	32
string	"was interrupted by the host with a reset"
polling_minutes	
short	2
extended	1374
conveyance	3
capabilities	
values	
0	123
1	3
exec_offline_immediate_supported	true
offline_is_aborted_upon_new_cmd	false
offline_surface_scan_supported	true
self_tests_supported	true
conveyance_self_test_supported	true
selective_self_test_supported	true
attribute_autosave_enabled	true
error_logging_supported	true
gp_logging_supported	true
ata_sct_capabilities	
value	20669
error_recovery_control_supported	true
feature_control_supported	true
data_table_supported	true
ata_smart_attributes	
revision	10
table	
0	
id	1
name	"Raw_Read_Error_Rate"
value	70
worst	64
thresh	44
when_failed	""
flags	
value	15
string	"POSR-- "
prefailure	true
updated_online	true
performance	true
error_rate	true
event_count	false
auto_keep	false
raw	
value	9581200
string	"9581200"
1	
id	3
name	"Spin_Up_Time"
value	93
worst	90
thresh	0
when_failed	""
flags	
value	3
string	"PO---- "
prefailure	true
updated_online	true
performance	false
error_rate	false
event_count	false
auto_keep	false
raw	
value	0
string	"0"
2	
id	4
name	"Start_Stop_Count"
value	100
worst	100
thresh	20
when_failed	""
flags	
value	50
string	"-O--CK "
prefailure	false
updated_online	true
performance	false
error_rate	false
event_count	true
auto_keep	true
raw	
value	438
string	"438"
3	
id	5
name	"Reallocated_Sector_Ct"
value	100
worst	100
thresh	10
when_failed	""
flags	
value	51
string	"PO--CK "
prefailure	true
updated_online	true
performance	false
error_rate	false
event_count	true
auto_keep	true
raw	
value	0
string	"0"
4	
id	7
name	"Seek_Error_Rate"
value	87
worst	60
thresh	45
when_failed	""
flags	
value	15
string	"POSR-- "
prefailure	true
updated_online	true
performance	true
error_rate	true
event_count	false
auto_keep	false
raw	
value	520340282
string	"520340282"
5	
id	9
name	"Power_On_Hours"
value	92
worst	92
thresh	0
when_failed	""
flags	
value	50
string	"-O--CK "
prefailure	false
updated_online	true
performance	false
error_rate	false
event_count	true
auto_keep	true
raw	
value	7634
string	"7634"
6	
id	10
name	"Spin_Retry_Count"
value	100
worst	100
thresh	97
when_failed	""
flags	
value	19
string	"PO--C- "
prefailure	true
updated_online	true
performance	false
error_rate	false
event_count	true
auto_keep	false
raw	
value	0
string	"0"
7	
id	12
name	"Power_Cycle_Count"
value	100
worst	100
thresh	20
when_failed	""
flags	
value	50
string	"-O--CK "
prefailure	false
updated_online	true
performance	false
error_rate	false
event_count	true
auto_keep	true
raw	
value	438
string	"438"
8	
id	18
name	"Unknown_Attribute"
value	100
worst	100
thresh	50
when_failed	""
flags	
value	11
string	"PO-R-- "
prefailure	true
updated_online	true
performance	false
error_rate	true
event_count	false
auto_keep	false
raw	
value	0
string	"0"
9	
id	187
name	"Reported_Uncorrect"
value	100
worst	100
thresh	0
when_failed	""
flags	
value	50
string	"-O--CK "
prefailure	false
updated_online	true
performance	false
error_rate	false
event_count	true
auto_keep	true
raw	
value	0
string	"0"
10	
id	188
name	"Command_Timeout"
value	100
worst	100
thresh	0
when_failed	""
flags	
value	50
string	"-O--CK "
prefailure	false
updated_online	true
performance	false
error_rate	false
event_count	true
auto_keep	true
raw	
value	0
string	"0"
11	
id	190
name	"Airflow_Temperature_Cel"
value	61
worst	36
thresh	0
when_failed	""
flags	
value	34
string	"-O---K "
prefailure	false
updated_online	true
performance	false
error_rate	false
event_count	false
auto_keep	true
raw	
value	673644583
string	"39 (Min/Max 39/40)"
12	
id	192
name	"Power-Off_Retract_Count"
value	100
worst	100
thresh	0
when_failed	""
flags	
value	50
string	"-O--CK "
prefailure	false
updated_online	true
performance	false
error_rate	false
event_count	true
auto_keep	true
raw	
value	265
string	"265"
13	
id	193
name	"Load_Cycle_Count"
value	97
worst	97
thresh	0
when_failed	""
flags	
value	50
string	"-O--CK "
prefailure	false
updated_online	true
performance	false
error_rate	false
event_count	true
auto_keep	true
raw	
value	7447
string	"7447"
14	
id	194
name	"Temperature_Celsius"
value	39
worst	64
thresh	0
when_failed	""
flags	
value	34
string	"-O---K "
prefailure	false
updated_online	true
performance	false
error_rate	false
event_count	false
auto_keep	true
raw	
value	90194313255
string	"39 (0 21 0 0 0)"
15	
id	197
name	"Current_Pending_Sector"
value	100
worst	100
thresh	0
when_failed	""
flags	
value	18
string	"-O--C- "
prefailure	false
updated_online	true
performance	false
error_rate	false
event_count	true
auto_keep	false
raw	
value	0
string	"0"
16	
id	198
name	"Offline_Uncorrectable"
value	100
worst	100
thresh	0
when_failed	""
flags	
value	16
string	"----C- "
prefailure	false
updated_online	false
performance	false
error_rate	false
event_count	true
auto_keep	false
raw	
value	0
string	"0"
17	
id	199
name	"UDMA_CRC_Error_Count"
value	200
worst	200
thresh	0
when_failed	""
flags	
value	62
string	"-OSRCK "
prefailure	false
updated_online	true
performance	true
error_rate	true
event_count	true
auto_keep	true
raw	
value	0
string	"0"
18	
id	200
name	"Multi_Zone_Error_Rate"
value	100
worst	100
thresh	1
when_failed	""
flags	
value	35
string	"PO---K "
prefailure	true
updated_online	true
performance	false
error_rate	false
event_count	false
auto_keep	true
raw	
value	0
string	"0"
19	
id	240
name	"Head_Flying_Hours"
value	100
worst	100
thresh	0
when_failed	""
flags	
value	0
string	"------ "
prefailure	false
updated_online	false
performance	false
error_rate	false
event_count	false
auto_keep	false
raw	
value	121255516705017
string	"4345 (110 72 0)"
20	
id	241
name	"Total_LBAs_Written"
value	100
worst	253
thresh	0
when_failed	""
flags	
value	0
string	"------ "
prefailure	false
updated_online	false
performance	false
error_rate	false
event_count	false
auto_keep	false
raw	
value	73966167809
string	"73966167809"
21	
id	242
name	"Total_LBAs_Read"
value	100
worst	253
thresh	0
when_failed	""
flags	
value	0
string	"------ "
prefailure	false
updated_online	false
performance	false
error_rate	false
event_count	false
auto_keep	false
raw	
value	558276226123
string	"558276226123"
spare_available	
current_percent	100
threshold_percent	10
power_on_time	
hours	7634
power_cycle_count	438
temperature	
current	39
power_cycle_min	39
power_cycle_max	41
lifetime_min	24
lifetime_max	63
op_limit_max	60
op_limit_min	5
limit_min	5
limit_max	60
lifetime_over_limit_minutes	3916
lifetime_under_limit_minutes	0
ata_log_directory	
gp_dir_version	1
smart_dir_version	1
smart_dir_multi_sector	true
table	
0	
address	0
name	"Log Directory"
read	true
write	false
gp_sectors	1
smart_sectors	1
1	
address	1
name	"Summary SMART error log"
read	true
write	false
smart_sectors	1
2	
address	2
name	"Comprehensive SMART error log"
read	true
write	false
smart_sectors	5
3	
address	3
name	"Ext. Comprehensive SMART error log"
read	true
write	false
gp_sectors	5
4	
address	4
name	"Device Statistics log"
read	true
write	false
gp_sectors	256
smart_sectors	8
5	
address	6
name	"SMART self-test log"
read	true
write	false
smart_sectors	1
6	
address	7
name	"Extended self-test log"
read	true
write	false
gp_sectors	1
7	
address	8
name	"Power Conditions log"
read	true
write	false
gp_sectors	2
8	
address	9
name	"Selective self-test log"
read	true
write	true
smart_sectors	1
9	
address	10
name	"Device Statistics Notification"
read	true
write	true
gp_sectors	8
10	
address	12
name	"Pending Defects log"
read	true
write	false
gp_sectors	2048
11	
address	15
name	"Sense Data for Successful NCQ Cmds log"
read	true
write	false
gp_sectors	2
12	
address	16
name	"NCQ Command Error log"
read	true
write	false
gp_sectors	1
13	
address	17
name	"SATA Phy Event Counters log"
read	true
write	false
gp_sectors	1
14	
address	19
name	"SATA NCQ Send and Receive log"
read	true
write	false
gp_sectors	1
15	
address	33
name	"Write stream error log"
read	true
write	false
gp_sectors	1
16	
address	34
name	"Read stream error log"
read	true
write	false
gp_sectors	1
17	
address	36
name	"Current Device Internal Status Data log"
read	true
write	false
gp_sectors	768
18	
address	47
name	"Sector Configuration log"
read	true
write	false
gp_sectors	1
19	
address	48
name	"IDENTIFY DEVICE data log"
read	true
write	false
gp_sectors	9
smart_sectors	9
20	
address	128
name	"Host vendor specific log"
read	true
write	true
gp_sectors	16
smart_sectors	16
21	
address	129
name	"Host vendor specific log"
read	true
write	true
gp_sectors	16
smart_sectors	16
22	
address	130
name	"Host vendor specific log"
read	true
write	true
gp_sectors	16
smart_sectors	16
23	
address	131
name	"Host vendor specific log"
read	true
write	true
gp_sectors	16
smart_sectors	16
24	
address	132
name	"Host vendor specific log"
read	true
write	true
gp_sectors	16
smart_sectors	16
25	
address	133
name	"Host vendor specific log"
read	true
write	true
gp_sectors	16
smart_sectors	16
26	
address	134
name	"Host vendor specific log"
read	true
write	true
gp_sectors	16
smart_sectors	16
27	
address	135
name	"Host vendor specific log"
read	true
write	true
gp_sectors	16
smart_sectors	16
28	
address	136
name	"Host vendor specific log"
read	true
write	true
gp_sectors	16
smart_sectors	16
29	
address	137
name	"Host vendor specific log"
read	true
write	true
gp_sectors	16
smart_sectors	16
30	
address	138
name	"Host vendor specific log"
read	true
write	true
gp_sectors	16
smart_sectors	16
31	
address	139
name	"Host vendor specific log"
read	true
write	true
gp_sectors	16
smart_sectors	16
32	
address	140
name	"Host vendor specific log"
read	true
write	true
gp_sectors	16
smart_sectors	16
33	
address	141
name	"Host vendor specific log"
read	true
write	true
gp_sectors	16
smart_sectors	16
34	
address	142
name	"Host vendor specific log"
read	true
write	true
gp_sectors	16
smart_sectors	16
35	
address	143
name	"Host vendor specific log"
read	true
write	true
gp_sectors	16
smart_sectors	16
36	
address	144
name	"Host vendor specific log"
read	true
write	true
gp_sectors	16
smart_sectors	16
37	
address	145
name	"Host vendor specific log"
read	true
write	true
gp_sectors	16
smart_sectors	16
38	
address	146
name	"Host vendor specific log"
read	true
write	true
gp_sectors	16
smart_sectors	16
39	
address	147
name	"Host vendor specific log"
read	true
write	true
gp_sectors	16
smart_sectors	16
40	
address	148
name	"Host vendor specific log"
read	true
write	true
gp_sectors	16
smart_sectors	16
41	
address	149
name	"Host vendor specific log"
read	true
write	true
gp_sectors	16
smart_sectors	16
42	
address	150
name	"Host vendor specific log"
read	true
write	true
gp_sectors	16
smart_sectors	16
43	
address	151
name	"Host vendor specific log"
read	true
write	true
gp_sectors	16
smart_sectors	16
44	
address	152
name	"Host vendor specific log"
read	true
write	true
gp_sectors	16
smart_sectors	16
45	
address	153
name	"Host vendor specific log"
read	true
write	true
gp_sectors	16
smart_sectors	16
46	
address	154
name	"Host vendor specific log"
read	true
write	true
gp_sectors	16
smart_sectors	16
47	
address	155
name	"Host vendor specific log"
read	true
write	true
gp_sectors	16
smart_sectors	16
48	
address	156
name	"Host vendor specific log"
read	true
write	true
gp_sectors	16
smart_sectors	16
49	
address	157
name	"Host vendor specific log"
read	true
write	true
gp_sectors	16
smart_sectors	16
50	
address	158
name	"Host vendor specific log"
read	true
write	true
gp_sectors	16
smart_sectors	16
51	
address	159
name	"Host vendor specific log"
read	true
write	true
gp_sectors	16
smart_sectors	16
52	
address	161
name	"Device vendor specific log"
gp_sectors	160
smart_sectors	160
53	
address	162
name	"Device vendor specific log"
gp_sectors	16320
54	
address	164
name	"Device vendor specific log"
gp_sectors	160
smart_sectors	160
55	
address	166
name	"Device vendor specific log"
gp_sectors	192
56	
address	168
name	"Device vendor specific log"
gp_sectors	136
smart_sectors	136
57	
address	169
name	"Device vendor specific log"
gp_sectors	136
smart_sectors	136
58	
address	171
name	"Device vendor specific log"
gp_sectors	1
59	
address	173
name	"Device vendor specific log"
gp_sectors	16
60	
address	177
name	"Device vendor specific log"
gp_sectors	160
smart_sectors	160
61	
address	180
name	"Device vendor specific log"
gp_sectors	16
smart_sectors	16
62	
address	182
name	"Device vendor specific log"
gp_sectors	1920
63	
address	190
name	"Device vendor specific log"
gp_sectors	65535
64	
address	191
name	"Device vendor specific log"
gp_sectors	65535
65	
address	193
name	"Device vendor specific log"
gp_sectors	8
smart_sectors	8
66	
address	195
name	"Device vendor specific log"
gp_sectors	32
smart_sectors	32
67	
address	198
name	"Device vendor specific log"
gp_sectors	5184
68	
address	199
name	"Device vendor specific log"
gp_sectors	8
smart_sectors	8
69	
address	201
name	"Device vendor specific log"
gp_sectors	8
smart_sectors	8
70	
address	202
name	"Device vendor specific log"
gp_sectors	16
smart_sectors	16
71	
address	205
name	"Device vendor specific log"
gp_sectors	1
smart_sectors	1
72	
address	206
name	"Device vendor specific log"
gp_sectors	1
73	
address	207
name	"Device vendor specific log"
gp_sectors	512
74	
address	209
name	"Device vendor specific log"
gp_sectors	656
75	
address	210
name	"Device vendor specific log"
gp_sectors	10256
76	
address	212
name	"Device vendor specific log"
gp_sectors	2048
77	
address	218
name	"Device vendor specific log"
gp_sectors	1
smart_sectors	1
78	
address	224
name	"SCT Command/Status"
read	true
write	true
gp_sectors	1
smart_sectors	1
79	
address	225
name	"SCT Data Transfer"
read	true
write	true
gp_sectors	1
smart_sectors	1
ata_smart_error_log	
extended	
revision	1
sectors	5
count	0
ata_smart_self_test_log	
extended	
revision	1
sectors	1
table	
0	
type	
value	2
string	"Extended offline"
status	
value	32
string	"Interrupted (host reset)"
lifetime_hours	7633
1	
type	
value	2
string	"Extended offline"
status	
value	32
string	"Interrupted (host reset)"
lifetime_hours	7631
2	
type	
value	1
string	"Short offline"
status	
value	0
string	"Completed without error"
passed	true
lifetime_hours	7630
3	
type	
value	2
string	"Extended offline"
status	
value	32
string	"Interrupted (host reset)"
lifetime_hours	7507
4	
type	
value	2
string	"Extended offline"
status	
value	32
string	"Interrupted (host reset)"
lifetime_hours	7464
5	
type	
value	1
string	"Short offline"
status	
value	0
string	"Completed without error"
passed	true
lifetime_hours	7330
6	
type	
value	2
string	"Extended offline"
status	
value	32
string	"Interrupted (host reset)"
lifetime_hours	7311
7	
type	
value	1
string	"Short offline"
status	
value	0
string	"Completed without error"
passed	true
lifetime_hours	7242
8	
type	
value	2
string	"Extended offline"
status	
value	0
string	"Completed without error"
passed	true
lifetime_hours	7165
9	
type	
value	1
string	"Short offline"
status	
value	0
string	"Completed without error"
passed	true
lifetime_hours	7040
10	
type	
value	2
string	"Extended offline"
status	
value	32
string	"Interrupted (host reset)"
lifetime_hours	6994
11	
type	
value	1
string	"Short offline"
status	
value	0
string	"Completed without error"
passed	true
lifetime_hours	6971
12	
type	
value	2
string	"Extended offline"
status	
value	32
string	"Interrupted (host reset)"
lifetime_hours	6959
13	
type	
value	1
string	"Short offline"
status	
value	0
string	"Completed without error"
passed	true
lifetime_hours	6902
14	
type	
value	2
string	"Extended offline"
status	
value	0
string	"Completed without error"
passed	true
lifetime_hours	6875
15	
type	
value	2
string	"Extended offline"
status	
value	32
string	"Interrupted (host reset)"
lifetime_hours	6784
16	
type	
value	1
string	"Short offline"
status	
value	0
string	"Completed without error"
passed	true
lifetime_hours	6596
17	
type	
value	2
string	"Extended offline"
status	
value	32
string	"Interrupted (host reset)"
lifetime_hours	6535
18	
type	
value	2
string	"Extended offline"
status	
value	0
string	"Completed without error"
passed	true
lifetime_hours	6492
count	19
error_count_total	0
error_count_outdated	0
ata_smart_selective_self_test_log	
revision	1
table	
0	
lba_min	0
lba_max	0
status	
value	32
string	"Not_testing"
1	
lba_min	0
lba_max	0
status	
value	32
string	"Not_testing"
2	
lba_min	0
lba_max	0
status	
value	32
string	"Not_testing"
3	
lba_min	0
lba_max	0
status	
value	32
string	"Not_testing"
4	
lba_min	0
lba_max	0
status	
value	32
string	"Not_testing"
flags	
value	0
remainder_scan_enabled	false
power_up_scan_resume_minutes	0
ata_sct_status	
format_version	3
sct_version	522
device_state	
value	0
string	"Active"
temperature	
current	39
power_cycle_min	39
power_cycle_max	41
lifetime_min	21
lifetime_max	64
op_limit_max	60
under_limit_count	0
over_limit_count	0
smart_status	
passed	true
vendor_specific	
0	0
1	0
2	0
3	0
4	0
5	0
6	0
7	0
8	0
9	0
10	0
11	0
12	0
13	0
14	0
15	0
16	0
17	0
18	0
19	0
20	1
21	0
22	0
23	0
24	0
25	0
26	0
27	0
28	0
29	0
30	0
31	0
ata_sct_temperature_history	
version	2
sampling_period_minutes	5
logging_interval_minutes	59
temperature	
op_limit_min	10
op_limit_max	40
limit_min	5
limit_max	60
size	128
index	26
table	
0	33
1	null
2	26
3	43
4	45
5	44
6	40
7	null
8	26
9	37
10	48
11	50
12	49
13	48
14	48
15	49
16	50
17	49
18	49
19	49
20	50
21	49
22	50
23	50
24	49
25	49
26	49
27	49
28	49
29	48
30	49
31	49
32	49
33	49
34	48
35	38
36	38
37	47
38	51
39	51
40	50
41	50
42	50
43	50
44	49
45	null
46	33
47	null
48	36
49	null
50	27
51	null
52	31
53	null
54	32
55	null
56	28
57	null
58	32
59	null
60	32
61	null
62	33
63	null
64	34
65	null
66	34
67	null
68	29
69	null
70	33
71	null
72	35
73	null
74	35
75	null
76	39
77	null
78	39
79	null
80	39
81	null
82	40
83	null
84	37
85	null
86	39
87	null
88	40
89	37
90	37
91	37
92	null
93	37
94	null
95	39
96	null
97	28
98	null
99	34
100	null
101	34
102	null
103	35
104	null
105	38
106	null
107	39
108	null
109	38
110	null
111	27
112	null
113	34
114	null
115	36
116	null
117	31
118	null
119	34
120	null
121	28
122	null
123	37
124	39
125	null
126	40
127	39
ata_sct_erc	
read	
enabled	true
deciseconds	70
write	
enabled	true
deciseconds	70
ata_device_statistics	
pages	
0	
number	1
name	"General Statistics"
revision	1
table	
0	
offset	8
name	"Lifetime Power-On Resets"
size	4
value	438
flags	
value	192
string	"V--- "
valid	true
normalized	false
supports_dsn	false
monitored_condition_met	false
1	
offset	16
name	"Power-on Hours"
size	4
value	7634
flags	
value	192
string	"V--- "
valid	true
normalized	false
supports_dsn	false
monitored_condition_met	false
2	
offset	24
name	"Logical Sectors Written"
size	6
value	72175362799
flags	
value	192
string	"V--- "
valid	true
normalized	false
supports_dsn	false
monitored_condition_met	false
3	
offset	32
name	"Number of Write Commands"
size	6
value	1578807047
flags	
value	192
string	"V--- "
valid	true
normalized	false
supports_dsn	false
monitored_condition_met	false
4	
offset	40
name	"Logical Sectors Read"
size	6
value	544071312526
flags	
value	192
string	"V--- "
valid	true
normalized	false
supports_dsn	false
monitored_condition_met	false
5	
offset	48
name	"Number of Read Commands"
size	6
value	1773117277
flags	
value	192
string	"V--- "
valid	true
normalized	false
supports_dsn	false
monitored_condition_met	false
6	
offset	56
name	"Date and Time TimeStamp"
size	6
flags	
value	128
string	"---- "
valid	false
normalized	false
supports_dsn	false
monitored_condition_met	false
1	
number	3
name	"Rotating Media Statistics"
revision	1
table	
0	
offset	8
name	"Spindle Motor Power-on Hours"
size	4
value	7418
flags	
value	192
string	"V--- "
valid	true
normalized	false
supports_dsn	false
monitored_condition_met	false
1	
offset	16
name	"Head Flying Hours"
size	4
value	4345
flags	
value	192
string	"V--- "
valid	true
normalized	false
supports_dsn	false
monitored_condition_met	false
2	
offset	24
name	"Head Load Events"
size	4
value	7447
flags	
value	192
string	"V--- "
valid	true
normalized	false
supports_dsn	false
monitored_condition_met	false
3	
offset	32
name	"Number of Reallocated Logical Sectors"
size	4
value	0
flags	
value	192
string	"V--- "
valid	true
normalized	false
supports_dsn	false
monitored_condition_met	false
4	
offset	40
name	"Read Recovery Attempts"
size	4
value	0
flags	
value	192
string	"V--- "
valid	true
normalized	false
supports_dsn	false
monitored_condition_met	false
5	
offset	48
name	"Number of Mechanical Start Failures"
size	4
value	0
flags	
value	192
string	"V--- "
valid	true
normalized	false
supports_dsn	false
monitored_condition_met	false
6	
offset	56
name	"Number of Realloc. Candidate Logical Sectors"
size	4
value	0
flags	
value	192
string	"V--- "
valid	true
normalized	false
supports_dsn	false
monitored_condition_met	false
7	
offset	64
name	"Number of High Priority Unload Events"
size	4
value	265
flags	
value	192
string	"V--- "
valid	true
normalized	false
supports_dsn	false
monitored_condition_met	false
2	
number	4
name	"General Errors Statistics"
revision	1
table	
0	
offset	8
name	"Number of Reported Uncorrectable Errors"
size	4
value	0
flags	
value	192
string	"V--- "
valid	true
normalized	false
supports_dsn	false
monitored_condition_met	false
1	
offset	16
name	"Resets Between Cmd Acceptance and Completion"
size	4
value	0
flags	
value	192
string	"V--- "
valid	true
normalized	false
supports_dsn	false
monitored_condition_met	false
2	
offset	24
name	"Physical Element Status Changed"
size	4
value	0
flags	
value	208
string	"V-D- "
valid	true
normalized	false
supports_dsn	true
monitored_condition_met	false
3	
number	5
name	"Temperature Statistics"
revision	1
table	
0	
offset	8
name	"Current Temperature"
size	1
value	39
flags	
value	192
string	"V--- "
valid	true
normalized	false
supports_dsn	false
monitored_condition_met	false
1	
offset	16
name	"Average Short Term Temperature"
size	1
value	46
flags	
value	192
string	"V--- "
valid	true
normalized	false
supports_dsn	false
monitored_condition_met	false
2	
offset	24
name	"Average Long Term Temperature"
size	1
value	40
flags	
value	192
string	"V--- "
valid	true
normalized	false
supports_dsn	false
monitored_condition_met	false
3	
offset	32
name	"Highest Temperature"
size	1
value	63
flags	
value	192
string	"V--- "
valid	true
normalized	false
supports_dsn	false
monitored_condition_met	false
4	
offset	40
name	"Lowest Temperature"
size	1
value	24
flags	
value	192
string	"V--- "
valid	true
normalized	false
supports_dsn	false
monitored_condition_met	false
5	
offset	48
name	"Highest Average Short Term Temperature"
size	1
value	60
flags	
value	192
string	"V--- "
valid	true
normalized	false
supports_dsn	false
monitored_condition_met	false
6	
offset	56
name	"Lowest Average Short Term Temperature"
size	1
value	34
flags	
value	192
string	"V--- "
valid	true
normalized	false
supports_dsn	false
monitored_condition_met	false
7	
offset	64
name	"Highest Average Long Term Temperature"
size	1
value	45
flags	
value	192
string	"V--- "
valid	true
normalized	false
supports_dsn	false
monitored_condition_met	false
8	
offset	72
name	"Lowest Average Long Term Temperature"
size	1
value	38
flags	
value	192
string	"V--- "
valid	true
normalized	false
supports_dsn	false
monitored_condition_met	false
9	
offset	80
name	"Time in Over-Temperature"
size	4
value	3916
flags	
value	192
string	"V--- "
valid	true
normalized	false
supports_dsn	false
monitored_condition_met	false
10	
offset	88
name	"Specified Maximum Operating Temperature"
size	1
value	60
flags	
value	192
string	"V--- "
valid	true
normalized	false
supports_dsn	false
monitored_condition_met	false
11	
offset	96
name	"Time in Under-Temperature"
size	4
value	0
flags	
value	192
string	"V--- "
valid	true
normalized	false
supports_dsn	false
monitored_condition_met	false
12	
offset	104
name	"Specified Minimum Operating Temperature"
size	1
value	5
flags	
value	192
string	"V--- "
valid	true
normalized	false
supports_dsn	false
monitored_condition_met	false
4	
number	6
name	"Transport Statistics"
revision	1
table	
0	
offset	8
name	"Number of Hardware Resets"
size	4
value	289
flags	
value	192
string	"V--- "
valid	true
normalized	false
supports_dsn	false
monitored_condition_met	false
1	
offset	16
name	"Number of ASR Events"
size	4
value	35
flags	
value	192
string	"V--- "
valid	true
normalized	false
supports_dsn	false
monitored_condition_met	false
2	
offset	24
name	"Number of Interface CRC Errors"
size	4
value	0
flags	
value	192
string	"V--- "
valid	true
normalized	false
supports_dsn	false
monitored_condition_met	false
5	
number	255
name	"Vendor Specific Statistics"
revision	1
table	
0	
offset	8
name	"Vendor Specific"
size	7
value	0
flags	
value	192
string	"V--- "
valid	true
normalized	false
supports_dsn	false
monitored_condition_met	false
1	
offset	16
name	"Vendor Specific"
size	7
value	0
flags	
value	192
string	"V--- "
valid	true
normalized	false
supports_dsn	false
monitored_condition_met	false
2	
offset	24
name	"Vendor Specific"
size	7
value	0
flags	
value	192
string	"V--- "
valid	true
normalized	false
supports_dsn	false
monitored_condition_met	false
sata_phy_event_counters	
table	
0	
id	10
name	"Device-to-host register FISes sent due to a COMRESET"
size	2
value	1
overflow	false
1	
id	1
name	"Command failed due to ICRC error"
size	2
value	0
overflow	false
2	
id	3
name	"R_ERR response for device-to-host data FIS"
size	2
value	0
overflow	false
3	
id	4
name	"R_ERR response for host-to-device data FIS"
size	2
value	0
overflow	false
4	
id	6
name	"R_ERR response for device-to-host non-data FIS"
size	2
value	0
overflow	false
5	
id	7
name	"R_ERR response for host-to-device non-data FIS"
size	2
value	0
overflow	false
6	
id	11
name	"CRC errors within host-to-device FIS"
size	2
value	0
overflow	false
7	
id	13
name	"Non-CRC errors within host-to-device FIS"
size	2
value	0
overflow	false
reset	false

Should I now try, as Arwen suggested,

removing the disk being resilvered and attempting a scrub? Could I also remove the disk ST16000NM000H-3KW103 ZYD00P5Y that seems to be causing problems, since it is a RAIDZ2 pool, and hope that my resilver finishes that way? Or is this the wrong way to think about what’s happening, because I don’t understand how ZFS works?

Could anybody tell me whether anything about my problem can be understood from the smart report? What should my next troubleshooting steps be now? I would greatly appreciate any tips on how to progress. Thank you all for your tips and support.

Removing 2 disks would put the pool in jeopardy. You could remove 1 and try it. If it generates the I/O suspend again, repeat with the other removed.

It can be a bit hard to diagnose issues remotely. Can you supply the output of zpool status in CODE tags? And point out which disk(s) you think are the problem?

NAME                                      STATE     READ WRITE CKSUM
        Volume1                                   ONLINE       0     0     0
          raidz2-0                                ONLINE       0     0     0
            2f7321eb-afd0-4d39-8eb7-abaa52c9a4e6  ONLINE       0     0     0
            0bccf225-1ad4-4575-828f-4b3d07b0d0d2  ONLINE       0     0     0
            7a1c14c4-ad25-4e19-af77-4349c998e060  ONLINE       0     0     0  (awaiting resilver)
            8e053548-9acb-4a03-a76a-a7db59cdb31e  ONLINE       0     0     0
            81ff015b-62fd-454a-83cc-c9b793aa4d81  ONLINE       0     0     0
            4bef9b6b-b96a-4698-85e2-8ba9705a5450  DEGRADED 1.16K 11.9K     1  too many errors
            44d811c4-8a6f-4c4b-946b-558c659fb08e  ONLINE       0     0     0
            30ccbe9f-31ba-4b82-97ae-3b4eccbbab3a  ONLINE       0     0     0
            aec216ca-a948-42c2-8317-d8af4b6ffe7b  ONLINE       0     0     0
            0669e04a-4a7b-4396-8647-4450f38a6e7d  ONLINE       0     0     0
            bf8c9c1f-184a-4237-aadc-e18c8720632f  ONLINE       0     0     0  (resilvering)
        cache
          7ab5e132-17b1-4090-a427-6d9625eee0f6    ONLINE       0     0     0

4bef9b6b-b96a-4698-85e2-8ba9705a5450 is the one that is causing problems (the one which immediately interrupts the long smart test), bf8c9c1f-184a-4237-aadc-e18c8720632f seems to be the 18TB one that I initially replaced, and I have no clue why 7a1c14c4-ad25-4e19-af77-4349c998e060 is awaiting a resilver, but that seems to be the issue. Is there any way to figure out which drive (serial number this is)? how should I proceed this all seems rather bad now

Having 3 problematic drives in a RAID-Z2 vDev is just a bit beyond me. Sorry.

There is good news. ZFS was specifically designed with knowledge of the data and how it interacts with the redundancy, (RAID-Z2 in this case). What this means is that as long as enough info exists for each RAID-Zx stripe, the data in that stripe is fully safe / recoverable.

There are certain situations where regular RAID-5/6 can fail, while replacing a disk. ZFS allows replacing in place. Meaning you add the new, replacement disk to the server, and then cause ZFS to replace the failing disk. What this helps prevent is that if another disk, (or 2), has bad blocks, but the failing disk that is being replaced, has the data, (or parity), you don’t loose data. In essence, the replacement disk temporarily Mirrors the failing disk, (until it finds a bad block, then uses the rest of the vDev for recovery).

However, this replace in place is not occurring on your pool’s vDev.

Your disks appear to be out of sync, thus requiring re-silvering to bring them back to sync. But with a 3rd disk being “DEGRADED”, on a RAID-Z2 pool, that can cause problems if the blocks that need resilver are not recoverable from the other disks.

As for what to do, sorry. As I said, this is a bit beyond me.