Can you connect the flaky drive to your motherboard’s SATA headers? Even if you have to disassemble the case on a bench, as a test. It eliminates a lot of variables. Sorry if this has already been suggested, I’ve searched the thread and did not see it.
I thought it was best practice to NOT have smart tests run at same time as scrub/resilver? Not that it seemed to matter in this case, but wanted to double check my understanding.
IMO best next step.
No, you did not come off badly. I simply wanted to review ZFS background. Both for you, and other readers now and in the future. Sorry if I sounded like a teacher, (I wrote computer documentation as part of my prior jobs…).
A simple cable disconnect of the drive trying to resilver. I’d do both power and data cables, just to isolate it completely.
As for ZFS scrubs every 4 weeks, that is mostly okay. Some people prefer every 2 weeks. And some want weekly, (which I think is excessive). Over 4 weeks I think is too long. So, 2 to 4 weeks is about right, unless there is older hardware, like used disks. Then closer to 2 week is probably better.
Make sure your disks get enough cooling during scrubs. They basically do non-stop seeks and reads until the scrub completes.
Firmware is good (there’s a P16.00.14.00 somewhere, but I think that beyond .10 it’s all about SSD behaviour). Is the additional power supply connected? (9300-16i typically have a 4-pin Molex).
75 cm should be within specs if backplane traces are short, but I’d be wary about these, especially if the problematic drives are behind these cables. I concur that using SATA cables from the motherboard could help.
If possible, yes indeed, as double duty will put maximal stress—and extend the duration of both operations, but it is possible.
That, unfortunately, is NOT possible. Jonsbo N cases in general appear to have poor drive cooling due to the use of consumer style fans to try and pull air from behind a backplane, and the N5 puts the PSU behind 4 of the 12 bays so for the leftmost drives there really i no good path for air flow.
The 8 vertical drives behind the half-height backplane on the right may enjoy some air flow from the two 120mm fans:
But the 4 horizontal drives can hardly have any joy through this, with the PSU just behind:
No, it is not connected but I figured this should be fine as an LSI 9300-16i should not be drawing more than 30 watts roughly and the PCIE slot should be able to provide 75 watts if I’m not mistaken. I could as one possible trouble shooting step acquire for a reasonable sum a LSI 9400i - 16i or aLSi 9305-16i from the same reputable seller of refurbished server hardware as my original LSI 9300 - 16i, Should be at my door step in a few days
I have now started running long Smart Tests on all 11 drives of the pool.using the using GSmartControl on the SystemRescue distort booted from an usb. I guess this will take at least 20 - 24 hours. I hope that all disks are passing this long Smart test.
yes airflow is suboptimal and worse than I imagined when I decided on the case. I have two Noctua nf a12x25 in the back running at a 100%, drives are always at roughly 40-43 degrees and at roughly 50-52 while a scrub happens…. I think this is far from ideal but not absolutely terrible but I might be mistaken …
interestingly the physical placement of the drives doesn’t make such a big difference (as one would believe from the layout of the case) drives are normally all within 2-3 degrees of each other …. also all drives that reported errors were on the board with eight drives. To be precise the drive that originally faulted was in position 6. I added the replacement that I received from Seagate to the same slot. There this replacement also faulted as I described. Two times. The first time the resilver completed. The second time when I shut down the server to investigate the drives right next to it ( so position seven and eight) reported the errors I described. The drive that is being resilvered is now still in position 6 but has no longer reported any errors. The drive that now always reports errors was initially in position 7 (therefore sharing a cable to the HBA with positions 5,6,7,8) but I now moved it to position 1 and it shows the behaviour that I told you about and no longer shares a cable with 5,6,7,8 but with 1,2,3,4
75 W is for a full x16 card. The 9300-16i is limited to 25 W and this is not enough for two LSI 3008 and a switch. Please connect additional power: It’s there for a reason.
A 9305 or 9400 would be fully powered from the PCIe slot, and run cooler, but this may not be necessary. (A plain 9300-8i supplemented by motherboard SATA ports would do as well.)
If the long SMART tests show no issues, we might have have nailed the hardware problem: Power to the HBA!
I’d say this is a fair assessment (50°C is still “within specs” for warranty purposes), but some here would not be happy with these numbers and want to see 10°C less to preserve their drives for longer.
Once again I am very thankful for all of your help and making me understand the issues at hand.
Ok, I understand, I had to search for the PCIe power cables of my power supply, and luckily found them. I now attached them to the LSI 9300-16i. Sadly, it didn’t fix my problem. The same behaviour as before continues. I completed my long SMART test on all drives of the pool — 10 of them finished and passed (including the new 18TB drive, which is being resilvered and which initially faulted two times). But the one that seems to be causing the problems, ST16000NM000H-3KW103 ZYD00P5Y, always gets interrupted after a few minutes. Here is the report.
json_format_version
0 1
1 0
smartctl
version
0 7
1 5
pre_release false
svn_revision "5714"
platform_info "x86_64-linux-6.18.34-1-lts"
build_info "(local build)"
argv
0 "smartctl"
1 "--health"
2 "--info"
3 "--get=all"
4 "--capabilities"
5 "--attributes"
6 "--format=brief"
7 "--log=xerror,50,error"
8 "--log=xselftest,50,selftest"
9 "--log=selective"
10 "--log=directory"
11 "--log=scttemp"
12 "--log=scterc"
13 "--log=devstat"
14 "--log=sataphy"
15 "--json=o"
16 "/dev/sds"
output
0 "smartctl 7.5 2025-04-30 r5714 [x86_64-linux-6.18.34-1-lts] (local build)"
1 "Copyright (C) 2002-25, Bruce Allen, Christian Franke, www.smartmontools.org"
2 ""
3 "=== START OF INFORMATION SECTION ==="
4 "Device Model: ST16000NM000H-3KW103"
5 "Serial Number: ZYD00P5Y"
6 "LU WWN Device Id: 5 000c50 0e7fa6d6d"
7 "Firmware Version: EN01"
8 "User Capacity: 16,000,900,661,248 bytes [16.0 TB]"
9 "Sector Sizes: 512 bytes logical, 4096 bytes physical"
10 "Rotation Rate: 7200 rpm"
11 "Form Factor: 3.5 inches"
12 "Device is: Not in smartctl database 7.5/5706"
13 "ATA Version is: ACS-5 (minor revision not indicated)"
14 "SATA Version is: SATA 3.3, 6.0 Gb/s (current: 6.0 Gb/s)"
15 "Local Time is: Tue Jun 23 21:39:53 2026 UTC"
16 "SMART support is: Available - device has SMART capability."
17 "SMART support is: Enabled"
18 "AAM feature is: Unavailable"
19 "APM feature is: Unavailable"
20 "Rd look-ahead is: Enabled"
21 "Write cache is: Enabled"
22 "DSN feature is: Disabled"
23 "ATA Security is: Disabled, NOT FROZEN [SEC1]"
24 ""
25 "=== START OF READ SMART DATA SECTION ==="
26 "SMART overall-health self-assessment test result: PASSED"
27 ""
28 "General SMART Values:"
29 "Offline data collection status: (0x82)\tOffline data collection activity"
30 "\t\t\t\t\twas completed without error."
31 "\t\t\t\t\tAuto Offline Data Collection: Enabled."
32 "Self-test execution status: ( 32)\tThe self-test routine was interrupted"
33 "\t\t\t\t\tby the host with a hard or soft reset."
34 "Total time to complete Offline "
35 "data collection: \t\t( 567) seconds."
36 "Offline data collection"
37 "capabilities: \t\t\t (0x7b) SMART execute Offline immediate."
38 "\t\t\t\t\tAuto Offline data collection on/off support."
39 "\t\t\t\t\tSuspend Offline collection upon new"
40 "\t\t\t\t\tcommand."
41 "\t\t\t\t\tOffline surface scan supported."
42 "\t\t\t\t\tSelf-test supported."
43 "\t\t\t\t\tConveyance Self-test supported."
44 "\t\t\t\t\tSelective Self-test supported."
45 "SMART capabilities: (0x0003)\tSaves SMART data before entering"
46 "\t\t\t\t\tpower-saving mode."
47 "\t\t\t\t\tSupports SMART auto save timer."
48 "Error logging capability: (0x01)\tError logging supported."
49 "\t\t\t\t\tGeneral Purpose Logging supported."
50 "Short self-test routine "
51 "recommended polling time: \t ( 2) minutes."
52 "Extended self-test routine"
53 "recommended polling time: \t (1374) minutes."
54 "Conveyance self-test routine"
55 "recommended polling time: \t ( 3) minutes."
56 "SCT capabilities: \t (0x50bd)\tSCT Status supported."
57 "\t\t\t\t\tSCT Error Recovery Control supported."
58 "\t\t\t\t\tSCT Feature Control supported."
59 "\t\t\t\t\tSCT Data Table supported."
60 ""
61 "SMART Attributes Data Structure revision number: 10"
62 "Vendor Specific SMART Attributes with Thresholds:"
63 "ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE"
64 " 1 Raw_Read_Error_Rate POSR-- 070 064 044 - 9581200"
65 " 3 Spin_Up_Time PO---- 093 090 000 - 0"
66 " 4 Start_Stop_Count -O--CK 100 100 020 - 438"
67 " 5 Reallocated_Sector_Ct PO--CK 100 100 010 - 0"
68 " 7 Seek_Error_Rate POSR-- 087 060 045 - 520340282"
69 " 9 Power_On_Hours -O--CK 092 092 000 - 7634"
70 " 10 Spin_Retry_Count PO--C- 100 100 097 - 0"
71 " 12 Power_Cycle_Count -O--CK 100 100 020 - 438"
72 " 18 Unknown_Attribute PO-R-- 100 100 050 - 0"
73 "187 Reported_Uncorrect -O--CK 100 100 000 - 0"
74 "188 Command_Timeout -O--CK 100 100 000 - 0"
75 "190 Airflow_Temperature_Cel -O---K 061 036 000 - 39 (Min/Max 39/40)"
76 "192 Power-Off_Retract_Count -O--CK 100 100 000 - 265"
77 "193 Load_Cycle_Count -O--CK 097 097 000 - 7447"
78 "194 Temperature_Celsius -O---K 039 064 000 - 39 (0 21 0 0 0)"
79 "197 Current_Pending_Sector -O--C- 100 100 000 - 0"
80 "198 Offline_Uncorrectable ----C- 100 100 000 - 0"
81 "199 UDMA_CRC_Error_Count -OSRCK 200 200 000 - 0"
82 "200 Multi_Zone_Error_Rate PO---K 100 100 001 - 0"
83 "240 Head_Flying_Hours ------ 100 100 000 - 4345 (110 72 0)"
84 "241 Total_LBAs_Written ------ 100 253 000 - 73966167809"
85 "242 Total_LBAs_Read ------ 100 253 000 - 558276226123"
86 " ||||||_ K auto-keep"
87 " |||||__ C event count"
88 " ||||___ R error rate"
89 " |||____ S speed/performance"
90 " ||_____ O updated online"
91 " |______ P prefailure warning"
92 ""
93 "General Purpose Log Directory Version 1"
94 "SMART Log Directory Version 1 [multi-sector log support]"
95 "Address Access R/W Size Description"
96 "0x00 GPL,SL R/O 1 Log Directory"
97 "0x01 SL R/O 1 Summary SMART error log"
98 "0x02 SL R/O 5 Comprehensive SMART error log"
99 "0x03 GPL R/O 5 Ext. Comprehensive SMART error log"
100 "0x04 GPL R/O 256 Device Statistics log"
101 "0x04 SL R/O 8 Device Statistics log"
102 "0x06 SL R/O 1 SMART self-test log"
103 "0x07 GPL R/O 1 Extended self-test log"
104 "0x08 GPL R/O 2 Power Conditions log"
105 "0x09 SL R/W 1 Selective self-test log"
106 "0x0a GPL R/W 8 Device Statistics Notification"
107 "0x0c GPL R/O 2048 Pending Defects log"
108 "0x0f GPL R/O 2 Sense Data for Successful NCQ Cmds log"
109 "0x10 GPL R/O 1 NCQ Command Error log"
110 "0x11 GPL R/O 1 SATA Phy Event Counters log"
111 "0x13 GPL R/O 1 SATA NCQ Send and Receive log"
112 "0x21 GPL R/O 1 Write stream error log"
113 "0x22 GPL R/O 1 Read stream error log"
114 "0x24 GPL R/O 768 Current Device Internal Status Data log"
115 "0x2f GPL R/O 1 Sector Configuration log"
116 "0x30 GPL,SL R/O 9 IDENTIFY DEVICE data log"
117 "0x80-0x9f GPL,SL R/W 16 Host vendor specific log"
118 "0xa1 GPL,SL VS 160 Device vendor specific log"
119 "0xa2 GPL VS 16320 Device vendor specific log"
120 "0xa4 GPL,SL VS 160 Device vendor specific log"
121 "0xa6 GPL VS 192 Device vendor specific log"
122 "0xa8-0xa9 GPL,SL VS 136 Device vendor specific log"
123 "0xab GPL VS 1 Device vendor specific log"
124 "0xad GPL VS 16 Device vendor specific log"
125 "0xb1 GPL,SL VS 160 Device vendor specific log"
126 "0xb4 GPL,SL VS 16 Device vendor specific log"
127 "0xb6 GPL VS 1920 Device vendor specific log"
128 "0xbe-0xbf GPL VS 65535 Device vendor specific log"
129 "0xc1 GPL,SL VS 8 Device vendor specific log"
130 "0xc3 GPL,SL VS 32 Device vendor specific log"
131 "0xc6 GPL VS 5184 Device vendor specific log"
132 "0xc7 GPL,SL VS 8 Device vendor specific log"
133 "0xc9 GPL,SL VS 8 Device vendor specific log"
134 "0xca GPL,SL VS 16 Device vendor specific log"
135 "0xcd GPL,SL VS 1 Device vendor specific log"
136 "0xce GPL VS 1 Device vendor specific log"
137 "0xcf GPL VS 512 Device vendor specific log"
138 "0xd1 GPL VS 656 Device vendor specific log"
139 "0xd2 GPL VS 10256 Device vendor specific log"
140 "0xd4 GPL VS 2048 Device vendor specific log"
141 "0xda GPL,SL VS 1 Device vendor specific log"
142 "0xe0 GPL,SL R/W 1 SCT Command/Status"
143 "0xe1 GPL,SL R/W 1 SCT Data Transfer"
144 ""
145 "SMART Extended Comprehensive Error Log Version: 1 (5 sectors)"
146 "No Errors Logged"
147 ""
148 "SMART Extended Self-test Log Version: 1 (1 sectors)"
149 "Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error"
150 "# 1 Extended offline Interrupted (host reset) 00% 7633 -"
151 "# 2 Extended offline Interrupted (host reset) 00% 7631 -"
152 "# 3 Short offline Completed without error 00% 7630 -"
153 "# 4 Extended offline Interrupted (host reset) 00% 7507 -"
154 "# 5 Extended offline Interrupted (host reset) 00% 7464 -"
155 "# 6 Short offline Completed without error 00% 7330 -"
156 "# 7 Extended offline Interrupted (host reset) 00% 7311 -"
157 "# 8 Short offline Completed without error 00% 7242 -"
158 "# 9 Extended offline Completed without error 00% 7165 -"
159 "#10 Short offline Completed without error 00% 7040 -"
160 "#11 Extended offline Interrupted (host reset) 00% 6994 -"
161 "#12 Short offline Completed without error 00% 6971 -"
162 "#13 Extended offline Interrupted (host reset) 00% 6959 -"
163 "#14 Short offline Completed without error 00% 6902 -"
164 "#15 Extended offline Completed without error 00% 6875 -"
165 "#16 Extended offline Interrupted (host reset) 00% 6784 -"
166 "#17 Short offline Completed without error 00% 6596 -"
167 "#18 Extended offline Interrupted (host reset) 00% 6535 -"
168 "#19 Extended offline Completed without error 00% 6492 -"
169 ""
170 "SMART Selective self-test log data structure revision number 1"
171 " SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS"
172 " 1 0 0 Not_testing"
173 " 2 0 0 Not_testing"
174 " 3 0 0 Not_testing"
175 " 4 0 0 Not_testing"
176 " 5 0 0 Not_testing"
177 "Selective self-test flags (0x0):"
178 " After scanning selected spans, do NOT read-scan remainder of disk."
179 "If Selective self-test is pending on power-up, resume after 0 minute delay."
180 ""
181 "SCT Status Version: 3"
182 "SCT Version (vendor specific): 522 (0x020a)"
183 "Device State: Active (0)"
184 "Current Temperature: 39 Celsius"
185 "Power Cycle Min/Max Temperature: 39/41 Celsius"
186 "Lifetime Min/Max Temperature: 21/64 Celsius"
187 "Specified Max Operating Temperature: 60 Celsius"
188 "Under/Over Temperature Limit Count: 0/0"
189 "SMART Status: 0xc24f (PASSED)"
190 "Vendor specific:"
191 "00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00"
192 "00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00"
193 ""
194 "SCT Temperature History Version: 2"
195 "Temperature Sampling Period: 5 minutes"
196 "Temperature Logging Interval: 59 minutes"
197 "Min/Max recommended Temperature: 10/40 Celsius"
198 "Min/Max Temperature Limit: 5/60 Celsius"
199 "Temperature History Size (Index): 128 (26)"
200 ""
201 "Index Estimated Time Temperature Celsius"
202 " 27 2026-06-18 16:07 33 **************"
203 " 28 2026-06-18 17:06 ? -"
204 " 29 2026-06-18 18:05 26 *******"
205 " 30 2026-06-18 19:04 43 ************************"
206 " 31 2026-06-18 20:03 45 **************************"
207 " 32 2026-06-18 21:02 44 *************************"
208 " 33 2026-06-18 22:01 40 *********************"
209 " 34 2026-06-18 23:00 ? -"
210 " 35 2026-06-18 23:59 26 *******"
211 " 36 2026-06-19 00:58 37 ******************"
212 " 37 2026-06-19 01:57 48 *****************************"
213 " 38 2026-06-19 02:56 50 *******************************"
214 " 39 2026-06-19 03:55 49 ******************************"
215 " 40 2026-06-19 04:54 48 *****************************"
216 " 41 2026-06-19 05:53 48 *****************************"
217 " 42 2026-06-19 06:52 49 ******************************"
218 " 43 2026-06-19 07:51 50 *******************************"
219 " 44 2026-06-19 08:50 49 ******************************"
220 " 45 2026-06-19 09:49 49 ******************************"
221 " 46 2026-06-19 10:48 49 ******************************"
222 " 47 2026-06-19 11:47 50 *******************************"
223 " 48 2026-06-19 12:46 49 ******************************"
224 " 49 2026-06-19 13:45 50 *******************************"
225 " 50 2026-06-19 14:44 50 *******************************"
226 " 51 2026-06-19 15:43 49 ******************************"
227 " ... ..( 3 skipped). .. ******************************"
228 " 55 2026-06-19 19:39 49 ******************************"
229 " 56 2026-06-19 20:38 48 *****************************"
230 " 57 2026-06-19 21:37 49 ******************************"
231 " ... ..( 2 skipped). .. ******************************"
232 " 60 2026-06-20 00:34 49 ******************************"
233 " 61 2026-06-20 01:33 48 *****************************"
234 " 62 2026-06-20 02:32 38 *******************"
235 " 63 2026-06-20 03:31 38 *******************"
236 " 64 2026-06-20 04:30 47 ****************************"
237 " 65 2026-06-20 05:29 51 ********************************"
238 " 66 2026-06-20 06:28 51 ********************************"
239 " 67 2026-06-20 07:27 50 *******************************"
240 " ... ..( 2 skipped). .. *******************************"
241 " 70 2026-06-20 10:24 50 *******************************"
242 " 71 2026-06-20 11:23 49 ******************************"
243 " 72 2026-06-20 12:22 ? -"
244 " 73 2026-06-20 13:21 33 **************"
245 " 74 2026-06-20 14:20 ? -"
246 " 75 2026-06-20 15:19 36 *****************"
247 " 76 2026-06-20 16:18 ? -"
248 " 77 2026-06-20 17:17 27 ********"
249 " 78 2026-06-20 18:16 ? -"
250 " 79 2026-06-20 19:15 31 ************"
251 " 80 2026-06-20 20:14 ? -"
252 " 81 2026-06-20 21:13 32 *************"
253 " 82 2026-06-20 22:12 ? -"
254 " 83 2026-06-20 23:11 28 *********"
255 " 84 2026-06-21 00:10 ? -"
256 " 85 2026-06-21 01:09 32 *************"
257 " 86 2026-06-21 02:08 ? -"
258 " 87 2026-06-21 03:07 32 *************"
259 " 88 2026-06-21 04:06 ? -"
260 " 89 2026-06-21 05:05 33 **************"
261 " 90 2026-06-21 06:04 ? -"
262 " 91 2026-06-21 07:03 34 ***************"
263 " 92 2026-06-21 08:02 ? -"
264 " 93 2026-06-21 09:01 34 ***************"
265 " 94 2026-06-21 10:00 ? -"
266 " 95 2026-06-21 10:59 29 **********"
267 " 96 2026-06-21 11:58 ? -"
268 " 97 2026-06-21 12:57 33 **************"
269 " 98 2026-06-21 13:56 ? -"
270 " 99 2026-06-21 14:55 35 ****************"
271 " 100 2026-06-21 15:54 ? -"
272 " 101 2026-06-21 16:53 35 ****************"
273 " 102 2026-06-21 17:52 ? -"
274 " 103 2026-06-21 18:51 39 ********************"
275 " 104 2026-06-21 19:50 ? -"
276 " 105 2026-06-21 20:49 39 ********************"
277 " 106 2026-06-21 21:48 ? -"
278 " 107 2026-06-21 22:47 39 ********************"
279 " 108 2026-06-21 23:46 ? -"
280 " 109 2026-06-22 00:45 40 *********************"
281 " 110 2026-06-22 01:44 ? -"
282 " 111 2026-06-22 02:43 37 ******************"
283 " 112 2026-06-22 03:42 ? -"
284 " 113 2026-06-22 04:41 39 ********************"
285 " 114 2026-06-22 05:40 ? -"
286 " 115 2026-06-22 06:39 40 *********************"
287 " 116 2026-06-22 07:38 37 ******************"
288 " 117 2026-06-22 08:37 37 ******************"
289 " 118 2026-06-22 09:36 37 ******************"
290 " 119 2026-06-22 10:35 ? -"
291 " 120 2026-06-22 11:34 37 ******************"
292 " 121 2026-06-22 12:33 ? -"
293 " 122 2026-06-22 13:32 39 ********************"
294 " 123 2026-06-22 14:31 ? -"
295 " 124 2026-06-22 15:30 28 *********"
296 " 125 2026-06-22 16:29 ? -"
297 " 126 2026-06-22 17:28 34 ***************"
298 " 127 2026-06-22 18:27 ? -"
299 " 0 2026-06-22 19:26 34 ***************"
300 " 1 2026-06-22 20:25 ? -"
301 " 2 2026-06-22 21:24 35 ****************"
302 " 3 2026-06-22 22:23 ? -"
303 " 4 2026-06-22 23:22 38 *******************"
304 " 5 2026-06-23 00:21 ? -"
305 " 6 2026-06-23 01:20 39 ********************"
306 " 7 2026-06-23 02:19 ? -"
307 " 8 2026-06-23 03:18 38 *******************"
308 " 9 2026-06-23 04:17 ? -"
309 " 10 2026-06-23 05:16 27 ********"
310 " 11 2026-06-23 06:15 ? -"
311 " 12 2026-06-23 07:14 34 ***************"
312 " 13 2026-06-23 08:13 ? -"
313 " 14 2026-06-23 09:12 36 *****************"
314 " 15 2026-06-23 10:11 ? -"
315 " 16 2026-06-23 11:10 31 ************"
316 " 17 2026-06-23 12:09 ? -"
317 " 18 2026-06-23 13:08 34 ***************"
318 " 19 2026-06-23 14:07 ? -"
319 " 20 2026-06-23 15:06 28 *********"
320 " 21 2026-06-23 16:05 ? -"
321 " 22 2026-06-23 17:04 37 ******************"
322 " 23 2026-06-23 18:03 39 ********************"
323 " 24 2026-06-23 19:02 ? -"
324 " 25 2026-06-23 20:01 40 *********************"
325 " 26 2026-06-23 21:00 39 ********************"
326 ""
327 "SCT Error Recovery Control:"
328 " Read: 70 (7.0 seconds)"
329 " Write: 70 (7.0 seconds)"
330 ""
331 "Device Statistics (GP Log 0x04)"
332 "Page Offset Size Value Flags Description"
333 "0x01 ===== = = === == General Statistics (rev 1) =="
334 "0x01 0x008 4 438 --- Lifetime Power-On Resets"
335 "0x01 0x010 4 7634 --- Power-on Hours"
336 "0x01 0x018 6 72175362799 --- Logical Sectors Written"
337 "0x01 0x020 6 1578807047 --- Number of Write Commands"
338 "0x01 0x028 6 544071312526 --- Logical Sectors Read"
339 "0x01 0x030 6 1773117277 --- Number of Read Commands"
340 "0x01 0x038 6 - --- Date and Time TimeStamp"
341 "0x03 ===== = = === == Rotating Media Statistics (rev 1) =="
342 "0x03 0x008 4 7418 --- Spindle Motor Power-on Hours"
343 "0x03 0x010 4 4345 --- Head Flying Hours"
344 "0x03 0x018 4 7447 --- Head Load Events"
345 "0x03 0x020 4 0 --- Number of Reallocated Logical Sectors"
346 "0x03 0x028 4 0 --- Read Recovery Attempts"
347 "0x03 0x030 4 0 --- Number of Mechanical Start Failures"
348 "0x03 0x038 4 0 --- Number of Realloc. Candidate Logical Sectors"
349 "0x03 0x040 4 265 --- Number of High Priority Unload Events"
350 "0x04 ===== = = === == General Errors Statistics (rev 1) =="
351 "0x04 0x008 4 0 --- Number of Reported Uncorrectable Errors"
352 "0x04 0x010 4 0 --- Resets Between Cmd Acceptance and Completion"
353 "0x04 0x018 4 0 -D- Physical Element Status Changed"
354 "0x05 ===== = = === == Temperature Statistics (rev 1) =="
355 "0x05 0x008 1 39 --- Current Temperature"
356 "0x05 0x010 1 46 --- Average Short Term Temperature"
357 "0x05 0x018 1 40 --- Average Long Term Temperature"
358 "0x05 0x020 1 63 --- Highest Temperature"
359 "0x05 0x028 1 24 --- Lowest Temperature"
360 "0x05 0x030 1 60 --- Highest Average Short Term Temperature"
361 "0x05 0x038 1 34 --- Lowest Average Short Term Temperature"
362 "0x05 0x040 1 45 --- Highest Average Long Term Temperature"
363 "0x05 0x048 1 38 --- Lowest Average Long Term Temperature"
364 "0x05 0x050 4 3916 --- Time in Over-Temperature"
365 "0x05 0x058 1 60 --- Specified Maximum Operating Temperature"
366 "0x05 0x060 4 0 --- Time in Under-Temperature"
367 "0x05 0x068 1 5 --- Specified Minimum Operating Temperature"
368 "0x06 ===== = = === == Transport Statistics (rev 1) =="
369 "0x06 0x008 4 289 --- Number of Hardware Resets"
370 "0x06 0x010 4 35 --- Number of ASR Events"
371 "0x06 0x018 4 0 --- Number of Interface CRC Errors"
372 "0xff ===== = = === == Vendor Specific Statistics (rev 1) =="
373 "0xff 0x008 7 0 --- Vendor Specific"
374 "0xff 0x010 7 0 --- Vendor Specific"
375 "0xff 0x018 7 0 --- Vendor Specific"
376 " |||_ C monitored condition met"
377 " ||__ D supports DSN"
378 " |___ N normalized value"
379 ""
380 "SATA Phy Event Counters (GP Log 0x11)"
381 "ID Size Value Description"
382 "0x000a 2 1 Device-to-host register FISes sent due to a COMRESET"
383 "0x0001 2 0 Command failed due to ICRC error"
384 "0x0003 2 0 R_ERR response for device-to-host data FIS"
385 "0x0004 2 0 R_ERR response for host-to-device data FIS"
386 "0x0006 2 0 R_ERR response for device-to-host non-data FIS"
387 "0x0007 2 0 R_ERR response for host-to-device non-data FIS"
388 "0x000b 2 0 CRC errors within host-to-device FIS"
389 "0x000d 2 0 Non-CRC errors within host-to-device FIS"
390 ""
drive_database_version
string "7.5/5706"
exit_status 0
local_time
time_t 1782250793
asctime "Tue Jun 23 21:39:53 2026 UTC"
device
name "/dev/sds"
info_name "/dev/sds [SAT]"
type "sat"
protocol "ATA"
model_name "ST16000NM000H-3KW103"
serial_number "ZYD00P5Y"
wwn
naa 5
oui 3152
id 3891948909
firmware_version "EN01"
user_capacity
blocks 31251759104
bytes 16000900661248
logical_block_size 512
physical_block_size 4096
rotation_rate 7200
form_factor
ata_value 2
name "3.5 inches"
trim
supported false
in_smartctl_database false
ata_version
string "ACS-5 (minor revision not indicated)"
major_value 8160
minor_value 65535
sata_version
string "SATA 3.3"
value 511
interface_speed
max
sata_value 14
string "6.0 Gb/s"
units_per_second 60
bits_per_unit 100000000
current
sata_value 3
string "6.0 Gb/s"
units_per_second 60
bits_per_unit 100000000
smart_support
available true
enabled true
read_lookahead
enabled true
write_cache
enabled true
ata_dsn
enabled false
ata_security
state 33
string "Disabled, NOT FROZEN [SEC1]"
enabled false
frozen false
master_password_id 65534
smart_status
passed true
ata_smart_data
offline_data_collection
status
value 130
string "was completed without error"
passed true
completion_seconds 567
self_test
status
value 32
string "was interrupted by the host with a reset"
polling_minutes
short 2
extended 1374
conveyance 3
capabilities
values
0 123
1 3
exec_offline_immediate_supported true
offline_is_aborted_upon_new_cmd false
offline_surface_scan_supported true
self_tests_supported true
conveyance_self_test_supported true
selective_self_test_supported true
attribute_autosave_enabled true
error_logging_supported true
gp_logging_supported true
ata_sct_capabilities
value 20669
error_recovery_control_supported true
feature_control_supported true
data_table_supported true
ata_smart_attributes
revision 10
table
0
id 1
name "Raw_Read_Error_Rate"
value 70
worst 64
thresh 44
when_failed ""
flags
value 15
string "POSR-- "
prefailure true
updated_online true
performance true
error_rate true
event_count false
auto_keep false
raw
value 9581200
string "9581200"
1
id 3
name "Spin_Up_Time"
value 93
worst 90
thresh 0
when_failed ""
flags
value 3
string "PO---- "
prefailure true
updated_online true
performance false
error_rate false
event_count false
auto_keep false
raw
value 0
string "0"
2
id 4
name "Start_Stop_Count"
value 100
worst 100
thresh 20
when_failed ""
flags
value 50
string "-O--CK "
prefailure false
updated_online true
performance false
error_rate false
event_count true
auto_keep true
raw
value 438
string "438"
3
id 5
name "Reallocated_Sector_Ct"
value 100
worst 100
thresh 10
when_failed ""
flags
value 51
string "PO--CK "
prefailure true
updated_online true
performance false
error_rate false
event_count true
auto_keep true
raw
value 0
string "0"
4
id 7
name "Seek_Error_Rate"
value 87
worst 60
thresh 45
when_failed ""
flags
value 15
string "POSR-- "
prefailure true
updated_online true
performance true
error_rate true
event_count false
auto_keep false
raw
value 520340282
string "520340282"
5
id 9
name "Power_On_Hours"
value 92
worst 92
thresh 0
when_failed ""
flags
value 50
string "-O--CK "
prefailure false
updated_online true
performance false
error_rate false
event_count true
auto_keep true
raw
value 7634
string "7634"
6
id 10
name "Spin_Retry_Count"
value 100
worst 100
thresh 97
when_failed ""
flags
value 19
string "PO--C- "
prefailure true
updated_online true
performance false
error_rate false
event_count true
auto_keep false
raw
value 0
string "0"
7
id 12
name "Power_Cycle_Count"
value 100
worst 100
thresh 20
when_failed ""
flags
value 50
string "-O--CK "
prefailure false
updated_online true
performance false
error_rate false
event_count true
auto_keep true
raw
value 438
string "438"
8
id 18
name "Unknown_Attribute"
value 100
worst 100
thresh 50
when_failed ""
flags
value 11
string "PO-R-- "
prefailure true
updated_online true
performance false
error_rate true
event_count false
auto_keep false
raw
value 0
string "0"
9
id 187
name "Reported_Uncorrect"
value 100
worst 100
thresh 0
when_failed ""
flags
value 50
string "-O--CK "
prefailure false
updated_online true
performance false
error_rate false
event_count true
auto_keep true
raw
value 0
string "0"
10
id 188
name "Command_Timeout"
value 100
worst 100
thresh 0
when_failed ""
flags
value 50
string "-O--CK "
prefailure false
updated_online true
performance false
error_rate false
event_count true
auto_keep true
raw
value 0
string "0"
11
id 190
name "Airflow_Temperature_Cel"
value 61
worst 36
thresh 0
when_failed ""
flags
value 34
string "-O---K "
prefailure false
updated_online true
performance false
error_rate false
event_count false
auto_keep true
raw
value 673644583
string "39 (Min/Max 39/40)"
12
id 192
name "Power-Off_Retract_Count"
value 100
worst 100
thresh 0
when_failed ""
flags
value 50
string "-O--CK "
prefailure false
updated_online true
performance false
error_rate false
event_count true
auto_keep true
raw
value 265
string "265"
13
id 193
name "Load_Cycle_Count"
value 97
worst 97
thresh 0
when_failed ""
flags
value 50
string "-O--CK "
prefailure false
updated_online true
performance false
error_rate false
event_count true
auto_keep true
raw
value 7447
string "7447"
14
id 194
name "Temperature_Celsius"
value 39
worst 64
thresh 0
when_failed ""
flags
value 34
string "-O---K "
prefailure false
updated_online true
performance false
error_rate false
event_count false
auto_keep true
raw
value 90194313255
string "39 (0 21 0 0 0)"
15
id 197
name "Current_Pending_Sector"
value 100
worst 100
thresh 0
when_failed ""
flags
value 18
string "-O--C- "
prefailure false
updated_online true
performance false
error_rate false
event_count true
auto_keep false
raw
value 0
string "0"
16
id 198
name "Offline_Uncorrectable"
value 100
worst 100
thresh 0
when_failed ""
flags
value 16
string "----C- "
prefailure false
updated_online false
performance false
error_rate false
event_count true
auto_keep false
raw
value 0
string "0"
17
id 199
name "UDMA_CRC_Error_Count"
value 200
worst 200
thresh 0
when_failed ""
flags
value 62
string "-OSRCK "
prefailure false
updated_online true
performance true
error_rate true
event_count true
auto_keep true
raw
value 0
string "0"
18
id 200
name "Multi_Zone_Error_Rate"
value 100
worst 100
thresh 1
when_failed ""
flags
value 35
string "PO---K "
prefailure true
updated_online true
performance false
error_rate false
event_count false
auto_keep true
raw
value 0
string "0"
19
id 240
name "Head_Flying_Hours"
value 100
worst 100
thresh 0
when_failed ""
flags
value 0
string "------ "
prefailure false
updated_online false
performance false
error_rate false
event_count false
auto_keep false
raw
value 121255516705017
string "4345 (110 72 0)"
20
id 241
name "Total_LBAs_Written"
value 100
worst 253
thresh 0
when_failed ""
flags
value 0
string "------ "
prefailure false
updated_online false
performance false
error_rate false
event_count false
auto_keep false
raw
value 73966167809
string "73966167809"
21
id 242
name "Total_LBAs_Read"
value 100
worst 253
thresh 0
when_failed ""
flags
value 0
string "------ "
prefailure false
updated_online false
performance false
error_rate false
event_count false
auto_keep false
raw
value 558276226123
string "558276226123"
spare_available
current_percent 100
threshold_percent 10
power_on_time
hours 7634
power_cycle_count 438
temperature
current 39
power_cycle_min 39
power_cycle_max 41
lifetime_min 24
lifetime_max 63
op_limit_max 60
op_limit_min 5
limit_min 5
limit_max 60
lifetime_over_limit_minutes 3916
lifetime_under_limit_minutes 0
ata_log_directory
gp_dir_version 1
smart_dir_version 1
smart_dir_multi_sector true
table
0
address 0
name "Log Directory"
read true
write false
gp_sectors 1
smart_sectors 1
1
address 1
name "Summary SMART error log"
read true
write false
smart_sectors 1
2
address 2
name "Comprehensive SMART error log"
read true
write false
smart_sectors 5
3
address 3
name "Ext. Comprehensive SMART error log"
read true
write false
gp_sectors 5
4
address 4
name "Device Statistics log"
read true
write false
gp_sectors 256
smart_sectors 8
5
address 6
name "SMART self-test log"
read true
write false
smart_sectors 1
6
address 7
name "Extended self-test log"
read true
write false
gp_sectors 1
7
address 8
name "Power Conditions log"
read true
write false
gp_sectors 2
8
address 9
name "Selective self-test log"
read true
write true
smart_sectors 1
9
address 10
name "Device Statistics Notification"
read true
write true
gp_sectors 8
10
address 12
name "Pending Defects log"
read true
write false
gp_sectors 2048
11
address 15
name "Sense Data for Successful NCQ Cmds log"
read true
write false
gp_sectors 2
12
address 16
name "NCQ Command Error log"
read true
write false
gp_sectors 1
13
address 17
name "SATA Phy Event Counters log"
read true
write false
gp_sectors 1
14
address 19
name "SATA NCQ Send and Receive log"
read true
write false
gp_sectors 1
15
address 33
name "Write stream error log"
read true
write false
gp_sectors 1
16
address 34
name "Read stream error log"
read true
write false
gp_sectors 1
17
address 36
name "Current Device Internal Status Data log"
read true
write false
gp_sectors 768
18
address 47
name "Sector Configuration log"
read true
write false
gp_sectors 1
19
address 48
name "IDENTIFY DEVICE data log"
read true
write false
gp_sectors 9
smart_sectors 9
20
address 128
name "Host vendor specific log"
read true
write true
gp_sectors 16
smart_sectors 16
21
address 129
name "Host vendor specific log"
read true
write true
gp_sectors 16
smart_sectors 16
22
address 130
name "Host vendor specific log"
read true
write true
gp_sectors 16
smart_sectors 16
23
address 131
name "Host vendor specific log"
read true
write true
gp_sectors 16
smart_sectors 16
24
address 132
name "Host vendor specific log"
read true
write true
gp_sectors 16
smart_sectors 16
25
address 133
name "Host vendor specific log"
read true
write true
gp_sectors 16
smart_sectors 16
26
address 134
name "Host vendor specific log"
read true
write true
gp_sectors 16
smart_sectors 16
27
address 135
name "Host vendor specific log"
read true
write true
gp_sectors 16
smart_sectors 16
28
address 136
name "Host vendor specific log"
read true
write true
gp_sectors 16
smart_sectors 16
29
address 137
name "Host vendor specific log"
read true
write true
gp_sectors 16
smart_sectors 16
30
address 138
name "Host vendor specific log"
read true
write true
gp_sectors 16
smart_sectors 16
31
address 139
name "Host vendor specific log"
read true
write true
gp_sectors 16
smart_sectors 16
32
address 140
name "Host vendor specific log"
read true
write true
gp_sectors 16
smart_sectors 16
33
address 141
name "Host vendor specific log"
read true
write true
gp_sectors 16
smart_sectors 16
34
address 142
name "Host vendor specific log"
read true
write true
gp_sectors 16
smart_sectors 16
35
address 143
name "Host vendor specific log"
read true
write true
gp_sectors 16
smart_sectors 16
36
address 144
name "Host vendor specific log"
read true
write true
gp_sectors 16
smart_sectors 16
37
address 145
name "Host vendor specific log"
read true
write true
gp_sectors 16
smart_sectors 16
38
address 146
name "Host vendor specific log"
read true
write true
gp_sectors 16
smart_sectors 16
39
address 147
name "Host vendor specific log"
read true
write true
gp_sectors 16
smart_sectors 16
40
address 148
name "Host vendor specific log"
read true
write true
gp_sectors 16
smart_sectors 16
41
address 149
name "Host vendor specific log"
read true
write true
gp_sectors 16
smart_sectors 16
42
address 150
name "Host vendor specific log"
read true
write true
gp_sectors 16
smart_sectors 16
43
address 151
name "Host vendor specific log"
read true
write true
gp_sectors 16
smart_sectors 16
44
address 152
name "Host vendor specific log"
read true
write true
gp_sectors 16
smart_sectors 16
45
address 153
name "Host vendor specific log"
read true
write true
gp_sectors 16
smart_sectors 16
46
address 154
name "Host vendor specific log"
read true
write true
gp_sectors 16
smart_sectors 16
47
address 155
name "Host vendor specific log"
read true
write true
gp_sectors 16
smart_sectors 16
48
address 156
name "Host vendor specific log"
read true
write true
gp_sectors 16
smart_sectors 16
49
address 157
name "Host vendor specific log"
read true
write true
gp_sectors 16
smart_sectors 16
50
address 158
name "Host vendor specific log"
read true
write true
gp_sectors 16
smart_sectors 16
51
address 159
name "Host vendor specific log"
read true
write true
gp_sectors 16
smart_sectors 16
52
address 161
name "Device vendor specific log"
gp_sectors 160
smart_sectors 160
53
address 162
name "Device vendor specific log"
gp_sectors 16320
54
address 164
name "Device vendor specific log"
gp_sectors 160
smart_sectors 160
55
address 166
name "Device vendor specific log"
gp_sectors 192
56
address 168
name "Device vendor specific log"
gp_sectors 136
smart_sectors 136
57
address 169
name "Device vendor specific log"
gp_sectors 136
smart_sectors 136
58
address 171
name "Device vendor specific log"
gp_sectors 1
59
address 173
name "Device vendor specific log"
gp_sectors 16
60
address 177
name "Device vendor specific log"
gp_sectors 160
smart_sectors 160
61
address 180
name "Device vendor specific log"
gp_sectors 16
smart_sectors 16
62
address 182
name "Device vendor specific log"
gp_sectors 1920
63
address 190
name "Device vendor specific log"
gp_sectors 65535
64
address 191
name "Device vendor specific log"
gp_sectors 65535
65
address 193
name "Device vendor specific log"
gp_sectors 8
smart_sectors 8
66
address 195
name "Device vendor specific log"
gp_sectors 32
smart_sectors 32
67
address 198
name "Device vendor specific log"
gp_sectors 5184
68
address 199
name "Device vendor specific log"
gp_sectors 8
smart_sectors 8
69
address 201
name "Device vendor specific log"
gp_sectors 8
smart_sectors 8
70
address 202
name "Device vendor specific log"
gp_sectors 16
smart_sectors 16
71
address 205
name "Device vendor specific log"
gp_sectors 1
smart_sectors 1
72
address 206
name "Device vendor specific log"
gp_sectors 1
73
address 207
name "Device vendor specific log"
gp_sectors 512
74
address 209
name "Device vendor specific log"
gp_sectors 656
75
address 210
name "Device vendor specific log"
gp_sectors 10256
76
address 212
name "Device vendor specific log"
gp_sectors 2048
77
address 218
name "Device vendor specific log"
gp_sectors 1
smart_sectors 1
78
address 224
name "SCT Command/Status"
read true
write true
gp_sectors 1
smart_sectors 1
79
address 225
name "SCT Data Transfer"
read true
write true
gp_sectors 1
smart_sectors 1
ata_smart_error_log
extended
revision 1
sectors 5
count 0
ata_smart_self_test_log
extended
revision 1
sectors 1
table
0
type
value 2
string "Extended offline"
status
value 32
string "Interrupted (host reset)"
lifetime_hours 7633
1
type
value 2
string "Extended offline"
status
value 32
string "Interrupted (host reset)"
lifetime_hours 7631
2
type
value 1
string "Short offline"
status
value 0
string "Completed without error"
passed true
lifetime_hours 7630
3
type
value 2
string "Extended offline"
status
value 32
string "Interrupted (host reset)"
lifetime_hours 7507
4
type
value 2
string "Extended offline"
status
value 32
string "Interrupted (host reset)"
lifetime_hours 7464
5
type
value 1
string "Short offline"
status
value 0
string "Completed without error"
passed true
lifetime_hours 7330
6
type
value 2
string "Extended offline"
status
value 32
string "Interrupted (host reset)"
lifetime_hours 7311
7
type
value 1
string "Short offline"
status
value 0
string "Completed without error"
passed true
lifetime_hours 7242
8
type
value 2
string "Extended offline"
status
value 0
string "Completed without error"
passed true
lifetime_hours 7165
9
type
value 1
string "Short offline"
status
value 0
string "Completed without error"
passed true
lifetime_hours 7040
10
type
value 2
string "Extended offline"
status
value 32
string "Interrupted (host reset)"
lifetime_hours 6994
11
type
value 1
string "Short offline"
status
value 0
string "Completed without error"
passed true
lifetime_hours 6971
12
type
value 2
string "Extended offline"
status
value 32
string "Interrupted (host reset)"
lifetime_hours 6959
13
type
value 1
string "Short offline"
status
value 0
string "Completed without error"
passed true
lifetime_hours 6902
14
type
value 2
string "Extended offline"
status
value 0
string "Completed without error"
passed true
lifetime_hours 6875
15
type
value 2
string "Extended offline"
status
value 32
string "Interrupted (host reset)"
lifetime_hours 6784
16
type
value 1
string "Short offline"
status
value 0
string "Completed without error"
passed true
lifetime_hours 6596
17
type
value 2
string "Extended offline"
status
value 32
string "Interrupted (host reset)"
lifetime_hours 6535
18
type
value 2
string "Extended offline"
status
value 0
string "Completed without error"
passed true
lifetime_hours 6492
count 19
error_count_total 0
error_count_outdated 0
ata_smart_selective_self_test_log
revision 1
table
0
lba_min 0
lba_max 0
status
value 32
string "Not_testing"
1
lba_min 0
lba_max 0
status
value 32
string "Not_testing"
2
lba_min 0
lba_max 0
status
value 32
string "Not_testing"
3
lba_min 0
lba_max 0
status
value 32
string "Not_testing"
4
lba_min 0
lba_max 0
status
value 32
string "Not_testing"
flags
value 0
remainder_scan_enabled false
power_up_scan_resume_minutes 0
ata_sct_status
format_version 3
sct_version 522
device_state
value 0
string "Active"
temperature
current 39
power_cycle_min 39
power_cycle_max 41
lifetime_min 21
lifetime_max 64
op_limit_max 60
under_limit_count 0
over_limit_count 0
smart_status
passed true
vendor_specific
0 0
1 0
2 0
3 0
4 0
5 0
6 0
7 0
8 0
9 0
10 0
11 0
12 0
13 0
14 0
15 0
16 0
17 0
18 0
19 0
20 1
21 0
22 0
23 0
24 0
25 0
26 0
27 0
28 0
29 0
30 0
31 0
ata_sct_temperature_history
version 2
sampling_period_minutes 5
logging_interval_minutes 59
temperature
op_limit_min 10
op_limit_max 40
limit_min 5
limit_max 60
size 128
index 26
table
0 33
1 null
2 26
3 43
4 45
5 44
6 40
7 null
8 26
9 37
10 48
11 50
12 49
13 48
14 48
15 49
16 50
17 49
18 49
19 49
20 50
21 49
22 50
23 50
24 49
25 49
26 49
27 49
28 49
29 48
30 49
31 49
32 49
33 49
34 48
35 38
36 38
37 47
38 51
39 51
40 50
41 50
42 50
43 50
44 49
45 null
46 33
47 null
48 36
49 null
50 27
51 null
52 31
53 null
54 32
55 null
56 28
57 null
58 32
59 null
60 32
61 null
62 33
63 null
64 34
65 null
66 34
67 null
68 29
69 null
70 33
71 null
72 35
73 null
74 35
75 null
76 39
77 null
78 39
79 null
80 39
81 null
82 40
83 null
84 37
85 null
86 39
87 null
88 40
89 37
90 37
91 37
92 null
93 37
94 null
95 39
96 null
97 28
98 null
99 34
100 null
101 34
102 null
103 35
104 null
105 38
106 null
107 39
108 null
109 38
110 null
111 27
112 null
113 34
114 null
115 36
116 null
117 31
118 null
119 34
120 null
121 28
122 null
123 37
124 39
125 null
126 40
127 39
ata_sct_erc
read
enabled true
deciseconds 70
write
enabled true
deciseconds 70
ata_device_statistics
pages
0
number 1
name "General Statistics"
revision 1
table
0
offset 8
name "Lifetime Power-On Resets"
size 4
value 438
flags
value 192
string "V--- "
valid true
normalized false
supports_dsn false
monitored_condition_met false
1
offset 16
name "Power-on Hours"
size 4
value 7634
flags
value 192
string "V--- "
valid true
normalized false
supports_dsn false
monitored_condition_met false
2
offset 24
name "Logical Sectors Written"
size 6
value 72175362799
flags
value 192
string "V--- "
valid true
normalized false
supports_dsn false
monitored_condition_met false
3
offset 32
name "Number of Write Commands"
size 6
value 1578807047
flags
value 192
string "V--- "
valid true
normalized false
supports_dsn false
monitored_condition_met false
4
offset 40
name "Logical Sectors Read"
size 6
value 544071312526
flags
value 192
string "V--- "
valid true
normalized false
supports_dsn false
monitored_condition_met false
5
offset 48
name "Number of Read Commands"
size 6
value 1773117277
flags
value 192
string "V--- "
valid true
normalized false
supports_dsn false
monitored_condition_met false
6
offset 56
name "Date and Time TimeStamp"
size 6
flags
value 128
string "---- "
valid false
normalized false
supports_dsn false
monitored_condition_met false
1
number 3
name "Rotating Media Statistics"
revision 1
table
0
offset 8
name "Spindle Motor Power-on Hours"
size 4
value 7418
flags
value 192
string "V--- "
valid true
normalized false
supports_dsn false
monitored_condition_met false
1
offset 16
name "Head Flying Hours"
size 4
value 4345
flags
value 192
string "V--- "
valid true
normalized false
supports_dsn false
monitored_condition_met false
2
offset 24
name "Head Load Events"
size 4
value 7447
flags
value 192
string "V--- "
valid true
normalized false
supports_dsn false
monitored_condition_met false
3
offset 32
name "Number of Reallocated Logical Sectors"
size 4
value 0
flags
value 192
string "V--- "
valid true
normalized false
supports_dsn false
monitored_condition_met false
4
offset 40
name "Read Recovery Attempts"
size 4
value 0
flags
value 192
string "V--- "
valid true
normalized false
supports_dsn false
monitored_condition_met false
5
offset 48
name "Number of Mechanical Start Failures"
size 4
value 0
flags
value 192
string "V--- "
valid true
normalized false
supports_dsn false
monitored_condition_met false
6
offset 56
name "Number of Realloc. Candidate Logical Sectors"
size 4
value 0
flags
value 192
string "V--- "
valid true
normalized false
supports_dsn false
monitored_condition_met false
7
offset 64
name "Number of High Priority Unload Events"
size 4
value 265
flags
value 192
string "V--- "
valid true
normalized false
supports_dsn false
monitored_condition_met false
2
number 4
name "General Errors Statistics"
revision 1
table
0
offset 8
name "Number of Reported Uncorrectable Errors"
size 4
value 0
flags
value 192
string "V--- "
valid true
normalized false
supports_dsn false
monitored_condition_met false
1
offset 16
name "Resets Between Cmd Acceptance and Completion"
size 4
value 0
flags
value 192
string "V--- "
valid true
normalized false
supports_dsn false
monitored_condition_met false
2
offset 24
name "Physical Element Status Changed"
size 4
value 0
flags
value 208
string "V-D- "
valid true
normalized false
supports_dsn true
monitored_condition_met false
3
number 5
name "Temperature Statistics"
revision 1
table
0
offset 8
name "Current Temperature"
size 1
value 39
flags
value 192
string "V--- "
valid true
normalized false
supports_dsn false
monitored_condition_met false
1
offset 16
name "Average Short Term Temperature"
size 1
value 46
flags
value 192
string "V--- "
valid true
normalized false
supports_dsn false
monitored_condition_met false
2
offset 24
name "Average Long Term Temperature"
size 1
value 40
flags
value 192
string "V--- "
valid true
normalized false
supports_dsn false
monitored_condition_met false
3
offset 32
name "Highest Temperature"
size 1
value 63
flags
value 192
string "V--- "
valid true
normalized false
supports_dsn false
monitored_condition_met false
4
offset 40
name "Lowest Temperature"
size 1
value 24
flags
value 192
string "V--- "
valid true
normalized false
supports_dsn false
monitored_condition_met false
5
offset 48
name "Highest Average Short Term Temperature"
size 1
value 60
flags
value 192
string "V--- "
valid true
normalized false
supports_dsn false
monitored_condition_met false
6
offset 56
name "Lowest Average Short Term Temperature"
size 1
value 34
flags
value 192
string "V--- "
valid true
normalized false
supports_dsn false
monitored_condition_met false
7
offset 64
name "Highest Average Long Term Temperature"
size 1
value 45
flags
value 192
string "V--- "
valid true
normalized false
supports_dsn false
monitored_condition_met false
8
offset 72
name "Lowest Average Long Term Temperature"
size 1
value 38
flags
value 192
string "V--- "
valid true
normalized false
supports_dsn false
monitored_condition_met false
9
offset 80
name "Time in Over-Temperature"
size 4
value 3916
flags
value 192
string "V--- "
valid true
normalized false
supports_dsn false
monitored_condition_met false
10
offset 88
name "Specified Maximum Operating Temperature"
size 1
value 60
flags
value 192
string "V--- "
valid true
normalized false
supports_dsn false
monitored_condition_met false
11
offset 96
name "Time in Under-Temperature"
size 4
value 0
flags
value 192
string "V--- "
valid true
normalized false
supports_dsn false
monitored_condition_met false
12
offset 104
name "Specified Minimum Operating Temperature"
size 1
value 5
flags
value 192
string "V--- "
valid true
normalized false
supports_dsn false
monitored_condition_met false
4
number 6
name "Transport Statistics"
revision 1
table
0
offset 8
name "Number of Hardware Resets"
size 4
value 289
flags
value 192
string "V--- "
valid true
normalized false
supports_dsn false
monitored_condition_met false
1
offset 16
name "Number of ASR Events"
size 4
value 35
flags
value 192
string "V--- "
valid true
normalized false
supports_dsn false
monitored_condition_met false
2
offset 24
name "Number of Interface CRC Errors"
size 4
value 0
flags
value 192
string "V--- "
valid true
normalized false
supports_dsn false
monitored_condition_met false
5
number 255
name "Vendor Specific Statistics"
revision 1
table
0
offset 8
name "Vendor Specific"
size 7
value 0
flags
value 192
string "V--- "
valid true
normalized false
supports_dsn false
monitored_condition_met false
1
offset 16
name "Vendor Specific"
size 7
value 0
flags
value 192
string "V--- "
valid true
normalized false
supports_dsn false
monitored_condition_met false
2
offset 24
name "Vendor Specific"
size 7
value 0
flags
value 192
string "V--- "
valid true
normalized false
supports_dsn false
monitored_condition_met false
sata_phy_event_counters
table
0
id 10
name "Device-to-host register FISes sent due to a COMRESET"
size 2
value 1
overflow false
1
id 1
name "Command failed due to ICRC error"
size 2
value 0
overflow false
2
id 3
name "R_ERR response for device-to-host data FIS"
size 2
value 0
overflow false
3
id 4
name "R_ERR response for host-to-device data FIS"
size 2
value 0
overflow false
4
id 6
name "R_ERR response for device-to-host non-data FIS"
size 2
value 0
overflow false
5
id 7
name "R_ERR response for host-to-device non-data FIS"
size 2
value 0
overflow false
6
id 11
name "CRC errors within host-to-device FIS"
size 2
value 0
overflow false
7
id 13
name "Non-CRC errors within host-to-device FIS"
size 2
value 0
overflow false
reset false
Should I now try, as Arwen suggested,
removing the disk being resilvered and attempting a scrub? Could I also remove the disk ST16000NM000H-3KW103 ZYD00P5Y that seems to be causing problems, since it is a RAIDZ2 pool, and hope that my resilver finishes that way? Or is this the wrong way to think about what’s happening, because I don’t understand how ZFS works?
Could anybody tell me whether anything about my problem can be understood from the smart report? What should my next troubleshooting steps be now? I would greatly appreciate any tips on how to progress. Thank you all for your tips and support.
Removing 2 disks would put the pool in jeopardy. You could remove 1 and try it. If it generates the I/O suspend again, repeat with the other removed.
It can be a bit hard to diagnose issues remotely. Can you supply the output of zpool status in CODE tags? And point out which disk(s) you think are the problem?
NAME STATE READ WRITE CKSUM
Volume1 ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
2f7321eb-afd0-4d39-8eb7-abaa52c9a4e6 ONLINE 0 0 0
0bccf225-1ad4-4575-828f-4b3d07b0d0d2 ONLINE 0 0 0
7a1c14c4-ad25-4e19-af77-4349c998e060 ONLINE 0 0 0 (awaiting resilver)
8e053548-9acb-4a03-a76a-a7db59cdb31e ONLINE 0 0 0
81ff015b-62fd-454a-83cc-c9b793aa4d81 ONLINE 0 0 0
4bef9b6b-b96a-4698-85e2-8ba9705a5450 DEGRADED 1.16K 11.9K 1 too many errors
44d811c4-8a6f-4c4b-946b-558c659fb08e ONLINE 0 0 0
30ccbe9f-31ba-4b82-97ae-3b4eccbbab3a ONLINE 0 0 0
aec216ca-a948-42c2-8317-d8af4b6ffe7b ONLINE 0 0 0
0669e04a-4a7b-4396-8647-4450f38a6e7d ONLINE 0 0 0
bf8c9c1f-184a-4237-aadc-e18c8720632f ONLINE 0 0 0 (resilvering)
cache
7ab5e132-17b1-4090-a427-6d9625eee0f6 ONLINE 0 0 0
4bef9b6b-b96a-4698-85e2-8ba9705a5450 is the one that is causing problems (the one which immediately interrupts the long smart test), bf8c9c1f-184a-4237-aadc-e18c8720632f seems to be the 18TB one that I initially replaced, and I have no clue why 7a1c14c4-ad25-4e19-af77-4349c998e060 is awaiting a resilver, but that seems to be the issue. Is there any way to figure out which drive (serial number this is)? how should I proceed this all seems rather bad now
Having 3 problematic drives in a RAID-Z2 vDev is just a bit beyond me. Sorry.
There is good news. ZFS was specifically designed with knowledge of the data and how it interacts with the redundancy, (RAID-Z2 in this case). What this means is that as long as enough info exists for each RAID-Zx stripe, the data in that stripe is fully safe / recoverable.
There are certain situations where regular RAID-5/6 can fail, while replacing a disk. ZFS allows replacing in place. Meaning you add the new, replacement disk to the server, and then cause ZFS to replace the failing disk. What this helps prevent is that if another disk, (or 2), has bad blocks, but the failing disk that is being replaced, has the data, (or parity), you don’t loose data. In essence, the replacement disk temporarily Mirrors the failing disk, (until it finds a bad block, then uses the rest of the vDev for recovery).
However, this replace in place is not occurring on your pool’s vDev.
Your disks appear to be out of sync, thus requiring re-silvering to bring them back to sync. But with a 3rd disk being “DEGRADED”, on a RAID-Z2 pool, that can cause problems if the blocks that need resilver are not recoverable from the other disks.
As for what to do, sorry. As I said, this is a bit beyond me.


