I have TrueNAS installed on an Minisforum MS-01, with an 8-bay QNAP TL-D800S JBOD attached. It had a RAIDZ2 pool of 8 6TB disks on it. I do have backups of the data (that I only started doing 3 weeks ago, whew!).
A bit over a week ago, one of the drives started throwing errors, so I replaced it with a 14 TB disk. As soon as that resilver finished, two more disks started throwing errors (they were all from nearly the same manufacture date in 2016-17), so I replaced those, one at a time, with 14 TB disks. Then one of the 14 TB disks was failed out, a day later, due to having 7 uncorrectable read errors. So this afternoon I replaced it with another 14 TB disk. The only other issue, this morning, was that one 6 TB drive had 1 checksum error.
Tonight, the resilvering was nearly completed, then stuck at 99.93% done. It sat that way for a long time, and then I got a bunch of errors about how no S.M.A.R.T. tests could be run. I ran a zpool status and this is the mess it reported:
admin@tns-qnap-2[~]$ sudo zpool status -v
pool: boot-pool
state: ONLINE
scan: scrub repaired 0B in 00:00:03 with 0 errors on Thu Sep 25 03:45:04 2025
config:
NAME STATE READ WRITE CKSUM
boot-pool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
nvme1n1p3 ONLINE 0 0 0
nvme0n1p3 ONLINE 0 0 0
errors: No known data errors
pool: tns-qnap-2
state: DEGRADED
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Mon Sep 29 14:04:53 2025
20.2T / 20.2T scanned, 20.1T / 20.2T issued at 661M/s
2.46T resilvered, 99.93% done, 00:00:21 to go
config:
NAME STATE READ WRITE CKSUM
tns-qnap-2 DEGRADED 0 0 0
raidz2-0 DEGRADED 0 4 0
12e3e46e-49c6-426e-9e12-2c3134062479 ONLINE 0 0 0
a92c8bc4-4b86-4c35-83b9-128f0ae22c58 ONLINE 0 0 0
replacing-2 DEGRADED 0 0 0
ad2d653a-0b4a-4eb2-bd83-b920e2f6ee72 REMOVED 0 0 0
052fe9c6-4cf0-4539-b99b-db58a1812528 ONLINE 0 0 0 (resilvering)
5313d473-44bf-4590-b392-182fbfbff735 ONLINE 0 0 0
00a8dbb3-c67c-4a74-b240-a1f55a188f0f ONLINE 3 4 0
15d57333-0b20-4b2a-851a-0b0d1a05939e ONLINE 3 4 1
937669cf-0dfd-4c6c-b5c5-211b1195ecc7 ONLINE 3 7 0
315bfe10-63da-47dc-91a5-0268fdda7630 ONLINE 3 8 0
errors: List of errors unavailable: pool I/O is currently suspended
admin@tns-qnap-2[~]$
Those four erroring drives are all but one of the remaining 6 TB drives.
Is there any way to save this pool, or is it toast and I’ll have to restore from backup (ZFS snapshots, replicated to a TrueNAS device dedicated to just storing backups) once all bad drives are replaced?
I haven’t touched it since getting that status report. I’m currently out of spares, but just happen to have two more 14 TB drives in-transit to me now, and I can order more.
What should I do now, and then once I have enough replacement drives in (assuming it’s even possible to save this)?
Edited to correct minor errors.
