The story thus far:
I had a drive that started throwing errors and had some SMART failures. I began backing up my data, in preparation to replace the drive. In the middle of backing up, the drive failed and became ureadable.
I took the drive offline, swapped in the new drive, and pulled out the failed drive, all pretty standard (followed the manual)
The new drive began resilvering, and all looked like it was going according to plan. Then at some point it stalled out at 26.29% and stayed there for 2 days. At some point last night there was a power outage. I brought the server back up, and resilvering started again, but now I have a variety of issues.
Main issue, the pool that this disk is in is no longer showing in the UI, but zpool status still shows it, however, most of the data appears to be gone, and the new disk, as well as a previously functioning disk, both show errors. I’ve been resilvering for the entire day today, and have been at 24.26% for about 9 hours.
pool: vault
state: DEGRADED
status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Thu Dec 5 10:20:36 2024
14.9T / 42.6T scanned at 794M/s, 10.3T / 42.6T issued at 238M/s
142G resilvered, 24.26% done, 1 days 15:23:06 to go
config:
NAME STATE READ WRITE CKSUM
vault DEGRADED 0 0 0
raidz1-0 DEGRADED 1.74K 29 0
replacing-0 DEGRADED 1.74K 62.6K 146
59595a42-9afa-4254-b857-2b931256ceee REMOVED 0 0 0
e4b17417-7915-41f6-bb63-b2cb55ae494a FAULTED 3 377 0 too many errors
674d96e7-99cb-41b9-8106-209b07f2728e ONLINE 275 44 0
10e9f969-f8d9-4244-aaa1-f175b95869b4 ONLINE 0 0 0
The UI does not show the vault dataset at all, and when looking at the devices, one of the original drives in the pool is no longer showing the correct name (sde) and is instead just a string of numbers, and it has a few hundred errors. The drive that is being added is showing as FAULTED with a few hundred errors as well.
Further, in the storage dashboard, I have 2 unassigned disks, one of those listed is the disk that was in the original pool that is now displayed as numbers. The other disk is one that I was going to set up as a hot spare.
System details:
TrueNAS Version: Dragonfish-24.04.2
CPU Ryzen 5 3600
64GB RAM
ASRock Rack X470D4U motherboard
LSI HBA card
Is this expected behavior? Is there anything I can do to recover from this?