Somehow all my disks inside my storage pool have been disconnected.
What happend was that I got a few notification of a degraded vdev a couple of days ago which happens once in a while and I couldnt do anything about it at the time. now I wanted to watch a movie with the kids and notice I couldnt access the movie so I went online and found vdev1 had faulted and vdev2 had degraded. So I hit ONLINE on the disks and nothing happend so I hit restart Truenas Scale and when i booted back up again I see 19 disks available and my pool ārustā is offline.
If i run zpool status I can see my pools except for the one in question ārustā
root@truenas[~]# zpool status
pool: boot-pool
state: ONLINE
scan: scrub repaired 0B in 00:01:28 with 0 errors on Tue Jul 9 03:46:30 2024
config:
NAME STATE READ WRITE CKSUM
boot-pool ONLINE 0 0 0
sdg3 ONLINE 0 0 0
errors: No known data errors
pool: lightning
state: ONLINE
status: Some supported and requested features are not enabled on the pool.
The pool can still be used, but some features are unavailable.
action: Enable all features using āzpool upgradeā. Once this is done,
the pool may no longer be accessible by software that does not support
the features. See zpool-features(7) for details.
scan: scrub repaired 0B in 00:23:05 with 0 errors on Sun Jun 23 00:23:12 2024
config:
What is the output of zpool import rust and zpool online rust?
Hardware list please, because thatās a considerable amount of drives you haveā¦ and I guess you are not using an HBA to connect them
root@truenas[~]# zpool import
pool: rust
id: 10465576058172428127
state: UNAVAIL
status: One or more devices contains corrupted data.
action: The pool cannot be imported due to damaged devices or data.
see: Message ID: ZFS-8000-5E ā OpenZFS documentation
config:
I would investigate how the drives are connected to the motherboard and to the PSU since that many drives becoming unavailable point to a different kind of hardware failure than the drives themselves giving up.
And with the full output we can now make sense of the above summary. There are five failed drives in two vdevs, exceeding what the pool can cope with. Check the cables and power. Check whether the drives are spinning. If you cannot bring back at least one of the failed drives in raidz2-1, the pool is lostāand all your data with it.
Alright so itās the norco backplane which is a bust. I rearrange the disks and now my zpool import looks like this. What do I do now?
root@truenas[~]# zpool import
pool: rust
id: 10465576058172428127
state: ONLINE
action: The pool can be imported using its name or numeric identifier.
config:
19 unassigned drives out of 24 in this pool means that the 5 failed drives are not connected at all, so their SMART report is not available. @Solen needs to map the drives to know what is where. To check the cables, as a loose or damaged cable could have taken four drives in a go. To check whether the falied drives are spinningāmoving bays could be an option here.
And above all to be thorough because it seems there are multiple issues: at best, a failed cable and a failed drive; at worst, five old drives having reached end of life.