Zpool import fails due to insufficient replicas (unavailable vdev, cannot open)

Hello all. Our file server’s motherboard and RAID controller were replaced with the same model because the RAID controller was overheating. Since the OS of the server was also problematic, the SAS drives were extracted from the server when it was shutdown. The new server was able to build the virtual drive of the 12 drives (40TB), and the individual drives all show up as online (used Lenovo XClarity Provisioning Manager to check).

The problem comes from importing the zpool. Since it was not properly exported, the zpool command did say that we have to use the -f option to force the import. However, using zpool import -f, we have encountered another issue as shown below.

root@truenas[~]# zpool import -f
  pool: WCI-NAS02-P
    id: 8273887505929077642
 state: UNAVAIL
status: The pool was last accessed by another system.
action: The pool cannot be imported due to damaged devices or data.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-EY
config: 
       WCI-NAS02-P                                                          UNAVAIL   insufficient replicas
          gptid/4f33427e-2602-11ee-9c97-0894ef6e5000   ONLINE
          gptid/4f65a6ea-2602-11ee-9c97-0894ef6e5000   UNAVAIL   cannot open

When we checked the /dev/gptid directory, this is the result:

root@truenas[~]# ls /dev/gptid/
0ba17ea-11f0-ad35-0090fa5f1dca  4f28c4e5-2602-11ee-9c97-0894ef6e5000 4f33427e-2602-11ee-9c97-0894ef6e5000

We can see the gptid/4f33427e-2602-11ee-9c97-0894ef6e5000 has a corresponding device in /dev/gptid but the unavailable gptid which cannot be opened is missing, but there is instead a similar gptid 4f28c4e5-2602-11ee-9c97-0894ef6e5000.

Right now the server is powered off. Is it possible that this is the same device but somehow changed its gptid when it was rebuilt by the controller? Can we also safely try to use the option -d to point to the gptid directory without messing with the data? We are also looking for an option to hire a data recovery firm to handle this if it is out of our hands to try and import the pool.

1 Like

Im not an expert, but presenting a virtual disc to Truenas from a hardware RAID array is a receipe for disaster.

Not sure if even @HoneyBadger can help here.

You know the “oof size = large” meme? Yeah, I made that face.

@ceaguilar You have a ZFS stripe sitting on top of a RAID controller which is a storage pathology that’s known to be dangerous.

We’ve got a few layers to this particular onion.

An overheating RAID controller can absolutely corrupt data.

ZFS is expecting two “virtual drives” in its stripe configuration - did you build two virtual drives in the Lenovo RAID manager, or only one?

Let’s start with lsblk -o NAME,PARTUUID,LABEL to see what you’ve got.