Not Able to Import Pool

Hello Everyone,

I am new to the TrueNAS community. I have been running TrueNAS scale EE for about 3 months using my old hardrives.

I had a pool “Prime_Pool” which was in RAIDZ1 config with 5x1TB drives. About a week ago, one of my drive started having errors. Hence I replaced it with new one and started the re-silvering process. However during re-silvering the new drive also failed. Now I started having exported pool error where 4 drives were showing up as exported pool.

I read somewhere that you can export the pool and try reimporting via GUI and it would work. Tried that and now I am not able to import the pool. I get the one drive is unavailable in the pool hence cant import.

I need help on how to solve.

Following is the info I get when try to import.

Thank in advance

admin@truenas[~]$ sudo zpool import
[sudo] password for admin: 
  pool: Prime_Pool
    id: 1678850832080138960
 state: DEGRADED
status: One or more devices were being resilvered.
action: The pool can be imported despite missing or damaged devices.  The
        fault tolerance of the pool may be compromised if imported.
config:

        Prime_Pool                                  DEGRADED
          raidz1-0                                  DEGRADED
            d6dd8ab7-eb46-401b-8214-a8b9bf0980ae    ONLINE
            ec784e90-491c-4864-85d3-0284dc6ab8f9    ONLINE
            395ae291-83e1-4cd5-84c1-3440e8a60fe1    ONLINE
            b7baebb1-3f23-42a9-b83e-72a3a1722e38    UNAVAIL
            replacing-4                             DEGRADED
              11598372007724218881                  UNAVAIL
              237b256d-298c-4f0f-ba96-cdb67a3f2e98  ONLINE

If you have two unavail drives in a RAIDZ1 it’s over.

So the only way to salvage this is if you somehow manage to get one of those two drives to come back. Typically, if it’s the drive itself, dodgy head or maybe the platters, your only avenue would be a clean room data recovery service, costing $ with many zeroes. A faulty circuitboard is slightly less work, but still on the data recovery service level for the absolute majority of people.

If you’re supremely lucky, it’s neither of those things, instead being due to a bad SATA/SAS or power cable or port on your motherboard/HBA.

There are people here that may be able to look into software related reasons for the drives being unavailable, but it’s a community forum, so help is not guaranteed.

It is unclear whether the devices shown as resilvering have completed or not - but given that it says " The pool can be imported despite missing or damaged devices. The fault tolerance of the pool may be compromised if imported." and also that partuuid 237b256d-298c-4f0f-ba96-cdb67a3f2e98 is also shown as ONLINE that suggests that the resilvering might have completed ok and that it is only leaving it offline because if you import it then there will be no fault tollerance.

But it looks like you may need another replacement hard drive for another resilver.

First I would to see if you can import the pool with:

  • sudo zpool import -R /mnt Prime_Pool first and if that doesn’t work…
  • sudo zpool import -f -R /mnt Prime_Pool and see if that does it.

If neither work please post the output here inside a </> box.

To me it looks like the resilver was still in progress. If so, adding another drive will not change anything about the current situation.

Yes - I missed that. The pool is probably toast.

BUT, there probably isn’t any harm attempting to import the pool and see whether the resilver can continue.

P.S. Someone on Reddit said that a 2-vDev x 3x RAIDZ1 was a fault tolerant as a 1-vDev x RAIDZ2. I showed that if you were to get 2 drives fail simultaneously, then the first had a 40% chance of the pool being toast, so the second was much more fault tolerant. He said no one ever gets 2 drives failing, and I pointed out that openZFS wouldn’t have RAIDZ2 & Z3 if it never happened, and that there were lots of documented examples. So he called me a loud-mouthed idiot (LoL) - definitely not the latter. And yet, here we on the same exact day with another actual example showing that RAIDZ1 can have a second drive fail during a resilver.