Nervous to re-add existing drives to pool

I had a drive failure, so I went and offline’d the problem drive. Due to how my drives are stacked, I couldn’t pinpoint which physical drive it was, so I decided to write down the serial and power the machine down to inspect each drive to find the correct one.

After finding and replacing the problem drive, I booted the system back up and was met with none of the drives (even the 3 that were still good) being assigned to the pool.

When I look at the drives themselves I see that the 3 good drives say their pool is “gemini (exported)”. Gemini is the pool they SHOULD be in, but it sounds like they were disconnected at some point?

Is it safe to just add all 4 drives into the pool and try to let it rebuild? Not sure what the best options is.

Thanks in advance!

Daniel

For some reason it won’t let me upload images or include links to Google Drive images… Otherwise I’d show them to help.

So you offlined the offending drive, pulled it, swapped in a good drive, and now your box acts like that pool is a stranger? You didn’t export the pool also by chance?

Should see the other 3 drives in that pool and let you import with a big fat button. Each drive stores pool data and Truenas polls them for that. Just seems weird that the pool got yeeted somehow. Usually shows up unhealthy on next boot and shows your new drive as 1 you can add to that pool and start the resilver.

Ninjaeditwhenyoureplied: I just copy paste into this editor but I think you can’t do that day one if you are brand new.

Do NOT add the drives to the pool - in fact do not take any actions on the basis of hoping it will fix things because it could make them worse - only take actions that you are certain will help or that you have been advised to take by someone more expert.

We should hopefully be able to help you to import your existing pool in the degraded state and then you can resilver the new drive.

Please run the following diagnostic commands and post the output of each into a separate </> box:

  • lsblk -bo NAME,LABEL,MAJ:MIN,TRAN,ROTA,ZONED,VENDOR,MODEL,SERIAL,PARTUUID,START,SIZE,PARTTYPENAME
  • sudo zpool status -v
  • sudo zpool import
  • lspci
  • sudo sas2flash -list
  • sudo sas3flash -list
  • sudo storcli show all
  • sudo zpool import gemini

P.S. @afrosheen - the issue with your suggestion is that the UI does not always reflect the actual state of pools in Linux and this can introduce complexities to diagnosis and fix that using the CLI avoids. IME it is better first to use the CLI to diagnose and fix issues, and then afterwards do what is necessary to make the UI align to Linux.

1 Like

Follow the tutorial by the forum bot; you should have received a mail about that.
Whenever possible, please paste the output from a SSH session (or web shell) as formatted text (</>) rather than images. Links to externally hosted images may NOT be followed (we’re willing to voluntarily help… but up to a point).

The solution should be to import your pool again. But if it didn’t automatically import on reboot, you may have a bigger issue.

I try with the UI first and if that does not work I fall back to the command line.

But do not forget to reconcile the TrueNAS config with the underlying system.

1 Like

Ouch. Yeah, I can’t disagree, just trying to wrap up the problem and how the system SHOULD be acting, not that he should take any action.

Usually when you have a failing drive that’s making your pool sick, you choose the bad disk and offline it, shut down, replace the drive, and reboot, right? You don’t expect Truenas to forget your pool and say “oh wow look at all these disks I’ve never seen before”.

Wow! More responses than I expected! Thanks everyone!

Here are the results as requested by @Protopia

dajack05@truenas:/$ lsblk -bo NAME,LABEL,MAJ:MIN,TRAN,ROTA,ZONED,VENDOR,MODEL,SERIAL,PARTUUID,START,SIZE,PARTTYPENAME
NAME LABEL MAJ:MIN TRAN   ROTA ZONED VENDOR MODEL SERIAL PARTUUID                                START          SIZE PARTTYPENAME
sda          8:0   sata      1 none  ATA    Hitac MS7921                                               3000592982016
├─sda1
│            8:1             1 none                      12905df8-e239-4fd5-be65-9bd306456247      128    2147418624 Linux swap
└─sda2
     Gemini
             8:2             1 none                      34e98517-c6ee-41e9-94c6-24e85323f7a5  4194432 2998445415936 Solaris /usr & Apple ZFS
sdb          8:16  sata      1 none  ATA    Hitac MS7921                                               3000592982016
├─sdb1
│            8:17            1 none                      7a99159e-57bf-467c-a9c4-b0cdd07a3c9e      128    2147418624 Linux swap
└─sdb2
     Gemini
             8:18            1 none                      7bcf1aa4-2048-43de-8434-5b04376cf88a  4194432 2998445415936 Solaris /usr & Apple ZFS
sdc          8:32  sata      1 none  ATA    Hitac MS7921                                               3000592982016
├─sdc1
│            8:33            1 none                      8ca3f9af-193d-46dd-82cd-df2ce116509f      128    2147418624 Linux swap
└─sdc2
     Gemini
             8:34            1 none                      37c67f00-2629-43a4-bc80-b6a46169998d  4194432 2998445415936 Solaris /usr & Apple ZFS
sdd          8:48  sata      1 none  ATA    ST400 ZFN1V0                                               4000787030016
sr0         11:0   sata      1 none  HL-DT- HL-DT KZLI1U                                                  1073741312
nvme0n1
           259:0   nvme      0 none         Samsu S64CNJ                                                250059350016
├─nvme0n1p1
│          259:1   nvme      0 none                      cd876572-88d2-47d9-b97c-42f19ca71b0f     4096       1048576 BIOS boot
├─nvme0n1p2
│    EFI   259:2   nvme      0 none                      1c9e91da-ed0c-4025-bcf7-f657d723d3c1     6144     536870912 EFI System
├─nvme0n1p3
│    boot-pool
│          259:3   nvme      0 none                      7708dfaf-41df-4247-90c8-f2d3b274b0cb 34609152  232339447296 Solaris /usr & Apple ZFS
└─nvme0n1p4
           259:4   nvme      0 none                      81409201-0d9d-41b7-a3c6-665fa28dd6d8  1054720   17179869184 Linux swap
  └─nvme0n1p4
           253:0             0 none                                                                      17179869184
dajack05@truenas:/$ sudo zpool status -v
  pool: boot-pool
 state: ONLINE
status: Some supported and requested features are not enabled on the pool.
	The pool can still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
	the pool may no longer be accessible by software that does not support
	the features. See zpool-features(7) for details.
  scan: scrub repaired 0B in 00:00:14 with 0 errors on Thu Jun 12 03:45:15 2025
config:

	NAME         STATE     READ WRITE CKSUM
	boot-pool    ONLINE       0     0     0
	  nvme0n1p3  ONLINE       0     0     0

errors: No known data errors

The following one makes me wonder if I removed the wrong drive?

dajack05@truenas:/$ sudo zpool import
   pool: Gemini
     id: 11268133550315815319
  state: UNAVAIL
status: One or more devices contains corrupted data.
 action: The pool cannot be imported due to damaged devices or data.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-5E
 config:

	Gemini                                    UNAVAIL  insufficient replicas
	  raidz1-0                                UNAVAIL  insufficient replicas
	    34e98517-c6ee-41e9-94c6-24e85323f7a5  ONLINE
	    7bcf1aa4-2048-43de-8434-5b04376cf88a  ONLINE
	    8d45e385-ba59-493c-b7ff-5eebe54a64e4  UNAVAIL
	    37c67f00-2629-43a4-bc80-b6a46169998d  OFFLINE
dajack05@truenas:/$ lspci
00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v6/7th Gen Core Processor Host Bridge/DRAM Registers (rev 05)
00:02.0 VGA compatible controller: Intel Corporation HD Graphics 630 (rev 04)
00:14.0 USB controller: Intel Corporation 200 Series/Z370 Chipset Family USB 3.0 xHCI Controller
00:14.2 Signal processing controller: Intel Corporation 200 Series PCH Thermal Subsystem
00:16.0 Communication controller: Intel Corporation 200 Series PCH CSME HECI #1
00:17.0 SATA controller: Intel Corporation 200 Series PCH SATA controller [AHCI mode]
00:1b.0 PCI bridge: Intel Corporation 200 Series PCH PCI Express Root Port #21 (rev f0)
00:1c.0 PCI bridge: Intel Corporation 200 Series PCH PCI Express Root Port #5 (rev f0)
00:1c.6 PCI bridge: Intel Corporation 200 Series PCH PCI Express Root Port #7 (rev f0)
00:1f.0 ISA bridge: Intel Corporation 200 Series PCH LPC Controller (B250)
00:1f.2 Memory controller: Intel Corporation 200 Series/Z370 Chipset Family Power Management Controller
00:1f.3 Audio device: Intel Corporation 200 Series PCH HD Audio
00:1f.4 SMBus: Intel Corporation 200 Series/Z370 Chipset Family SMBus Controller
01:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller 980
02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 15)
03:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9215 PCIe 2.0 x1 4-port SATA 6 Gb/s Controller (rev 11)
dajack05@truenas:/$ sudo sas2flash -list
LSI Corporation SAS2 Flash Utility
Version 20.00.00.00 (2014.09.18)
Copyright (c) 2008-2014 LSI Corporation. All rights reserved

	No LSI SAS adapters found! Limited Command Set Available!
	ERROR: Command Not allowed without an adapter!
	ERROR: Couldn't Create Command -list
	Exiting Program.
dajack05@truenas:/$ sudo sas3flash -list
Avago Technologies SAS3 Flash Utility
Version 16.00.00.00 (2017.05.02)
Copyright 2008-2017 Avago Technologies. All rights reserved.

	No Avago SAS adapters found! Limited Command Set Available!
	ERROR: Command Not allowed without an adapter!
	ERROR: Couldn't Create Command -list
	Exiting Program.
dajack05@truenas:/$ sudo sas3flash -list
Avago Technologies SAS3 Flash Utility
Version 16.00.00.00 (2017.05.02)
Copyright 2008-2017 Avago Technologies. All rights reserved.

	No Avago SAS adapters found! Limited Command Set Available!
	ERROR: Command Not allowed without an adapter!
	ERROR: Couldn't Create Command -list
	Exiting Program.
dajack05@truenas:/$
dajack05@truenas:/$
dajack05@truenas:/$ sudo storcli show all
CLI Version = 007.1504.0000.0000 June 22, 2020
Operating system = Linux 6.1.74-production+truenas
Status Code = 0
Status = Success
Description = None

Number of Controllers = 0
Host Name = truenas
Operating System  = Linux 6.1.74-production+truenas
dajack05@truenas:/$ sudo zpool import gemini
cannot import 'gemini': no such pool available

Thanks everyone!

Could well be.
Re-check these serial numbers, plug back everything and check the cabling.

1 Like

If it was the wrong drive then hopefully when you take out the correct one and put the other one back in, then it will all burst back to life.

1 Like

Like Protopia mentioned, if you removed by accident the wrong drive after setting the bad drive to offline, then the pool thinks it has 2 failed/missing drives. Check the serial numbers closer and if you pulled the wrong drive (it happens) fix things by putting the good drive back in that was pulled by accident and replacing the bad drive with the new replacment drive.

That should get things back to where they should be.

Main thing is to not panic and don’t blindly follow any AI advice or other advice found on the web.

1 Like

So I reconnected all the original drives as before (I also made sure to write down model and serial numbers).

After booting, the pool came back online. I reenabled the drive that I had disabled before and it is resilvering now.

I’m going to keep an eye on it and see if it starts kicking out errors again.

2 Likes

I was gonna say, after seeing that drive list for the pool, what is clearly obvious now and explains the inconsistent behavior vs the expected behavior.

Now once this is all done, do yourself a favor. Get a dymo or whatever labeler. Shut your box down and label every single drive with its serial number so you know, at a glance, which drive is which. Also spend a minute making a spreadsheet with placement and serials. So if sdb goes down, you can point to the drive map, pop the case and grab the offending drive.

Redundant, but please label the drives in a place that doesn’t require much effort beyond popping the case lid or wall to read them. Even a diagram like the ones electricians make for your breaker box switches, taped to the inside of the case, is better than nothing. Take a photo, print it and write on it if you must. Anything is better than guessing.