Cannot import 'pool_a': insufficient replicas = scan: resilvered 156M

jerr · October 8, 2024, 7:57pm

Sorry for probably a stupid question but I am a beginner in Truenas but I am learning hard I will write out the entire commands I executed - maybe it will be useful to someone in a similar situation

My Truenas ElectricEel-24.10-RC.2 after a normal restart lost one pool called pool_a, and all disks (3 pieces) are in pool_a(exported) status. I checked the status of the disks themselves and they seem to be OK:

lsblk
NAME        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
sda           8:0    0  10.9T  0 disk 
├─sda1        8:1    0     2G  0 part 
└─sda2        8:2    0  10.9T  0 part 
sdb           8:16   0  10.9T  0 disk 
├─sdb1        8:17   0     2G  0 part 
└─sdb2        8:18   0  10.9T  0 part 
sdc           8:32   0  10.9T  0 disk 
├─sdc1        8:33   0     2G  0 part 
└─sdc2        8:34   0  10.9T  0 part 
sdd           8:48   1   7.2G  0 disk 
nvme0n1     259:0    0 238.5G  0 disk 
├─nvme0n1p1 259:1    0     1M  0 part 
├─nvme0n1p2 259:2    0   512M  0 part 
├─nvme0n1p3 259:3    0   222G  0 part 
└─nvme0n1p4 259:4    0    16G  0 part

I checked the disks and they all look as good as one of them below - I hope:

sudo fdisk -l
Disk /dev/sda: 10.91 TiB, 12000138625024 bytes, 23437770752 sectors
Disk model: ST12000NE0008-2P
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 3BDF6087-7854-446B-84E8-9253EC5B7381

Device       Start         End     Sectors  Size Type
/dev/sda1      128     4194304     4194177    2G Linux swap
/dev/sda2  4194432 23437770718 23433576287 10.9T Solaris /usr & Apple ZFS

I tried to import the pool but I got an error message:

sudo zpool import pool_a
cannot import 'pool_a': insufficient replicas
	Destroy and re-create the pool from
	a backup source.

After that I imported the pool in readonly mode and all the pools and vdevs showed up in the browser window, but the pool itself did not mount in /mnt

sudo zpool import -o readonly=on pool_a
cannot mount '/pool_a': failed to create mountpoint: Read-only file system
Import was successful, but unable to mount some datasets

after that I made another status query and got a list of errors:

sudo zpool status -v
pool: pool_a
 state: ONLINE
status: One or more devices has experienced an error resulting in data
	corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
	entire pool from backup.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
  scan: resilvered 156M in 00:01:28 with 592 errors on Mon Oct  7 17:55:18 2024
config:

	NAME                                      STATE     READ WRITE CKSUM
	pool_a                                    ONLINE       0     0     0
	  raidz1-0                                ONLINE       0     0     0
	    2fc128e1-b472-42ea-8a56-715eb5305916  ONLINE       0     0     0
	    39d99900-6aa3-4c05-862e-73f24438a182  ONLINE       0     0     8
	    d370c168-7b3b-4c12-9421-af4dd222fc09  ONLINE       0     0     8

errors: Permanent errors have been detected in the following files:

        <metadata>:<0x0>
        <metadata>:<0x475>
        <metadata>:<0x38e>
        pool_a/reolink:/CAM/outdoor/2024/10/07/RLC-833A_00_20241007174308.mp4
        pool_a/VM/HAOS:<0x1>

… and now unfortunately I don’t know how to remove this data or fix it because the pool is not mounted, SCRUB does nothing either - is it because the pool is not mounted to /mnt?

Can I still fix it or did I lose the entire only backup??
The situation is very sad because I haven’t had any replications / backups for a long time

Protopia · October 8, 2024, 10:45pm

I am entirely unclear whether this pool is online or not. The zfs status says it is, was recently resilvered (yesterday) but has errors.

With the very limited information available (i.e. no SMART details to say what state the actual disks are in, no details of mount points /pool_a vs. /mnt/pool_a etc., what details you get in the command line vs. what is seen in the UI etc.) and the difficulties of understanding the actual status, it is difficult to know what actions might be sensible.

The only advice I can give at this point is NOT to try a bunch of random commands to bring it back online. At this stage it looks like most of the data is recoverable except one cam video and (possibly more of a problem) what looks likely to be a VM Zvol. Doing the wrong thing might make this completely irrecoverable.

But in the end the pool will most likely need to be rebuilt from scratch to get rid of the metadata errors, so resign yourself to this and find some hardware you can hopefully copy your data onto.

And promise yourself that you will set up smart tests, snapshots and backups when it is all back working again.

jerr · October 9, 2024, 6:20am

I have checked all SMART tests from all disks and they are Completed without error.
Should I force the pool to mount in this situation?

sudo zpool import -f pool_a

check the physical condition of the disks

sudo zpool status -v

and if they are OK, run SCRUB: ?

sudo zpool scrub pool_a

I can’t think of anything else that would make sense to do ?

I don’t care about the camera recordings, other individual files are also bearable.
Yeeeees - I promise to dutifully make copies :-))) if there is something to make them :-/

etorix · October 9, 2024, 6:46am

“Status” last showed the pool was online, If so, copy all valuable data and then destroy the pool.

jerr · October 9, 2024, 9:54am

I will do that, but since the pool is not mounted, I should probably mount it first?

sudo zpool import -f pool_a

jerr · October 10, 2024, 10:41pm

unfortunately force import does nothing.

I analyzed what happened before the pool was disconnected:
before restarting I got a confirmation about the update 24.10-BETA.1 → ElectricEel-24.10-RC.2, and the zpool upgrade which I performed. Everything worked fine until the restart.

At the moment, despite the fact that the pool is exported (and disconnected? ) there is an inscription about the possibility of performing an UPGRADE - should I do it?

I have a question - If the disks claim that pool_a is Exported, then why does the menu have the option to export this pool? is any of this information not true?

If I understand correctly, if I do an export, will nothing change the current state of pool_a? the data, if not destroyed, will remain on the disks?

Protopia · October 11, 2024, 8:47am

Do NOT do a pool upgrade at the moment. You don’t want to change anything that might reduce the chances of getting this back online.

Do not do an export - just in case that makes it worse.

We need to establish whether the pool is currently imported and offline or exported - probably imported.

Your previous sudo zpool status results showed that the pool was online.

Please run sudo zpool status -v again and copy and paste the results.

When you say that sudo pool import -f pool_a does nothing, do you mean that there is literally zero output from the command or do you mean that “nothing” is your summary that it didn’t work?

jerr · October 11, 2024, 10:10am

oh wait a moment - one more time: at the moment normally after restart the TRUENAS website shows such images as below, meaning that pool_a is missing and it was exported:

sudo zpool status -v

  pool: boot-pool
 state: ONLINE
status: Some supported and requested features are not enabled on the pool.
        The pool can still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
        the pool may no longer be accessible by software that does not support
        the features. See zpool-features(7) for details.
  scan: scrub repaired 0B in 00:01:01 with 0 errors on Wed Oct  9 03:46:03 2024
config:

        NAME         STATE     READ WRITE CKSUM
        boot-pool    ONLINE       0     0     0
          nvme0n1p3  ONLINE       0     0     0

errors: No known data errors

if I try to import the lost pool_a then trunas shows me this message:

sudo zpool import pool_a

cannot import 'pool_a': insufficient replicas
	Destroy and re-create the pool from
	a backup source.

For testing purposes I tried to import pool_a in readonly mode ( until restart):

sudo zpool import -o readonly=on pool_a

cannot mount '/pool_a': failed to create mountpoint: Read-only file system
Import was successful, but unable to mount some datasets



sudo zpool status -v

pool: pool_a
 state: ONLINE
status: One or more devices has experienced an error resulting in data
	corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
	entire pool from backup.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
  scan: resilvered 156M in 00:01:28 with 592 errors on Mon Oct  7 17:55:18 2024
config:

	NAME                                      STATE     READ WRITE CKSUM
	pool_a                                    ONLINE       0     0     0
	  raidz1-0                                ONLINE       0     0     0
	    2fc128e1-b472-42ea-8a56-715eb5305916  ONLINE       0     0     0
	    39d99900-6aa3-4c05-862e-73f24438a182  ONLINE       0     0     8
	    d370c168-7b3b-4c12-9421-af4dd222fc09  ONLINE       0     0     8

errors: Permanent errors have been detected in the following files:

        <metadata>:<0x0>
        <metadata>:<0x475>
        <metadata>:<0x38e>
        pool_a/reolink:/CAM/outdoor/2024/10/07/RLC-833A_00_20241007174308.mp4
        pool_a/VM/HAOS:<0x1>

but unfortunately pool_a DID NOT MOUNT in /mnt so nothing despite the fact that I can see all the datasets on the truenas website I do not have access to them. For the same reason I cannot do SCRUB or even delete these damaged files.

Force import sais the same as normal import:

root@truenas[~]# sudo zpool import -f pool_a

cannot import 'pool_a': insufficient replicas
        Destroy and re-create the pool from
        a backup source.

Thank you very much for your help, I keep reading in different places about such cases but I have the impression that I am going in circles

Protopia · October 11, 2024, 10:39am

Please stop attempting zpool import because zpool status says it is imported and online.

We have two issues:

Getting you access to your existing data - you say yourself it is not mounted in /mnt so that is what we need to work on. Once we have that, then we can move onto …
Resolving the metadata and file errors.

Let’s focus on the first of these…

Please run sudo zfs list and copy and paste the results.

jerr · October 11, 2024, 11:42am

here it is:

root@truenas[~]#  sudo zfs list
NAME                                                         USED  AVAIL  REFER  MOUNTPOINT
boot-pool                                                   48.1G   165G    96K  none
boot-pool/.system                                            222M   165G   120K  legacy
boot-pool/.system/configs-782595d4363048b99e75310cc20adc4f   600K   165G   600K  legacy
boot-pool/.system/cores                                      120K  1024M   120K  legacy
boot-pool/.system/ctdb_shared_vol                             96K   165G    96K  legacy
boot-pool/.system/glusterd                                    96K   165G    96K  legacy
boot-pool/.system/netdata-782595d4363048b99e75310cc20adc4f   205M   165G   205M  legacy
boot-pool/.system/nfs                                        128K   165G   128K  legacy
boot-pool/.system/rrd-782595d4363048b99e75310cc20adc4f      11.5M   165G  11.5M  legacy
boot-pool/.system/samba4                                    1.71M   165G   264K  legacy
boot-pool/.system/services                                    96K   165G    96K  legacy
boot-pool/.system/syslog-782595d4363048b99e75310cc20adc4f   2.16M   165G  2.16M  legacy
boot-pool/.system/webui                                       96K   165G    96K  legacy
boot-pool/ROOT                                              47.9G   165G    96K  none
boot-pool/ROOT/22.02.1                                      4.09G   165G  4.09G  legacy
boot-pool/ROOT/22.02.2                                      2.61G   165G  2.61G  legacy
boot-pool/ROOT/22.02.2.1                                    2.61G   165G  2.61G  legacy
boot-pool/ROOT/22.02.3                                      2.66G   165G  2.66G  legacy
boot-pool/ROOT/22.02.4                                      2.67G   165G  2.67G  legacy
boot-pool/ROOT/22.02.RELEASE                                2.62G   165G  2.61G  legacy
boot-pool/ROOT/22.12.2                                      2.63G   165G  2.63G  legacy
boot-pool/ROOT/22.12.3                                      2.65G   165G  2.65G  legacy
boot-pool/ROOT/22.12.3.1                                    2.65G   165G  2.65G  legacy
boot-pool/ROOT/22.12.3.2                                    2.65G   165G  2.65G  legacy
boot-pool/ROOT/22.12.3.3                                    2.65G   165G  2.65G  legacy
boot-pool/ROOT/22.12.4                                      2.65G   165G  2.65G  legacy
boot-pool/ROOT/22.12.4.1                                    2.65G   165G  2.65G  legacy
boot-pool/ROOT/22.12.4.2                                    2.65G   165G  2.65G  legacy
boot-pool/ROOT/23.10.2                                      2.33G   165G  2.33G  legacy
boot-pool/ROOT/24.04.2.2                                    2.39G   165G   164M  legacy
boot-pool/ROOT/24.04.2.2/audit                               208K   165G   208K  /audit
boot-pool/ROOT/24.04.2.2/conf                                140K   165G   140K  /conf
boot-pool/ROOT/24.04.2.2/data                                152K   165G   324K  /data
boot-pool/ROOT/24.04.2.2/etc                                6.59M   165G  5.66M  /etc
boot-pool/ROOT/24.04.2.2/home                                  0B   165G   124K  /home
boot-pool/ROOT/24.04.2.2/mnt                                 104K   165G   104K  /mnt
boot-pool/ROOT/24.04.2.2/opt                                74.1M   165G  74.1M  /opt
boot-pool/ROOT/24.04.2.2/root                                  8K   165G   980K  /root
boot-pool/ROOT/24.04.2.2/usr                                2.12G   165G  2.12G  /usr
boot-pool/ROOT/24.04.2.2/var                                31.8M   165G  31.0M  /var
boot-pool/ROOT/24.04.2.2/var/ca-certificates                  96K   165G    96K  /var/local/ca-certificates
boot-pool/ROOT/24.04.2.2/var/log                             508K   165G  4.03M  /var/log
boot-pool/ROOT/24.10-BETA.1                                 2.27G   165G   165M  legacy
boot-pool/ROOT/24.10-BETA.1/audit                           8.03M   165G  8.60M  /audit
boot-pool/ROOT/24.10-BETA.1/conf                            6.84M   165G  6.84M  /conf
boot-pool/ROOT/24.10-BETA.1/data                             232K   165G   360K  /data
boot-pool/ROOT/24.10-BETA.1/etc                             7.53M   165G  6.48M  /etc
boot-pool/ROOT/24.10-BETA.1/home                              64K   165G   144K  /home
boot-pool/ROOT/24.10-BETA.1/mnt                              104K   165G   104K  /mnt
boot-pool/ROOT/24.10-BETA.1/opt                               96K   165G    96K  /opt
boot-pool/ROOT/24.10-BETA.1/root                             180K   165G  13.0M  /root
boot-pool/ROOT/24.10-BETA.1/usr                             2.04G   165G  2.04G  /usr
boot-pool/ROOT/24.10-BETA.1/var                             50.9M   165G  31.5M  /var
boot-pool/ROOT/24.10-BETA.1/var/ca-certificates               96K   165G    96K  /var/local/ca-certificates
boot-pool/ROOT/24.10-BETA.1/var/log                         18.3M   165G  63.9M  /var/log
boot-pool/ROOT/24.10-RC.2                                   2.43G   165G   165M  legacy
boot-pool/ROOT/24.10-RC.2/audit                             35.8M   165G  22.6M  /audit
boot-pool/ROOT/24.10-RC.2/conf                              6.83M   165G  6.83M  /conf
boot-pool/ROOT/24.10-RC.2/data                              1.00M   165G   392K  /data
boot-pool/ROOT/24.10-RC.2/etc                               7.57M   165G  6.49M  /etc
boot-pool/ROOT/24.10-RC.2/home                               288K   165G   144K  /home
boot-pool/ROOT/24.10-RC.2/mnt                                104K   165G   104K  /mnt
boot-pool/ROOT/24.10-RC.2/opt                                 96K   165G    96K  /opt
boot-pool/ROOT/24.10-RC.2/root                              13.4M   165G  13.0M  /root
boot-pool/ROOT/24.10-RC.2/usr                               2.05G   165G  2.05G  /usr
boot-pool/ROOT/24.10-RC.2/var                                159M   165G  31.9M  /var
boot-pool/ROOT/24.10-RC.2/var/ca-certificates                 96K   165G    96K  /var/local/ca-certificates
boot-pool/ROOT/24.10-RC.2/var/log                            126M   165G  68.1M  /var/log
boot-pool/ROOT/24.10-RC.2/var/log/journal                   50.3M   165G  50.3M  /var/log/journal
boot-pool/ROOT/Initial-Install                                 8K   165G  2.61G  /
boot-pool/grub                                              8.18M   165G  8.18M  legacy
pool_b                                                       528K  10.8T    96K  /mnt/pool_b

/mnt list:

root@truenas[~]# ls -la /mnt 
total 11
drwxr-xr-x  5 root root  5 Oct 10 09:51 .
drwxr-xr-x 21 root root 29 Oct  4 03:50 ..
drwxr-xr-x  2 root root  2 Oct  6 11:46 .ix-apps
drwxr-xr-x  2 root root  2 Oct  6 11:45 pool_a
drwxr-xr-x  2 root root  2 Oct 10 09:51 pool_b

for the record - pool_b is working and was set up yesterday on another disk for backup purposes, for now it’s empty

Protopia · October 11, 2024, 12:02pm

Thanks - the ls was useful. Can you do a ls -la /mnt/pool_a?

Then we can try to mount the pool.

I am assuming it is not mounting because there is a directory called /mnt/pool_a but we need to know what is in it before deciding what to do with it.

jerr · October 11, 2024, 12:13pm

it looks empty ?:

root@truenas[~]# ls -la /mnt/pool_a
total 1
drwxr-xr-x 2 root root 2 Oct  6 11:45 .
drwxr-xr-x 5 root root 5 Oct 10 09:51 ..

Protopia · October 11, 2024, 12:42pm

Ok - let’s try removing the empty directory and mounting the pool (and sub-datasets):

sudo rmdir /mnt/pool_a
sudo zfs mount -v -R pool_a

jerr · October 11, 2024, 1:11pm

done:

root@truenas[~]# sudo rmdir /mnt/pool_a
root@truenas[~]# sudo zfs mount -v -R pool_a
cannot open 'pool_a': dataset does not exist
usage:
        mount [-j]
        mount [-flvO] [-o opts] <-a|-R filesystem|filesystem>

For the property list, run: zfs set|get

For the delegated permission list, run: zfs allow|unallow

For further help on a command or topic, run: zfs help [<topic>]

root@truenas[~]# ls -la /mnt/pool_a     
ls: cannot access '/mnt/pool_a': No such file or directory

root@truenas[~]# ls -la /mnt            
total 10
drwxr-xr-x  4 root root  4 Oct 11 15:09 .
drwxr-xr-x 21 root root 29 Oct  4 03:50 ..
drwxr-xr-x  2 root root  2 Oct  6 11:46 .ix-apps
drwxr-xr-x  2 root root  2 Oct 10 09:51 pool_b
root@truenas[~]#

Protopia · October 11, 2024, 1:41pm

I doubt it will work, but let’s try

sudo zpool get altroot pool_a to check where it will try to mount it.
sudo zfs mount -v -a to do a default mount of all pools.

neofusion · October 11, 2024, 2:18pm

If you need to manually import a pool in the shell for any reason, you should set the altroot like so:
zpool import -o altroot=/mnt <poolname>

Protopia · October 11, 2024, 2:30pm

Yes - but this is as much about trying to work out what went wrong as putting it right.

Once we know what the current altroot is, then we can change it if necessary.

neofusion · October 11, 2024, 2:57pm

The above implies that that the issues following were at least partially related to an improper manual import, since with the newish TN changes to have a read-only root, trying to mount something on the root (by not using altroot) was doomed to fail.

But I agree with you that it’s good to make sure before going further.

jerr · October 11, 2024, 2:59pm

Ok, so first:

root@truenas[~]# sudo zpool get altroot pool_a 

Cannot get properties of pool_a: no such pool available.

about sudo zfs mount -v -a just nothing happened …

root@truenas[~]# sudo zfs mount -v -a 

root@truenas[~]# ls -la /mnt 
total 10
drwxr-xr-x  4 root root  4 Oct 11 15:09 .
drwxr-xr-x 21 root root 29 Oct  4 03:50 ..
drwxr-xr-x  2 root root  2 Oct  6 11:46 .ix-apps
drwxr-xr-x  2 root root  2 Oct 10 09:51 pool_b

yes - I also want to understand the reason for the failure.
I am waiting for confirmation whether I should do zpool import -o altroot=/mnt <poolname>

Protopia · October 11, 2024, 3:28pm

Can we do a sudo zpool status -v again?

The pool was online - but now “no such pool available”.