After Upgrade from Core to Scale my pool is OFFLINE

cleansman · October 23, 2024, 6:29pm

I have upgraded from TrueNAS CORE 13.0-U6.2 to TrueNAS SCALE 24.04.2.3 using the WebUI.

After the upgrade my main pool “terra” is offline and all 4 disks are unassigned. The pool contains offline Data VDEVs.

But the disk health is fine.

I have run some CLI commands that might help to find the issue:

root@freenas[~]# zpool status
  pool: freenas-boot
 state: ONLINE
status: Some supported and requested features are not enabled on the pool.
	The pool can still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
	the pool may no longer be accessible by software that does not support
	the features. See zpool-features(7) for details.
  scan: scrub repaired 0B in 00:00:16 with 0 errors on Thu Oct 17 03:45:16 2024
config:

	NAME                                                     STATE     READ WRITE CKSUM
	freenas-boot                                             ONLINE       0     0     0
	  mirror-0                                               ONLINE       0     0     0
	    ata-Samsung_SSD_850_PRO_256GB_S39KNX0HA12807Z-part2  ONLINE       0     0     0
	    ata-Samsung_SSD_850_PRO_256GB_S251NX0H875840D-part2  ONLINE       0     0     0

errors: No known data errors
root@freenas[~]# 


root@freenas[~]# zpool import
no pools available to import
root@freenas[~]# 


root@freenas[~]# lsblk
NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
sda      8:0    0   3.6T  0 disk 
├─sda1   8:1    0     2G  0 part 
└─sda2   8:2    0   3.6T  0 part 
sdb      8:16   0   3.6T  0 disk 
├─sdb1   8:17   0     2G  0 part 
└─sdb2   8:18   0   3.6T  0 part 
sdc      8:32   0   3.6T  0 disk 
├─sdc1   8:33   0     2G  0 part 
└─sdc2   8:34   0   3.6T  0 part 
sdd      8:48   0 238.5G  0 disk 
├─sdd1   8:49   0   512K  0 part 
└─sdd2   8:50   0 238.5G  0 part 
sde      8:64   0 238.5G  0 disk 
├─sde1   8:65   0   512K  0 part 
└─sde2   8:66   0 238.5G  0 part 
sdf      8:80   0   3.6T  0 disk 
├─sdf1   8:81   0     2G  0 part 
└─sdf2   8:82   0   3.6T  0 part 
root@freenas[~]# lsblk -f
NAME   FSTYPE     FSVER LABEL        UUID                                 FSAVAIL FSUSE% MOUNTPOINTS
sda                                                                                      
├─sda1                                                                                   
└─sda2                                                                                   
sdb                                                                                      
├─sdb1                                                                                   
└─sdb2                                                                                   
sdc                                                                                      
├─sdc1                                                                                   
└─sdc2                                                                                   
sdd                                                                                      
├─sdd1                                                                                   
└─sdd2 zfs_member 5000  freenas-boot 6818064012132738886                                 
sde                                                                                      
├─sde1                                                                                   
└─sde2 zfs_member 5000  freenas-boot 6818064012132738886                                 
sdf                                                                                      
├─sdf1                                                                                   
└─sdf2        

root@freenas[~]# midclt call pool.query | jq
[
  {
    "id": 1,
    "name": "terra",
    "guid": "7680425423279888722",
    "path": "/mnt/terra",
    "status": "OFFLINE",
    "scan": null,
    "topology": null,
    "healthy": false,
    "warning": false,
    "status_code": null,
    "status_detail": null,
    "size": null,
    "allocated": null,
    "free": null,
    "freeing": null,
    "fragmentation": null,
    "size_str": null,
    "allocated_str": null,
    "free_str": null,
    "freeing_str": null,
    "autotrim": {
      "parsed": "off",
      "rawvalue": "off",
      "source": "DEFAULT",
      "value": "off"
    }
  }
]

Any help is highly appreciated.

Many thanks!

Protopia · October 24, 2024, 9:36am

Please post the output from zpool import.

cleansman · October 24, 2024, 9:49am

Hi Protopia,

Thank you for your reply.
I added the output already to my initial post.

No pools available to import

Protopia · October 24, 2024, 1:41pm

Well all I can see from the output is that sda/b/c/f are all 4TB drives with a 2GB (swap) partition and a 3.6TB (presumably data ZFS) partition, but that any signature that would tell lsblk that it is a ZFS pool has gone.

I would assume that the data is still there, but the partition id information is so back dated that it is not showing up as ZFS under scale.

I have read here about someone who had encrypted pools where the encryption applied was no longer supported in scale. Are your drives encrypted?

My only advice would be to reinstall CORE and see if that makes the pool visible again. If it does, then we can investigate how to make it compatible and you can try an upgrade again after that.

Do you have your configuration file saved?

cleansman · October 24, 2024, 3:38pm

Okay, will try to reinstall Core now, I have a backup of the config.

Let’s see how this works.

Arwen · October 24, 2024, 4:06pm

That is a good thought. If the Core instance’s disks were GELI encrypted, that is totally incompatible with SCALE. That has to be removed via Core, (1 disk at a time assuming redundant pool), before even considering migrating to SCALE.

cleansman · October 24, 2024, 5:08pm

I have reinstalled Core and uploaded the backup configuration, but the situation is similar.

root@freenas:~ # zpool status -v
  pool: boot-pool
 state: ONLINE
config:

	NAME        STATE     READ WRITE CKSUM
	boot-pool   ONLINE       0     0     0
	  mirror-0  ONLINE       0     0     0
	    ada0p2  ONLINE       0     0     0
	    ada1p2  ONLINE       0     0     0

errors: No known data errors
root@freenas:~ # zpool import
no pools available to import

But in the GUI I have the option to unlock the pool, when I do that it fails with this message: “[EFAULT] Pool could not be imported: 4 devices failed to decrypt.” and the following Errors:

Error: concurrent.futures.process._RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/concurrent/futures/process.py", line 246, in _process_worker
    r = call_item.fn(*call_item.args, **call_item.kwargs)
  File "/usr/local/lib/python3.9/site-packages/middlewared/worker.py", line 111, in main_worker
    res = MIDDLEWARE._run(*call_args)
  File "/usr/local/lib/python3.9/site-packages/middlewared/worker.py", line 45, in _run
    return self._call(name, serviceobj, methodobj, args, job=job)
  File "/usr/local/lib/python3.9/site-packages/middlewared/worker.py", line 39, in _call
    return methodobj(*params)
  File "/usr/local/lib/python3.9/site-packages/middlewared/worker.py", line 39, in _call
    return methodobj(*params)
  File "/usr/local/lib/python3.9/site-packages/middlewared/schema.py", line 985, in nf
    return f(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/zfs.py", line 352, in import_pool
    self.logger.error(
  File "libzfs.pyx", line 402, in libzfs.ZFS.__exit__
  File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/zfs.py", line 343, in import_pool
    raise CallError(f'Pool {name_or_guid} not found.', errno.ENOENT)
middlewared.service_exception.CallError: [ENOENT] Pool 7680425423279888722 not found.
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/pool_/encryption_freebsd.py", line 272, in unlock
    await self.middleware.call('zfs.pool.import_pool', pool['guid'], {
  File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1283, in call
    return await self._call(
  File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1248, in _call
    return await self._call_worker(name, *prepared_call.args)
  File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1254, in _call_worker
    return await self.run_in_proc(main_worker, name, args, job)
  File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1173, in run_in_proc
    return await self.run_in_executor(self.__procpool, method, *args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1156, in run_in_executor
    return await loop.run_in_executor(pool, functools.partial(method, *args, **kwargs))
middlewared.service_exception.CallError: [ENOENT] Pool 7680425423279888722 not found.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/middlewared/job.py", line 355, in run
    await self.future
  File "/usr/local/lib/python3.9/site-packages/middlewared/job.py", line 391, in __run_body
    rv = await self.method(*([self] + args))
  File "/usr/local/lib/python3.9/site-packages/middlewared/schema.py", line 981, in nf
    return await f(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/pool_/encryption_freebsd.py", line 286, in unlock
    raise CallError(msg)
middlewared.service_exception.CallError: [EFAULT] Pool could not be imported: 4 devices failed to decrypt.

The encryption password I used was definitely correct (copy & paste from Bitwarden multiple times).

Regarding GELI, I verified before the upgrade that the disks were not GELI encrypted.

Some screenshots:

EDIT

It seems my check for GELI encryption was not very good:

root@freenas:~ # midclt call pool.query | jq
[
  {
    "id": 1,
    "name": "terra",
    "guid": "7680425423279888722",
    "encrypt": 2,
    "encryptkey": "d46ad3a4-c028-4664-a134-4b51a0817687",
    "path": "/mnt/terra",
    "status": "OFFLINE",
    "scan": null,
    "topology": null,
    "healthy": false,
    "status_detail": null,
    "autotrim": {
      "parsed": "off",
      "rawvalue": "off",
      "source": "DEFAULT",
      "value": "off"
    },
    "is_decrypted": false,
    "encryptkey_path": "/data/geli/d46ad3a4-c028-4664-a134-4b51a0817687.key"
  }
]

cleansman · October 25, 2024, 10:46am

Any idea what I can do to decrypt the pool again on Core?

Protopia · October 25, 2024, 3:14pm

Do you know the encryption key?

cleansman · October 25, 2024, 3:57pm

Yes! And I tried to decrypt it, but it resulted in the provided error message.

cleansman · October 28, 2024, 6:56pm

I tried to find a solution in other threads on the internet but without luck.

The pool is there in the GUI but it has no disks.

Is it worth a try to Export/Disconnect the pool via the GUI? Maybe this will allow me to import it again? But I am afraid that this will make things even worse.