Hello, I have non-critical / backed-up data on a home made TrueNAS server used for media at home.
A few days ago, I migrated from CORE to SCALE and have been having trouble getting my BLACKBOX01 pool back online.
The pool did not immediately show as available (the following from memory so maybe a bit hinky):
Tried to be a good boy and read up on the issue, forums etc.
Verified that I’ve got a backup CONFIG available from CORE and followed suggestions to export the pool via GUI (not SHELL, SSH), deselecting all 3 options.
Attempted to import the pool via GUI, it is no longer listed.
Did another day of forum reading…
Drunkenly tried several (likely dumb) recommended CLI methods based on sudo zpool import , including options in combinations of “-a” , “-f” and what have you. Initially sudo zpool info only showed boot pool I believe.
-Have since spent several days reading and working on it w/ no progress.
Anywho, I have an “ok” level of knowledge on TrueNAS, mainly from building and maintaining the system over the years; but, definitely not an expert; so, I’d really appreciate a bit of help.
Info - not sure what info ppl might need, I can get whatever info could be useful:
OS Version: TrueNAS-SCALE-24.04.2.2
2x Zeon E5-2670 CPUs
32GiB ECC RAM
Boot drives are 2x SSD’s mirrrored
Data drives are 5x 4TB WE Red on the motherboard direct (I believe)
sudo zpool list output : only boot pool
sudo lsblk output : shows no mount points for data drives.
So any help I could get would be great. I can rebuild but would rather not. Did try to reload a prior config (CORE, could not locate a SCALE pre-drunken-stupidity config , no changes. )
This bit is interesting. It may be my ignorance but does it really suggest you export your pool before upgrade? Are you sure you didn’t check the box to mark disks as new when you exported the pool?
…does it really suggest you export your pool before upgrade?
Well, here’s where “drunken Jim” f’d over “sober Jim”, I didn’t really review any documentation at all before upgrade to SCALE, just backed up the config (which now appears to be gone, I do have older versions) and upgraded. I didn’t export the pool until the OS was upgraded; but, did it through the GUI which I’ve read is not the same as doing so in the CLI / SSL.
Are you sure you didn’t check the box to mark disks as new when you exported the pool?
The config file isn’t needed to access your pool but it holds other information such as your system settings, users etc but your pool should be fine without it. The exporting of your pool before upgrade sounds odd to me and not something I’ve ever done before however again this alone should not have caused you to lose your pool.
The only thing that makes sense to me atm is that perhaps drunken Jim when exporting the pool ticked the top box ‘mark disks as new’ as this would have wiped all the drives thus destroying the pool.
Yeah it doesn’t look good buddy. Your drives are there as we can see them with lsblk and your boot pool is good even confirming the update as it’s suggesting it was created using an older version of ZFS. If you hadn’t mentioned exporting your pool I would have been at a total loss but I fear when you did this accidentally you may have marked the disks as new as nothing else makes any sense. Let’s see if anyone else has any other thoughts before we totally give up.
So, I opted to follow (as best I’m able as a Linux rookie) the process that HoneyBadger performed (i.e. fixing the partition table) and described in that post.
I’ve duplicated one of the disks in the missing pool and currently have only that one connected to the system. I had previously read posts on repairing a partition for missing labels due to offset in the partition table; but, haven’t been able to find a good general guide. Note - I’m not sure if I’m using the correct terminology when I say “offset” and am not sure that it’s my issue.
Can someone let me know if I seem to be heading down the right path? Any resources I could use for the next steps?
Oh, the one actual fix I attempted (on the cloned disk) was a zhack fix, w/ no improvement.
Oh, did find the post that I was referring to that describes using dd to search for the label: Possible to reattach a disconnected drive to an empty pool? . So, I’m reading up on how to do that safely. I don’t know what metaslab_shift is exactly, appears to be the variable name that Linux or Truenas uses to store the value I’m looking for OR specific to that user’s build.
Thanks!
Oct 9 Update:
Ok, figured out what metaslab refers to; but, still not sure what metaslab_shift is.
Ran sudo dd if=/dev/sda status=progress bs=1M | strings | grep metaslab_shift w/ the following results (none)