Can't import after trying extend and power outage

Connected most of the drives to HBA during lunch after a shutdown and all disk show up but still can’t import :sob:

Panic yields to bad decisions. The new install explains the “last accessed by another system”.

Which SAS controller? Which firmware version?

1 Like

Trying Klennet ZFS Recovery trial to see if its even possible recovery

It may still be possible to recover from TrueNAS. But let Klennet scan now that you’ve launched it.

You should still investigate the issue to prevent a recurrence. A raidz2 should not be faulted without even a single failed drive.
ECC RAM? What HBA? Sufficiently cooled?

Between “reinstalling out of panic” and now running Klennet, I honestly think you’re rushing in what could be a simple solution.

I had to take a break because you still did not put your output into code brackets. It takes away the motivation to help.

Sorry I don’t understand by “code brackets”

I think it had to do with extend not finishing attaching 10th drive because of power going off in my block. Once the power was back on all my 10 drives were as “add to pool” instead in its pool

Will update once Klennet Software is finished since this is 10 drives first scan is at 62%.

‘code brackets’ means to use Preformatted Text. </> on the toolbar, Ctrl+e

2 Likes

Fair enough. That is the </> button.

Now please answer about your “SAS controller”. A wrong one could be the cause of your issues!

Could be, but raidz expansion should have just resumed when the system came back online.
We’ve seen (way too many) cases of drives getting out of sync and loosing their labels, but your drives still have their labels all set at the same transaction group (txg). It’s not clear why the pool doesn’t import.

10:00.0 Serial Attached SCSI controller: Broadcom / LSI SAS3008 PCI-Express Fusion-MPT SAS-3 (rev 02)

Sorry for the late reply and Thank you all for your time its currently at 92% first scan on Klennet ZFS Recovery. Notice now that not much activity on most drives at 80% there was only 2 drives with activity and now only see 1. Is this normal? scanning is still progressing but don’t know if it progresses regardless.

Side note:
contacted one of my friends and he is going to lend me a 9500-16i since he says that can support 16 drives. What do you think?

1 Like

For some reason can’t edit the first post of this so had to add the correction here

root@truenas[~]# smartctl -l error /dev/sda
smartctl -l selftest /dev/sda
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.12.15-production+truenas] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Error Log Version: 1
No Errors Logged

smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.12.15-production+truenas] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_err or
1 Short offline Completed without error 00% 3

root@truenas[~]# smartctl -l error /dev/sdb
smartctl -l selftest /dev/sdb
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.12.15-production+truenas] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Error Log Version: 1
No Errors Logged

smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.12.15-production+truenas] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
1 Short offline Completed without error 00% 938 -

root@truenas[~]# smartctl -l error /dev/sdc
smartctl -l selftest /dev/sdc
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.12.15-production+truenas] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Error Log Version: 1
No Errors Logged

smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.12.15-production+truenas] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
1 Short offline Completed without error 00% 15269 -
2 Short offline Completed without error 00% 0 -

root@truenas[~]# smartctl -l error /dev/sdd
smartctl -l selftest /dev/sdd
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.12.15-production+truenas] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Error Log Version: 1
No Errors Logged

smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.12.15-production+truenas] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
1 Short offline Completed without error 00% 12448 -

root@truenas[~]# smartctl -l error /dev/sde
smartctl -l selftest /dev/sde
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.12.15-production+truenas] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Error Log Version: 1
No Errors Logged

smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.12.15-production+truenas] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
1 Short offline Completed without error 00% 14420 -

root@truenas[~]# smartctl -l error /dev/sdf
smartctl -l selftest /dev/sdf
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.12.15-production+truenas] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Error Log Version: 1
No Errors Logged

smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.12.15-production+truenas] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
1 Short offline Completed without error 00% 0 -

root@truenas[~]# smartctl -l error /dev/sdg
smartctl -l selftest /dev/sdg
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.12.15-production+truenas] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Error Log Version: 1
No Errors Logged

smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.12.15-production+truenas] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
1 Short offline Completed without error 00% 0 -

root@truenas[~]# smartctl -l error /dev/sdh
smartctl -l selftest /dev/sdh
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.12.15-production+truenas] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Error Log Version: 1
No Errors Logged

smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.12.15-production+truenas] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
1 Short offline Completed without error 00% 0 -

root@truenas[~]# smartctl -l error /dev/sdi
smartctl -l selftest /dev/sdi
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.12.15-production+truenas] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Error Log Version: 1
No Errors Logged

smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.12.15-production+truenas] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
1 Short offline Completed without error 00% 7502 -

root@truenas[~]# smartctl -l error /dev/sdj
smartctl -l selftest /dev/sdj
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.12.15-production+truenas] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Error Log Version: 1
No Errors Logged

smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.12.15-production+truenas] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
1 Short offline Completed without error 00% 0

You have a proper HBA. If it is on phase16 firmware, all is fine and I don’t think that a 9500 would help.

I don’t know how Klennet proceeds to gather all the information it needs. So let’s assume it is working as it should and wait for completion.

Thank you. Currently in the 2nd step in Klennet which is 2nd disk scan.

Curious, do you have your computer set to reboot on a power outage or just remain powered off? I am wondering if some of the power outage and ZFS problems are the systems coming back up after the first outage and then losing power again. A safer option may be just to have it set to stay off upon a power outage. This is just a guess though.

Hi!m, Yes it is actually setup to power back on after an outage.

Scan is still going taking awhile notice this in the zfs structure. Is this normal?

Let us know when the scan is done. No clue what is normal looking for Klennet.
Can’t diag in TrueNAS until that is complete.

2nd disk scan step barely at 2% :sob:

Are you trying to recover data? or are you trying to recover the pool?
If just the data I’d suggest some out of the box thinking…that worked for me when I had a ZFS issues…true as would NOT mount the drives. no how, no way. I set up a windows test machine with it’s own boot and storage drive, added an HBA . I installed OpenZFS for windows, it’e exactly what it sounds like, Attached the true as drives and Mounted and imported the pool. this solution let me recover ALL of my data and store it elsewhere. Then I rebuilt the pool. moved the drives back to the truenas box to test it. I did see a message about previously mounted on another machine, but truenas didn’t care about , then I rebuilt the pool on truesnas , and restored the data to it.
I spent weeks trying to get Klennet and another ZFS recovery tool to work and they sort of did, but this solution took an honest 20 min before I was able to start recovering the data.