I’m new to truenas scale. I’ve just rebooted and my primary pool with several shares is offline after the last reboot. Would anyone be willing to help me to figure out how to get the pool back (without losing data, )???
I can see the disks that now show “N/A” in the Pool Column of the Storage Disks Display. The SMART test returns expected result. Are the disks okay?
There are two alerts:
First, this: New ZFS version or feature flags are available for pool n5x1x4TB…error CRITICAL
Second, this: Pool n5x1x4TB state is OFFLINE: None
Help? Can I restore this pool? Funnily enough, the windows shares that use this pool are still running, but they don’t actually share anything.
zpool list -v n5x1x4TB
root@truenas[~]# zpool list -v n5x1x4TB
cannot open ‘n5x1x4TB’: no such pool
root@truenas[~]#
root@truenas[~]# zpool status -v n5x1x4TB
cannot open ‘n5x1x4TB’: no such pool
root@truenas[~]#
The drives are SATA HDDs, WDC_WD40EFAX-68JH4N1, 3.64 TiB
There are five drives connected to a pcie sata controller
Only four drives show up in the disk listing, all four show N/A in the Pool column
FYI
I did a full restart immediately before the pool disappeared. I can see only four of five drives that are / were part of the pool, n5x1x4TB. I’ve read here in this forum, I think in one of your replies to another post, actually, that TrueNAS does not automatically import pools with failed disks.
Could that be an issue? I see only four of five disks listed, so perhaps one has failed?
SMR drives and a sata controller… ouchies. I’m hoping that you meant HBA.
Any chance at all you have available sata ports on your motherboard that you can use to see if drives are detected? A full list of your hardware; motherboard, nic, cpu, ram, the exact model of sata controller (link to it if you have to), etc. could be of help.
Quick & dirty tips - have you checked the physical connections? Any chance there is a loose power/sata data cable? Does your bios see the drives?
P.S. @Fleshmauler is absolutely right - your WD Red EFAX drives are SMR drives, and even WD state that these are completely un-suiitable for ZFS redundant drives (because their bulk write performance is terrible and during bulk writes the drives themselves or ZFS can timeout the drives and cause a ZFS drive error that can degrade or take your pool offline).
root@truenas[~]# sudo sas2flash -list
LSI Corporation SAS2 Flash Utility
Version 20.00.00.00 (2014.09.18)
Copyright (c) 2008-2014 LSI Corporation. All rights reserved
No LSI SAS adapters found! Limited Command Set Available!
ERROR: Command Not allowed without an adapter!
ERROR: Couldn't Create Command -list
Exiting Program.
root@truenas[~]#
root@truenas[~]# sudo sas3flash -list
Avago Technologies SAS3 Flash Utility
Version 16.00.00.00 (2017.05.02)
Copyright 2008-2017 Avago Technologies. All rights reserved.
No Avago SAS adapters found! Limited Command Set Available!
ERROR: Command Not allowed without an adapter!
ERROR: Couldn't Create Command -list
Exiting Program.
root@truenas[~]#
root@truenas[~]# sudo zpool import
pool: n5x1x4TB
id: 5758399647352221700
state: FAULTED
status: One or more devices contains corrupted data.
action: The pool cannot be imported due to damaged devices or data.
The pool may be active on another system, but can be imported using
the ‘-f’ flag.
see: Message ID: ZFS-8000-5E — OpenZFS documentation
config:
The part I bolded was supposed to all be on one line, please try again without the line break.
The OpenZFS page linked to in that last zpool import command has this line: The device listed as FAULTED with ‘corrupted data’ cannot be opened due to a corrupt label. ZFS will be unable to use the pool, and all data within the pool is irrevocably lost. The pool must be destroyed and recreated from an appropriate backup source. Using replicated configurations will prevent this from happening in the future.
Do you have a backup?
I wonder how the label could have been corrupted… bad RAM?
Unfortuantely I have only older backups. I will lose data, and be very sad if I can’t import the pool.
The pool was working before a recent restart. The cli return in our case includes, “The pool may be active on another system, but can be imported using
the ‘-f’ flag.”
I’m hoping that the import can be forced, so that I can make backups.
I think I’m okay. We’ll see in a few hours, the pool is online with two degraded/offline disks of six,
The WD40EFAX is SMR Recording Technology, according to the manufacturer’s specifications. It is marketed as a NAS drive, the whole WD Red series is supposed to be the best choice for NAS/RAID applications. I don’t see any mention of SMR as unsuitable for Z2. Amazing. How would I have known?
Are you running TrueNAS in a VM or something along those lines?
In short, for an uninitiated, it would be difficult to know this due to WD’s marketing.
After they snuck these types of drives into their Red line there was a big backlash and they eventually posted (among other things) this blog post while trying to spin it as something positive for the customer:
I personally voted with my wallet and stopped buying their products, period.
I’m going to add the basic details of my curent TrueNAS hardware/NAS build, by editing my original post above. The point is moot, if the degraded pool lives long enough for me to make a copy of current data, then I’ll likely build a new NAS.
To answer your question though, no, the NAS is TrueNAS on bare metal. It’s an ATX mid tower case with an older, Gigabyte main board. I use two PCE8SAT-M01 VER0065, these are SATA expansion cards in PCIEx1 slots. Those seem to work extremely well.
I have two pools of 5 drives, one with the WD Red NAS drives, and one with 5 Segate drives
Exactly - how indeed? Shame on Western Digital for acting the way they did and indeed continuing to act the way they do i.e. not giving an explicit warning on the drives and on the packaging and in their marketing literature that they are SMR drives and unsuitable for ZFS.
I was aware of the SMR fiasco before I bought my drives, so I could have bought Red Plus but instead decided to do my small part in making WD pay for their actions by buying Seagate IronWolf drives instead.
I note that in the linked Blog post it says " While we work with iXsystems on DMSMR solutions for lower-workload ZFS customers…" and that strikes me as odd since iX are NOT directly responsible for OpenZFS and other NAS software (and non-NAS software like that lesser known Linux distro called Ubuntu) also use ZFS. Perhaps @kris could enlighten us as to whether WDC did indeed “work with iXsystems” and maybe tell us what choice words were spoken by iX when they were approached by WDC.
(And if iX didn’t actively “work with” WDC, does this blog post constitute corporate libel, and I wonder whether iX will request WDC to take it down?)
I sadly have no relevant advice on data recovery other than validating physical connections & trying to see if drives work properly when directly connected to the motherboard. Reason being, even if some cli magic can bring the pool online, the underlying hardware & therefor chance to get data out if questionable.
If you don’t have enough sata connections on your motherboard, but have a second system, see if you can temporarily make a TrueNAS boot (usb is fine, this is just to copy data), move the drives over directly to motherboard, and see if that gives any joy.
As for port multipliers, they might ‘work great’ at first glance, but they generally set you up for failure. Frankly it’d have been better if they just immediatetly never functioned.
Thank you, (hopefully, non-literally?) Fleshmauler,
I have three of five drives online, and the pool was imported after another reboot. 'Long may it last. I’m copying data now to a sata drive, it is a desktop version drive also by Western Digital.
That is in an external drive bay connected via usb 3, so it will take several hours to copy just part of the data. For the rest of the data, I’m making a prioritized listing of directories, and I’m trying to prepare the rsync commands to copy data from the NAS to a local raid array on my local fedora machine. That will be quicker than the first directory copy, which is being copied via the usual nautilus file manager. If I can get everything copied to my local machine, then I’ll make more copies to various sata drives, that are kicking around. If I can get a few back ups made, then I’ll try to replace the failed drives in the TrueNAS pools. If I can repair the existing pools, then I’ll have copies on and off site, and I’ll have a working NAS. Then I’ll build two more NAS boxes/machines.
It is what I eat! Everyone immediately goes to canabalism for some reason; I just like meat.
Happy to hear it came online & you’re actively making backups! Even happier to hear you’re rolling up some hardware for a new build man.
Hopefully all the transfers go through successfuly & this can be used as nothing more than a (mildly painful) learning experience.
When I first joined I thought everyone was too crazy conservative on best practices & generally a stick in the mud. After the years though, I found myself spreading the same advice as the old guard after learning the hard way myself…