HDDs suddenly gone

Had some severe weather and power went on and off several times before going off for 2 days. The system is on a surge protector. When I powered my system back up no HDDs are visible to either GUI or CLI. I only have one pool and it shows offline.

None of my 5 HDDs (zfs raid) are shown in /dev, the only drive in dev is da0 which is the boot USB. There are no pools available to import. zpool status does not show my data pool, only the boot USB again.

The entry for the pool in Storage>Pools shows offline and there is an option to export/disconnect but i’m hesitant to do this without being able to see the physical disks

The only thing I’ve found (again, I’m an amateur) that might be related is that the S.M.A.R.T service isn’t running and I can’t start it from the services page. Says “SMART service failed to start.”

Not sure what else to look at.

FreeBSD freenas.local 13.1-RELEASE-p2 FreeBSD 13.1-RELEASE-p2 n245412-484f039b1d0 TRUENAS amd64

[EDIT] Sorry, of course the hardware list would be required for troubleshooting. This system is about 3yr old. Really just built it and let it be. I blow out the dust 2x yr and update truenas and that’s about it.

Board is a Supermicro X11SSM-F
16G DDR4 RAM
Data disks are (5) WD Red WD30EFAX 3Tb

Disks are SATA connected and were in a ZFS raid5 config. I had set up a single pool using all usable space on the array with 2 volumes, one for data and one for backups of my laptop.

It really ought to be on a UPS. But all of your disks disappearing is a very bad thing, and also very unusual. You may want to check for some kind of power problem. Not much else we can suggest without knowing your hardware.

1 Like

TrueNAS Core?

Please post detailed hardware, software and pool details.

Look at the Storage then Disks tab in TrueNAS GUI. What are you seeing?

How are these 5 HDD’s connected and presented to the OS?

As someone else pointed out. Please post your hardware.

Edited original post with HW. Let me know if you need more details. Disks tab just shows the boot volume USB. No HDDs listed at all.

Directly to the motherboard? Or are you using some kind of controller card? And if so, what?

Again, very strange for all five to die at once, which suggests the problem is somewhere else than with the disks themselves. Are they actually spinning?

Are your HDD visible from bios before TN boot ?

Were you UEFI boot or LEGACY?
May be the bios turned wrongly SATA in other way it was set like Raid instead of AHCI

Ok…I need to take a step back here and think.

Sorry for my panic post, I’m an amateur photographer and all my images on the array. I was literally just looking at other redundant backup options when this happened so I kind of lost my mind.

Troubleshooting begins at the physical layer, right? Evidence indicates this was a physical event and all 5 disks disappearing at once points to a central issue. It’s possible that all 5 disks got hosed by an electrical surge, of course, but it seems less probable than something getting fried on the board before it got to the disk.

I can still hope that’s all it is, right? Also the surge protector seems fine and the breaker/fuse didn’t pop, it’s still able to power other devices when I plug them in.

So, I replaced the BIOS battery which, after over 3 years, was due anyway. Powered the systems back on and I do not detect any sounds or vibrations from any of the drives. The IPMI GUI doesn’t seem to have any information on drives at all so it’s not helpful. There aren’t even any entries for the SATA channels.

Seems I can’t update the BIOS without some form of license key and any specific info on the board has long been lost.

Going to have to dig through my computer junk drawer and see if I still have a VGA cable and keyboard to watch the startup process. The next step is probably a new board to see if that solves the problem. Fingers crossed.

Appreciate all the responses but in the clear light of morning I see that I’ve still got a lot of variables to attend to here before reaching out for help.

Cheers.

…and more probable than either of those is that something happened to the PSU, or between it and the drives.

You can’t update the BIOS through the IPMI web interface without a license key–you can still go old-school by putting the new file on a USB stick and use either FreeDOS or the UEFI shell to flash.

Or, from here:

It appears you can generate your own license key by running echo -n 'enter-your-bmc-mac-address-here' | xxd -r -p | openssl dgst -sha1 -mac HMAC -macopt hexkey:8544E3B47ECA58F9583043F8 | awk '{print $2}' | cut -c 1-24

2 Likes

I’m assuming you mean by ZFS raid5, you really mean RAIDZ1.

I’m assuming you have access to the console. If so, what is the output of gpart list?

To remind everyone, ZFS was specifically designed to not loose existing data on graceless power offs, (aka power outages). That said, UPSes are always helpful. Of course, hardware can failure during such an event.

Once you get your server working, you might want to address the SMR drives, (WD30EFAX).

1 Like

UPS is definitely a good thing for multiple reasons. One not mentioned here, low power or brownouts is actually often more devastating than a surge. Cheaper surge protectors often don’t do well with those, as an ex surge protector sales person.

So far you were asked twice about how the 5 drives are physically connected to the data cable, to the motherboard or to an HBA? This is important.

Next, don’t update your BIOS, why would you feel it is corrupt? Update it after you fix the problem.

Two: you said you are a novice, are you certain the BIOS is not recognizing the hard drives? And are you using IPMI to verify this ? A screen capture/photo would go a long ways.

Third: you said the drives are not spinning up, then with the power off, unplug all the drive cables(data and power). Next chose ONE hard drive plug in ONLY the power connector to the drive. Now power on the system, does the drive spin up immediately when you turn on the power? If it does, good news. The drive my stop spinning after a few seconds/minutes, that is normal for some drives, and you have the dreaded SMR drives.

If the drive spins up, power off, reconnect the other drive power lines, repeat and verify if the other drives are good.

If all is still good, power off and connect your drives data cables directly to the motherboard. Power on. Check the BIOS again. I have no idea what you know but you may need to make configuration changes in your BIOS.

For the hell of it, remove your boot drive and leave it out. If you power on and you get a message stating something like “this is a data drive” then you know the hard drive is working. And if that happens, maybe yor boot drive is corrupt.

Be very clear in the data you provide, and provide the data requested or helping you out will be a guessing game and gets frustrating for everyone. We actually want to help.

Good luck and i hope all your data is there.

2 Likes

I also asked for the output of gpart list. Never got reply on that also.

I also asked for the output of gpart list . Never got reply on that also.

I appreciate the responses but, as I said above, after some thought I realized I need step back, stop panicking, and start troubleshooting at the physical layer. gpart list is blank of HDDs, that and all the other evidence points to some hardware fault and it seems unlikely that all 5 disks failed or were compromised at once.

I’m going to start with a new PSU and if that doesn’t help I’ll sort out if I want to replace the board or move the disks to a mini series enclosure direct from Truenas which I’ve been thinking about for a bit.

Thanks.

2 Likes

That’s unfortunate. Good luck and hope it’s some HW failure other than the disks as you said.

Good luck and it is good when a person takes upon themselves to troublshoot a problem.

Assuming the HDD’s are all connected directly to the MB, the first thing you should be doing is verifying if the disks are visible in the BIOS. If you don’t see them there, indeed it may be that the PSU or the motherboard is fried.