Gone for the weekend, returned to pool missing, possible drive failure, no idea what to do

This isn’t pleasant. The import seems to be failing over and over with this trigger warning of metadata corruption:

1751136633   ffff96c5dc180000 spa_misc.c:429:spa_load_note(): spa_load(Valkyrie, config trusted): spa_load_verify found 4 metadata errors and 1033 data errors
1751136633   ffff96c5dc180000 spa_misc.c:415:spa_load_failed(): spa_load(Valkyrie, config trusted): FAILED: spa_load_verify failed [error=5]

And it’s trying to rewind quite a bit, from the initial 3083521 all the way down to 3083483 where it runs out of uberblocks.

At this point, the next steps involve overriding the error detection, telling ZFS to “mount the pool anyways” and return what data it can. Obviously this will give undesirable results w.r.t data integrity if you try to read a damaged file, and with metadata damage it still may not be able to complete this - but we can try.

WARNING: Hazardous Tunables Ahead.

sudo’ing these commands will disable metadata and data verification on pool load, disabling a core piece of ZFS’s native data integrity. They should only be enabled as last-resort pool recovery.

echo 0 > /sys/module/zfs/parameters/spa_load_verify_metadata
echo 0 > /sys/module/zfs/parameters/spa_load_verify_data 

With these tunables on, you can attempt to re-run the same

sudo zpool import -fFX -R /mnt Valkyrie -o readonly=on

as before, hopefully with the result of your pool being mounted in /mnt/Valkyrie as read-only. At this point you should attempt to recover by copying to an external drive/system or additional pool with redundancy.

3 Likes

Quick update. But first, Thank you @etorix, @HoneyBadger, @neofusion, @Fleshmauler, @Protopia, @joeschmuck, @sfatula. That did work and I was able to access the pool. I have backed up the most critical stuff and I am now waiting for more drives to arrive so that I have a place to copy the rest of the stuff to. I’ll update again as soon as I get my drives and everything is backed up.

6 Likes

@HoneyBadger - curious; after OP successfully backs up needful files, would it be required to manually re-enable these tunables, or is this a change that only lasts until next reboot?

@Fleshmauler That is a good question.

I also would assume @msarles would be rebuilding his pool by destroying it and recreating it, is that a fair assumption?

With what feels like so many metadata corruptions in recent months, these things are very good to know. I’m curious if what seems like an upward tick is really just due to many more users building with hardware that does not meet the minimum specifications. There are a lot of non-ECC RAM users out there these days.

Hey, is that a TrueNAS Tech Talk Topic?

1 Like

The tunables will reset to their default of 1 on the next module load or reboot.

I can’t lie, I’ve seen a couple more than I’d like of late. Multi-vdev systems can be more insulated from it due to the defaults of redundant_metadata=all in the pool config, which duplicates critical metadata across vdevs wherever possible, but many (most?) home users have a single Z2 vdev. The challenge is again finding a scenario where the pool is faulted and we can probe it while the user is okay with waiting to recover.

There does seem to be a fairly strong correlation with non-ECC, but I did see one instance of an Intel+ECC system where it doesn’t want to mount a pool.

3 Likes

1 Like

This thread has brought up a good point which I have been ignorant to, RAIDZ and metadata potential issues, with respect to home users who typically use RAIDZ vdevs, like myself and have been since FreeNAS 8.0. And this is mainly so I understand when I recommend to someone to use a RAIDZ2, what the implications are, especially if they do not have ECC RAM.

There is a lot of metadata mentions in the TrueNAS forum, and some stuff in Resources but my head is still spinning. I might be able to digest it and understand it, and I think I do to some extent, however I wanted to just ask the simple question, and no need to make the answer lengthy.

Question: For an average home user who does not need ultra fast high speed access to the NAS data, where a few milliseconds makes no difference at all, what would be a reasonable recommendation for a person with four 24TB spinning rust drives and was looking to have a capacity of 40TB for example, and does not have ECC RAM? And I’m asking all with respect to metadata in mind.

RAIDZ, two vdevs of 2 mirrors and add a pair of 120GB SSDs for svdev, ???
If the user needs to add one or two more drives to make it proper, that is good to know so it could be recommended.

I’m curious what would be a better solution to recommend.

If this already documented somewhere that I have not found yet, and you know the link…

That’s a good question, what’s a baseline system look like for the average joe? There doesn’t really seem to be one though. You have equal amounts of maniacs that raid datacenters for drive sleds and “look at these spare parts I scraped together” type of builds. We’d have to settle on a standard first, and your average home pc without fancy HBA’s can handle mirrored ssd boot drives plus four more (and even that is asking for 6 onboard sata which some boards can only dream of).

I think once we polled enough people and got enough opinions, it would be nice to document a basic home build and let new people have a model to start with. I think 2 vdevs of mirrors in a single pool is a great start. But this thread has been dramatic enough already. I’m just glad the OP got to save his bacon after it looked like it was done time after time. Says alot about zfs and iX basically donating support time; they owe us nothing, along with many volunteers.

Since iX collects anonymized data, they probably already know what an average joe system looks like, the more I think about it. It’s an opt-in, but I bet it gets traction.

Certainly for a typical average user, no need for a svdev.

3 Likes

T3 guys, you know what to do, let’s have some data-driven ecc vs non-ecc discussion related to metadata. :slight_smile:

Well, it’s all backed up. Now I need figure out what to do next. I saw @joeschmuck mentioned that I would likely rebuild my pools by destroying it and then recreating it. I would love advice on what I should do next.

Is destroying and rebuilding the best option or is there a different way?

Is there a good tutorial somewhere where I can learn what I should be doing better to protect myself from the situation I just was in?

What do you all do to back up your data? Should I set up a 2nd server and rsync it regularly?

I’ve learned a lot from you all and I’m grateful for you taking time to help me.

1 Like

Server grade hardware might help. LSI (Broadcom) HBA these need good cooling. Server chassis usually set up good, but noisy.

Replicating you data pool to another ZFS machine or other backup can help with this type of weird failure. Not sure if the reason was ever figured out.

You can browse the articles below

Hardware guide can be downloaded as a PDF, see yellow download button on upper right

1 Like

@msarles
I am not the authority on metadata issues. But if you search the forum you will find people turning back to a previous time, if they can or rebuilding the pool.

If you have not already done so, test that RAM, test that CPU. Ensure your ststem is stable. RAM testing for several days if you can , CPU testing for 6 hours or longer. Yes it will get hot, that is part of the goal. And address any cooling issues you may have.

I’m glad you were able to backup all your data.

Also, get some scripts that will regularly test your drives and scrub your pools on a schedule. Look at Joe’s first link there. Once configured, it will show you a health report each day via email. You will learn to recognize patterns there…for example if drive 6 is always near 49 degrees C it may not be panic time whilst the other drives are cooler. Some drives get sandwiched and choked of airflow.

A daily health report ensures that things like this don’t just suddenly turn up. You will probably start seeing warning signs ahead of time, and know which drives need replacing. Also get a right-sized UPS and ensure that it’s communicating via network or USB. Even your small-ish Cyberpower units will talk on USB and can trigger a soft shutdown if battery capacity gets too low; also they stream power off the top of the batteries, so the power is clean and the spikes and dips get smoothed. Runtime is however much you are willing to spend, but once you go UPS you will get familiar with changing batteries sooner or later. Not a big deal, just be aware you’re introducing maintenance.

2 Likes

I use the multi_report script from Joe and have been for a long time. It is fantastic in providing me with advanced detail knowledge of the system.It also does short and long tests on drives (including ssd’s) and can make backups of your configuration files and send detailed reports. I also have Truenas send me the standard alerts and notices.

Both of my racks with associated equipment and also containing Supermicro (used) commercial servers with redundant power supplies are backed up by dual rack mounted ups systems either ups can power everything. These are backed up by the whole house generator so I never lose power and power gets filtered. Servers are set to shutdown after 5 minutes if one of the ups systems sends an on battery which would mean the generator did not start or transfer. Servers are on 24/7/365 and all have ECC ram, commercial ssd’s and enterprise spinning rust drives. Pools consist of multiple z2 vdevs and are no more than 8 wide. A lot of the drives are over 7 years old and run fine. Max cpu temps at 98-100% cpu is 58c and Max drive temp is 48c. Under average load cpu temp is 34c and drive temp is under 38c.

2 Likes