Please save my life!

I would suggest reversing this step.

A common mistake people make is to remove the wrong disk, because they assume the device numbers are bay numbers etc. they are not and are effectively randomized every reboot.

The only way to identify a disk is via its serial, and that’s not necessarily reliable when the disk has failed, either as the system may report the serial of another disk.

ANYWAY, hopefully if you put back the other disk, and it reappears then you can fix the server, but if that’s not the case then things are not good

That’s a lot of hours and not a lot of testing/maintenance.

And I assume you have no backup?

2 Likes

The result of disk query as attached.
Anyway, thank you and everyone for helps!
disk_query.txt (5.6 KB)

I thought it was until yesterday. :frowning:

A backup is a separate copy. If you lose the copy… and that means you lose the data, it means you didn’t have a backup.

2 Likes

As I wrote above, it started when I had an error message for a long time and I noticed that instead of 8, I only see 7 disks. I wrote out the serial numbers from the Storage/disks menu. I thought that I would put the new hdd in the place of the one that is not listed and that’s it. Instead I saw “pool unknown”. Then I panic. I put back the hdds that I took out. Nothing. I tried it with a disc I had taken out a while ago (in case I was wrong). Nothing. I tried the scrub on the non-existent pool. Obviously nothing. Browsing the Internet, trying to import zpool. Then I got here in the evening after looking through the old freenas forums.
Now, however, I noticed that I get this error message when I put one of the hdd back. This hdd is the 3rd member of the damaged array. Can something be done with this? Is there at least some way to get it working until the resilver takes place with the new win


chester?

Da2 is growing some (irreparable) low level failures. That is bad.
Where are the two missing drives? Back in the NAS, but not showing anywhere? If so, I would power the NAS off, check and reseat every power cable and every data cable, and reboot.

The bolded items are the serial numbers for the drives that are recognized. The missing two drives will be something else. I assume you know which drives you replaced however always track by serial number. I was hoping the midclt command would also provide the missing serial numbers but it did not.

As the others have said, hopefully reinstalling the original drives is all that is needed to recover.

As for the da2 error message, I can’t stress this enough, you need to convert the da2 into a serial number. You can do that by using smartclt -a /dev/da2 and then you will have the serial number of the drive, and it should match one of the drives you just installed. Remember what was said before, drive names (IDs) can change and they do, especially when you introduce a different drive.

When you added that drive and received those errors, did you see if the pool was back online? If not, could you import it? I say these things because you already know you have two drives that had errors so you should expect to see those. Also, 24 offline unrecoverable errors is not tragic. Your only goal here is to get the pool mounted and once it is, copy all your data off the machine to somewhere safe. Nothing more at this time.

If you have no place to copy the data then you need to “Replace” the drive using the proper procedure, with the failed drive still installed. You do not need to run a SCRUB, the checks will happen automatically while rebuilding the pool or reading the pool.

1 Like

What is labeled da2 in this image is not the same drive as the one running as da2(s/n z1f4txdc) in the error message above. Only the original drivers are loaded now. of the two missing ones, you don’t see one at all, and the other is da2 in the error message.
That is why I had the question, whether the da2 drive in the error message can somehow be made online. Even using some external application.
By the way, the drives are in hotswap drawers, so there was no need to touch the cabling during the replacement.

I thought wrong. :frowning:
But then I don’t understand.
This error message only appears when I connect the hdd with serial number z1f4txdc. Anyway, no error message at all.

Just to be clear, do not hot swap the drives. Power down the system, then swap drives, then power up.

I may not be understanding you; when you installed drive serial number z1f4txdc, it generated the error message of the 24 offline unrecoverable errors, correct? And if true, that is okay.

Was this a Hot Swap? If yes, power down, wait 5 seconds or more, power on. And check the status of the pool. Did it come back online? Does the GUI show one more drive installed? da6 should come back.

My current problem to understand is why two drives are not being recognized by FreeNAS. If they are plugged in, they should show up. Not as part of a pool but just as disks available for use at a minimum. So I’m not sure why you are not showing drives da6 and da7. You may need to open the computer and examine it. Check the connectors on the drives and inside the back of the drive bay for damage, as well as on the HBA (assuming you have one). It can be easy to damage these electrical connectors or one just fall off if not installed well.

Maybe someone else has more advice, I’m almost out of options.

1 Like

Can I cause a lot of trouble if I put two older drives in place of the two “invisible” drives?
I mentioned hotswap to indicate that you don’t have to dig inside the machine to replace the HDD. I always turned it off before that.
It doesn’t see the z1f4txdc serial number drive, but I only get the aforementioned error message if it’s plugged in.
I swapped two drives first to check the mounts and the mounts seem fine.
Do I understand correctly that faulty Winchesters should also appear among the disks?

You can NOT recover without bringing back at least one of the two missing drives, so do not put other drives in their place. You may try to swap drives around to see if some connectors/bays are defective. You may try to connect the defective drives directly to motherboard SATA ports, using known good cables. Do you hear the drives spin?
If you’re out of ports, you may remove one of the SLOG drives and/or one of the boot drives (not the one your BIOS is booting from).

1 Like

If it seems like we are treating you like you don’t know anything, it is intentional as to do our best to communicate properly and make no assumptions. We definitely do not want you to lose your data if it can be recovered. Please do not take any offence to how we word things.

“Winchesters” LOL, brought back some memories. I have worked with those type of connectors for the past 44 years. They are likely not in your system these days but I do appreciate the trip down memory lane. I just retired from working in the submarine community.

2 Likes

I didn’t want to keep changing the words disk and hdd. In Hungary, we still use the word “winchester”. Computer parts stores have not used this word since the ssd became widespread. :slight_smile:
In any case, I thank everyone for taking care of my problem at all. I will try the solutions suggested above.

It seems that the hdd is physically broken. I just tried putting any hdd in the missing places, freenas sees them. Of course, the pool won’t recover, but at least I now know what the problem is.
I have the same type of disk, I will try to transfer the control panel(board) from it.

1 Like

That’s non-trivial, do you have experience doing that? Otherwise it might be better to leave it to a professional.

Or is this a YOLO situation where whatever happens - happens?

I thought so too.
By the way, yolo. From here, the road only leads up. :slight_smile:

This may work however it does tend to work best if you have a board from the same lot of drives.

Also, what are your failed indications? Is the motor spinning up? If no, it could be the electronics or the motor. If it turns out to be the motor, you can move the platters from one drive to the other, however if you have multiple platters, there is a technique to doing it. I have done it once just to see if I could, and it does work. Patience and tape makes a win. Of course if the drive is helium filled, it is a no-go.

Swapping out the board is easy but due to manufacturing differences between different plants and different batches/lots of drives, the electronics timing may be different which is why it is always best to replace with a same lot board.

Best of luck to you.

If not mistaken the PCB is programmed to the specific HDD - you may have to also transfer this original bios (it isn’t bios, but I’m going to use that term because I imagine it’d be similar to manually flashing bios chips) to donor PCB.

Edit: turns out it is bios, and people just transfer the bios chip over physically to the donor pcb

2 Likes

I talked to the specialist and I had a question. If we start the bad hdd and make a copy of it (we don’t just copy its contents, but a complete copy), will it be accepted by freenas, or must the original disk be used anyway?