Errors in pool after replacing drive

Hi, I’m running truenas server for few years now in my home, with smaller and bigger issues from time to time, but generaly everything was fine. It’s build from old PC.

Recently i started replacing drives for the bigger one, and yesterday I added WD Red 4TB. It’s used drive, 3 years and 1212 power on hours. Before i checked the surface and everything was fine.
I took out old drive, added new, clicked replace (had to use “force” since i forgot to wipe it before). Started the resilver and that’s where the problems started.
I got uncorrectable errors and cheksum errors. Before everything was fine, and i was running scrub once a week. I’m not sure from where the errors came and how to check that, or what to do next. I still have old drive, so I was thinking on putting it back, hoping it will fix everything.

I will be thankfull for any help or suggestions.
image

How are these drives connected?
A similar number of checksum errors on each disk is suspicious.

SATA, directly to the motherboard. Do you think it can be cables issue? Reconecting everything and scrubing again may help?

I’d double check your connections for sure, depending on the motherboard it may also have some option between onboard SATA being AHCI mode or RAID mode, my board for some god-awful reason allows me to set RAID mode and then pass through each of my drives as a RAID0 device. Make sure it isn’t doing anything like that :stuck_out_tongue:

You can run zpool clear Main to clear the error and then run another scrub to see if it crops up again.

I will try that, I was a bit afraid of clearing errors before I fix the issue, but you are right.
It will just clear that errors were reported, and find them again if they will still be there

Ok, so i tried clearing errors, and he still see them, and also while trying to copy affected files I’m getting an error, so files are definitely corrupted.

What happens if I put old drive back? I wasn’t adding any new files to the pool, but run scrub etc. Will Truenas accept old disk, or will it check that’s older version of pool and reject it?

WD Red is an SMR disk - and even WD state that it is NOT suitable for ZFS.

See Read and Write slow RAIDz1- truenas scale - #6 by Protopia for a detailed explanation I researched this morning.

SMR disks have bulk write performance issues where they have c. 30GB of CMR space for caching writes and the data is then staged to SMR over time. If you are bulk writing i.e. during a resilver, the CMR cache fills up and then writes get massively delayed and the drive appears to be faulty to ZFS.

2 Likes

Great, and i just ordered second one.
Mine is exactly WD4003FFBX.
I think I remember now that i read about that somewhere. There is list of CMR drives somewhere, since manufacturer rarely tells that, am I correct?

So my question stands, can I take new drive out and put back the old one?

I suspect this was your mistake.

Ideally when replacing good drives, leave the original drive connected for redundancy. Otherwise you are reduced to zero redundancy which means any errors in other disks will be unrecoverable.

Which means a checksum error on one disk while trying to rebuild the new disk will result in checksum errors on the new disk.

If the old disk was online at the time, the checksum error could’ve been corrected.

And figure out if the WD Red is a CMR or SMR disk.

Now, maybe zfs would’ve put the same blocks in the same location on the new disk. I don’t know. And maybe if you dd the zfs partition from the old
Disk over the zfs partition on the new disk, then maybe it will recover it. I don’t know. But the gpt info must not be erased on the new disk.

Is it possible to connect new drive via USB for resilvering? I’m not sure if i have any free Sata ports.

Ok, so i deleted corrupted files, added additional Sata power connector, so now i can add to the server additional drive.

Replaced next drive without takich the old one and everything worked fine. It looks like the issue were some copying errors during resilvering.

What is strange i lost one photo, and fifth episode in 3 different TV series.

Yes, though I would personally plug the new drive to a SATA port and use a USB adapter for the (partially failing) drive to be replaced.