Hi, I’m running truenas server for few years now in my home, with smaller and bigger issues from time to time, but generaly everything was fine. It’s build from old PC.
Recently i started replacing drives for the bigger one, and yesterday I added WD Red 4TB. It’s used drive, 3 years and 1212 power on hours. Before i checked the surface and everything was fine.
I took out old drive, added new, clicked replace (had to use “force” since i forgot to wipe it before). Started the resilver and that’s where the problems started.
I got uncorrectable errors and cheksum errors. Before everything was fine, and i was running scrub once a week. I’m not sure from where the errors came and how to check that, or what to do next. I still have old drive, so I was thinking on putting it back, hoping it will fix everything.
I’d double check your connections for sure, depending on the motherboard it may also have some option between onboard SATA being AHCI mode or RAID mode, my board for some god-awful reason allows me to set RAID mode and then pass through each of my drives as a RAID0 device. Make sure it isn’t doing anything like that
You can run zpool clear Main to clear the error and then run another scrub to see if it crops up again.
I will try that, I was a bit afraid of clearing errors before I fix the issue, but you are right.
It will just clear that errors were reported, and find them again if they will still be there
Ok, so i tried clearing errors, and he still see them, and also while trying to copy affected files I’m getting an error, so files are definitely corrupted.
What happens if I put old drive back? I wasn’t adding any new files to the pool, but run scrub etc. Will Truenas accept old disk, or will it check that’s older version of pool and reject it?
SMR disks have bulk write performance issues where they have c. 30GB of CMR space for caching writes and the data is then staged to SMR over time. If you are bulk writing i.e. during a resilver, the CMR cache fills up and then writes get massively delayed and the drive appears to be faulty to ZFS.
Great, and i just ordered second one.
Mine is exactly WD4003FFBX.
I think I remember now that i read about that somewhere. There is list of CMR drives somewhere, since manufacturer rarely tells that, am I correct?
So my question stands, can I take new drive out and put back the old one?
Ideally when replacing good drives, leave the original drive connected for redundancy. Otherwise you are reduced to zero redundancy which means any errors in other disks will be unrecoverable.
Which means a checksum error on one disk while trying to rebuild the new disk will result in checksum errors on the new disk.
If the old disk was online at the time, the checksum error could’ve been corrected.
And figure out if the WD Red is a CMR or SMR disk.
Now, maybe zfs would’ve put the same blocks in the same location on the new disk. I don’t know. And maybe if you dd the zfs partition from the old
Disk over the zfs partition on the new disk, then maybe it will recover it. I don’t know. But the gpt info must not be erased on the new disk.