Hi,
I recently started using a new NAS, upgrading from my old nas4free box.
I started fresh with four new drives.
Took me a while to setup everything like I wanted in Truenas, and transfer all the data to the new box, which took me 4 days using rsync…
I forgot to burn in the new drives…
They are Seagate ST12000VN0008 12TB drives.
Now all of a sudden I am getting these errors for one of them:
I’m currently running it again, which takes some time.
But I suspect something is seriously wrong.
I’d like to do the burn-in tests described here: Hard Drive Burn-in Testing | TrueNAS Community
Considering how long it took me to transfer all my data, setup shares and permissions… Is there a way to recover at least that setup work?
I have my four drives in 2x mirror setup, so could I offline one drive, do the burn-in tests and then resilver it, repeating this for each drive?
Worst case I still have my old NAS, and the data didn’t change since I transferred it.
I’m not the most experienced in assessing smart data but for me it looks like the drive is indeed faulty.
You could try switching cables and check if everything is seated properly, but if that fails I’d RMA the drive.
Possible, but with 12 TB drives this will take a very long time, given that you can only burn in one drive at a time.
You’ll have no redundancy when you burn in the other drives and losing another disk will kill the whole pool. I’d advise you get your other NAS up to speed on the data right now.
Do you have another source of backup?
Then you can check the cabling, if the drive still throws errors RMA it. Then burn in all drives simultaneously and recreate the pool from scratch.
CRC is usually my go to indicator for bad cables, I wasn’t sure if OPs smart data will rule out any cabling for sure.
I can’t really remember, I think I also got pending sectors with a bad cable on one of my SSDs.
You still need to act now though, I interpret @Davvo meant this with cooling down the other drives. Two mirrored vdevs with one lost drive is not safe at this point, especially with untested drives.
Get your backups in order ASAP.
You can check the three remaining drives and already start the RMA process. Waiting for other drives to be burned in won’t speed up anything. If they all come back clean you’re already a few days ahead since you started the RMA process on the known faulty one.
Yeah the temperatures are a bit too high, I agree. The NAS is a Terramaster F4-424 pro and I just confirmed that the fan is spinning. I need to check if I can increase the speed in the BIOS.
I have mym original NAS still, and the only thing that was added since migration are some snapshots from my Proxmox server.
That looks like a very handy script, although read-only?
Will it test the full disk this way?
Just for my understanding, I need to let this script run for a few passes, then do the SMART test again to check for errors?