So I have a Trunas system that’s been perfect for ages, I love it. I wanted a little pool for some VMs, so I added a basic controller with a couple of SSDs
Today I was doing my backup, so the array’s being hammered, and da2 failed, didn’t even appear in the list of disks, obviously my pool/array was not happy. I opened up the case a unplugged the 2 SSDs from the controller, rebooted and the disk is now up.
"The following alert has been cleared:
Pool TVShowsPool state is ONLINE: One or more devices has experienced an error
resulting in data corruption. Applications may be affected."
Anyone heard of this? is the disk dying and a reboot brought it back?
Any thoughts? Thanks
Well, not necessarily exactly like you described but i fixed enough issues where usually either the disk or just the cable or even the backplane had/began to running into issues when under heavy(for the particular system) loads or just a scrub and trim and the same time and the disks were not accessible anymore.
Usually those were fixed by a cold boot and you knew you had to do some maintenance in the near future.
Funny you say that, I unplugged the connectors to the problematic drive, as I was going to remove it, then plugged them back and unplugged the SSD drives so the SATA controller had nothing to talk to. Once the Backup is complete I’ll remove the controller and SSD permanently.
EDIT
Actually, I shoud ensure the connecions are good and reconnect the ssds to see if they were the problem. But the more I think about it the connector wasn’t tight. I’m begining to wonder if the connector wasn’t in properly and just worked it’s way out. The drive was only put in 5 days ago.
SO
Complete backup - Don’t touch anything it works!
Double check connectors
Connect SSDs
Close case and wait and see.
Well, this type of SATA controller “add on” cards are not recommended at all.
Some models work, but a lot are not very well built. Just search this forum or the old one, there is also a useful thread why SAS HBAs are recommended (I think it is in the resources section).
The scenario you described is pretty typical, they are working fine first, but under load they start to fail.
The server is a dell t340, using the internal controller that was flashed and setup a couple of years ago. It’s been working perfectly since then. A couple of days ago, I added the cheap controller for a couple of SSDs for a pool for some VMs. I wasn’t bothered about it’s reliability, as it was for VMs, but I suspect it’s causing a problem, so I’ll take it out.
I think that the only pci chipset I’ve not had wierd problems with use the ASM chips. I think the card you spec’d uses JMicron. Here in US, the ASM1064 2 port cards are about the same price as the one you spec’d. No idea if chipset compatibility is the issue. Since Core is FreeBSD-based, I recommend looking at the FreeBSD compatibility data for hardware. Good luck.
John
Thanks for that, I’ve ordered a replacement card with the ASM1064 chipset.
I decided that as there was a question mark over the drive, I’ll return it and just get a new one.It’s not worth the risk.