early on, threw allot of errors. like a counter on a one arm bandit slot machine…
pulled it out of the storage pool. replaced it.
wiped it… then ran bad blocks, it failed, miserably… for then just left it in machine, figured leaving machine alone safer. lets get stability.
just left it, 3 weeks later, ran a long test, no errors, ran bad blocks again, not a single error…
hmm, huh ???
had another drive throw a 8 blocks errors last night, haunt had problems on drive before. wondering if i’m sitting with something else in my system thats causing this.
have 2-3-4 open SATA slots on the MB, maybe move these drive connections onto MB from the LSI controller ?
but anything is possible. It would be weird as it only have 5 drives in it at the moment where-as I previously had 8 HDD’s. so power usage and cooling should sit in an better place.
Very likely, but they are in their own cooling zone, and if you did not change anything, there is big 140mm fan there.
However, the LSI HBAs are meant to be used in servers and need airflow over the heatsink. Probably the same is true for the 10Gb NIC. Both are sitting close together in a very nice, but not very big case.
The case supports to fans in the front, but out of the box their is just one in the top, which means it pushes air to the CPU cooler, but not over the PCI slots.
Even if it is not the root cause, it might be sensible to add a second fan at the front of the case, if you have not done this already. Or a small one over the LSI heatsink, but there wont be much space. I added a fan mounted on a slot bracket, but again there might not be enough space and/or the airflow will just reach the lower card.
Edit: Just saw that there are several internal temperature sensors in the area, so maybe comparing values could give an indication which areas are hotter. And between the two PCE-E slots there are two PCI slots, so there is space, which probably is a good thing.
BTW, the second PCI-slot is just PCIx4, even though it is full length, so one of the two cards will not be able to use the x8 they are capable of. Not sure though at what point this actually matters.
Just to illustrate, that is the way it is shipped, there is not much airflow below the red line from front to back, and at least the HBA needs airflow. I
If different drives give you errors on the same cable, chances are the cable is at fault. I had a very bad time diagnosing why 7 drives in a DELL server repeatedly started throwing errors and degrading the pool while all drives were passing smart tests individually. I ended up replacing the SAS backplane cable and I had no errors ever since. Just food for thought.
Let me get home and see how i can maybe flash my extra LSI card I have with the new FW… and install that into the machine. Then I always have a blackout option if I make a mistake with the flashing…
Remember I tried while back and could not get it done…
G