ZPOOL Issue-Many drive faults when expanding

So this week I got another 18TB drive to install in my machine.
I tested the drive first before committing to adding it to the Pool, and the SMART tests all came up okay.
So I added the drive, and now my entire system is failing.
It currently says the resilver process will take 726 years, and it just keeps going up!
It has been continuously scanning the same drive since Monday.
It’s reporting ridiculous numbers of errors across all the drives.
Has anyone experienced this?
I successfully added a drive shortly after Electric Eel was released and it went smoothly albeit slowly.
Could my HBA be failing?
Although one drive is not connected to the HBA and is reporting crazy numbers of faults too - it is directly attached to the Motherboard.
I need some help troubleshooting this.

Could it be my Power supply?

I expanded my pool on 24.10 back in December too, and if I remember correctly, the first time it went through the expansion process, then through the resilver. On resilver it failed for me and I had to manually (and selecting force) import the drive into the pool.
What does Shell show when running sudo zpool status -v “name of your pool here between quotes”

It was the HBA controller card - it was cooked. I unplugged the Sas to SATA cables and plugged the drives directly into the motherboard. It took ages to boot, but it is up and running. ZFS is resilvering still, but reporting no errors at least.

1 Like

Well it didn’t stay fixed. The drives were continuing to give me errors, so I bought replacement drives. The originals were all from the same batch so it is possible they could fail together.
No sooner have they been installed and I am getting Suspended Pool.
I destroyed the Pool, and completely removed the drives, replacing them with the new drives, and now I just keep getting SMART errors.
Is it possible that TrueNas is corrupted somehow?

Which kind of SMART errors?
CRC could be an issue with cabling and/or HBA. Pending/reallocated sectors are drive failures.

Is this a bare metal installation or virtualised?

It’s bare metal.
I am not sure how to check what kind of SMART errors they are.
Where can I locate that?

smartctl -a /dev/sda in the web terminal or, better, a SSH session, replacing ‘sda’ as needed.