ZPOOL Issue-Many drive faults when expanding

GlennS · January 15, 2025, 5:47am

So this week I got another 18TB drive to install in my machine.
I tested the drive first before committing to adding it to the Pool, and the SMART tests all came up okay.
So I added the drive, and now my entire system is failing.
It currently says the resilver process will take 726 years, and it just keeps going up!
It has been continuously scanning the same drive since Monday.
It’s reporting ridiculous numbers of errors across all the drives.
Has anyone experienced this?
I successfully added a drive shortly after Electric Eel was released and it went smoothly albeit slowly.
Could my HBA be failing?
Although one drive is not connected to the HBA and is reporting crazy numbers of faults too - it is directly attached to the Motherboard.
I need some help troubleshooting this.

Could it be my Power supply?

ic2_Alpha · January 15, 2025, 7:43am

I expanded my pool on 24.10 back in December too, and if I remember correctly, the first time it went through the expansion process, then through the resilver. On resilver it failed for me and I had to manually (and selecting force) import the drive into the pool.
What does Shell show when running sudo zpool status -v “name of your pool here between quotes”

GlennS · January 15, 2025, 8:11am

It was the HBA controller card - it was cooked. I unplugged the Sas to SATA cables and plugged the drives directly into the motherboard. It took ages to boot, but it is up and running. ZFS is resilvering still, but reporting no errors at least.

GlennS · January 20, 2025, 8:11am

Well it didn’t stay fixed. The drives were continuing to give me errors, so I bought replacement drives. The originals were all from the same batch so it is possible they could fail together.
No sooner have they been installed and I am getting Suspended Pool.
I destroyed the Pool, and completely removed the drives, replacing them with the new drives, and now I just keep getting SMART errors.
Is it possible that TrueNas is corrupted somehow?

etorix · January 20, 2025, 12:25pm

Which kind of SMART errors?
CRC could be an issue with cabling and/or HBA. Pending/reallocated sectors are drive failures.

Is this a bare metal installation or virtualised?

GlennS · January 21, 2025, 8:22am

It’s bare metal.
I am not sure how to check what kind of SMART errors they are.
Where can I locate that?

etorix · January 21, 2025, 3:33pm

smartctl -a /dev/sda in the web terminal or, better, a SSH session, replacing ‘sda’ as needed.

Topic		Replies	Views
Cascading drive failures. How to proceed when all pools show errors? TrueNAS General SCALE , Hardware , ZFS	3	89	April 5, 2026
My hard drive suddenly stopped reading, and POOL going down TrueNAS General SCALE , Hardware , ZFS , SMR , Import-problem	7	87	January 19, 2026
Zpool errors on new drives preventing resilver TrueNAS General SCALE , ZFS	0	42	December 12, 2024
Same drive showing up twice in Pool, one faulted TrueNAS General SCALE , ZFS	2	119	February 22, 2025
In zfs3 all disks have changed status to DEGRADED TrueNAS General	8	99	April 18, 2025

ZPOOL Issue-Many drive faults when expanding

Related topics