Ive used TrueNAS Scale for 2 years. Now, Im setting up a new NAS server by create a TrueNAS Scale VM on Proxmox and pass-though a HBA card.
But, each time Ive done copy data from old NAS, the pool is always in degraded status with 1-2 fault disks (with hundred of read errors) and 1 degrade disk.
The degraded disk pass all short and long SMART test with no read/write error.
Ive tried below solution but not solved the issue yet:
Replace fault/degraded disks by good disks from the old NAS → fault/degraded disks worked well on old NAS, but good disks become fault/degraded on new NAS.
Swap fault/degraded disks with good disks on the same pool by swap cables → the good disks become fault/degraded after resilvering.
Replace cable → not solved the issue yet.
Run Memtest and passed.
Ive already tested and passed. Ive replaced cable also.
Scrub inside TrueNAS VM give me 1 degraded disk and 1-2 fault disks with hundred of read error. But scrub that pool (3 times) on Proxmox just give me 1 degraded disk and no fault disk.
My old NAS system: HP Z210, 16GB ECC, 5 disks in RAIDz2 layout connect directly to SATA on motherboard.
My new NAS system: A320 + Ryzen 2200G, 16GB non-ECC, 8 disks in RAIDz2 layout connect to LSI 9211-8u (model Dell H200 IT mode), pass-through from Proxmox host.
My Proxmox on my new NAS system install on a RAID10 ZFS pool with all disks connect directly to SATA on motherboard. And they are working very well with no error. So, I think maybe issue come from the HBA card.
Ive already tested and passed. Ive replaced cable also.
Scrub inside TrueNAS VM give me 1 degraded disk and 1-2 fault disks with hundred of read error. But scrub that pool (3 times) on Proxmox just give me 1 degraded disk and no fault disk.
Do you have any idea?
I mean, we’re still getting faults inside & out of the vm. Check cooling & make sure hba is isolated from the hypervisor would be my next suggestions.
Replace thermal goop if you’re comfortable doing so, but either way, slap a fan on that sucker. Doesn’t have to be attached, but make sure you got some solid airflow on it. Lotta examples in the past resolved by checking the basics.
If that fails, see if you can make a temporary boot environment (ie a usb with truenas os on it; this should be ok as it is for short term testing); do issues persist if truenas isn’t virtualized? If so, do issues persist if you connect drives directly to motherboard without truenas being virtualized? If so… Maybe you got a faulty hba & need a replacement.
Those would be my next steps; slowly isolate possible issues until you can confirm if it is an hardware fault, or a complication with virtualization.