I am experiencing issues with a ZFS pool (named “DISK”) running on a Proxmox VE server with TrueNAS as a VM. The pool consists of two 20 TB disks in a mirrored configuration (RAID 1). The main problem is that the pool reports data errors and shows permanent data loss for certain files. I have conducted multiple zpool scrub operations, and while they report no repaired errors, data errors persist. Attempts to import the pool with various flags (-f, -o readonly=on) often result in I/O errors and segmentation faults.
SMART tests have been run on the disks, showing no critical errors, yet TrueNAS encounters access issues with specific blocks, indicating underlying read problems. Additionally, zdb commands reveal block errors and leaked space.
I am seeking assistance with:
Understanding the root causes of these data errors.
Recommendations on potential repair options to recover data, given that I do not have a separate backup for this data.
Any insights or advice on how to proceed to minimize further data loss and potentially recover the data would be greatly appreciated.
In addition to what @Stux wrote, it is always helpful to use the ZFS terminology. ZFS does not support RAID-1, but does support something similar, “Mirroring”. One difference is that a ZFS Mirror only mirrors active data. So during disk replacement, ZFS can bring it’s mirror back faster than RAID-1.
Using RAID-1 wording implies either hardware RAID, (highly discouraged), or external software RAID, (like at the Proxmox level). If using either, the question is “Why use TrueNAS?”
@stainzor Sorry to hear that you’re having problems here. It sounds like you’re familiar with some of the deeper poking about with ZFS commands.
When you show the pool listing with zpool import does it show the previous system for importation as your Proxmox host?
If you have block errors and leaked space I have a suspicion that the pool was configured with individual drive passthrough in Proxmox, and your host picked up and mounted the pool. You may be able to rescue the situation by attempting to import with the -fF flag (force-rewind) - I would recommend also combining this with the -o readonly=on and -R /mnt flags.
If that fails, you may need to force-rewind harder with -fFX but that may require disabling pool scrub/scan on import as well.