Hi All,
Recently my TrueNAS 24.04 system started to show checksum errors on a pool.
$ sudo zpool status pool0 -v
pool: pool0
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
scan: scrub repaired 0B in 05:03:32 with 0 errors on Mon Dec 22 23:32:01 2025
config:
NAME STATE READ WRITE CKSUM
pool0 ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
14fd5317-e933-4fbb-90e1-723242e4d98b ONLINE 0 0 4.20K
711697c4-77ef-4e4a-b97a-800e77418c93 ONLINE 0 0 4.20K
12f20257-f9bd-400d-a4c7-5e7e5cb63a41 ONLINE 0 0 4.20K
errors: List of errors unavailable: no such pool or dataset
However, no file errors are listed. Instead I get a no such pool or dataset error displayed.
Symptoms and history:
- No visible issues at all. Everything seems to be working perfectly fine. I just have these ever increasing number of checksum errors.
- The first error appeared during a resilvering afterI replaced an old drive with a new one and also copied a new file to one of the datasets. This file was then marked as corrupted.
- After the resilvering has finished, I tried removing and restoring the file. This did not clear the error, so I ended up removing the entire dataset and restoring the files from backup. Since then I have the error displayed above.
My fix attempts so far:
- I tried recreating the entire dataset that held the single file that was marked as corrupted.
- I replaced every SATA cable with a brand new one. (Since I always see the exact same amount of errors on all of the drives, I did not expect too much from this attempt.)
- Ran a 48+ hours memtest. 0 errors.
- Cleared the error count with zpool clear and run multiple scrubs. (I read on the ZFS GitHub project support that the first scrub does not always clear checksum errors.)
I’m afraid I have a meta data problem and I don’t have any other choice than destroying and recreating the pool. Since this would pretty much mean a complete NAS rebuild and I have a large number of apps, I’m afraid this rebuild would mean several days if not weeks of work. I have the following questions:
- Is there anything else that I could try to fix my pool before wiping it?
- If I have to wipe, what would be the best path of doing this to keep it as simple and least painful as possible? Can you guide me to a documentation of a similar scenario?
- I am thinking about to temporarily migrate my apps pool to the “backup” pool until I recreate and restore the main data pool. Should that work? Does TrueNAS copy the /mnt/.ix-apps/ folder and all app configurations when I set a different pool under Apps → Configuration → Choose pool ?
- Which configuration settings will I lose when I destroy the pool? For example will the export settings remain but with an error, or they will be automatically wiped as well? Same for data protection settings. Will all configuration be automatically wiped or they will be preserved with errors?
- In case nothing is being preserved, before clicking wipe, what TrueNAS config can I backup for a faster recover / reconfig? For example what does the backup file contains, that is being dumped before regular upgrades? Can I use this for a quicker reconfiguration?
- Is there an ultimate restore guide that I could consult for learning as much as possible and plan my next move before I destroy anything?
Thank you for any kind of help.