Having issues with a pool in TrueNAS Scale running on bare metal.
System hardware:
- HDDs of pool in question: Western Digital Red Pro 16TB x4 (connected directly to SATA ports on motherboard)
- Motherboard: ASUS ProArt Z790-CREATOR WIFI
- Memory: Corsair Vengeance - 96GB (2x48), DDR5, 5600 MHz
- CPU: Intel Core i9-14900K
- Power: Thermaltake Toughpower GF3 (1350 Watt) on UPS
1. Dataset Error on TrueNAS Scale v25.04.2.1
I have a dataset that I rarely unlock because it contains old archive data. A few days ago, I unlocked this dataset. When I navigated to it in a file manager, it was empty. When I clicked on the dataset in the Datasets tab of the web UI, I encountered the error: CallError. [EFAULT] Failed retreiving GROUP quotas for [pool]/[dataset]. I also noticed my ACLs for this dataset were gone.
I’ve seen others report this error, which may have been caused by migrating from Core to Scale. I haven’t done this, as I’ve always run Scale, but I did have to restore Scale from backup in March 2025 because my boot drive failed at this time.
I then noticed this Critical error in the web UI notifications: SMB shares have path-related configuration issues that may impact service stability. I haven’t changed SMB settings in quite some time.
Possible causes I was considering:
- Restoring TrueNAS from backup, and not having unlocked this dataset since then
- There was a pending update for v25.04.2.4, which fixes some SMB issues
- There are pending ZFS upgrades for all of my pools, although this appears to only be related to Fast Deduplication
Stopping and restarting the SMB service appeared to fix this problem, as I was able to unlock the dataset in question, the ACLs were restored to their original state, and I could see and open its files.
2. I then updated TrueNAS to v25.04.2.4
When I rebooted, the above issue with the archive dataset is gone, but I see two new issues:
Critical: Pool <pool> state is ONLINE: One or more devices has experienced an error resulting in data corruption. Applications may be affected.- I have a dataset that’s only used by the Syncthing app. After the first reboot from updating to v25.04.2.4, this was fine. On subsequent reboots, I now have to use the Force option to unlock this dataset, no clue why.
I rebooted a couple of more times and ran a scrub on the pool in question, which reports 0 errors.
When I run sudo zpool status in CLI, it shows:
state: One or more devices has experienced an error resulting in data corruption. Applications may be affected.errors: 1 data errors, use '-v' for a list
When I run sudo zpool status -xv and provide admin password, it shows the same thing except: errors: List of errors unavailable: permission denied (WHY?)
Then I ran a Long SMART test on each HDD (spinning) in the pool in question, via the web UI, which presented another issue: The web UI isn’t showing me that the tests are running. Was this always the case? I can only see that these are running via CLI. These are still running now, and I’ll report back with findings, but I have a suspicion they will show no errors.
Not looking for an exact diagnosis at this time, acknowledging more info will be needed, but can anyone first help me get TrueNAS to show me the 1 data errors it’s referring to in the CLI, but won’t show me when I enter my admin password?
Is there any way to get TrueNAS to forget/re-evaulate what it believes is corrupted? I see specific data errors listed in other users’ output, but not mine, which makes me wonder whether there is is actually an issue here.
I see no evidence of corruption, nor any issues with apps, VMs, replication, or Rsync backups.