So this system has been on Cobia for awhile and would see occasional ACL corruption eg (it shows up as just a number in the ACL editor and with a warning icon in the field).
After updating to Dragonfish it mostly worked but I lost my AD configuration entirely and had to set that back up exactly the same as before. And now, after it had been running that way for a day or so, ACL corruption happens on all the shares same as I was having on Cobia.
AD server is windows server 2016 and 2022 (a VM) (we tend to upgrade these stair step very few years).
We have almost entirely windows clients on windows 10 and 11, and a couple servers. We use backupchain and azure cloud backups within truenas itself. The configuration for that appears unaffected. The System itself is a AM4 motherboard + 5950x + 128GB ECC UDIMMS, a 2TB NVMe mirror, and a 4 wide raidz2 of SATA HDDs. The ACL corruption occurred on both vdevs and multiple pools.
That’s not corruption of ACL. Seeing a UID or GID simply means that we could not resolve those into AD SIDs. This may mean that the user / group was deleted from AD. It also may mean that there was a local user / group that has since been deleted.
There have been no local groups created.
And the AD SIDs were not deleted.
So yes there is some kind of corruption going on… in the past I go in delete the “corrupted” entry and remap it back to the user or group that was previously there. No changes on the AD side of things at all… .its been running with only a handful of new groups added in the past decade and none removed.
No. It’s not corruption (the term has a specific and useful meaning that is not relevant here). As mentioned, it’s winbindd failing to resolve a unix UID / GID to a SID. There are various reasons for this to happen. I gave two likely scenarios.
That doesn’t make any sense, the only thing that has occurred here is an upgrade from Cobia to Dragonfly without any changes to any of the AD users or groups.
Why would winbind fail if everything is still valid… both of the situations you described are DEFINITELY not the case here.
I’d have to see specific server configuration (testparm -s output) details and log messages to precisely say why those particular accounts didn’t resolve. If the connection to AD is flapping regularly then we we’ll have erratic results for non-cached entries.
If the actual mappings are changing then that either indicates that the results are for an improperly configured trusted domain or that our idmap settings have been changed since you first joined AD.
I noticed something weird in my authentication logs… when I VPN from my home computer to my work computer. I start getting SMB authentication attempts from my work computer (with my home NAS credentials which are not domain credentials) to the work NAS…
While this is wierd I am also not sure why this would break anything. So it appears that somehow my work laptop is trying to connect to shares on the work nas with the username from my home computer’s nas connections which isn’t on the same domain at all.
Here is what the audit logs look like I removed any private info.