New Pool ACL set as NFSv4-libvirt-qemu Has No Access

I created a new pool, 2 drives in a mirror, and it seems to have NFSv4 set as the ACL. Not sure if I missed this when creating the pool but, it is not allowing a built-in user, libvirt-qemu, to access the dataset I have created for a virtual machine.

I have looked for solutions but cannot find anything relative to my exact situation. Would I be correct in assuming I need to strip the ACL from the pool in the CLI and set it as Posix to match my 1st and original Pool?

You can’t edit the permissions of the root Dataset, it will always be owned by root. You can only edit permissions of child datasets.
Create a child dataset on that new pool and add the buildin user so it has permissions.

That is the issue, the dataset already has the same permissions as the pool but I continue to get the same error when creating a VM. If I try to add libvirt-qemu as a user to the ACL of the VM dataset, I get the following EPERM] Filesystem permissions on path /mnt/Fast Storage prevent access for user “libvirt-qemu” to the path /mnt/Fast Storage/VirtualM. This may be fixed by granting the aforementioned user execute permissions on the path: /mnt/Fast Storage.
*Dataset ACL


*Pool ACL

That should not have been permitted by our validation. What is the full path of the dataset above?

1 Like

/mnt/Fast Storage/VirtualM/ISO_Storage

What is output of nfs4xdr_getfacl /mnt/Fast Storage/?

nfs4xdr_getfacl: Failed to get NFSv4 ACL

Not sure if the following information is relevant or not. Both of these new drives for the pool are NVME on a PCI-e AOC.

Okay. What is output of stat /mnt/Fast Storage?

File: /mnt/Fast Storage
Size: 4 Blocks: 1 IO Block: 1024 directory
Device: 0,134 Inode: 34 Links: 4
Access: (0770/drwxrwx—) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2024-01-19 19:19:30.351820429 -0500
Modify: 2024-10-14 03:24:23.928351206 -0400
Change: 2024-10-14 03:24:23.928351206 -0400
Birth: 2024-01-19 19:19:30.351820429 -0500

*I forgot I needed to add ’ to Fast Storage, i.e ‘Fast Storage’ when I ran the last one you asked for. So I ran it again as nfs4xdr_getfacl /mnt/‘Fast Storage’/ and got the following.

File: /mnt/Fast Storage/
owner: 0
group: 0
mode: 0o40770
trivial_acl: false
ACL flags: none
owner@:rwxpDdaARWcCos:fd-----:allow
group@:rwxpDdaARWc–s:fd-----:allow
group:builtin_users:rwxpDdaARWc–s:fd-----:allow
group:builtin_administrators:rwxpDdaARWcCos:fd-----:allow
user:apps:rwxpDdaARWc–s:fd-----:allow

Hmm… we have explicit validation to prevent users from setting ACLs on root level dataset. Did you set this via shell or do any dataset manipulation or bind mounts from shell? Have you set the root level dataset as storage for any apps?

No, I have not done anything via the shell. I set the pool up, and did replication from my other pool to this new one for all my apps. All apps are running fine. Planka gave me some issues stating that it could not access the DB, User ‘planka’ could not authenticate, not an exact quote of the error. I tried moving it back to the other pool to see if that would resolve it but, I got the same error in the logs. I was not very invested in that app so I deleted it.

All apps have their own host path datasets for config and “media” except a few that I left ix-applications.

Do I need to rebuild the pool from scratch? I would like to avoid that if possible but if that is the only option, so be it. Thank you for your assistance so far.

No. You don’t have to rebuild from scratch, but I want to understand how:

  1. the pool got created with NFSv4 acltype on root level dataset (assuming you didn’t do this yourself)
  2. how the ACL got set there

Both of these shouldn’t be happening. Please PM me a debug System->Advanced->Save Debug

As far as removing the ACL and putting back things to appropriate defaults:

nfs4xdr_setfacl -b '/mnt/Fast Storage/'
chmod 755 '/mnt/Fast Storage/'

Note: this is a great example of why you shouldn’t put spaces in pool / dataset names :slight_smile:

2 Likes

I would not intentionally make any changes in the shell, I try to stay out of there at all cost lol. I will get that debug to you shortly.

To your note: I know…and I regret it lol. I usually don’t have naming schemes like that. Anyway I could change it to FastStorage or Fast_Storage without bricking the pool?

Issue is resolved but, the exact reason for the issue is being narrowed down. I just want to give a huge thank you to @awalkerix for your assistance and taking the time to work with me.

1 Like

The reason for this issue is, I am quoting, “It looks like you performed a ZFS send/recv over the top of the dataset and of course wrote the ACL from the replication source on the target”. Not sure how I achieved this unintentionally but, this was done during replication. Thank you again for all the help.

Update: I am unable to delete a few snapshots. “Snapshots must not have dependent clones”. I have tried to research this and I am now at a loss on how to fix this. There is no option to “promote” any of the datasets in reference to the clones and zfs commands do not work in the shell, i.e. “zfs promote”.

Again, any help will be greatly appreciated.

EDIT: I was able to find the path in the shell. This makes me think the only way to resolve this is via the shell. I am not sure what to promote at this point if I did use zfs promote. (I did not use sudo when I previously stated “zfs commands do not work in the shell”)

/mnt/.ix-apps/app_mounts/jellyfin]$ ls
cache cache-ix-applications-backup-system-update–2024-11-07_14:38:57-clone config transcodes transcodes-ix-applications-backup-system-update–2024-11-07_14:38:57-clone

UPDATE: I was able to follow this, How to delete snapshots with dependent clones? | TrueNAS Community, which I referenced earlier but being an idiot and not using sudo for zfs commands. All clones have been promoted and the respective apps seem to working fine. I didn’t break anything!

I am guessing that because these were tied to ix-apps/ix-applications they were hidden and I was not able to find them. I used -a when listing in the directories but that did not show the hidden paths?

Lesson learned again I guess. Data structure is very important.

EDIT: I am still unable to delete the snapshots. I keep promoting back and forth, i.e once I promote a dataset/clone the snapshot falls back to/says the previous dataset is a clone. I can also see more than one clone for some of the snapshots. I am totally lost now.

Where are you up to?

Can you try to post some screenshots or listings to explain the situation clearly?

Yes, please see the following. This Snapshot is tied to two clones, so firstly, I am unable to determine which Clone should be promoted and when I promote one the other is obviously still tied to the snapshot.

Snapshot:
image
Clones:

These are the snapshots I am unable to delete due to clones. A few others only had one clone so I was able to promote and delete the old snapshot.

Please let me know if you need anymore information.