Trying to replicate from core to scale .... ^cannot unmount^ ^permission denied^

I try to replicate a dataset having a zvol from my TrueNas Core system to my TrueNas Scale system. Both on latest software versions.

My first idea was to pull the data from core towards Scale … I did not manage.

Then I did try to push the data from Core towards Scale which seems to be easier.

However also that is not working (at least not in my case).

I defined ^the same^ user on core and on scale. The user has a pubic key and secret key on core and is identified with its public key on scale.

On both systems the user has enough authorizations. On scale in fact every thing I could imagine apart from being root him self.

Never the less I get the error messages as shown below. I tried multiple things but always at the end the message is ^cannot unmount^ … ???

The destination ^mnt/Olifant/BackUp-Panda-VMs/Graylog/GrayLog(zvol>^ and even making the involved user owner of that dataset dit not help :grimacing:

Louis

[2024/12/02 20:10:35] INFO [Thread-88] [zettarepl.paramiko.replication_task__task_4] Connected (version 2.0, client OpenSSH_9.2p1)
[2024/12/02 20:10:35] INFO [Thread-88] [zettarepl.paramiko.replication_task__task_4] Authentication (publickey) successful!
[2024/12/02 20:10:36] INFO [replication_task__task_4] [zettarepl.replication.pre_retention] Pre-retention destroying snapshots:
[2024/12/02 20:10:36] INFO [replication_task__task_4] [zettarepl.replication.run] For replication task ‘task_4’: doing push from ‘SamsungSSD/GrayLog’ to ‘Olifant/BackUp-Panda-VMs/GrayLog’ of snapshot=‘daily-2024-11-19_00-00’ incremental_base=None receive_resume_token=None encryption=False
[2024/12/02 20:10:36] INFO [replication_task__task_4] [zettarepl.paramiko.replication_task__task_4.sftp] [chan 6] Opened sftp connection (server version 3)
[2024/12/02 20:10:36] INFO [replication_task__task_4] [zettarepl.transport.ssh_netcat] Automatically chose connect address ‘192.168.18.32’
[2024/12/02 20:10:36] ERROR [replication_task__task_4] [zettarepl.replication.run] For task ‘task_4’ unhandled replication error SshNetcatExecException(None, ExecException(1, “cannot unmount ‘/mnt/Olifant/BackUp-Panda-VMs/GrayLog’: permission denied\n”))
Traceback (most recent call last):
File “/usr/local/lib/python3.9/site-packages/zettarepl/replication/run.py”, line 181, in run_replication_tasks
retry_stuck_replication(
File “/usr/local/lib/python3.9/site-packages/zettarepl/replication/stuck.py”, line 18, in retry_stuck_replication
return func()
File “/usr/local/lib/python3.9/site-packages/zettarepl/replication/run.py”, line 182, in
lambda: run_replication_task_part(replication_task, source_dataset, src_context, dst_context,
File “/usr/local/lib/python3.9/site-packages/zettarepl/replication/run.py”, line 279, in run_replication_task_part
run_replication_steps(step_templates, observer)
File “/usr/local/lib/python3.9/site-packages/zettarepl/replication/run.py”, line 637, in run_replication_steps
replicate_snapshots(step_template, incremental_base, snapshots, encryption, observer)
File “/usr/local/lib/python3.9/site-packages/zettarepl/replication/run.py”, line 720, in replicate_snapshots
run_replication_step(step, observer)
File “/usr/local/lib/python3.9/site-packages/zettarepl/replication/run.py”, line 797, in run_replication_step
ReplicationProcessRunner(process, monitor).run()
File “/usr/local/lib/python3.9/site-packages/zettarepl/replication/process_runner.py”, line 33, in run
raise self.process_exception
File “/usr/local/lib/python3.9/site-packages/zettarepl/replication/process_runner.py”, line 37, in _wait_process
self.replication_process.wait()
File “/usr/local/lib/python3.9/site-packages/zettarepl/transport/ssh_netcat.py”, line 212, in wait
raise SshNetcatExecException(None, self.listen_exec_error)
zettarepl.transport.ssh_netcat.SshNetcatExecException: Active side: cannot unmount ‘/mnt/Olifant/BackUp-Panda-VMs/GrayLog’: permission denied

Data you want to move is on Core, right?
So you need a PULL done in Scale, or a PUSH from Core.

Do you have the same error tryng both the replication type?

Yep, I did try both.
I first tried pull from Scale, but that did give me significant connection problems. For that reasons I switched to push from Core to Scale. With the described effect.

Core is on TrueNAS-13.3-U1
Scale on ElectricEel-24.10.0.2

Note that the destination pool is a Z1-pool. Updated to the ZFS latest version. To remark here is that I made that ZFS update during the test, and there was a clear behavoir difference between before and after the update. After the update there was a bit more going on before the stop. But in the end the result was the same ^cannot unmount^

Yep, i understand, im facing similar issue.
SSH Connection problem (related to TLS ) if i try to connect in Core for pull data from Scale, despite i managed to push from Scale to Core after a lot of attempt.

If you want give a try… Enable the root account on the Scale system, and try to establish the connection on Core with him… See if something change

Perhaps using root account works, but to do that I have to violate a lot of security.

As example at this moment root on Scale and root on Core are different. E.e. different public and private keys. I absolutely do not want to give them the same keys.

And in general at this moment it is already absurd how much power I have given the ^executing-user^ on the Scale system. I did not only make the user owner of the destination data set but also member of buildin-admin, operators and wheel.
Far from OK of course, but I did that trying to solve the problem and for testing

You totally right.
I have just opened a thread regarding this problem, i’m struggling from weeks on those things.
Possible that i have made some mistakes on ACL, or have inherited something that broke from Core… At least im not alone anymore :laughing: honestly i was concerned about the things seems no one facing similar issue, and the fact that do something important like replication that was just a start a job now has been a pain in the moment i sidegraded

It is drama … I spend many hours trying to move data from core to scale. I tried in on four different methods:

  • rsync pull from scale and push from core
  • replication pull from scale and push from core

Nothing really nothing worked

  • using accounts which IMHO more than appropriate permissions.
  • using SSH-keys pub and private keys (in the users home dir)
  • using the (IMHO strange) separate KEY-pairs

In the end I managed to make a working solution using a SSH-key-pair and SSH-connection with the keys of ^core root^ configured on ^as scale-key-pair^.

OK it does work, but it is against every security idea’s I have!!
Absolutely not the way I would like to do it !!

Can you check if the home directory of those users have set properly ACL?
In my case the error was there… :pleading_face:

I did fix a couple of unexpected issues related to file permissions, but even after those fixes … it did not work. I will mention them below never the less.

Some issues to mention:

  • stupidly TrueNas does not have ^/home^ as base for user directory files
  • and also IHMO not OK is that user ^/home^ directory’s are not on the boot disk (perhaps I can change that, I consider to do that)
  • the user home directory where ever it is, needs a “sudo chmod 751” to be accessible and not insecure
  • the ssh directory (^/home//.ssh^) should not be accessible for others. If access to the private key in that directory (id_rsa) is not restricted to ^600^ ssh will refuse to use the PK. Note that the public key is also in that directory under the name ^authorized_keys^ and the the remote host id’s for SSH in ^known_hosts^

What ever, I did fix that kind of issues

Note that IMHO the Idea od ^Backup Credentials^ SSH-connections and key pairs^ are a bad idea. No reason to have them IMHO. It is just a user with appropriate authorization to do thinks!

Edit / update:

  1. I noticed that /home is present in case of TrueNas Scale and that the home directory of admin is there. I plan to migrate the home directory of my users to that directory.
  2. I tried to migrate a user to /home … not allowed path must start with /mnt/ :frowning:

image

Hi, sorry for the delay but didn’t noticed your post edit.
I remember you were struggling like me, and i got a response on my ticket, so i thinked that could be usefull for you too: due to a bug, Core can’t connect to other system via SSH with a user that is not the root one, at least in semi-automatic mode

I can confirm that watching the audit of the Scale system.
Plus obvioulsy we are talking of Core

we will never see this fixed :melting_face: but at least for me, neither the manually connection work.
So for the moment i will stick to PUSH from Scale to Core, the only way i can connect both system.

For your problem on the home directory, i think is just something “blinded”, you probably should move your data without a full dataset replica

opps , didn’t search properly first. I blame my absolute exasperation with my new expensive system.

This has to be a joke ? no ?

Move task from PUSH from core, to PULL on the Scale system. Or enable the root login on Scale (if Is one time operation)

Hi, thx for the response :slight_smile:

Yea i tried going at it via pull from scale

[EFAULT] Passive side: cannot unmount '/mnt/.ix-apps/docker': pool or dataset is busy Active side: cannot send 'NAS-main/mandie-home': I/O error.

The only real option i have is to do them one folder at a time via ssh

jack@TruenasCore ~ $ sudo zfs snapshot NAS-main/manjaro-home@populate
jack@TruenasCore ~ $ sudo zfs send NAS-main/manjaro-home@populate | ssh -i ~/.ssh/rep_key truenas_admin@192.168.1.117 sudo zfs receive -F basepool/manjaro-home

But im supposed to be using this nas ?? If i use it via ssh all the time then there is no point in having it.
and I cant allow login as root!

Does this happen even if you use the --no-perms flag on rsync?

Nevermind.

Youll have to hang on a while, i’v started sending several TB’s across. Its a little tied u and i dare not touch it just now.

I shall try asap

EDIT:

O rsync, sorry, i have not tried it, not sure what you mean ? rsync in the shell or is there a setting in the gui to choose a method or what exactly?

You’re right, I read this on my phone initially and managed to hallucinate that this was an rsync based copy that was failing. I see now that it was normal replication from the very start.

I would try it without the +NETCAT though, just to see if it still fails the same way.

Honestly i can understand all your point.
I’m not an expert but in my little experience after moving to Scale i struggled a lot tryng backup on Core, and i achieve to replicate properly after a lot of attempt.
I have read better your initial detailed post, i don’t understand if this Is a “one time moving operation”, but if Is, despite you clarly said that you want avoid nesting datasets i would still suggest you to

  • pull from Scale ( Core can’t connect to Scale without root login anyway)
  • create a dataset where nesting the replica
  • remove the allow from scratch flag
  • flag instead full file system replication and recursive
  • after data Is transfered, move/replicate datasets on the root, then clear the original dataset

Off course can’t be a periodic solution… In case try open a support ticket but if Is something broke on Core don’t wait for a resolution, you should achieve this in other way like in doing

Well its a onetime operation as in iv got to send all the nas data over to the empty new scale somehow to start with but after that i will be wanting to do regular backup replications between the two. This is what i brought it for, as a secondary remote(ish) backup.

Cant pull from scale, (see comment above Trying to replicate from core to scale .... ^cannot unmount^ ^permission denied^ - #13 by jackdinn)

I dont know what i shall do, i suspect i will either figure a way to get all the data onto scale and then install scale on the first server as well, maybe that will work better (i have no clue if scale to scale is any better). Or i CAN send from scale to core, so i could use the core as the remote replication backup instead or the new scale.

Meh, head hurts.

I see your comment above, but you get the same error tryng the other things i suggest?

yea, iv tried creating another dataset to send in into but it just said much the same, slightly different i.e. if i created a new dataset called extradataset & try to sent into that it would say cant unmount basepool/extradataset

I tried allow replicate from scratch on and off and full system replication on and off .