Help Please. Off-site Replication - Security

Hello all,

First time posting, please let me know if I’ve made an error or breached forum rules.

I have a TrueNAS Core 13.x system and am well on my way to deploying a second at a family member’s house. The intent being to remotely back up from my local NAS (A) to remote NAS (B), however I wish to do this as securely as possible.

Please note, NAS B will connect as an OpenVPN client to a site-to-site VPN hosted at NAS A’s location.

After reading quite extensively the last couple of days, I gather:

  • The best ways to do this are with a PULL configuration from B to A, reducing the risk to NAS B in the event A was compromised somehow.
  • Replication tasks and Rsync are two possible ways of achieving this.
  • Replication is faster, supports existing snapshots and appears generally more recommended.
  • Replication, however, requires the use of an SSH key with root access privileges and (possibly) activating the SSH service on NAS A and/or B.
  • Rsync can possibly run as a standard user, with either SSH service enabled or just the Rsync service enabled.

(Please correct me if my understanding thus far is incorrect).

My primary consideration here is securing NAS A and the data on it as much as possible (e.g. from ransomware etc). To that extent I:

  • Have no additional services running on NAS A (plugins, jails etc all off).
  • Expose all SMB shares as Read Only, and use ACLs to enforce Read-Only privilege as well.
  • Have 2FA enabled for root account.
  • All unneeded services disabled (including SSH). The only services enabled are S.M.A.R.T, SMB, and UPS.

So now, for the first time, I arrive at the prospect of creating a replication task, which, as I understand it, requires not only enabling the SSH service on NAS A (something I have thus far avoided), but also generating an SSH key with root privilege and storing it on NAS B. Having had things so locked down until now, this feels like opening a rather large hole. Especially as, as far as I can tell, the only permissions needed on NAS A would be read permission for the dataset being backed up.

I understand that it might be possible to create a special root key on A with only the minimum amount of permissions needed in order to perform the replication. I’m not sure how this is performed though, having done little work with SSH keys in the past (aside from GitHub access)

The other option I saw mentioned (though now I can’t find the post) is that I could use rsync to back up (again, enabling the SSH service) but this time running as a non-root user. Is this any more secure really than using replication?

Basically, I’m looking for advice on the best way to perform this back-up in a way that will let my neurotic self sleep at night. I’m willing to go to the nth degree to lock things down. No suggestions are unwelcome. For instance, if the consensus is that one way or another the SSH service needs to be switched on, are there ways I can be paranoid about that? I.e. Remember to turn off Password auth in the SSH service or In the advanced settings, you can restrict SSH to only a certain network, have Site A NAS connect to your Site-to-Site VPN as well (even though it’s locally hosted) and then you can restrict SSH connections to only coming from your VPN network, not your local one

I’ve even considered something where I spin up a Linux VM in Proxmox at Site A, mount the data on NAS A using an account that only has read permission to the data, and then perform the back-up between NAS B and the VM (losing things like snapshots etc but I would just set a snap-shot task up on NAS B anyway). Thoughts on that as a work-around?

Please help the paranoid to survive. Any advice is welcome and hopefully with some answers I can write up a “Paranoia guide to TrueNAS backups” at the end to help the next person implement the most locked down site-to-site backup possible.

Its a good question on ZFS replication without root access. The docs seems to indicate its required… even for SCALE. My guess is that is because of the need to create a dataset for replicating to. If it is needed in all cases, its worth a feature request.

Syncthing and rsysnc would not need root access.

To protect against ransomware… snapshots are valuable.

Have you tried following these steps https://m.youtube.com/watch?v=htnUVRr6Jmg&t=42s&pp=ygUTVHJ1ZW5hcyByZXBsaWNhdGlvbg%3D%3D

1 Like

Thanks for the response :slightly_smiling_face:

I could understand it possibly needing root access on the destination in which it’s being closed to (NAS B) for the reasons you state. However I’m not sure what purpose it serves on NAS A where the data is being Pulled from. Especially if a snapshot task is preexisting.

Regardless, if it’s confirmed that root access is absolutely required on NAS A for in order for it’s data to be pulled to NAS B. Are there any best practices for securing the access token used? I’ve heard it’s possible to lock such a token down to only running certain commands. But I’m not sure exactly what commands need to be given and what can be blocked.

Snapshots are underway already :slight_smile:

Thanks again!

Thanks for sending that through! :slight_smile:

I admit I hadn’t watched ut. I came across it but as it was for Scale and I’m on Core I just dismissed it.

Having just watched it though it looks pretty solid. For reasons I’d like to stay on Core for the moment but possibly the same can be achieved on Core. Tomorrow I’ll spin up a pair of Core VMs to test it out and see how I get on before trying it in my live environment.

A couple of questions/concerns, if you don’t mind? It appears it operates by running the replication task as a dedicated user on NAS A which is essentially what I want. However:

  • I notice that it still requires the admin credentials for the Web UI during set up. Is this just for the initial generation of the users key and from then on everything runs as the dedicated user?
  • I presume this still requires the activation of the SSH service on NAS A (I’m gathering there’s no way around this as, the data has to be communicated somehow). If it’s the case that SSH must be turned on, what are some ways of being paranoid there? Disable username and password auth I presume? Are there steps I can take to lock down the ssh key generated by this set up wizard, or does the wizard do it for me? Etc.

I’m happy enough to play around with the process myself on some VMs tomorrow and report back with what I discover, but it’s always useful to know what you don’t know from those who do :slight_smile:

Thanks again!

Yes I believe so

Yes the ssh service needs to be running. As you say disabling password login and root login will increase your protection massively.

You could perhaps explore using something like tailscale.

1 Like

Thanks for the quick response.

Sounds manageable, a little googling suggests that the option to run as a dedicated user might not be available in Core. I’ll give it a test tomorrow to be sure though and report back for others. I’d rather avoid a move to Scale right at the moment (later down the line for sure).

The other thing I was considering and might try is that, as I already have a Push SYNC task configured on NAS A, whether it’s just easier to set up a Pull SYNC task on NAS B from the same cloud provider and essential use a cloud back-up as a middle man between them. No key exchanging needed. Not quite site to site but possibly feasible… Then NAS B just performs is own snapshots.

RE tailscale, I have a VPN set up between the two sites. Networking isn’t the issue so much as it is access control and accounts on the NAS’s themselves:/

Will report back on all fronts. Thanks again for jumping in!

1 Like

Depends on what your desired outcome is. Replication is brilliant and I use it within an Enterprise environment to backup data from one DC to another. However for small to medium sized businesses then often backing up to a cloud provider can be a better bet. Have you explored something like Storj? It’s much cheaper than your average cloud provider and may well give you everything you need without the extra hassle. Plus it integrates perfectly with TrueNAS. I have used it in my home environment for quite a while now.

1 Like

I’ll check it out. I’m needing to move on from my current cloud provider anyway as I’m looking for something cheaper. The other one I was considering was Backblaze B2 which seems quite well regarded. 3x stored capacity with of free egress is tempting too if I end up using it to middleman NAS A and B.

Thanks again! You’ve been super quick with your responses and very helpful :slight_smile: appreciate it!

1 Like

Okay, spent the morning testing and reporting back. I was successful, with very minimal work-arounds. I’m not entirely sure if I would advise this, as it did require going slightly off the beaten path and when it comes to back-ups… well, I don’t want to take the risk that my work-around wasn’t ENTIRELY successful and only find out when it’s too late. If you’re interested in hearing how I did it (long process of trial and error, all of it documented) read below. Otherwise, feel free to ignore.

Having spun up a pair of TrueNAS VMs in Proxmox and spent the morning testing out, it doesn’t seem like this is easily possible, and while I did eventually get it working, I’m not sure I’ve gained much in terms of security. I’ll document what I tried, as maybe it will help others.

Context

The two NAS VMs are TestA and TestB. Each has a 32GB virtual OS drive and a 100GB virtual drive for “Storage”.

TestA has PoolA. Which in turn contains two datasets: ToReplicate and ToNotReplicate.

TestA is given a user named repusr. This user is provided read permissions on ToReplicate. This user is otherwise fairly locked down. Under Accounts > Users they are not a member of any group other than their own. They have Disable Password turned on. They have no Samba Authentication etc etc.

TestA has PoolB. This pool has a single dataset called RepDest.

TestB is given no additional users.

Snapshotting

Back on TestA, I created a snapshot task under Tasks > ** Periodic Snapshots**. I set it to:

  • Operate on PoolA/ToReplicate.

  • Run every minute (so I have lots to test with) and retain for a week.

  • I added a -2w suffix to it, as is my policy. Thus its naming convention is: auto-%Y-%m-%d_%H-%M-2w.

The first snapshot was created successfully, a minute later.

Turns out SSH really is non-negotiable

Initially I started with the SSH service disabled on TestA (just to give it a shot). It quickly became clear that there was no way to get any of this working without it. So I enabled it, and within its configuration, also checked the box for Allow Password Authentication (starting easy on myself here).

Testing semi-automatic replication SSH set up with dedicated user (failure)

With SSH now enabled on TestA I went to TestB and created a replication task. The first attempt involved Semi-automatic setup, and was unsuccessful, settings listed here for completeness though:

  • Source: On a Different System

  • SSH Connections: Create New

In dialog that appears:

  • Name: TestA-TestB.

  • Setup-Method: Semi-automatic

  • TrueNAS URL: http://[IP-of-TestA]

  • Username: testrep (My user I made earlier)

  • Password: Replicate123 (Password for that user)

  • Private Key: Generate New

  • Cipher: Standard (Secure)

As mentioned. The connection failed, it gave me an error indicating incorrect password. I reset the repusr password to be sure and it still failed. I attempted adding the repusr to the builtin_administrators group but to no avail. I was able to SSH into TestA with PuTTY using the username/password, but TrueNAS refused to create the replication task SSH connection with it.

Testing manual replication SSH set up with dedicated user (partial success)

As the issue appeared to be TestB was unable to perform the initial username/password auth with TestA with the repusr credentials. The next step was to form the SSH connection manually and then try perform a replication with that.

At this point I went back the the SSH service settings and disabled username/password auth (we don’t need it any more I suppose).

Manual SSH Key and Connection

My first step was in TestB where I visited System > SSH Connections and System > SSH Keypairs where I deleted anything that might have been auto-generated during my initial attempt. With the slate wiped clean and still on TestB inside the SSH Keypairs panel I clicked Add, gave it the name TestA-RepUsr, clicked Generate Keypair, copied the PUBLIC key, and clicked Submit.

Next, still on TestB was System > SSH Connections and clicked Add. Configuration as follows:

  • Name: TestA-RepUsr-Conn

  • Setup Method: Manual

  • Host: [IP-of-TestA] (this time without the http prefix)

  • Port: 22

  • Username: repusr

  • Password: Replicate123

  • Private Key: Selected the one made previously from the drop down

  • Cipher: Standard

  • Connect Timeout: 10

I then clicked the Discover Remote Host Key button to allow it to populate the remaining field and was successful.

At this point an SSH connection will still fail. To formalize this configuration, I return to TestA and under the user settings for repusr I paste the public key into the SSH Public Key field. I note that it has root@truenas.localhost at the end, but that doesn’t seem to actually infer it has root access, I theorize this is the result of it being generated on TestB while signed in to the GUI as the root user.

Replication with Manual Key/Connection

With that done, we can now see if our manual connection is sufficient for replication. Back on TestB we go again to the Replication screen and create a new one.

In here we again select A Different System as the source, but now when we click SSH Connetions we can select the one we made earlier from the drop down. With that done, I was excited to see that the connection appears to work! We can tree open the Source window and see PoolA from TestA and the two datasets which live below it. So far, so good. Selecting ToReplicate as the source dataset, updating the snapshot name to include my custom -2w suffix and (optionally) enabling Recursive the snapshot taken earlier even shows up. (Meaning, we go from 0 snapshots found to 1 snapshot found).

Select the local data set RepDest under PoolB as the destination and save everything.

Sadness

Sadly, attempting to actually execute this replication task results in an error which end as such:


warning: cannot send 'PoolA/warning: cannot send 'PoolA/ToReplicate@auto-2025-01-19_13-50-2w': permission denied

cannot receive: failed to read from stream.

It appears that while our non-root connection is enough to parse the structure of the source NAS, it isn’t enough to actually replicate the information over. Updating the ACL for the source dataset to Full Control has no effect. Googling the error returns results of others attempting a similar thing, similarly failing, and resigning to setting up replication as the root user.

The “Success” half of “Partial Success”

During my googling, I did come across a blog post (which I can’t link here due to rules) but googling for FreeNAS Replication with a Dedicated User + "mikebellerue" (including quotes) will bring it up. Mike seems to have had a similar issue (on an older version of FreeNAS) and was able to resolve it by granting additional ZFS permissions to the dedicated user, and adding a Tunable to the system which appears to allow the user to mount volumes or something. I’ve come across this approach before and dismissed it as being too invasive for my tastes, but decided to attempt it all the same.

The command I executed in the shell was:


zfs allow repusr create,destroy,diff,mount,readonly,receive,release,send,userprop PoolA/ToReplicate

(Edit: only a subset of these are needed, this can be reduced to:)


zfs allow repusr send PoolA/ToReplicate

and then under System > Tunables added a variable with name vfs.usermount, value of 1 and type of sysctl. No reboot was needed, and after that a re-run of the task was successful, with the PoolA snapshots being replicated over to TestB.

Have I Actually Gained Anything?

While replication does now function, and it does so in a PULL config using a non-root user, this was achieved by:

  • Enabling SSH service (which I would have liked to avoid but also accept the data has to be transmitted SOMEHOW, and at least root SSH login and username/password login can be left turned off, so really I’m happy with that).

  • More importantly, issuing a litany of ZFS allowed privileges to my non-root user. Create, Destroy etc. All permissions that, in the interests of keeping TestA as locked down as possible, I would prefer not to issue to a non-root user. Possibly not all of these permissions are needed, and I intend to trim them down to see what’s ACTUALLY required.

  • I also had to create a system tunable which, as far as I can tell, allows users to mount the filesystem. I’m not sure what the security implication is there, or if it’s no big deal. But it’s still something that seems “smelly”.

Removing Unnecessary ZFS Permissions

It appears that only a subset of the ZFS allow permissions are actually required. Issuing the below command removes the ones not needed for replication, leaving only the send delegation.


zfs unallow repusr create,destroy,diff,mount,readonly,receive,release,userprop PoolA/ToReplicate

Rebooting TestA and re-running the replication job, it appears to succeed still, even with only the send ZFS delegation. Which raises the question, if we don’t need the mount delegation, do we still need the tunable which appeared to relate to user mounting of things? To test, I went back and deleted it. Rebooted TestA again, just to be sure and re-ran the replication task.

Success! It appears that tunable wasn’t needed after all :open_mouth:

Let’s validate all this… Back on TestA:

  • Ensure that the ZFS delegations are as I believe

zfs allow PoolA/ToReplicate

Output:


root@truenas[~]# zfs allow PoolA/ToReplicate

---- Permissions on PoolA/ToReplicate --------------------------------

Local+Descendent permissions:

user repusr send

root@truenas[~]#

Confirming that only that single ZFS delegation is present.

  • Double checking, the tunable has also been deleted.

  • In SSH options, Log in as Root with Password and Allow Password Authentication are both off. I presume having them on won’t break this, but I prefer the security of leaving them off.

  • In user settings for repusr:

  • It has it’s own home folder.

  • It’s primary group is repusr and it has no other higher permission groups it’s part of.

  • Remove its SSH key.

  • Disable Password is enabled.

  • Samba Authentication is disabled.

One more reboot of TestA to be sure.

Over on TestB:

  • Heading to Storage > Snapshots delete everything.

  • In Tasks > Replication delete all.

  • In System > SSH Connection and SSH Keypairs delete all.

  • Reboot to be sure.

Setting up again, manually create an SSH keypair as we did earlier for repusr. Create an SSH connection based off that keypair. On TestA paste the public key into the repusr SSH key box. Create a new Replication task using that SSH connection. Run it.

Success.

Conclusion

So. It’s possible to do Replication without needing to use root to authenticate and it actually isn’t that complicated and doesn’t require (much) CLI trickery. Really only one command.

Is it secure? Well, I believe so, but you can make your own call.

TL;DR

When setting up remote replication between a pair of TrueNAS boxes, be aware that:

  • Semi-automatic connection doesn’t work. You need to make the SSH keypair manually on the NAS that initiates the connection, then copy the key over to the dedicated replication user on the target box. This is actually VERY easy. Instructions are above.

  • On the target NAS, you’ll need to issue a single CLI command which grants send permissions to the dedicated user. Initially it seemed like additional permissions and even system tunables were needed, but it doesn’t appear that’s the case.

  • SSH service does have to be enabled on the target NAS, but not on the one initiating the connection. Password auth can be left disabled, as well as root SSH login. So that’s good.

Disclaimer

I’m not entirely sure whether there are hidden gotchas with this work-around. As far as I can tell, any files and folders I make on TestA/PoolA/ToReplicate make their way into TestB/PoolB/RepDest, but given we aren’t running as root, what might happen if there are files or folders on ToReplicate which repusr doesn’t have permissions to read? Will it fail silently? Or will it copy the structure but no content. I really don’t know and there are too many edge cases to test for.

Personally, I’m just going to continue with my plan-B of using a cloud provider to back up the current version of NAS A and then have NAS B configured to pull a day later from that same provider and snapshot what it receives itself. No direct link between NAS devices, and no possible issues with permissions running as non-root.

Wow that’s a brilliant write up and great testing. Many thanks for reporting back with so much detail. I’ve no doubt this will help so many future users. Great work.

PS: if you have the energy and/or inclination perhaps raise a feature request to make this process easier as clearly there are some broken processes along the way. You could link to this post with your findings. Having a dedicated user do zfs send with limited permissions makes lots of sense and I know security is at the forefront of the teams mind.

1 Like