Is there an way to setup an task that rsync a snapshot after it is made? It is similar to replication but use rsync as the mechanism.
How would you go about rsyncâing a snapshot? Snapshots are not mounted. You can only rsync the state of the live dataset by default.
Of course with a custom script that takes the snapshot, mounts it, then calls rsync, that is entirely possible. Just not in the UI and with standard TrueNAS behaviour.
Sorry, didnât think that through.
rsync
works at the file layer and is completely oblivious of the underlying file system. It doesnât matter if itâs ZFS, UFS, EXT4 ⌠as long as a minimal set of POSIX semantics are implemented.
ZFS snapshots are a ZFS internal data structure. To send them somewhere else you need to use zfs send
.
Thanks for your reply.
I understand that the best way to backup a zfs dataset is zfs send a snapshot. But I donât have another machine running zfs and has sufficient storage space for the dataset I would like to backup. It is not ideal to rsync live data especially while there are databases running on it.
On the Data Protection pane of the webUI, a user can create a schedule to take a snapshot and a schedule to run an rsync task. I naively thought they can link to each other to automate the backup task.
I guess I better create a feature request on this.
In the meantime, Iâll try my luck with shell-script-fu.
The feature request wonât be helpful, because rsync and zfs snapshots operate at two different layers of the entire stack.
You can turn a snapshot into a file that you can copy to a remote location, though.
zfs send <pool>/path/to/dataset@snapshot | gzip -c > /mnt/some/path/to/a/directory/mysnapshot.gz
That will work to a remote machine over SSH, of course:
zfs send <pool>/path/to/dataset@snapshot | gzip -c | ssh user@remote.machine "cat >/mnt/some/path/to/a/directory/mysnapshot.gz"
I suppose you can grab the latest snapshot name, and use that as the âsource rootâ for the rsync command.
An example in Core / FreeBSD:
LATESTSNAP=`zfs list -H -t snap -o name -s creation mypool/mydata | tail -n 1 | sed 's/.*@//'`
Then later in the script, you use this variable for the âsource rootâ with your rsync command:
rsync -options /mnt/mypool/mydata/.zfs/snapshot/$LATESTSNAP/ backup.ip.address:/path/to/sync/
UPDATE: If you want âseamless automationâ, then youâd have to also include snapshot creation into your script, since I donât believe the GUI can communicate to a custom script.
Yeah, I have just written a shell script to accomplish this.
I just find the latest snapshot and rsync it everyday. And I added a twist on it by backing up 2 copies alternatively in case a problem arises while rsyncâing.
Do you mind sharing the script?
You can filter/censor any identifiable names or strings.
EDIT: I personally would have no need for it, but I can definitely see its usefulness for non-ZFS / non-TrueNAS destinations.
(Iâm familiar with being hit by the error âFiles have changed!â during an rsync
run. )
Sure.
Here is my script.
(Edited: please skip this. A better version is listed below.)
#!/bin/bash
# rsync the latest snapshot via rsync daemon
# settings
#---------
# the directory to be backed up
dataset='/mnt/tank/example'
# the destination protocol
dest_protocol='rsync:'
# the destination user
dest_user='example'
# the destination host
dest_host='192.168.1.44'
# the destination port
dest_port='873'
# the module and path on the host
dest_path='rocker_backup/docker'
# the password file
dest_pw_file='/path/to/dest.rsync-pw'
# the log directory
log_dir='/path/to/log/rsync-docker'
# the log file name
log_file_name="dsm-rsync"
# number of days to keep log files
logs_keep_for_days='30'
# the recipient of the email in case of error
recipient='example@example.com'
# the process
#------------
# the prefix to snapshots
target_parent_dir="${dataset}/.zfs/snapshot"
# the protocol, user host and port of the destination
dest_prefix="${dest_protocol}//${dest_user}@${dest_host}:${dest_port}"
# current date
ds=$(/bin/date +'%Y-%m-%d')
# current time
ts=$(/bin/date +'%H:%M:%S%:z')
# the log file name
log_file="${log_dir}/${log_file_name}-${ds}.log"
# we keep 2 copies by appending a '0' or '1' to the destination alternatively every day.
d=$(date +%s)
postfix=$(((${d}/86400)%2))
dest="${dest_path}${postfix}"
# clean up old logs
/bin/find "$log_dir" -type f -mtime +$logs_keep_for_days -delete
# log the following output to $log_file
{
echo "===="
echo "destination: '$dest'"
echo "$ts rsync starts."
echo
# find the latest snapshot
eval "files=($(/bin/ls -t --quoting-style=shell-always $target_parent_dir))"
if ((${#files[@]} <= 0)) ; then
# no snapshot found
/bin/echo "Error: There is no snapshot in '$target_parent_dir'."
# send an email to notify the recipient
/bin/printf "Backup '${dataset}' failed. No snapshot is found." | /bin/mail -s "Backup '${dataset}' failed for no snapshot found" "$recipient"
exit 1
fi
# latest snapshot found
target="${target_parent_dir}/${files[0]}"
# rsync
/bin/rsync -rlptDvx --delete --password-file="$dest_pw_file" "${target}/" "${dest_prefix}/${dest}"
# capture the exit code
code=$?
# current time
ts=$(/bin/date +'%H:%M:%S%:z')
/bin/echo
if [ $code -eq 0 ] ; then
# backup succeeded without errors
/bin/echo "$ts rsync done."
exit 0
fi
# backup failed
/bin/echo "$ts rsync failed with code $code."
# send an email to notify the recipient
/bin/printf "Backup '${dataset}' failed. Please consult '${log_file}' for details." | /bin/mail -s "Backup '${dataset}' failed" "$recipient"
# log to $log_file
} >> "$log_file" 2>&1
Prerequisites for running the script:
- An rsyncd daemon has to be set up on the destination host due to the fact that the rsync version mismatched. Instructions for enabling rsyncd on Synology. For general linux: How to Set Up an Rsync Daemon on Your Linux Server
- mail has to be set up on TrueNAS SCALE/Core in order to send notification emails, ssmtp is a good choice for web email providers. Here is a tutorial: How to Use SSMTP to Send an Email from Linux Terminal
A few notes for fellows using this script:
- Adjust the setting variables from line 8 to 31 to fit your particular requirement.
- Fill the content of the password file assigned to the variable dest_pw_file on line 21 with the password of the user of the destination host assigned to the variable dest_user on line 13.
It is the password of the user defined on the rsyncd daemon of the destination host, not the password of the system user of the destination host. - It keeps an exact copy of the target, which means extra files and directories are deleted on the destination.
- It cannot keep the owners and group of the files and directories on the destination. They belong to the user connected to destination host instead.
In order to keep the users and groups of the backup, the dest_user has to be root and change the rsync options on line 81 from â-rlptDvxâ to â-avxâ or â-rlptgoDvxâ. For security reasons, I donât recommend running root across network devices. - Compress option of rsync fails frequently. Thatâs why I donât enable it.
- As rsync traffic is unencrypted, only run this script in a trusted LAN or over VPN.
- It does not back up any external device linked. Remove x from the above options (4) in order to back up external devices mounted on the target.
- As it has to access the snapshot directory, the script has to be executed as root.
- Due to the fact that this script backs up 2 copies on alternative day, you need to create the destination directories for the variable dest_path on line 19. One with a character â0â appended and another with â1â appended.
I.e. for the setting âdest_path=ârocker_backup/dockerââ, there should be 2 directories, âdocker0â and âdocker1â under the rocker_backup module directory. - The directory assigned to the variable log_dir on line 24 has to exist before running the script.
- In regions with daylight saving policy, the cron job executing this script should be run once a day at 1 am or later (or whatever daylight saving advancement of your region runs). Otherwise you may end up backing up to the latest backup copy again on the day daylight saving becomes effective and standard time restore. In case anything goes wrong while backing up, the other intact copy is 2 days old.
There appears to be a âmistakeâ in the script at this section:
# find the latest snapshot
eval "files=($(/bin/ls -t --quoting-style=shell-always $target_parent_dir))"
if ((${#files[@]} <= 0)) ; then
# no snapshot found
/bin/echo "Error: There is no snapshot in '$target_parent_dir'."
# send an email to notify the recipient
/bin/printf "Backup '${dataset}' failed. No snapshot is found." | /bin/mail -s "Backup '${dataset}' failed for no snapshot found" "$recipient"
exit 1
fi
# latest snapshot found
target="${target_parent_dir}/${files[0]}"
From my understanding, youâre using ls
to list the contents of the directory /mnt/<poolname>/<dataset>/.zfs/snapshot
This will not offer the latest snapshot if there are different naming schemas involved. For example, a mixture of snapshot names, such as âauto-â, âmanual-â, âtemporary-â, and âbackup-â.
The safest bet is to use the zfs
command itself, since by default it sorts the snapshots by their creation time. (This can also be explicitly invoked, or even in âreverseâ order. If left alone, it presents the order from oldest to most recent.)
Thatâs why I suggested this example:
That will yield an output, such as:
auto-20240511.0000-6m
EDIT: Using the -t
flag does not always get around it. For example, hereâs a sample of one of my datasetâs snapshots. Notice the modification date compared to the snapshotâs name, and hence, snapshot âcreation timeâ?
Look at this sample. Notice the collisions?
Apr 28 14:23 auto-20240504.0000-6m
Apr 28 14:23 auto-20240511.0000-6m
Apr 16 19:55 auto-20240420.0000-6m
Apr 16 19:55 auto-20240427.0000-6m
Using your script, it would believe the âlatestâ snapshot is from May 4, 2024. When in reality, the latest snapshot is from May 11, 2024.
Yes, the snapshot are different, and there have been changes / new files between May 4 and May 11. Itâs just that the root folder might not have had new files written to it directly, yet subfolders do indeed have new/modified files.
-t sort by time, newest first; see âtime
Read my post again. I explained (adding to a post-edit) why using -t
does not fix the underlying problem.
Oh.
Yes. The time in the list is of the snapshotted directory, not when the snapshot was made.
Bingo. Thatâs why the âproperâ way to grab the latest snapshot (and to always be 100% correct) is to use zfs
(against the dataset) instead of ls
(against the hidden snapshot directory).
Thank you. I have made some adjustments based on your suggestion.
(Edit: The following script has been updated on May 23, 2024.)
Change log
May 24, 2024
- changed to encrypted transmission via ssh instead of rsyncd daemon mode
May 23, 2024
- options for snapshot creation, removal and recursive added.
- dataset is now looked up from zfs command. Manually set is not required.
- a more robust of snapshot finding
- the destination path will be created if not exists
- snapshot is logged
May 22, 2024
- more accurate of finding the latest snapshot
- dataset required
- dataset is logged
#!/bin/bash
# rsync the latest snapshot via rsync daemon
# updated on 24 May 2024
# settings
#---------
# the rsync on destination
rsync_path='/bin/rsync'
# the mount point of the dataset to be rsync'ed
target_mount_point='/path/to/target/dataset'
# the destination ssh user
dest_user='someone'
# dest ssh identity file. The default $HOME/.ssh/id files will be used if omitted or empty
dest_ssh_id_file='/path/to/ssh/id/file'
# the destination host
dest_host='192.168.x.x'
# the module and path on the host
dest_path='path/to/destination'
# the log directory
log_dir='/path/to/log/rsync'
# the log file name
log_file_name="rsync-dataset"
# number of days to keep log files
logs_keep_for_days='30'
# the recipient of the email in case of error
recipient='someone@example.com'
# snapshot options
# 0 = use the latest snapshot without taking one
# 1 = take a snapshot before rsync
# 2 = remove the snapshot afterwards
snapshot_option=0
# recursive snapshot?
# 0 = non recursive snapshot
# 1 - recursive snapshot
snapshot_recursive=1
# the process
#------------
# the prefix to snapshots
target_parent_dir="${target_mount_point}/.zfs/snapshot"
# current date
ds=$(date +'%Y-%m-%d')
# current time
ts=$(date +'%H:%M:%S%:z')
# the log file name
log_file="${log_dir}/${log_file_name}-${ds}.log"
# seconds from epoch
d=$(date +%s)
# convert to days mod 2
postfix=$(((${d}/86400)%2))
# we keep 2 copies by appending a '0' or an '1' to the destination alternatively every day.
dest="${dest_path}${postfix}"
# clean up old logs
find "$log_dir" -type f -mtime +$logs_keep_for_days -delete
# log the following output to $log_file
{
echo "===="
# find the dataset
dataset=$(zfs list -H -o name "$target_mount_point")
if ((${#dataset} <= 0)) ; then
echo "Error: '$target_mount_point' is not a dataset mount point."
# send an email to notify the recipient
printf "Backup '${dataset_mount_point}' failed. It is not a dataset mount point." | mail -s "Backup '${dataset_mount_point}' failed. It is not a dataset mount point." "$recipient"
exit 5
fi
ls "${target_mount_point}/.zfs" > /dev/null 2>&1
code=$?
if (( $code != 0)) ; then
echo "Error: '$target_mount_point' is not a dataset mount point."
# send an email to notify the recipient
printf "Backup '${dataset_mount_point}' failed. It is not a dataset mount point." | mail -s "Backup '${dataset_mount_point}' failed. It is not a dataset mount point." "$recipient"
exit 5
fi
echo "dataset: '$dataset'"
echo "destination: '$dest'"
# get the snapshot
if (( $snapshot_option >= 1 )) ; then
# take a snapshot
latest_snapshot="rsync-$(date +'%Y-%m-%dT%H-%M-%S_%Z')"
(( $snapshot_recursive == 0 )) && rec= || rec='-r'
echo "snapshot '${dataset}@${latest_snapshot}' creating..."
zfs snapshot $rec "${dataset}@${latest_snapshot}"
code=$?
if (($code != 0)); then
echo "snapshot '${dataset}@${latest_snapshot}' creation failed with code: $code."
exit 3
fi
echo "snapshot '${dataset}@${latest_snapshot}' created."
else
# use the latest snapshot
latest_snapshot=$(zfs list -H -t snap -o name -s creation "$dataset" | tail -n 1 | sed 's#.*@\([^/]*\)$#\1#')
if ((${#latest_snapshot} <= 0)) ; then
# no snapshot found
echo "Error: There is no snapshot of the dataset found."
# send an email to notify the recipient
printf "Backup '${dataset}' failed. No snapshot was found." | mail -s "Backup '${dataset}' failed for no snapshot found" "$recipient"
exit 1
fi
fi
# latest snapshot found
target="${target_parent_dir}/${latest_snapshot}"
# the ssh id argument
(( ${#dest_ssh_id_file} == 0 )) || id_arg="-i '$dest_ssh_id_file'"
echo "target: '$target'"
echo "$ts rsync starts."
echo
# rsync
rsync -rlptDvx --delete -e "ssh $id_arg" --rsync-path="$rsync_path" "${target}/" "${dest_user}@${dest_host}:${dest}"
# capture the exit code
code=$?
# current time
ts=$(date +'%H:%M:%S%:z')
echo
if (( $code == 0 )) ; then
# backup succeeded without errors
if [ $snapshot_option -ge 2 ] ; then
# remove the snapshot created
echo "snapshot '${dataset}@${latest_snapshot}' removal..."
zfs destroy "${dataset}@${latest_snapshot}"
code=$?
if (($code != 0)) ; then
echo "snapshot '${dataset}@${latest_snapshot}' removal failed with code: $code."
# send an email to notify the recipient
printf "snapshot '${dataset}@${latest_snapshot}' removal failed with code: $code." | mail -s "Backup '${dataset}' snapshot removal failed" "$recipient"
else
echo "snapshot '${dataset}@${latest_snapshot}' removed."
fi
fi
echo "$ts rsync done."
exit 0
fi
# backup failed
echo "$ts rsync failed with code $code."
if [ $snapshot_option -ge 2 ] ; then
# remove the snapshot created
echo "snapshot '${dataset}@${latest_snapshot}' removal..."
zfs destroy "${dataset}@${latest_snapshot}"
code=$?
if (($code != 0)) ; then
echo "snapshot '${dataset}@${latest_snapshot}' removal failed with code: $code."
else
echo "snapshot '${dataset}@${latest_snapshot}' removed."
fi
fi
# send an email to notify the recipient
printf "Backup '${dataset}' failed. Please consult '${log_file}' for details." | mail -s "Backup '${dataset}' failed" "$recipient"
exit 2
# log to $log_file
} 2>&1 | tee -a "$log_file"
Let us know if the script works!
I can definitely see some good use for this.
(Maybe after a few âsuccessfulâ runs, you can be fairly confident it will continue to work as intended?)