Replication to second TrueNAS system: 4 questions

I have a second cloud-based truenas system I provisioned at ServaRica for $25/mo for 4TB of storage. I have 4 questions about full backups that I want to clarify.

  1. My snapshots were taking zero sized snaps when I did my snapshots before I did my first full backup. I just changed it to eliminate the zero sized snapshots. This leads to a complete failure to back up because you get an error message like: “cannot send main@auto-2024-09-23_02-00 recursively: snapshot main/time-machine/stk@auto-2024-09-23_02-00 does not exist
    warning: cannot send ‘main@auto-2024-09-23_02-00’: backup failed
    cannot receive: failed to read from stream.” The reason is obvious: the dataset didn’t change so there was no snap. MUST I turn on “take zero sized snaps” or is there a way to do full recursive backups without this? I’m guessing I have to take the 0 sized backups or it won’t work. If so, the help text on this checkbox really should be updated.
  2. main/.system: Should I have the .system in my main pool or boot pool or does it even matter? What would you do? I’d imagine it’s better for .system to be in the pool since the configs would get backed up that way automatically.
  3. mnt/.ix-apps: this gets put inside my backup of my main pool as main/.ix-apps even though it’s outside the pool. I’m guessing I should do nothing and it works, but it’s weird since the snapshot says it’s in the pool but the shell says it is in /mnt.
  4. if I have a snapshot task called auto-… then when I create a replica task, it refers to it as “main-auto-” since you have to refer to the replication task to start automatically. But it appears that this has the effect of creating more snapshots with the name main-auto which is completely unexpected. Here is a screenshot:

looking forward to being educated on these three things.

In particular, I turned on the creation of zero sized snaps. resnapped it in the periodic replication task, and when it does the automatic backup, it fails; it’s still complaining about the missing earlier snapshot that was supposed to be there and wasn’t because I turned it off in the previous run. So now I have to replicate from scratch now to re-sync everything and ALWAYS LEAVE TAKE ZERO SIZED snapshots turned ON when doing recursive snaps, right??? Damn I wish they had mentioned this in the help text.

Yes, you must allow zero sized snaps if you’re replicating.

The pool/ix-apps dataset is hidden (not sure this is the right decision), but mounted at /mnt/.ix-apps

I don’t think there is much point replicating the system dataset. It’d be very hard to restore.

Instead organize periodic backup of your config with the multireport script

You may have seen this, but I did mention the empty snapshots thing for replications in my Tiered Snapshots video

I do the above, and replicate the hourly and longer snapshots offsite, and would recommend that too.

1 Like

At 70ms and 1gbit, you have a BDP of about 9MB.

You probably need a max tcp window size of at least 16MB.

You should check what it’s set too.

But to be honest, I haven’t checked how this is configured on scale.

THANKS! So I have to trash my backup and start over due to the mistake I made to not create the zero snapshots? There is no recovery other than starting over???

Did you see my DM to you?

I’ll look into the window size. GREAT SUGGESTION!!!

YOU WERE RIGHT: Not even close to the mark. Will change and report back.

root@truenas[/mnt/main/user/stk]# sysctl net.ipv4.tcp_rmem
net.ipv4.tcp_rmem = 4096        131072  6291456
root@truenas[/mnt/main/user/stk]# sysctl net.ipv4.tcp_wmem
net.ipv4.tcp_wmem = 4096        16384   4194304
root@truenas[/mnt/main/user/stk]#      

Delete the range of snapshots taken while there were missing snapshots.

Then there won’t be missing snapshots, and the replication will be able to run. Since it hadn’t run since it hit an empty snapshot, it should then continue from where you were.

1 Like

Oh, thanks. You just saved me 2 days of backup. that makes total sense. Easy peasy. THANK YOU!!!

1 Like

Wow. That made a difference. I was able to get 1 second at 800 Mb/sec!!!

Still playing with the numbers.

root@truenas[/mnt/main/user/stk]# iperf3 -c backup.skirsch.com
Connecting to host backup.skirsch.com, port 5201
[  5] local 192.168.1.115 port 40586 connected to 162.250.191.179 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  48.6 MBytes   408 Mbits/sec    1   6.75 MBytes
[  5]   1.00-2.00   sec  95.0 MBytes   797 Mbits/sec    1   5.01 MBytes
[  5]   2.00-3.00   sec  41.2 MBytes   346 Mbits/sec  1798   2.57 MBytes
[  5]   3.00-4.00   sec  33.8 MBytes   283 Mbits/sec  155   1.90 MBytes
[  5]   4.00-5.00   sec  27.5 MBytes   231 Mbits/sec    0   2.01 MBytes
[  5]   5.00-6.00   sec  22.5 MBytes   189 Mbits/sec    1   1.50 MBytes
[  5]   6.00-7.00   sec  17.5 MBytes   147 Mbits/sec  264   1.12 MBytes
[  5]   7.00-8.00   sec  15.0 MBytes   126 Mbits/sec    0   1.19 MBytes
[  5]   8.00-9.00   sec  17.5 MBytes   147 Mbits/sec    0   1.24 MBytes
[  5]   9.00-10.00  sec  17.5 MBytes   147 Mbits/sec    0   1.27 MBytes

sudo sysctl -w net.ipv4.tcp_rmem=“4096 131072 16777216”
sudo sysctl -w net.ipv4.tcp_wmem=“4096 87380 16777216”

Second attempt:

sudo sysctl -w net.ipv4.tcp_wmem="4096 87380 18750000"
net.ipv4.tcp_rmem = 4096 87380 18750000
net.ipv4.tcp_wmem = 4096 87380 18750000
root@truenas[/mnt/main/user/stk]# iperf3 -c backup.skirsch.com
Connecting to host backup.skirsch.com, port 5201
[  5] local 192.168.1.115 port 48792 connected to 162.250.191.179 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  57.8 MBytes   484 Mbits/sec   36   14.5 MBytes
[  5]   1.00-2.00   sec   109 MBytes   912 Mbits/sec    2   7.34 MBytes
[  5]   2.00-3.00   sec  61.2 MBytes   514 Mbits/sec   51   2.63 MBytes
[  5]   3.00-4.00   sec  31.2 MBytes   262 Mbits/sec  200   1.94 MBytes
[  5]   4.00-5.00   sec  28.8 MBytes   241 Mbits/sec    0   2.05 MBytes
[  5]   5.00-6.00   sec  30.0 MBytes   252 Mbits/sec    0   2.13 MBytes
[  5]   6.00-7.00   sec  22.5 MBytes   189 Mbits/sec    1   1.59 MBytes
[  5]   7.00-8.00   sec  22.5 MBytes   189 Mbits/sec    1   1.17 MBytes
[  5]   8.00-9.00   sec  16.2 MBytes   136 Mbits/sec    0   1.25 MBytes
[  5]   9.00-10.00  sec  17.5 MBytes   147 Mbits/sec    0   1.31 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec   396 MBytes   333 Mbits/sec  291             sender
[  5]   0.00-10.07  sec   386 MBytes   322 Mbits/sec                  receiver

Basically, you have an Elephant Flow through a Long Fat Pipe :wink:

This may help.

1 Like

thanks. I’ll take a look. Here’s the best settings so far. The buffer size for 1Gbps is 8.1MB

sudo sysctl -w net.ipv4.tcp_rmem="4096        131072  8750000"
sudo sysctl -w net.ipv4.tcp_wmem="4096        16384   8750000"
net.ipv4.tcp_rmem = 4096        131072  8750000
net.ipv4.tcp_wmem = 4096        16384   8750000
root@truenas[/mnt/main/user/stk]# iperf3 -c backup.skirsch.com
Connecting to host backup.skirsch.com, port 5201
[  5] local 192.168.1.115 port 34834 connected to 162.250.191.179 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  30.6 MBytes   257 Mbits/sec    0   9.64 MBytes
[  5]   1.00-2.00   sec  61.2 MBytes   514 Mbits/sec  118   4.62 MBytes
[  5]   2.00-3.00   sec  60.0 MBytes   503 Mbits/sec    0   4.84 MBytes
[  5]   3.00-4.00   sec  66.2 MBytes   556 Mbits/sec    0   4.84 MBytes
[  5]   4.00-5.00   sec  66.2 MBytes   556 Mbits/sec    0   4.84 MBytes
[  5]   5.00-6.00   sec  58.8 MBytes   493 Mbits/sec    2   3.40 MBytes
[  5]   6.00-7.00   sec  48.8 MBytes   409 Mbits/sec    0   3.58 MBytes
[  5]   7.00-8.00   sec  48.8 MBytes   409 Mbits/sec    2   2.61 MBytes
[  5]   8.00-9.00   sec  30.0 MBytes   252 Mbits/sec    1   1.94 MBytes
[  5]   9.00-10.00  sec  26.2 MBytes   220 Mbits/sec    0   2.04 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec   497 MBytes   417 Mbits/sec  123             sender
[  5]   0.00-10.07  sec   492 MBytes   410 Mbits/sec                  receiver

iperf Done.

the cloudflare recommendations made things worse. But I did find a way to make things run 3X faster! check this out. this is not a fluke:

root@truenas:/mnt/main/user/stk# iperf3 -c backup.skirsch.com
Connecting to host backup.skirsch.com, port 5201
[  5] local 192.168.1.115 port 44864 connected to 162.250.191.179 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  40.8 MBytes   343 Mbits/sec    0   14.5 MBytes
[  5]   1.00-2.00   sec  81.8 MBytes   686 Mbits/sec   13   15.2 MBytes
[  5]   2.00-3.00   sec  80.1 MBytes   672 Mbits/sec    2   14.5 MBytes
[  5]   3.00-4.00   sec  73.5 MBytes   617 Mbits/sec    3   15.2 MBytes
[  5]   4.00-5.00   sec  80.8 MBytes   678 Mbits/sec    2   14.2 MBytes
[  5]   5.00-6.00   sec  79.4 MBytes   666 Mbits/sec    2   14.2 MBytes
[  5]   6.00-7.00   sec  71.2 MBytes   597 Mbits/sec  591   15.4 MBytes
[  5]   7.00-8.00   sec  81.2 MBytes   681 Mbits/sec    2   15.3 MBytes
[  5]   8.00-9.00   sec  81.5 MBytes   683 Mbits/sec   13   14.8 MBytes
[  5]   9.00-10.00  sec  74.7 MBytes   627 Mbits/sec    3   14.3 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec   745 MBytes   625 Mbits/sec  631             sender
[  5]   0.00-10.07  sec   745 MBytes   620 Mbits/sec                  receiver

iperf Done.
root@truenas:/mnt/main/user/stk# iperf3 -c backup.skirsch.com
Connecting to host backup.skirsch.com, port 5201
[  5] local 192.168.1.115 port 45012 connected to 162.250.191.179 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  35.0 MBytes   294 Mbits/sec    1   20.0 MBytes
[  5]   1.00-2.00   sec  86.4 MBytes   725 Mbits/sec    1   14.3 MBytes
[  5]   2.00-3.00   sec  61.8 MBytes   519 Mbits/sec    7   16.1 MBytes
[  5]   3.00-4.00   sec  82.1 MBytes   689 Mbits/sec    2   14.6 MBytes
[  5]   4.00-5.00   sec  91.1 MBytes   764 Mbits/sec    1   14.3 MBytes
[  5]   5.00-6.00   sec  71.2 MBytes   597 Mbits/sec    5   15.2 MBytes
[  5]   6.00-7.00   sec  78.2 MBytes   656 Mbits/sec    2   6.81 MBytes
[  5]   7.00-8.00   sec  80.5 MBytes   676 Mbits/sec  611   6.79 MBytes
[  5]   8.00-9.00   sec  71.5 MBytes   599 Mbits/sec   14   14.8 MBytes
[  5]   9.00-10.00  sec  83.0 MBytes   696 Mbits/sec    2   14.9 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec   741 MBytes   621 Mbits/sec  646             sender
[  5]   0.00-10.07  sec   741 MBytes   617 Mbits/sec                  receiver

iperf Done.
root@truenas:/mnt/main/user/stk#