I’m replicating my TrueNAS to TrueNAS running at ServaRica in Canada.
I’m getting an average of well under 20MB/sec or so throughput because the Replication task uses a single stream. So it takes over a day per TB to backup, even though I’m paying for many times that bandwidth. There doesn’t appear any option to increase the number of streams.
This seems like a huge lost opportunity for some impressive performance gains, especially if you have to restore from an offset backup quickly: you REALLY want to get a lot of streams going in parallel.
Why is there no option to increase the number of streams or specify a bandwidth cap? I’m backing up many datasets so there should be no reason that the restore can’t be one stream per dataset, for example.
Is this just a priority feature issue, or am I missing something?
I didn’t see a control for # of streams in the configuration.
Also, there isn’t a control for conflict resolution either. If I didn’t set read only and modify the backup dataset, does it keep the change or overwrite it or ask? There doesn’t seem to be an option to set the behavior to: keep, overwrite, ask. So what does it do?
It overwrites it. The result of replication is that the destination dataset is an exact duplicate of the source dataset as of the time of the snapshot that was replicated. It doesn’t merge, and it doesn’t ask.
As to your main question, I don’t believe it’s possible to send ZFS replication over multiple streams simultaneously. That isn’t a limit of TrueNAS; it’s a limit of ZFS. As I understand it, at least.
Yeah. Long shot , but it is an easy thing to test, and it certainly makes a difference to my backup system which doesn’t support AES-NI (but the VPN does), difference between 20MB/s and gigabit.
I find SSH tends to saturate a tcp connection given enough data, and decent AES support.
The iperf tests would be the most interesting thing.
Should be plenty of data for the tcp stream to open its windows to get good speed.
I use iperf3 on both my truenas servers (main and backup).
The limitation is simply how fast you can put bits down a single TCP/IP stream between remote sites. That limit is around 20MB/sec if the sites are far away (like in another country). The NUMBER of streams is the limiting factor here. iperf3 proves that. So does wget. So does scp. All the same speeds betwen sites.
So when you do a ZFS replicate to another site, isn’t each snapshot done independently in ANY order? So each snap could be sent in a separate stream. This will speed things up by a factor of 5 or more. This is quite huge when you are restoring data and need it fast.
My question is was there some reason they use ONE stream and serially send the snapshots vs. sending all the snaps in parallel?
It just seems ix left a lot on the table here in terms of potential speed.
here are the iperf3 between my sites for one stream vs. 5 streams. Using -R gave same numbers. The speed is quite variable every second.
thanks for the overwrite clarification. that’s what i thought. the AI chatbots got it wrong.
As for ZFS limitation of a single stream, that makes no sense to me. Each dataset is independent and can be restored independently. Is there some sort of “global lock” preventing more than one dataset to be modified at a time"? This seems really counter-intuitive… it would imply you can only write to one dataset at a time.
So I’m skeptical it is a ZFS limitation. Am I missing something?
I believe with ZFS you can send separate streams (to/from separate datasets).
However, if you’re issuing a recursive replication, then it’s treated as a single stream, even though multiple datasets are involved. (A single “task”.)
I use Servarica and it sends much faster (don’t have an exact number but that’s slow!) can send it, runs about 4 minutes each night. Not sure about other countries though. I would check the RTT. The tcp window size could be adjusted if need be to improve on that speed.
It is a zfs limitation. You need to talk to zfs folks if you want to understand the details and why. That being the case, I don’t think that is your issue. That speed is slow, too slow.
Both have 1Gbps or higher internet connections and are not loaded. I think you’ll rarely get above 20 to 30 MB/sec and it will vary from second to second.
Also FWIW, variability between different iPerf3 streams when running with threads is not unexpected, particularly when there is a network bottleneck of some kind.
Not necessarily. Egressing and going through the internet and all of its magical pipes is it’s whole own separate thing. I wanted you to do that to test for buffer bloat, not for absolute ping times.
Depending on where the bottleneck is, that simple test can show that a specific bottleneck which manifests from congestion. As load increases, ping times can grow much higher. You don’t have this problem.
3.3X faster now after tweaking lots of tcp networking parameters. no other change to anything else. I guess there is lots of room for improvement in this area.
With linux tweaks, I’m now getting 3.3X higher transfer rates than before.
You’re not wrong, but you really are just solving a software bottleneck that’s being hung up my single core performance. If you don’t mind my asking, what kernel parameters did you tune in the TCP stack?
i will have to go back and look. i just keep reading articles and trying things. But i’m nearly saturating the link now showing the linux kernel has huge room for improvement.