Rsync really slow compared to line speed / replication

Alister · March 12, 2025, 2:30pm

I’ve got my backup/test server running (Both boxes are on the same LAN)

using same network (10Gb) cables etc - didn’t power down / reboot between

Rsync performance ~400 Mb/s
Replication ~3000 Mb/s

This a large dataset of 21TiB and 2.6 Million files ( a lot of Photos) - it took about an hour to actually start transferring data

I’m posting this in case anyone runs into the issue of rsync being really slow.

I looked to see if there was anything I could do to speed it up and there didn’t appear to be

awalkerix · March 12, 2025, 2:52pm

2.6 Million files

This is why rsync is slow. It’s file based. ZFS replication is at a lower layer, which is why it’s awesome.

Alister · March 12, 2025, 3:27pm

2.3 Million of those files takes up 840 GB, so the size of the others was higher, some 200GB+ yet the speed remained low

I just wasn’t expecting it to be that slow

Bingo600 · March 12, 2025, 3:44pm

My weekly rsync w. 1Gb IF’s:
When xferring my newly created tar.gz archives (will always update) i see approx 700Mb/s.

When xfering my “4TB nfs disk” where not much has changed, it takes approx 45 min, and usually with Kb/s xfer rate … Lot’s of small files (linux kernel sources etc.). That’s because once the initial xfer was done, not much will change , be xferred. But rsync still has to calculate/check if the file has changed, takes time.

sah · March 12, 2025, 5:09pm

What rsync arguments are you using? Picking the right ones can make things go much faster. If you are trying to checksum the file it will take as long as it would take for the host and target system to read every file and calc its checksum. So that would mean you are reading 21TiB on each side before anything happens.

Are you running rsyncd on the target side or are you doing this over a mounted filesystem?

Here is a script I use to sync my music collection from my Mac to my NAS. In this case the Mac is the source of truth so I just want the NAS to match. I am running the rsycnd App on the TrueNAS side. Also note if you remove the “–progress – stats” it runs about 15% faster. File list generation time on 96K files is 0.053 seconds.

Please do not try this blind. You can run it with “./script.sh test” and it will do a dry run and tell you want it will do.

#/bin/bash
#PUT

SRC=/Volumes/data1/Media/Music/
DST=rsync://barrel:30026/music
EXCLUDE_FILE=rsync-exclude

if [ "$1" == "test" ]; then
    RSYNC_START="/opt/homebrew/bin/rsync --dry-run"
else
    RSYNC_START="/opt/homebrew/bin/rsync"
fi

if [ ! -d $SRC ]; then
    echo "Local directory is not there. Please fix and try again"
else
    $RSYNC_START \
    --iconv=utf-8-mac,utf-8 \
    --force \
    --size-only \
    --no-perms \
    --no-owner \
    --no-group \
    --omit-dir-times \
    --delete \
    --progress \
    --stats \
    --recursive \
    --exclude-from=$EXCLUDE_FILE \
    $SRC \
    $DST
fi

Alister · March 12, 2025, 7:39pm

I just set it up zero arguments set

I’ve done the replication so I have a copy

I googled to see if there was anything I could do to make it run faster and could find anything i.e. force parallelisation similar to downloading a file via multiple connections

etorix · March 12, 2025, 8:04pm

Irrespective of the size of the files, and whether there has been any chage, rsync needs to check each and every file. One hour for 2.1M files? Not too bad…
You may speed it up with a special vdev or metadata L2ARC so that rsync has faster access to all the metadata it needs to browse.

But moving the backup to ZFS and using replication will be faster.

winnielinnie · March 12, 2025, 9:54pm

Let’s not get too technical, okay?

winnielinnie · March 12, 2025, 10:01pm

The bottleneck is your storage drives.

Rsync needs to pull all the file metadata from the drives, in order to compare the directory listing to the target.

At minimum, it needs to know “does this file exist?” In order to do that, it needs to crawl through the entire directory tree.

Most rsync setups need more than that, though. They also need to compare any difference in sizes and modification timestamps.

No matter what, the entire directory tree needs to be crawled and read from the storage drives.

The only way to speed up this process is to bypass the drives and do this all in RAM. How can this all be done in RAM? By keeping the relevant metadata in the ARC.

How to keep all this metadata in the ARC? By having enough total RAM to safely house it, and adjusting a ZFS parameter to mitigate “pressure” against metadata in the ARC.

In summary: You’ll need to increase RAM and adjust the ZFS parameter to greatly favor metadata over data in the ARC.

As @awalkerix said: ZFS replications are way better. It already knows what to transfer, without having to crawl an entire directory tree, as is the case with rsync.