SFTP backup speed is slow and slows more over backup time

Hello. Firstly, I want to thank all developers for Duplicacy and everyone who helps people in this forum.

I’m testing Duplicacy against 1TB MySQL database files. First backup took 2-3 hours, and it was OK, because it’s a first backup. But incremental backups took also 2-3 hours, while restic, for example, took 40-50 minutes.

Details: first ten minutes files are backing up with speed 200-210 MB/s, but after that, within hour, backup speed slowly slows down until 130-140 MB/s. And I don’t understand where the bottleneck is: it’s not the CPU (there is a lot of cores available, and I tried running backup with multiple threads - same results), it’s not the RAM (duplicacy uses 700M, but there’s hundred gigs of free memory), it’s not the disks (server uses 2 SATA SSDs in RAID-1, they can give about 1 GB/s in total), it’s not the network (Duplicacy didn’t used it a lot, and never approached even to 700 Mbps). And it’s not even the server where sftp is running - there’s also enough resources for Duplicacy.

I’m not sure what’s the root cause in my case, and I didn’t found anything like that on this forum (well there are two posts and one issue about same issues, but with B2 and there’s much different numbers).

There’s also this post about SFTP issues, but also with different numbers.

I also found another issue about backup performance, will try those repository settings later.

Can you help me with this situation? I really want to see Duplicacy shine. It’s just something that prevents it, and I don’t understand what exactly.

Duplicacy version: 2.4.0
Storage backend: sftp
Backup threads: 8
Repository settings: encrypted
Environment variables: DUPLICACY_PASSWORD and DUPLICACY_SSH_KEY_FILE

You can try the benchmark command:

Tried right now. One of the cores was loaded up to 99-100%. Here’s whole output:

$ duplicacy benchmark -upload-threads 8 -download-threads 8 -file-size 2048
Repository set to <redacted>
Storage set to sftp://<redacted>
Generating 2048.00M byte random data in memory
Writing random data to local disk
Wrote 2048.00M bytes in 0.79s: 2601.28M/s
Reading the random data from local disk
Read 2048.00M bytes in 0.28s: 7190.55M/s
Split 2048.00M bytes into 427 chunks without compression/encryption in 9.57s: 214.08M/s
Split 2048.00M bytes into 427 chunks with compression but without encryption in 12.40s: 165.21M/s
Split 2048.00M bytes into 427 chunks with compression and encryption in 12.67s: 161.68M/s
Generating 64 chunks
Uploaded 256.00M bytes in 4.12s: 62.09M/s
Downloaded 256.00M bytes in 2.82s: 90.90M/s
Deleted 64 temporary files from the storage

This shows the sftp upload is the bottleneck:

Uploaded 256.00M bytes in 4.12s: 62.09M/s
Downloaded 256.00M bytes in 2.82s: 90.90M/s

Maybe you can try 4 threads instead?

For subsequent backups where there aren’t too much data to upload, the file processing becomes the bottleneck:

Split 2048.00M bytes into 427 chunks with compression but without encryption in 12.40s: 165.21M/s
Split 2048.00M bytes into 427 chunks with compression and encryption in 12.67s: 161.68M/s

So an overall speed of 130-140 MB/s is inline with these numbers. Restic could be faster because they don’t do compression, and they can use multiple threads to scan files.

But one thing that can definitely help is to use a fixed chunk size as explained in Chunk size details.

1 Like

I tried to create a few backups in a repository with fixed chunk size today - roughly same results. Also tried with S3 (minio) repository instead of sftp - same. With 4 threads speed is lower than with 16.

I don’t understand, why storage is the bottleneck. There is enough resources on both servers, enough network bandwith… but anyway, thank you for your response.