Restore single-threaded?

Whissi · 17 May 2024 01:02

Hello!

Is it possible that “restore” is limited to a single download thread and the -threads parameter has no effect?

Here’s why I think this:

I have a repository with a fixed chunk size of 4MB for better deduplication (with dynamic chunk sizes, Duplicacy fills chunks with additional files. This results in creating a new chunk if a file is moved or renamed, which I want to avoid).

The backed-up data includes my user profile, which contains a Thunderbird profile with over 500,000 mostly <1MB EML files (but just 15GB on disk). Most files in my user profile are small — in total, there are 1,067,860 files that are 374,494MB in size, divided into 1,141,977 chunks.

I am using SFTP as the backend.

I am now trying to restore this backup. While the backup with -threads 100 was completed in just under an hour, fully utilizing my 1Gbps uplink (which was not the case without -threads 100), the restoration is taking about 16 hours — the 1Gbps uplink connection is hardly utilized (and yes, the server can deliver 1Gbps as verified with ssh user@example.invalid 'dd if=/dev/urandom bs=1k count=1024000' | pv | dd of=/dev/null bs=1k count=1024000).

Another indicator:
During the upload, the debug output showed that the files were not uploaded sequentially. The log looked like this, for example:

[…]
Uploaded chunk 64800
Uploaded chunk 64798
Uploaded chunk 64801
Uploaded chunk 64802
Uploaded chunk 64799
Uploaded chunk 64805
Uploaded chunk 64803
Uploaded chunk 64804
Uploaded chunk 64806
[…]

Now, during the restoration, the chunks are being downloaded sequentially:

[…]
Downloaded chunk 64800
Downloaded chunk 64801
Downloaded chunk 64802
Downloaded chunk 64803
Downloaded chunk 64804
Downloaded chunk 64805
Downloaded chunk 64806
Downloaded chunk 64807
Downloaded chunk 64808
[…]

The repository uses RSA encryption. I am using Duplicacy in version 3.2.3 (254953).

I suspect that the restore operation is limited to a single thread, despite the -threads parameter, which affects the restoration speed significantly. Is that correct?

ninjanner · 18 May 2024 06:25

Couple things I would try.

Check the change log for the versions and see if anything is mentioned about the threads.

Test with Debug ModeRun the restore command with debug mode enabled to gather more information about the threading behavior:

duplicacy restore -r revision -threads 100 -d

ninjanner · 18 May 2024 06:28

Have you tried testing with much larger chucks to see if you see the same behavior.

Maybe the behavior only shows in smaller chunks.

saspus · 18 May 2024 16:36

Whissi · 20 May 2024 17:26

Thank you for confirming that the restoration is not parallelized.

Unfortunately, the suggested workaround of manually downloading the data beforehand is not an option for me.

In defense of Duplicacy, I must point out that, for example, Kopia also does not load data in parallel. However, the difference with Kopia is:

With Kopia, the individual chunks are packed into 21MB pack files and are still individually addressable. This means there is no performance loss due to the multitude of small chunks that occur with Duplicacy when using fixed chunk sizes. Additionally, you still benefit from complete deduplication, even when files are copied or moved.

I used a fixed chunk size in Duplicacy because, otherwise, deduplication does not work when copying or moving files smaller than the average chunk size. Duplicacy tries to combine these until the average chunk size is reached. In this regard, Duplicacy behaves similarly to archivers with solid archives — if the sorting changes due to adding, removing, or moving files, a new chunk is created that only matches in that specific combination.

What a pity.

saspus · 20 May 2024 17:51

I would not take that recommendation literary. You can mount the remote with eg. rclone mount, and restore from the mount. Data transfer then will be handled by rclone, where you can control concurrency and prefetching separately.

This is expected, regardless of whether chunk sizes are fixed or not.
It’s a tradeoff you can control. By adjusting the chunk size you can choose your own tradeoff between performance with deduplication.