Preserve Duplicacy cache across restore runs

Hi there.

I notice that Duplicacy creates a cache of file chunks and snapshots in .duplicacy/cache whilst it does a restore operation, presumably it fetches the data here before re-assembling the files?

The cache doesn’t seem to persist, after the restore operation completes there’s only a small handful of files left amounting to a few hundred kilobytes. I was wondering if there is a way to keep all the downloaded chunks, so that any subsequent restore could fetch chunks locally if they are present?

As an example where this could be useful, imagine backing up a large binary file (e.g. 1GB) which changes regularly with small deltas (say 1MB). Now if you wanted to restore different versions of that file, you’d have to redownload the whole 1GB file, but if the chunks had been cached, then you might only need to download one or two chunks when restoring a different version.

Duplicacy already does incremental restore. When the file to be restored exists locally, Duplicacy will split the existing file into chunks first and then compare them with those to be restored. Only changed chunks will be downloaded.

The file available locally is the best cache.

3 Likes

Thanks, cool, I didn’t realise that it did that, that’s great that it does that. :grinning:

I tried it with a simple one byte replacement and it worked, downloading just one chunk.

However I then tried inserting a single byte at the start of the file, the backup worked perfectly uploading a single chunk:

Backup for E:\duplicacy\largefiletest\source at revision 7 completed
Files: 1 total, 341,603K bytes; 1 new, 341,603K bytes
File chunks: 81 total, 341,603K bytes; 1 new, 16,384K bytes, 16,446K bytes uploaded
Metadata chunks: 3 total, 6K bytes; 2 new, 5K bytes, 5K bytes uploaded
All chunks: 84 total, 341,609K bytes; 3 new, 16,389K bytes, 16,451K bytes uploaded
Total running time: 00:00:02

Restoring however re-downloads all the chunks (I previously had revision 6 and restored to revision 7).

Restoring E:\duplicacy\largefiletest\restore to revision 7
Downloaded chunk 1 size 16777216, 16.00MB/s 00:00:20 4.7%
....
Downloaded chunk 81 size 681016, 166.80MB/s 00:00:01 100.0%
Downloaded weapons.pak (349802215)
Restored E:\duplicacy\largefiletest\restore to revision 7
Files: 1 total, 333.60M bytes
Downloaded 1 file, 333.60M bytes, 81 chunks
Total running time: 00:00:02

This is what made me think it might be useful to have an option on the restore command to cache the file chunks.

That is right. Duplicacy can’t handle chunks that are moved from one location to another location during restore. Handling such chunks would add a great deal of complexity to the implementation.

Still, I’m not sure if leaving file chunks in the local cache is a good idea. Currently the cache is used by metadata chunks only, mostly for fast file listing. There are already a number of complaints about large cache sizes, and adding file chunks will only make it worse.

2 Likes

Don’t be afraid to respond to your own post/question is you find the solution before someone in the staff can help you!