Understanding Cache Size

Tarnsten · 28 June 2024 03:08

Hey folks,

I’m just wondering about the size of the cache vs. backup. I have a device which is backed up, and then if I add the storage to another device + restore 1 file of about 150MB, the cache ends up as 1.4GB for a 70GB repository with only 1 revision:

I have another backup which is ~2TB, and with 47 revisions, and the cache on that is only just hitting 1.4GB:

Any reason why the Duplicacy backup might be so much larger, proportionally? I would like to switch to using Duplicacy for everything, but if it grows at this same rate, the cache size will be too large for me to use unfortunately

gchen · 28 June 2024 20:41

Only metadata chunks are stored in the cache, and the amount of metadata chunks is proportional to the number of files in a backup, not the total file size.

Moreover, if you run only backup jobs from a repository, the cache only stores the metadata chunks in the latest backup. But if you also run prune jobs, then metadata chunks from all existing backups will remain in the cache.

Tarnsten · 1 July 2024 03:53

Okay, so in this instance the repository has 485,496 files in, whereas the Borg repo only has 200,105 in it. So, without knowing how the Borg repo design handles metadata/caching, it would seem perhaps the Duplicacy solution is better at this than Borg?

Follow up question: is the metadata chunking a fixed size per file? I.e. can I calculate cache size based on number of files, or is there some variability in this?