Understanding repository implementation

ritmo2k · 24 March 2018 16:21

After reading benchmarking I am considering to begin switching from CrashPlan (and my trialing of Restic) to Duplicacy. The single most important feature I require are sets. I have many tiers of data, each with a priority where the highest priority tier needs to complete its upload to B2 before any other tier can start. If tier 1 endures a longer upload than the expected window where another tier would have started, that secondary tier looses its opportunity to upload.

So, I as far as I understand it, I can emulate sets with repositories where each repo indicates a set.

It seems far more work to maintain file system objects (a repo with symlinks to the paths of interest). How does this scale in reality for users with several repo’s. I would normally be used to maintaining configuration files (or in the case of CrashPlan, an abysmal UI).

Lastly, the cache concerns me, the single host I plan to run Duplicacy on is a staging host which itself is already a copy of all the data from each client.

As I have just begun to read up and use Duplicacy in a test environment, I am curious about the opinions of more experienced people on these aspects.

gchen · 25 March 2018 02:51

I can emulate sets with repositories where each repo indicates a set.

Yes, that is right. To prioritize repositories, I think you can write a script to check, when a backup for a top priority repository runs at the scheduled time, if there are any Duplicacy processes from low priority repositories, and just kill them is there are.

the cache concerns me

What cache are you talking about here?

ritmo2k · 28 March 2018 06:52

I was talking about the .cache folder but I realized it remains relatively insignificant.

Thanks for the guidance.