I have been testing Duplicycy for pretty much exactly 1 month now (had to buy a license yesterday). Using the Web UI.
I am using it to backup around 10 TB of Data, from my HDDs, to Google Cloud Storage with the “Archive” class. I’d think that’s the ideally class for backup storage: Very cheap for storing stuff, and only expensive when you actually want to retrieve anything, which you don’t plan to ever do, ideally.
Google Cloud Storage Archive costs 1.2 $USD per TB per month. So my 10 TB would be around 12 USD per month. Additionally, every “operation” on the cloud storage costs $0.50 USD per 10000 operations (An operation is an action that makes changes to or retrieves information about buckets and objects in Cloud Storage.).
But Duplicacy seems to cause way too many individual file operations, which are very expensive. Also, Duplicacy seems to cause a surprisingly big amount of download activity from the storage, which also is very expensive ($160 USD per TB).
Before using Duplicycy, I was using Arq 6, with the exact same storage backend. Arq caused way, way, way less operations, and no “download” at all, so Arq was very cheap on the storage backend.
Here’s some data:
Duplicacy has only finished around 1/3 of the initial backup yet, so around 3.3 TB.
In just this month (December), where I have only used Duplicacy, There have been 1.4 million “operations” on the cloud storage, costing $72.2 USD. Additionally, Duplicacy seems to have downloaded a total of 282 GB from the storage, which cost $50 USD. On top of that, around $5 USD for storage, which is fine, and cheap.
But the 1.4 million operations, combined with the download of 282 GB of files, cost a total of $122 USD! There is no way how I can afford this going forward, this is just 1 month.
In comparison, here are the stats for when I used Arq 6, the months before. Also consider, my Duplicacy backup only has uploaded around a third of what my Arq 6 backup has uploaded, as that fully finished in the time!
In the months from May till November, Arq 6 has caused 423000 “operations”, costing a total of $22 USD. Additionally, Arq 6 has used a total of 0.49 GB of “download”, costing $0.02 USD.
I did start a new backup with Arq 6 in that timeframe, same as I did with Duplicacy now. So it’s a very fair comparison between Duplicacy and Arq 6. I have never tried to restore any data, neither with Arq 6 nor with Duplicacy. Arq 6 has caused me a cost of $22 USD in file operations over the course of 6 months, while Duplicacy has caused me a cost of $122 USD within just 1 month.
Duplicacy is obviously completely unsustainable in this case. Is there anything that can be done to improve that?
I guess I could try to increase the chunk size? I am using the defaults currently. I do not care about deduplication at all, my data is all fairly unique anyways. I just want to get Duplicacy to cause less file operations, and also not download anything unless I actively restore any data. Is that possible? What do I have to change?