Not everybody wants to store all their eggs in one basket (cloud-only) and most will wisely choose a straightforward, proper, 3-2-1 strategy that includes 2 or more local copies (perhaps more) situated on finite disk space. 20%, or however much it is for each person, isn’t a non-trivial amount, especially when multiplied by the number of backup copies.
As I pointed out, all anyone has to do is run a
check and see for themselves a good approximation how much of their data is stored for any snapshot. In my experience, it always has to be pruned as some point, or the wallet has to be regularly taken out, just to keep expanding. That’s not an option for most people.
I manage dozens of systems with varying backup technologies - some with de-duplication (e.g. Veeam Agent, or Win Server Essentials client backup, the latter of which actually has very high de-duplication down to the sector level) - and when I look at the numbers, they all have to be pruned to some degree, or waste enormous amounts of disk space. This doesn’t have anything to do with de-duplication efficiency - it’s simply data growth, and the fact these, and Duplicacy, use snapshots in time.
If you do F.A with your data, it won’t grow, and neither will your backups. If you actually do stuff, it’ll grow regardless - proportional to how much you use your data, as will the backups. So sure, if you want to turn your backup plan into an infinite archive, knock yourself out. Most people won’t, and they can can check for themselves and decide on their own requirements. Backup methodologies have always had retention options (daily, weekly, monthly) for this very reason.
As for archival storage, IMHO that’s mostly folly that won’t save on cost for anyone seriously using it for backup purposes. If it was an order of magnitude cheaper instead of merely “4x”, it might make sense. Perhaps shipping only older snapshots for long-term archive, it might make sense. But for backups, where conducting test restores should be part of the backup plan, that’s just not feasible with archival tier storage, where the costs of restoring is an order of magnitude steeper. Unless you want to make assumptions about equating general cloud reliability with the ability of the client to reliably implement its backup logic… backups shall be tested.
I’m all for Duplicacy supporting separated chunk metadata (since it can be useful to segment that on local storage too), but not wasting time and effort beyond that just so a handful of people can save not very much at all. Would much rather development efforts be put into reliability; snapshot fossils, server-side checksum verification on all backends that support it (which would actually help with archival tier storages), better detection and recovery from missing/truncated chunks etc., and yes separated chunk metadata.