I can only speak from my experience on what I think is a relatively small backup set, around 3TB, but read/write has been reasonable and restores responsive. This is check -chunks on an ARM Odroid HC2 w/2GB RAM running lighttpd+mod-webdav. (Moving average so some smoothing.)
Software developer is. Any choice and option passed to the user is the one developer did not/could not/would not make. Shall we let users pick their encryption codec and scheme? Chunk sizing? Compression codec?datastore layout? Snapshot format? What else? Have them write their own backup software? Target backend is just another part of the “backup solution”, its choice must be must be made by the vendor, based on intricate knowledge and profiling of the engine design.
Easy. Another positive experience means nothing. One negative — disqualifies the whole thing. The solution shall work for everyone, not just you.
Not at all. I’m against using GoogleDrive (OneDrive, DropBox) etc as S3 replacements, in the context of this discussion, yes.
I’m also saying that all Google’s free user-facing services (photos, mail, docs) are parasites sucking up your data to benefit their ad services revenue, and you should not use them.
I have nothing against Google Workspace, GCS, Domains, and multiple other excellent services and products that came out of google. Including go.
Let’s not lump everything into the same bucket.
Are you saying downloading entire backup history periodically is normal? Why? You don’t trust your storage provider? Why do you keep using it?
And you can totally download 100Gb/month from Glacier absolutely free, if you really need to. But I fail to see the reason why. Do you also second-guess your ram and CPU correctness? I trust AWS and google. I don’t trust duplicacy with prune, hence, check (with no arguments) is sufficient. Egressing data is misguided.
On one had — you are expecting every user to do that testing for themselves? It’s waste of everyone’s time. On the other — you don’t need to taste spooled food to know it’s toxic.
You ca deduce from the design goals whether it will be a good match. The fact that it works short term is irrelevant and inconclusive
That’s because you were given that choice and you are resisting to let it go. There are a lot more design choices that were made without consulting you, and you are perfectly content with accepting them as is.
Suitable storage backend should have been just the same silent choice made as part of design.
Correct. It’s wrong solution. It can be wrong and still work for a while at the same time, there is no contradiction.
While walking back I thought of another analogy that might help understand why egressing from the data provider for “checking” is unsubstantiated and I’ll-advised.
Imagine you backup to a local nas. NAS is a ZFS array with monthly scrubbing. The act of scrubbing the checksumming redundant storage guarantees data correctness right afte the scrub. Would you also run duplicacy check-chunks on top?
If yes, please explain why?
If no - then you should not egress from commercial cloud storage either for exact same reason. You pay them to keep your data consistent.
As I see it, broadly there are 2 opportunities for error: transport and storage.
Even with local storage transport is more complex, so there’s a greater risk of error, which check will detect. Once transported you rely on the storage (ZFS in your example) to preserve integrity.
No system’s perfect, so you mitigate risk. Where it’s highest and the “cost” to address it, which determine priorities, will be system-specific. Best practices but no universal right answer.
Getting back to the OP, were you able to connect after manually editing the settings file?
Right, every part of the process needs to provide guarantees. Perhaps unencrypted FTP is a bad idea, but SFTP, or S3 provide those guarantees: transmission either succeeds, and the data is intact, or fails.
Very well said. The probability of Amazon AWS (or Google Cloud, or insert your favorite hyperscaler) losing data either due to flawed transport or storage is non-zero. But it’s small enough, to make “full egress always because you can’t trust them” even remotely a proportional response.
OneDrive had a bug few years back where they would happily save truncated files. That is another reason to avoid extra complexity level on top of bare storage.
B2 had a bug few years back where API would return bad data. This is another reason to avoid small vendors like B2, iDisk, pCloud, etc., who compete on price only.
In other words, as you said, you want to optimize cost and risk.
One way is to use shoddy cheap storage and egress every day, hoping that data will survive until the next check. Other way is to use storage that is inherently more reliable, if nothing else just due to sheer number of customers and data stored, and associated test coverage; and rely on a combination of guarantees afforded by the protocols, and their competent implementation, not hopes. And the incentives’ alignment just adds polish and performance.
From the very beginning, Duplicacy has ticked most of the boxes for most users - a choice between local and cloud storage, or combination of both - and succeeds, precisely because it offers users flexibility. This forum, for example, wouldn’t exist otherwise. Numerous features have already been implemented because users asked for them. Duplicacy is a better product.
By your twisted logic, support for archival storage should never happen.
This is arrogant and delusional. Your failure to get GCD working, while plenty of others are able, is no reason to strip the feature you no longer use, and others do.
Same goes with WebDAV. Perfectly suitable protocol for use over wireguard. You want the developer to yank it due to your bad personal experience, for what… to ‘protect’ users from themselves? LOL
What’s ‘free’ got to do with anything? It’s not free if we pay a bloody subscription!
Yes, it is normal - as part of a proper backup strategy.
You test it because nobody should ever simply ‘trust’ storage of any kind. Regardless of tool, regardless of storage type or reputation. You ‘trust but verify’, and employ well-proven strategies, such as 3-2-1, to mitigate against bad assumptions.
Yes. Not just -chunk, but occasional full restores too.
The storage medium isn’t the only point of failure. Software bugs are another, with ‘non-zero’ chance. If your policy is to only ‘trust’ but never verify the software, either, then more fool you.
These are old arguments and I’m not here to convince you. Everyone else here hopefully understands 3-2-1, and verifying backups, is best practice despite @saspus saying the opposite.
Thank you guys for supporting me about this topic.
Not good at English so I decided not reply after saspus told its not recommended several times and ask for more details days ago. Apparently he can not help me solve the problem.
Using sftp means I need to enable ssh and share root password/key in other place or create a new user and configure permission and so on. It’s not safe or convenient for newbie like me.
With dufs I can serve webdav in one command and without worry about other thing like access control, just pure file sharing.
I do know I can change the json file url to webdav-http to achieve this, I see this solution in another topic.
My problem is that I need have a webdav with ssl which share the same password since it’s encrypted ,to generate a ‘dummy config’, then change its url in config file.
Unfortunately I can’t add a this ‘dummy config’ in web edition without a working webdav with ssl. When adding, it need to connect and add some file.
Have to enable ssl for webdav temporarily and disable it in the end. I don’t think it’s a ‘solution’.
First you init (or add) the storage from the command line with http
Then you manually add/edit the storage block in the json file.
None of this is done with the web interface.
Duplicacy supports http WebDAV but (for some reason) the web interface enforces an unnecessary https requirement. The goal is to set it up with the command line and manual editing, bypassing the web interface.
Your storage is encrypted? Mine isn’t, maybe that’s the difference.
I believe (someone correct me if I’m wrong) that Duplicacy can be used entirely from the command line. If so, encryption shouldn’t prevent you from following the steps in my post above.