Initially seed my offsite storage

I want to create an off-site mirror of my home 3TB local backup using AWS S3 or wasabi. Unfortunately I have a slow internet connection. I do have all the same files in my backup synced to an AWS server, which naturally has very fast upload speeds. My question is can I create a backup from this AWS instance that could provide the beginnings of an off-site backup that once uploaded I could continue to sync to from home to add any differences?

On a related note, is there any advantage to using a copy job to make the offsite backup rather than any other method of syncing the files?

Yes, you should be able to do this. The first backup from your AWS server should upload most chunks and the second backup from your home will ver fast.

The advantage of copy vs other methods is explained in this guide:

The recommended way is to use the copy command to copy from the default storage to the additional storage ( offsite_storage ). This way, you’ll always get identical backups on both storage providers:

Of course you may be able to use third-party tools, such as rsync or rclone, to copy the content of one storage to another (:grey_exclamation: in this case don’t forget about using --\bit-identical as explained here). But compared with rsync/rclone, the copy command can be used to copy only a selected set of revisions instead of everything. Moreover, if two storage are set up differently (such as when one is encrypted and the other is not) then the copy command is your only choice.

Thanks. How do I make the remote backup copy-compatible with the home one? Should I copy the existing config up there?

See the example from this #how-to Add command details, section “Bit Identical”

I would honestly urge people not to use the -bit-identical option when making copy-compatible storages - unless absolutely necessary, i.e. you are using an alternative method to copy the storages, such as rclone/rsync.

Otherwise, if you’re using Duplicacy to copy the backups between storages - and this is recommended anyway, since you can then copy a subset of snapshots - just initialise the second storage using the -copy option with the add command (not forgetting to add -encrypt if you want to maintain encryption).

While it’s a very small likelihood, using -bit-identical will slightly weaken your security, in that you are basically re-using encryption keys.

If someone were to learn of the master password protecting one config in a storage, they can now bypass the encryption in all copy-compatible storages, regardless of whether you change all your master passwords now or in future. Without -bit-identical, each storage can have a different set of encryption keys (stored in the config) with a different master password protecting it.

2 Likes

I understand your point about security, but I personally prefer to use -bit-identical. It’s a trade-off between the risk of a password leak and the convenience of handling backups with external tools. The second is far more common - I hope - than the first.

Also, my backup passwords are so “deep” in my security layers (Windows :nauseated_face:, Veracrypt, KeePass, 2FA, in-memory protection, auto-type obfuscation) that if an attacker has access to my backup passwords (different for each repository) they already have access to my files.

But that’s just my use case, of course.

2 Likes

In this case i’m with @towerbr: convenience > just a lil’ bit of extra security.

Fair enough, but I think it’s important to make users aware of the actual purpose and downsides of -bit-identical, and point them to the more relevant docs (the -copy option) so they can read up on it…

One of the recurrent issues with Duplicacy design, is that invariably, new users come a cropper when wanting to add copy-compatibility to their existing setup and discovering they should’ve used add -copy instead of another init on the second storage.

Likewise, once you create a copy-compatible storage with -bit-identical, you can’t really undo that choice. IMO, it’s exactly the same problem as password re-use - nobody can truly predict how it will impact security, but it has the real chance to bite you in the ass. Already, I can imagine various ways a dedicated hacker would abuse that choice, despite otherwise good security practices.

Ideally, it would be nice if one day Duplicacy could copy between any storage without prior preparation. I believe it’d be technically feasible to do, if a little complicated. Even with different chunk sizes, variable or fixed.

Anyway, I don’t particular think -bit-identical and third-party copying adds that much convenience anyway, and it may introduce problems of its own. An added advantage of letting Duplicacy do the copying - in addition to being able to specify a subset of snapshots - is that it actively decrypts and re-encrypts chunks. This effectively validates the integrity of data, as it would assuredly error out when encountering corruption.

4 Likes

Interesting point, I hadn’t thought of that… copy-as-a-chunk-integrity-check…

“chunk” or “file”? :thinking:

I’ll evaluate some possibilities regarding this … :wink: