I had a thread previously, unrelated to this topic where someone had mentioned I shouldn’t really use copy offsite. However, I noticed in the user guide that it’s recommended to use copy offsite instead of having separate jobs for backing up to each destination. What is the right answer here? My big concern with copy offsite is if something corrupts the local backup, then it corrupts my offsite as well unless I can catch it before the copy job.
Copy doesn’t work like this, it will only copy new chunks that appeared since the last copy. It won’t delete chunks (hence you need to run sync’ed prunes separately if you want that), nor will it re-copy existing chunks even if they got corrupted on the source. The only way corruption will propagate if your chunks get corrupted even before copy happens.
Basically, with two separate backups you get redundancy for the situation above, at the cost of having snapshots that are out of sync. Out of sync means that you may not be able to repair your storages in case of missing/corrupted chunks as these may be subtly different (you’d still be able to restore all files from the healthy storage). Copied storages, however, need to be pruned in sync, as mentioned above, and are susceptible to corruption at the source. You pick your poison.
So if I’m following correctly, something corrupted wouldn’t have any effect on already copied chunks, correct? I’d still be able to access those previously copied snapshots just fine, it would just be the newly copied corrupted ones and I’d just need to repair those on the local and eventually the fixes would get copied?
I am not sure what scenario you’re describing, especially for “fixes would get copied”. But now when I think about it, even corruption at the source shouldn’t propagate on copy. AFAIK, copy decrypts/re-encrypts chunks, so presumably it will detect corrupted source chunks and these won’t be copied.
Well that’s just it. I don’t really know how copy works under the hood so I’m trying to make sure that I’m not setting myself up for a true mess if something bad happened in the source. It sounds like it wouldn’t really be an issue though from what you’re describing.
Yes, copy
works exactly this way.
Is it possible to get my two storages in sync at this point? Probably not since I have similar snapshot IDs at both locations.
I think I’ll need to setup a new storage and copy snapshots from the two current storages. Then when I’m ready, copy the new one offsite and continue that going forward.
If your two storages weren’t already copy-compatible then yes you’ll have to recreate one of them - making sure to initialise it with the copy-compatible options.
Once you have 2 copy-compatible storages, just make sure they’re pruned with the same parameters on the same day as each other, and they’ll remain in ‘sync’. To do a one-time sync (if they fell out of sync), copying snapshots both ways should get you back to a good state.
What are the disadvantages from making all storages copy compatible?
And if I make a new storage copy compatible with an existing one, are they copy-compatible in both directions?
There’s no disadvantage, really. It just involves extra effort to copy the main settings from one config
to another. Once you have two copy-compatible storages, you can use Duplicacy to copy
in either direction.
And what’s the benefit of doing a copy over a backup?
Identical version history across all storages. When you make a new backup — you create a snapshot at a different point in time, some files may have changed.
In addition to what @saspus said, you can also copy backup storages without access to the source data - for example, a NAS can replicate the data to the cloud, all while using less resources.