Backup to multiple locations with different filters

lahma69 · 5 May 2022 12:49

I am currently using fairly aggressive filters to backup only selected parts of a large local drive to offsite storage (Google Drive) using the following preferences file:

[
    {
        "name": "default",
        "id": "lahmaz97_r",
        "repository": "",
        "storage": "gcd://backup/duplicacy/default",
        "encrypted": true,
        "no_backup": false,
        "no_restore": false,
        "no_save_password": false,
        "nobackup_file": "",
        "keys": null,
        "filters": "R:/PortableApps/dload-cloud_mgmt_syncing/Duplicacy/filters/r/offsite-gd/filters",
        "exclude_by_attribute": false
    }
]

What I want to change: Since I am currently only backing up a part of this large drive to cloud storage, I want to backup the entire drive locally to another local drive (while still performing the partial backup to cloud storage). This new, local backup will use much less aggressive filters than the cloud storage (essentially, I’ll be backing up the whole drive except for temp/cache files and such).

After doing quite a lot of research on the forum, I think I have a general idea of how to go about this but I still have a couple of concerns that I’m not quite sure how to resolve. I would greatly appreciate any help offered from those of you who are much more knowledgeable than myself in regard to Duplicacy. Thanks!

A Few Notes:

I’m using the cli version of Duplicacy and my setup right now is pretty stock standard: I have a ‘.duplicacy’ folder in the root of my drive that contains my preferences file and cache and such. The only thing that deviates from a normal, stock standard setup is that I store my filters file elsewhere (as you can see in my preferences file at the beginning of the post).
The new local storage will back up the same files as the cloud storage plus many more (the local storage will be a superset of the cloud storage).
Both backups will be encrypted and use the same parameters (except for using separate filters) and thus should be copy compatible and bit-identical.
The cloud storage will have a MUCH more lax pruning policy than my local storage since I have unlimited storage on Google Drive. My local storage is really only intended for keeping a full backup/mirror of my drive locally… not for keeping multiple revisions of it.
Regardless of how I ultimately end up accomplishing this, before I perform my 1st backup to local storage, I intend on using the copy command to copy the latest revision of my cloud backup to my local storage as I assume this will significantly reduce the amount of time it will take to perform the 1st backup to local storage (I have 400Mbps download speeds).
I’ve read the document at Back up to multiple storages but this article references a situation which is kind of the reverse of mine… In that situation they are starting with a setup which backs up to a local storage device and adding a 2nd storage which backs up to the cloud. Whereas, in my situation, my default storage backs up to the cloud and I am adding a 2nd storage which backs up to a local drive. Additionally, my cloud backup and local backup will not be the same since they’re using different filters.
Other relevant posts that were helpful:
Backup a directory to 2 storage locations with GUI - #29 by Christoph
Am i doing this right? (2 Storage Locations) - #20 by cyrond

Options:

The reason I only perform a partial backup of my local drive to cloud storage is because of bandwidth limitations (I only have 20Mbps upload speeds). My primary source of confusion is in figuring out whether I need to perform each backup entirely separately (wasting disk/CPU resources doing many of the same things twice) or if I can somehow use the copy command to do things more efficiently. Here are a couple of examples of how I could go about things and the conundrums I run into:

If for example, I perform my cloud backup 1st and then copy that new revision to my local storage before running the backup command to backup to my local storage, it is going to have to download files over the internet from Google Drive (which will obviously be slow).
However, if instead, I perform my local backup 1st, I can’t copy that new revision to my cloud storage because it will be huge and contain many more files than what I intend to store on my cloud storage.
I suppose I could 1st run a local backup using the cloud-storage-filters-file, copy that new revision to my cloud storage, and then run a 2nd backup to my local storage using the local-storage-filters-file. Although, I’m not really sure if this is more efficient than just backing up to each destination entirely separately, and I also don’t know how this would work in regard to colliding revision numbers and such.
Before performing the 1st backup to local storage, I could just copy the latest revision of my cloud storage backup to my new local storage and for all subsequent backups, I could always just perform each backup entirely separately… For some reason, this just doesn’t feel like the optimal way to go about things…

If option #4 is the recommended way to go about things, this is how I would assume I would do that… Is this correct? If it is, would my local backup revision ID’s increase accordingly from the last revision copied from my cloud storage (if I copy revision 1016 from cloud storage before beginning local backups, would my next local backup be revision 1017)?

Add an additional storage (local) to my existing repository (with ‘S:/Backup/duplicacy/default’ being where my new local backup will be stored):

duplicacy add -e -copy default -bit-identical local lahmaz97_r S:/Backup/duplicacy/default
“Set” the filters file for my new local storage:

duplicacy set -storage local -filters R:/PortableApps/dload-cloud_mgmt_syncing/Duplicacy/filters/r/local/filters
Copy the latest revision from my cloud storage to my new local storage (with ‘1016’ being the latest revision on my cloud storage):

duplicacy copy -id lahmaz97_r -r 1016 -from default -to local
Backup to my new local storage:

duplicacy backup -t 1stUniqueBackupToLocal -stats -threads 10 -vss -vss-timeout 400 -storage local
Backup to cloud storage as I usually do:

duplicacy backup -stats -threads 10 -vss -vss-timeout 400 -storage default
Prune my cloud storage and local storage separately using different retention rules (not going to list all of this since it isn’t really relevant).

Sorry for writing such a ridiculous novel of a post… I really didn’t intend to do so starting out. I appreciate anyone who takes the time to read it and offer their opinions. Thanks!

saspus · 5 May 2022 15:02

I don’t see the benefit in copying the backup from the cloud.

Everything else seems ok, and it can be summarized as “create two storage locations, two backup tasks, with separate filter files”.

This

Is not a concern because:

This new, local backup will use much less aggressive filters than the cloud storage (essentially, I’ll be backing up the whole drive except for temp/cache files and such).

I.e. you will be doubling work only on a small subset of files.

Furthermore, you will be only saving upload bandwidth which is irelevant for local target. Packing and compressing will be the same.

I would recommend running duplicacy under cpulimit to avoid power spikes

lahma69 · 5 May 2022 23:02

I would recommend running duplicacy under cpulimit to avoid power spikes

Can you expound on this a bit? Is this simply to limit how much CPU the process uses? Considering I’m running a 5900X with 12 cores, is this something I really even need to worry about? Thanks a lot for your reply and suggestions. I really appreciate it. Could you by chance tell me if I did copy the latest revision from my cloud storage and used that as a starting point for my local backups if that would mean that my local backup revision ID would start from that revision ID (if I copy revision 1016 from cloud storage before beginning local backups, would my next local backup be revision 1017)? Thanks again.

saspus · 5 May 2022 23:18

Yes, for people who run it on personal machines /laptops is usually bad user experience when fans are flaring up every hour because duplicacy rushes to complete backup in 50 seconds when it would have been absolutely fine for it to take entire hours and avoid those lower spikes.

If that’s a server connected to wall power - probably not :). But still — homogenous power consumption == good, spikes — bad

saspus · 5 May 2022 23:20

Not sure about that. But you can check what revisions are in the target by looking at the contents of “snapshots” folder