Suggestion: add "-include" to copy command

Summary

I am backing up to a local store and using the copy command to send the snapshots offsite as described in Backup to multiple storages. However, this approach sends more snapshots offsite than needed. Suggest adding an -include parameter to the copy command.

Background

Previously, I used to use two backups, one local and one offsite. The local backup created snapshots hourly and is primarily to help with the occasional “oops” by a user. The offsite backup created snapshots daily for disaster recovery. This dual-backup approach provided sufficient risk mitigation and disaster recovery for my needs. Switching to the backup-copy approach increased bandwidth utilization and storage costs without corresponding benefit because all the hourly snapshots are sent offsite.

Suggestion Details

I suggest adding something similar to the prune command’s -keep parameter to the copy command. However, instead of determining which snapshots to prune, it will determine which snapshots to include. For example, to include only 1 snapshot per day, -include 1,0 would copy just 1 snapshot per day.

The effect achieved will be the state described in bullet 4 in Backup to multiple storages under “Pruning” without the manual effort or creation of a (possibly buggy) bespoke script.

Thanks,
Scott Ainsworth

This is an interesting idea, although personally I would just prefer -latest - to copy only the most recent snapshot, on a daily schedule.

However, I did wonder about some kind of copy -sync flag, which would effectively do a normal copy but prune snapshots on the destination which are no longer on the source.

The main reason being, is that even if you wanted a complete set of snapshots in two different locations, you still have to prune the same storages with the same retention periods and on the same day for there to be no issues with yo-yo’ing prune / re-copy cycles as can happen if they get out of sync.

Your idea, combined with this, might be a better feature. And, TBH, it may as well re-use the -keep flag name and usage - i.e. copy -keep 1:1.

If, one day, Duplicacy would store its retention settings on the storage side, and all you had to do was a simple prune command without having to specify -keep, it could look at both storages when doing a copy, and automatically pick the right snapshots (although I’m not sure what it’d do if those retention periods are different; maybe pick the destination, optionally delete snapshots on destination if not on source, and copy only snapshots that fit within the defined retention periods).

(I’d still like a -latest flag, though - for simpler setups.)

Anyway, just brainstorming…

@Droolio, Thank you for the feedback!

I definitely modeled -include after -keep. In fact, when I started writing the suggestion, I used -keep, only switching to -include as I neared completion of the post. I switched terms because I conceptualize the two operations differently. During prune, -keep indicates which revisions remain (i.e., are not touched). Whereas, -include indicates which are touched (i.e., copied).

-latest is simpler than -include. But what does -latest mean if the copy does not run for a few days (assuming the local backups do run)? This questions brings out a subtle difference between the two: the meaning of -latest depends on when it is run, while the meaning of -include defines a desired end-state. -include is idempotent (within a 24-hour window). When copy is run multiple times on the same input with the same -include arguments, the same result is produced. (This is very useful for testing among other things). Copying with -include has the potential to produce different results when run multiple times in a 24-hour period.

Thanks!
Scott