Prune should keep the last backup in the specified period

I have been using duplicacy for a week now… It is absolutely amazing.
However, there is one thing I think should be handled differently: the behaviour of prune -keep ..
Say I run three backups a day and once every week I prune old backups with the following:

duplicacy prune -keep 1:2 -all

Duplicacy will keep one revision per day for revisions older than 2 days, fine. The problem is that the first backup for each day is kept and the later ones discarded.
I don’t understand why this is the default behaviour.
Wouldn’t it be better to keep the last one for each day?
It seems more reasonable to me, as it represent a more recent status of the repository.

Think the other way, I would argue that the first backup of a day captures more changes than the last backup of previous day.

But the point is, if you have to choose backups to prune, there is no absolute reason why some backups are more important than others, at least not in general cases.

Altho I agree with you in principle, for me when I want to go back to, say Jan 2’s work, remembering to instead go back to the first non-Jan 2 backup is a slight surprise.

1 Like

Well, in theory one backup should not be more important than another, but in reality I think they are.
If I run a second backup in a day, then probably it is because I did some changes which I would like to “commit”.
So I guess it is more likely to want the latest change in a repo.
Using a long enough period with keep surely helps though (say 1:7)

Proposal: prune --preserve-newest
I propose a new option for the prune command to count backwards from the newest revision.

Desired Behavior
Run a prune with any -keep value, and guarantee that the newest revision of any repository is kept.

Why I Need This
The newest revision is the most important. Most of my snapshots are kept for the purpose of restoring files in case the local storage device is lost while traveling, not primarily for undoing changes.

Unexpected Deletion

  • I ran a backup 7 days ago (SD card A) and again 6 days ago (SD card B) from the same repository after copying over the SD cards.
  • I ran prune for the whole storage which contains multiple snapshots from different repositories.
  • multiple -keep, the smallest -keep 5:5
  • The newest revision containing the data from SD card B was deleted.
  • The interval between the revisions was less than 5 days, so the newest revision was deleted because prune counts forward from the oldest revision.

Impractical Workarounds

  • To prevent this, I must manually check snapshot revision ages against keep rules for all snapshots in a storage.
  • But some repositories are backed up less frequently or consistently than others, so there may not be a -keep rule that guarantees the newest revision is kept for all snapshots.

The Only Guaranteed Method

  • the newest revision is younger than the smallest “m” in -keep “n:m”,
  • or the number of days between newest revision and previous revision is greater than the largest “n” in -keep “n:m”

Workaround Not Always Usable

  • To make sure the newest revision of all repositories in a storage is younger than the smallest “m” days, and the smallest possible “m” is 1 day, a backup must be made for all repositories, within the last 24 hours (or “m” days) before running the prune.
  • However some repositories may be on computers located far away and are currently inaccessible.

Side Question

  • If you use -keep 0:365 -keep 90:90 -keep 30:7 and the newest revision is older than 365 days, will the entire snapshot be deleted?
  • Use 100000:365 instead to prevent that? (edit: still not safe if oldest and newest are older than 365, only way is to always have a revision younger than 365 days)

Duplicacy always keeps the latest revision, unless you use the -exclusive flag.

You must’ve ran -exclusive in addition to -keep in the same run. That’s the only reason why the newest revision was deleted. Normally, you shouldn’t ever run -exclusive. So Duplicacy doesn’t need to count backwards to solve this (in fact, it’d probably introduce a whole new set of problems).

You can run -exclusive separately from -keep - say, one after the other - but you shouldn’t ever together. TBH, there should probably be a big red warning about this in the docs.

https://forum.duplicacy.com/search?q=prune%20exclusive%20keep%20together%20@Droolio

I’ll try a test later without the -exclusive tag after 24 hours has passed after the newest revision.

The OP in this thread says:

Say I run three backups a day and once every week I prune old backups with the following:
duplicacy prune -keep 1:2 -all
The problem is that the first backup for each day is kept and the later ones discarded.

So does that mean prune always counts forward from the oldest revision, whether or not -exclusive is used?

For example I ran the prune with -exclusive, and it counted forward.
If it had counted backward, then 8 would have been deleted instead of 10.

So if I had run prune without -exclusive, then 10, 13, 14 would still have been deleted, but 15 would remain?

  8 | @ 2024-11-30 19:07
 10 | @ 2025-03-19 20:40
 12 | @ 2025-07-12 18:36
 13 | @ 2025-07-12 18:38
 14 | @ 2025-07-14 21:05
 15 | @ 2025-07-14 22:06 (latest, newest)

prune -keep 0:365 -keep 120:30 -keep 30:1 -a -exclusive

Deleting snapshot at revision 10
Deleting snapshot at revision 13
Deleting snapshot at revision 14
Deleting snapshot at revision 15

Yep, it always counts back from the oldest snapshot.

Revision 8 isn’t deleted because it’s not older than a year. It’s ‘kept’ as the first (oldest) revision encountered, then any revisions older than 120 days ago but within 30 days of revision 8, would have got deleted.

This is regardless of -exclusive.

The most important thing to note is using -exclusive removes the default behaviour of protecting the most recent revision - thus you’re not meant to use -exclusive during the normal course of use, but especially not with -keep. (That protection is removed because you might want to target specific revisions with -r - including the last.)