Help with Prune/Retention Settings

Tez · 24 May 2025 12:03

Environment:

I have several computers in my network that all, with various programs, backup to my Synology NAS drive. The NAS runs Duplicacy via Docker and uplifts everything to a Backblaze B2 bucket.

All backups work absolutely fine, backup as needed, and I have both a Check schedule running and semi-regularly check my backups are valid by restoring randomly selected files.

Everything is running smoothly.

The Question:

I have a scheduled task inside Duplicacy that runs housekeeping. Its main job is to Prune the backups in storage at Backblaze. The setting I have for this is currently:

-keep 0:360 -keep 30:180 -keep 7:30 -keep 1:7

Now I’ve noticed I’m starting to pay Backblaze more each month that I’d like, or I think is necessary, so I’d like to scale this down so Backblaze is only ever storing the last month worth of backups and nothing beyond that.

The reason for this thread is I’m a little confused as to how the Prune schedule works and exactly what command I’d need to achieve this.
I know my -keep 0:360 means no revisions older than 360 days (basically, a year), so am I correct in thinking my new revised command should be:

-keep 0:360 -keep 0:180 -keep 1:30 -keep 1:7

That would give me only 1 copy at a month old, I think? But what does the 1:7 mean? The help documents say it’s “1 revision for revisions older than 7 days”. Why older than 7 days? I want 1 revision each day, which is when the backup happens, and then nothing over a month; i.e. I only ever want 30 rolling days backups. Does the above command satisfy that?

Apologies if this is a simple question, but I’m having a hard time following how these retention commands work.

fronesis · 24 May 2025 14:48

You want this:
-keep 0:180 -keep 1:30

That will not keep any snapshots older than 6 months; it will delete them all. It will then keep 1 snapshot every day for any snapshots older than 1 month. For any snapshots made within a month, it will just keep every single snapshot you create. (Technically, you don’t even need that second -keep because you are only backing up once a day, so in practice it’s not really going to delete anything.

The help documents say it’s “1 revision for revisions older than 7 days”. Why older than 7 days?

Because the command you are running is prune which is a command to DELETE snapshots. -keep is just a flag. It’s saying, “while you are in the processing deleting a bunch of stuff, don’t delete this stuff.”

If you want 1 snapshot for each day up to 30, then just make sure you run a backup at least every day. On the other hand, if you were backup up every hour, then you might want to add more flags to decide how many of those to keep.

Tez · 24 May 2025 20:51

That makes, sense; thank you.

One more thing, most of my backups are daily, but I also have two jobs scheduled on the NAS that backup weekly on a specific day. These are backups that aren’t subject to much change, so don’t need as frequent snapshotting.

In the example you gave - -keep 0:180 -keep 1:30 - can I assume, then, that the daily backups will be one as day and not go over 30 days (i.e there will be 30 copies of them) and the weekly ones will have four historical examples?

fronesis · 24 May 2025 23:01

Here’s what we get if we run duplicacy prune -h

-keep <n:m> [+] keep 1 snapshot every n days for snapshots older than m days

So just plug in our -keep 1:30 and it will tell us what happens:
keep 1 snapshot every 1 day for snapshots older than 30 days.

This means:

There is no “30 copies” anywhere here
For any snapshot that is less than 30 days old, duplicacy won’t delete it. How many of those snapshots there are just depends on how many you made.
For snapshots older than 30 days, duplicacy will keep 1 per day and delete the rest.
If the number of snapshots Duplicacy encounters is smaller than the prune command is instructed to keep, then it will just keep them all.
For example, if you run prune with the keep flags described above, -keep 0:18 -keep 1:30 , duplicacy will DELETE all snapshots older than 180 days and NOT TOUCH any snapshots less than 30 days old. It will look at snapshots between 30 and 180 days old. If finds only one per week or two per week, then it won’t delete anything. If it finds 10 or 20 per week, then it will delete 3 or 13 per week (that is, enough to get to 1 per day).

Please also keep in mind that you can can run prune against all snapshot ids -a or against a specific snapshot id -id <snapshot id>, so when it comes to the different jobs you have scheduled it will depend how you run the backups (same snapshot id, or different ids) and how you run the prunes.

Tez · 25 May 2025 09:53

Hokay, I think I get how Duplicacy is handling these backups now, thank you.

Honestly, I still don’t get why it doesn’t just do the same as most backup software packages and clearly state “keep X copies” and then rollover at that value. That would be so much easier to understand and account for!

saspus · 27 May 2025 19:22

You want to keep cadence of backups in time, regardless of number of backups taken at any specific point in time, so “keep X copies” is meaningless, because you can’t know will these X copies span 1 day or one year.

For example if you take hourly backups, but one day your machine slept for 8 hours and there other -16 hours, you will end up with different length backup history. This is bad and inconsistent. Pruning by time backup was taken will do the right thing there.

Pretty much all backup software operates that way — by specifying time intervals in the past the snapshots must be kept at.

Duplicacy’s way to specify that cadence is different, but it’s no better and no worse than any other approach. It provides flexibility at the expense of extra complexity. Web UI however has a preset for most common configuration (GFS) to hide the complexity from those that don’t need it.

Tez · 4 July 2025 10:19

Hello! Apologies in resurrecting and old thread, but I have a new question related to it and doing it this way seemed a lot better than starting a new thread.

As a reminder, this is Duplicacy running on my Synology NAS as a Docker and is designed to upload backups from my NAS to my Backblaze B2 bucket.

Since I last updated everything I’ve been keeping an eye on my Backblaze bucket and, alarmingly, the size of that bucket just keeps growing and growing. It’s currently about 7.5TB, which is way more than it should be. Obviously, I’m paying Backblaze for all this storage.

I have all my backups scheduled in Duplicacy and there is a single scheduled job set up to prune everything. It is set as -keep 0:180 -keep 0:30 -keep 1:7.
The idea is I want no more than a weeks’ worth of any of my backup jobs.
The prune job runs once a week and always shows as Completed. It’s last output shows as:

Running prune command from /tmp/duplicacy/repositories/localhost/all
Options: [-log prune -storage tez-nas -keep 0:180 -keep 0:30 -keep 1:7]
2025-06-29 11:00:10.511 INFO STORAGE_SET Storage set to b2://tez-nas-drive-backup
2025-06-29 11:00:11.081 INFO BACKBLAZE_URL Download URL is: https://f002.backblazeb2.com
2025-06-29 11:00:12.621 INFO RETENTION_POLICY Keep no snapshots older than 180 days
2025-06-29 11:00:12.622 INFO RETENTION_POLICY Keep no snapshots older than 30 days
2025-06-29 11:00:12.622 INFO RETENTION_POLICY Keep 1 snapshot every 1 day(s) if older than 7 day(s)
2025-06-29 11:02:03.244 INFO SNAPSHOT_NONE No snapshot to delete

Now, not only is my B2 bucket growing, but if I go to the Restore section in Duplicacy and pick any of my backup IDs I get returned a list of revisions going back far further than I’d have expected.
For example, the time of writing this post is 4/7/25. My revisions available for restoring currently go back to 10/11/24; so almost 8 months. This is just a shade beyond the 7 days I’m wanting to store!

I’m very nervous about setting anything at Backblaze’s end to do with retention as I think it’ll very likely screw up the chunking method Duplicacy is using; I’d prefer Duplicacy handles everything by itself.

So what is going wrong in my scenario. If I let this continue for another few months my bill to Backblaze looks like it will just continue to rise and rise. Is there any diagnostic command(s) I can run to find out what’s going on here?

Thank you in advance of any help!

saspus · 4 July 2025 16:52

Your prune invocation does nothing because it’s missing which snapshot ID to operate on. You need to either add -a to work on all snapshot IDs or -id <snapshot ID> to prune just the specified one.

Then you need to do

-keep 0:7 -a

What you wrote will keep all revisions younger than a week and one revision every day for a month otherwise, and that if you add -a or -if flags.

Tez · 4 July 2025 17:01

Aah, thank you so much for the missing flag. That makes perfect sense as to why nothing was actually pruning and my Backblaze storage - and bill - was just increasing constantly.

I’ve now adjusted the prune job to include the -a flag.

Droolio · 5 July 2025 15:05

Just to clarify on what saspus said, this: -keep 0:180 -keep 0:30 is contradictory - the first rule gets overwritten by the second, because they’re both 0:m, and the smallest takes precident. You should only have one of these, and 0:7 is all you need (if you only want a week).

I’d strongly urge you to keep more revisions, though…

You’d be surprised at how efficient Duplicacy’s de-duplication and how a well-written prune rule is, and would probably let you keep more than 8 months while keeping the space down. I use -keep 30:365 -keep 7:90 -keep 1:14 (you could always change the first rule to say 0:365), and I bet it’d use far less than what your B2 was using 'til now.

system · 15 July 2025 15:06

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.