I don't think my retention policy is working correctly

I have the following purge schedule set to run daily :

-keep 0:365 -keep 30:30 -keep 7:7 -keep 1:1

As far as i understand, this should :slight_smile:

keep a daily backup up to 7 days,
Keep 1 snapshot every 7 days for anything older than 7 days
Keep 1 snapshot every 30 days for anything older than 30 days
Keep nothing once it’s over 365 days…

But… when i go to restore i seem to have options going back daily for much longer than 7 days :

raw

What am i doing wrong here?

The log for the job shows the below… which as far as i can tell is just plain wrong :

2022-01-09 08:00:36.821 INFO RETENTION_POLICY Keep no snapshots older than 365 days
2022-01-09 08:00:36.822 INFO RETENTION_POLICY Keep 1 snapshot every 30 day(s) if older than 30 day(s)
2022-01-09 08:00:36.822 INFO RETENTION_POLICY Keep 1 snapshot every 7 day(s) if older than 7 day(s)
2022-01-09 08:00:36.822 INFO RETENTION_POLICY Keep 1 snapshot every 1 day(s) if older than 1 day(s)
2022-01-09 08:00:37.394 INFO SNAPSHOT_NONE No snapshot to delete

What is your actual prune command line? Post the beginning of the prune log, to confirm that you are pruning the same storage. You can also add -id to ensure it will actually prune what you need.

I’m running the web-gui… is this what you’re after?

Running prune command from /root/.duplicacy-web/repositories/localhost/all
Options: [-log prune -storage GOOGLE -keep 0:365 -keep 30:30 -keep 7:7 -keep 1:1]
2022-01-09 08:00:01.506 INFO STORAGE_SET Storage set to gcd://DUPLICACY
2022-01-09 08:00:04.896 INFO RETENTION_POLICY Keep no snapshots older than 365 days
2022-01-09 08:00:04.896 INFO RETENTION_POLICY Keep 1 snapshot every 30 day(s) if older than 30 day(s)
2022-01-09 08:00:04.896 INFO RETENTION_POLICY Keep 1 snapshot every 7 day(s) if older than 7 day(s)
2022-01-09 08:00:04.896 INFO RETENTION_POLICY Keep 1 snapshot every 1 day(s) if older than 1 day(s)
2022-01-09 08:00:36.600 INFO SNAPSHOT_NONE No snapshot to delete

I have the same command running on 2 repositories both acting the same way…

here’s the other log :

Running prune command from /root/.duplicacy-web/repositories/localhost/all
Options: [-log prune -storage NAS -keep 0:365 -keep 30:30 -keep 7:7 -keep 1:1]
2022-01-09 08:00:36.617 INFO STORAGE_SET Storage set to sftp://admin@192.168.2.107:3122/sftp/sftp/DUPLICACY
2022-01-09 08:00:36.821 INFO RETENTION_POLICY Keep no snapshots older than 365 days
2022-01-09 08:00:36.822 INFO RETENTION_POLICY Keep 1 snapshot every 30 day(s) if older than 30 day(s)
2022-01-09 08:00:36.822 INFO RETENTION_POLICY Keep 1 snapshot every 7 day(s) if older than 7 day(s)
2022-01-09 08:00:36.822 INFO RETENTION_POLICY Keep 1 snapshot every 1 day(s) if older than 1 day(s)
2022-01-09 08:00:37.394 INFO SNAPSHOT_NONE No snapshot to delete
1 Like
  • if you add -id <snapshot_id> option explicitly, does it then work?
  • what if you add -a flag?

I’m confused what you mean by adding the -id <snapshot_id> … you want me to add a specific snapshot id to the prune job?

I’m just running the jobs with -a and it seems to be deleting things… I’ll check once it’s done and see what the restore points look like now…

Yes, either pass specific -id or -a. By default there should have been -a but looking at your log it seems it was removed.

If it works with -a — then no need to test with each specific id. The only reason I asked about that is because I had some corner case where it did not work for me with -a but did when specifying specific id

1 Like

Man this prune is taking it’s sweet time…and hour and 10 minutes already… is this to be expected?

When i tail the prune log it’s been stuck on deleting a revision for over an hour…

With google drive? Yes. It’s a very high latency remove. I have seen prune take over 3 days.

If you interrupt it — it will leave half-dead snapshots in the datastore which will then fail check. It’s an often reported bug. (To recover you can run check -fossils -resurrect and then delete remaining bad snapshots manually, followed by a prune -exhaustive to remove orphaned chunks. It can also take it’s sweet time)

As a result when I was backing up to google drive I wasn’t pruning at all from the machines (I did prune weekly from a separate cloud instance)

2 Likes

Ok well if it’s to be expected then i’ll leave it running and see where we are in the morning…
thanks for your help! Hopefully once this is complete the resulting prunes will be quicker :slight_smile:

I guess i could use something like an oracle or google free tier to run the prune seperately?

FWIW I use oracle, they offer the most generous free instance :slight_smile:

I have a

VM.Standard.E2.1.Micro

Running but i have a feeling i can get a better specced vm out of the free tier.

Well it took a lot less than expected… just under 2 hours but we’re looking good… thank you!

1 Like

That’s what I use, but I have x64 software to run. I’m wondering if “Ampere A1 Compute instances (Arm processor)” won’t fit this usecase better: 4x total cores and massive amount of ram

l it took a lot less than expected

Relevant ask:

Also, the fact that prune completes successfully without either -a and -id present while doing nothing is a bug IMO. There is no legit scenario when this behavior is desirable.

@gchen?

I think this is an artefact of clumping together two operations in one command - collecting chunks and deleting chunks.

Both -a and -id seem to operate on the collect stage, so if you’re not running with -collect-only or -delete-only then there’s still a delete phase which may/may not run depending if fossils were collected on previous runs.

Hence, I don’t think it’s much of a problem, but perhaps Duplicacy should say something in the log like No -id or -all specified; -delete-only implied …but still allow it to run? You can, after all, run -delete-only without -id or -a making any sense in this context.

3 Likes

If no -a and -id are specified then the default id is retrieved from the preference. In this case the prune job runs with a dummy id duplicacyweb, which is never used to create backups.

This makes sense.

Then there are two bugs:

  1. Duplicacy CLI shall not “successfully” prune nonexistent snapshot
  2. Duplicacy Web shall validate input to the CLI and require either id or -a — because otherwise it’s meaningless.

Not meaningless with the -delete-only flag (and -exclusive and -exhaustive too I believe) - -a or -id isn’t strictly required there.