Need advice if I should prune or just start over

Hey,

I’ve had Duplicacy running on an Unraid server using the web UI for around 2-3 years I think. I have it backing up to B2 every night, which I’ve noticed has gotten a little high in cost lately. I realized that I’ve been backing up a /Videos directory, and that folder doesn’t really need to be backed up anymore. It is now excluded.

The issue is, I’ve not had any rules put into place for pruning this entire time. So there’s ~790 revisions (I know, I know), and when I try to run:

-id theid -keep 0:1 -exhaustive

to prune everything but the last backup without the videos folder, it’s taking an extremely long time.

  1. Is that normal for a very large amount of pruning, or did it possibly freeze and I should just give it another go?
  2. Should I just nuke it and start over?

I plan on adding more rules to all of this afterwards, but first I need to clean this mess up.

Thanks

Since you are willing to nuke the whole backup history, the question is really boils down to what’s faster and/or cheaper.

Prune can take a long time and a lot of api calls (download metadata snapshots, enumerate chunks, enumerate other snapshots, figure out which ones can be fossilized, and then deleted. B2 charges for api calls, do they not?

Nuking everything will require you to re-upload all data again.

So I guess the best course of action would depend on how much data you have and how fast is upstream is.

I’m not sure if -exhaustive is needed here, or whether it can be contributing to the bad performance without any benefit in return.

Thanks for the info! I was not aware of B2 charging for the calls, if they do or not. I don’t mind re-uploading, so I think that’s the best route if I want to fix this sooner.

Just curious, what options do you recommend for a simple backup schedule? Currently I just had it set to -threads 6 and just let it go for a long time.

Thanks again.

Keep unceasing number of threads until you no longer see improvement. Depending on how far are you from their datacenter bandwidth per thread you may be getting can vary,

https://www.reddit.com/r/backblaze/comments/bg7gt1/comment/em0l2bt/

And you may be encountering other bottlenecks before that.

1 Like