Few newbie questions - prune, OneDrive etc

nicodempl · 13 January 2023 09:11

I am quite new to duplicacy and still trying to set it right.

In my case, I had a lot of different directories spread out among different drives, computers, old backups etc. So first what I did was one large copy of all files into one Backup directory. Lucky enough I had space for this.

Then what I did was running duplicacy to check deduplication possibility. As predicted - from more then 13TB of data, final backup takes less then 6TB… What a mess I got in these drives… Dedupe was great idea.

Then I started to add/delete some files there to make it a little cleaner (BACKUP, source directory) and start to run next duplicacy jobs each time I made some changes. Meaning that snapshots are getting bigger, and it now my destination space exceeded 6TB.

Next week I should be almost ready with my cleanup so here are my questions:

should I totally delete my backup and restart new job and treat it as clean base one (long time to create as I use NAS with Celeron for this);
or can do prune with setting to leave last snapshot? Will this release some space and then data in duplicacy destination will go lower then 5TB (my prediction)?

Or:
How I can force duplicacy to delete files that are no longer in source? I think there is no way then prune with good settings, yes?

And my last question for my 321 backup plan:
Is there a chance to divide destination into 5 different spaces?

I have family Office 365 with OneDrive 5x 1TB and could use this to copy my backup. Ideal secure plan for me, but can it be done somehow in duplicacy in such way (one job, 5 different credentials and Ondedrives) or I should manually divide directories and copy to 5 accounts?

Droolio · 24 January 2023 13:36

Again, noticed this hadn’t been answered after 11 days…

Personally, I would not go with option 1 - not having a backup copy while this takes place isn’t a great idea.

My first thoughts is you might want to get around to doing a round of backup -hash jobs, to ensure your storage is efficiently packed after moving files around a bit. Even if you deleted all but the latest revisions, there might be a lot of wasted space with since-deleted data being packed into partially-referenced chunks.

If you’re lacking space, I’d just reduce your prune policy to something like -keep 0:7 first, run new backup jobs with the -hash flag. This may add a bit more weight to your storage, so you may need for the initial cull to release actual disk space (Remember: Duplicacy does pruning in two steps.)

After a week or so, your storage should be packed efficiently, and you’ve removed old history. Put your prune policy back to what it was previously. (You can, ofc, reduce 7 to less and do it within a day - so long as you’re sure the two-step pruning is occurring.)

You can do an additional cleanup with -exhaustive, but I’d encourage you to run this as a one-off, and don’t run it in conjunction with -exclusive, and especially not -keep. It’ll remove unreferenced chunks after the second step.

To answer your question about splitting a storage across several cloud accounts. So long as this is a third just-in-case copy…

This is feasible… whether I’d personally go down this route, hmm dunno. You could use Rclone union + Rclone serve sftp OR Rclone mount + mergerfs, or a variety of other methods. I’ll leave you to do the research on that. You might find out that it works out well, ya never know til you try…