I have a 500GB backup going to my Google drive and I am running out of space. I set up GCP storage in an archive tier and have that data backed up to the new location. What is the best way to now remove the Google Drive backup data to free up that space? Thank you.
To delete an existing backup from Google Drive, just delete the duplicacy root (the folder that contains chunks, snapshots, etc. subfolders).
Be very careful with archival storage. It may not be suitable for backup (note minimum 356 days storage duration, at least 5x API cost and 5x retrieval fees, compared to near-line tier from the same vendor)
Thanks. I have multiple backup jobs using my Google drive as the target, and I only want to switch one of them away from Drive to Archive. It is data that almost never changes but I want an offsite copy. It looks like everything is intermingled in the “duplicity” folder under chunks and such. Is there a way to tell one job from another within the folders?
Ah, good point. Don’t touch duplicacy folder then!
Go under snapshots folder and delete the subfolder corresponding to the backup you want to purge. Clear local cache on the client. Then run
prune -a -exhaustive. This will remove chunks not referenced by any backup, thus removing data that was only used by deleted snapshot. (It’s really bad name — but that’s what duplicacy calls it)
Note, prune, especially exhaustive, on google drive will take massive amount of time. Days, if not weeks. Just let it continue. If you want to see some progress being made you can add -d global flag.
Great, thank you! I can issue this command from the web interface as a separate job then?
Yes. For these odd jobs, I usually have another schedule with all days of the week disabled, so it’s never actually auto-scheduled.
I ran the job and it took a couple of days. It is done now but my Drive storage wasn’t recovered.
I am using the WebUI. By default it entered in some other retention settings:
-keep 0:1800 -keep 7:30 -keep 1:7 -a -exhaustive
I added the -a -exhaustive. Should I remove the -keep options and run with just -a -exhaustive on a job by itself?
Sounds about right, it’s due to high latency of google drive API.
How recently was the last backup done in the snapshot id that you wanted purged? If less than seven days – it may not clear chunks right away. Review the prune log to confirm, there would be mention of that (not sure if that logging is on by default though). (And I assume that those files don’t share most of the content with the other backups, those shared bits won’t be freed of course)
You have two options:
- Safe one, slow: wait at least 7 days and run
prune -exhaustive -aagain. it shall clear it this time.
- Unsafe, immediate: If you want to free up the storage ASAP: Stop all backup schedules, from all machines that backup to that storage. Make absolutely, 100% sure that no other client can touch the datastore for the duration of this next prune, neither from any other nor from the same machine, and run
prune -a -exhaustive -exclusive. The
-exclusiveflag will bypass a bunch of safety nets and purge the unused chunks right away. But do make absolutely sure that this instance will be the only one touching storage. Otherwise you may (will) lose data.
I don’t think that matters, but for clarity I would remove them.