Moving storage - Google Drive

scott3 · 15 May 2022 16:17

I have a 500GB backup going to my Google drive and I am running out of space. I set up GCP storage in an archive tier and have that data backed up to the new location. What is the best way to now remove the Google Drive backup data to free up that space? Thank you.

saspus · 15 May 2022 18:34

To delete an existing backup from Google Drive, just delete the duplicacy root (the folder that contains chunks, snapshots, etc. subfolders).

Be very careful with archival storage. It may not be suitable for backup (note minimum 356 days storage duration, at least 5x API cost and 5x retrieval fees, compared to near-line tier from the same vendor)

scott3 · 15 May 2022 21:01

Thanks. I have multiple backup jobs using my Google drive as the target, and I only want to switch one of them away from Drive to Archive. It is data that almost never changes but I want an offsite copy. It looks like everything is intermingled in the “duplicity” folder under chunks and such. Is there a way to tell one job from another within the folders?

saspus · 15 May 2022 21:28

Ah, good point. Don’t touch duplicacy folder then!

Go under snapshots folder and delete the subfolder corresponding to the backup you want to purge. Clear local cache on the client. Then run prune -a -exhaustive. This will remove chunks not referenced by any backup, thus removing data that was only used by deleted snapshot. (It’s really bad name — but that’s what duplicacy calls it)

saspus · 15 May 2022 23:11

Note, prune, especially exhaustive, on google drive will take massive amount of time. Days, if not weeks. Just let it continue. If you want to see some progress being made you can add -d global flag.

scott3 · 16 May 2022 01:16

Great, thank you! I can issue this command from the web interface as a separate job then?

saspus · 16 May 2022 01:30

Yes. For these odd jobs, I usually have another schedule with all days of the week disabled, so it’s never actually auto-scheduled.

scott3 · 18 May 2022 03:09

I ran the job and it took a couple of days. It is done now but my Drive storage wasn’t recovered.
I am using the WebUI. By default it entered in some other retention settings:

-keep 0:1800 -keep 7:30 -keep 1:7 -a -exhaustive

I added the -a -exhaustive. Should I remove the -keep options and run with just -a -exhaustive on a job by itself?

Thanks again!

saspus · 18 May 2022 04:41

Sounds about right, it’s due to high latency of google drive API.

How recently was the last backup done in the snapshot id that you wanted purged? If less than seven days – it may not clear chunks right away. Review the prune log to confirm, there would be mention of that (not sure if that logging is on by default though). (And I assume that those files don’t share most of the content with the other backups, those shared bits won’t be freed of course)

You have two options:

Safe one, slow: wait at least 7 days and run prune -exhaustive -a again. it shall clear it this time.
Unsafe, immediate: If you want to free up the storage ASAP: Stop all backup schedules, from all machines that backup to that storage. Make absolutely, 100% sure that no other client can touch the datastore for the duration of this next prune, neither from any other nor from the same machine, and run prune -a -exhaustive -exclusive. The -exclusive flag will bypass a bunch of safety nets and purge the unused chunks right away. But do make absolutely sure that this instance will be the only one touching storage. Otherwise you may (will) lose data.

I don’t think that matters, but for clarity I would remove them.