Deduplicate based on what's already at destination?

Destination is Google Drive: I have a bunch of virtual servers that I’ve backed up to Google Drive already. The nature of VPS is that they are not renewed after a while, so they are no longer around (most aren’t). I have two machines that I want to start backing up to the same GDrive account. These two computers contain roughly 90% of what’s already on the destination. Will Duplicacy be able to de-dupe based on what’s already on the Google Drive destination (i.e. data that’s not tied to any computer being backed up)? I would just need to do this dedupe for the initial backup.

I you have not deleted the backups of the defunct (virtual) computers the datastore does not know/care that those computers no longer exist. The snapshots are still referencing chunks from this machines and therefore the chunks are still present. And yes, it will not re-upload same chunks again – that’s the core feature of duplicacy – cross-machine deduplication.

1 Like

Duplicacy didn’t create those initial backups, hence the question. They were dumped in via Resilio Sync.

Duplicacy stores its data and metadata in a special file/folder structure. It can’t deduplicate against arbitrary files stored outside of this structure.

1 Like

I would also suggest not to mix duplicacy storage with any other folders.
Create one folder like GoogleDrive/Backups/Duplicacy/ and use that as storage, but in that specific folder do not copy anything else.

By using this simple arrangement, you can also have GoogleDrive/Backups/Resilio Sync VPS/ which contains your existing files, which have nothing at all to do with :d:.