Does duplicacy recognize moved files and folders?

So that it doesnt have to re-upload tons of data again?

1 Like

Yes, Duplicacy is able to deduplicate moved files and folders, but it won’t able to achieve the ‘perfect’ deduplication, as it takes a pack-and-split approach – files are packed together first (on the fly) and then split into chunks, so a moved file may result in a few new chunks being created due to the file before and the file after changed.

1 Like

Could you please explain a little bit more?
If i move a folder containing 2tb of data, it will reconise that and not upload it all again? How much of that has to be re-created?

You can imagine this 2tb folder as a 2tb tar file. Only a few new chunks will be created when scanning the beginning and ending of this 2tb tar file, because the file before this tar file and the file after will be different. The majority of the chunks within this tar file will remain the same and thus do not need to be uploaded again.

Is this scoped to a storage backend? Or per Backup job? If I have 4 folders being backed up to GDrive with 4 Backup jobs, will Duplicacy notice moving a file from one location (i.e. Media1) to another (Media2)?

Yes, Duplicacy can deduplicate between backups on the same storage. Technically it won’t notice a file having been moved, but when it scans the moved file it will be able to figure out that most of chunks already exist in the storage.

2 Likes

Although I seem to remember that the bundle-and-chunk algorithm (which is not yet implemented) solves an issue with insufficient deduplication in moved files. But maybe that concerned small filles only?

1 Like

That algorithm should handle both small and large files afaicr. Small files will be bundled together, whereas large files will not be bundled with anything else. The first and last chunks of large files would be standalone.

Disclaimer: this is only as far as i can remember.

1 Like

Yes, but afaicr it only makes a difference for when small small files are moved.

dito