Please describe what you are doing to trigger the bug:
I’ve been using Duplicacy with Google Drive as the backend. When I tried to use rclone
to copy the remote repository to another place, it complained about duplicate file names. This is when I found out about the problem.
Please describe what you expect to happen (but doesn’t):
Chunks / revisions are named differently.
Please describe what actually happens (the wrong behaviour):
First, not every time a new backup is made, will duplicate files be created, so this might be correlated to network uncertainties. And because of the occasional nature, I have already deleted the log files from the cron jobs when the problem occurred.
Some duplicate files have identical MD5 hashes, so I can use rclone dedupe
to remove the (identical) duplicates. Others have identical names but different MD5 hashes. This is what confuses me and make it impossible to move the remote storage with confidence (i.e., not sure if I should keep the older and newer files; the newer files would make more sense if say the duplicate is due to Duplicacy retrying a failed operation).
$ rclone dedupe GoogleDrive1:duplicacy Remote2:duplicacy
2020/11/14 15:56:26 NOTICE: chunks/64/83e08e3ed30ac6ec83ad385957b7ef5e4b6af5733116a8eda66b0d9c3793dd: Found 2 duplicates - deleting identical copies
chunks/64/83e08e3ed30ac6ec83ad385957b7ef5e4b6af5733116a8eda66b0d9c3793dd: 2 duplicates remain
1: 3399462 bytes, 2020-06-27 15:17:49.693000000, MD5 072f233da8842048739f2d98923ec841
2: 3399462 bytes, 2020-06-27 15:17:49.334000000, MD5 ffda4e78df88b4a3de1312413692d1f6