Exhaustive prune on idrive e2 deletes needed chunks

Please describe what you are doing to trigger the bug:
duplicacy prune -exhaustive -d with S3 storage driver targeting an idrive e2 bucket.

Please describe what you expect to happen (but doesn’t):
The command returns doing nothing.

Please describe what actually happens (the wrong behaviour):
Output of Found redundant chunk ... floor(# of chunks / 1000) times. If dry run is not selected the chunks will be deleted. This seems to be happening because the S3 driver requests chunk listings in batches of 1000. It appears there is a bug with the idrive s3 implementation that causes it to return the marker object entry rather than start with the entry after the marker as documented by Amazon. The prune code believes this is a redundant chunk and deletes it when in reality it was the only one.
This is a bug with idrive but my understanding is they are somewhat slow, poorly responsive. Perhaps a workaround should be placed in duplicacy since this causes data loss.
Perhaps it may be enough to use ListObjectsV2 API call, though I have not tested this yet.

Edit: After initial testing, ListObjectsV2 has the same problem (even though the parameter is called start after). Skipping the first element on subsequent ListObjects calls does appear to avoid the problem.

1 Like

Examining with other tools it seems that ListObjectsV2 works ok with the use of continuation tokens as documented in the S3 API and also what the AWS go sdk uses if you use the ListObjectsV2Pages interface

I just pushed this commit: Skip identical entries when listing chunks · gilbertchen/duplicacy@d92b173 · GitHub

This is a more general fix. I’m hesitant to update the s3 backend just for iDrive.

The concern about not breaking existing S3 consumers is fair. The V2 interface is notably faster, so might still warrant consideration as an option on that account. Though I assume only exhaustive prune is doing a full chunk list?

What is also interesting is that after reporting this issue to iDrive they got back to me and reported that the behavior with StartAfter was intentional default configuration due to some issues with some unnamed client applications and they can and did change the behavior in my account. I just did retest the current release of duplicacy and saw that the duplicate reporting had ceased.

1 Like