Help restore my faith in Duplicacy

testingdup1 · 16 January 2024 01:20

Hope you all can help. Had Duplicacy backing up what seemed to be fine with no issues or problems. Just attempted to restore and everything went fine except for one file. I keep getting the following error: “Failed to decrypt the chunk : cipher: message authentication failed; retrying”

Help me restore my faith in Duplicacy. Does anyone know why this would happen? Luckily I have another copy so it’s more so just trying to figure out what went wrong. I’m willing to troubleshoot in order to find the root cause, just don’t know where to start. Thanks!

saspus · 16 January 2024 01:25

Looks like corrupted chunk.

What is your storage? Does it guarantee data consistency? Did you enable erasure coding in duplicacy if it does not? (Not a guarantee though, but better than nothing as a temporary measure until you move to as storage that provides such guarantees).

Another possibility was the chunk was corrupted before it got uploaded — I would check filesystem on your host machine and ram (with memtest86). If that’s the same machine of course.

Looks like it works as expected: detected and reported you that the chunk is corrupted, instead of restoring garbage data.

testingdup1 · 16 January 2024 01:51

Thank you for the reply saspus. OneDrive is the target. I didn’t enable EC but can do so in the future.

Host machine was fine.

I guess I’m just a bit confused. I have 13 snapshots. The file appeared in snapshot 8 and remained unchanged for the next snapshots, however all snapshots for that particular file are corrupt.

Why didn’t Duplicacy warn me while backing up or retry an upload on snapshot 9? Would enabling EC changed this behavior? Just not getting the warm and fuzzies from Duplicacy.

saspus · 16 January 2024 02:27

OneDrive is generally fine; this leaves two possibility:

you have uploaded that problematic chunk when OneDrive had issues (documented here on the forum) with keeping partially uploaded file instead of deleting it. They fixed that eventually but it was a problem for some period of time. EC would not have helped here anyway.
issues on your machine caused the chunk to get corrupted after it was created but before it was uploaded.

Btw was it OneDrive for business or personal?

That’s deduplication working. If the [parts of] file was unchanged — no reason to re-upload it, instead snapshot references existing chunk. That happens to be corrupted.

How was it supposed to know that the chunk file got corrupted? Duplicacy trusts storage. If storage says “yep, I got the file” - it assumes it indeed got the file.

Since you don’t pay for egress from OneDrive you can schedule to run check -chunks after each backup. This will cause Duplicacy to download and check every chunk. It will check every chunk only once, but in case of flaky remotes like OneDrive that may be lying about data integrity or in cases when chunk got corrupted after it was created but before it was uploaded this is not a bad idea to do.

You should not base your data integrity policy based on warm and fuzzy feelings :). Trust but verify.

Backup that is not tested — is to be assumed broken.

Duplicacy provides some facilities to help in those scenarios, but ultimately using flaky storage is a culprit. The line shall be drawn somewhere on what to verify. Shall duplicacy trust ram? CPU? By default, Duplicacy assumes that storage does not damage nor lose files.

If that is a concern — you can run duplicacy check -a. This downloads snapshots, and verifies that all chunks that snapshots reference are still present. This result in relatively small download, and for remotes with paid egress is cheap way to detect remote losing files or prune bugs.

If you add -chunks flag — it will also download and check integrity of chunk files. If you have to do this — something went very wrong and either your machine is faulty or remote.

If you add -files — it will also try to reconstruct all files.

Anyway, what shall you do now?

Run filesystem checker and disk check to ensure your storage hardware is not dying.
Run memtest86 several times to ensure ram on your machine is not faulty.
Delete the bad chunk from the storage manually.
Create a new snapshot ID and make one full backup with exact same settings and filters as your broken one. Vast majority of chunks will be already on the storage so no extra upload will occur. But this will allow the missing chunks to get re-created and re-uploaded, thus repairing your datastore.
Delete the newly created snapshot ID, we don’t need it anymore.
If you continue using OneDrive as a backup destination schedule check -chunks after every backup job or periodically.

testingdup1 · 16 January 2024 14:20

Saspus - I really appreciate the thorough explanation. Sounds like the API call to OneDrive failed, which puts the blame on them and not Duplicacy. Faith pretty much restored, thank you! Check -chunks will probably go in the workflow.

I know it’s not my local server since I’ve copied that file around and MD5 summed it. All good locally.

Since I have your ear, regarding the steps 4 and 5 you listed. Why would I delete the new snapshot ID and not keep it?

Again, appreciate the knowledge transfer and help!

saspus · 16 January 2024 15:57

That’s temporary one, needed only to trick duplicacy into re-creating all thinks, in order to replace the missing one. It’s no longer needed after that, and you just continue backup history into the old snapshot id

mcar · 26 February 2024 12:30

This might not be applicable in your case, but I recently had a similar issue when restoring from a Backblaze (S3) storage. All chunks succeeded except one, and I only later found out that it was due to hitting the daily Backblaze download cap (which was left to its default). Doh! I though that the backup was corrupted so I deleted everything in the bucket and started from scratch, but this was of course completely unnecessary in hindsight.

It would however have been nice if duplicacy had notified me of being rate limited (if this is possible to detect) so that I wouldn’t have made the wrong assumption and resorted to deleting everything.