Hello,
I am trying to grasp the way how duplicacy works. I have read the page on wiki and pdf with detailed explanation of how pruning works. But I am still unable to quite fully understand how it works and how is it able to work lock-free.
Explanation given is that prune deletes files in two steps: first renaming chunks to fossils, then actually removing only if there is a “new snapshot” and it is not referencing any fossils. This works because either snapshot is seen by “fossil collection” or “fossil deletion” step. I can see this being true if we run one backup + prune at the same time in any combination possible with the “policy” rules in place.
But I do not see it working with 2+ backups + prune running at the same time.
Let’s imagine that we start “fossil collection” and 2 backups at roughly the same time:
- Fossil collection starts just before Backups so it doesn’t “see” them
- Fossil collection sees that nothing references chunk A so it is marked as fossil (old snapshots get removed so nothing is referencing chunk A anymore)
- Backup 1 doesn’t reference chunk A
- Backup 2 does reference chunk A right before fossil collection marked is as fossil, so it doesn’t reupload the chunk
- Fossil collection finishes
- Backup 1 finishes, but Backup 2 doesn’t finish yet
- We start fossil deletion. There is a new snapshot created by Backup 1 and it is after fossil collection finished. So we satisfy both checks in the “Policy 3”. There is a new snapshot that fossil collection didn’t see (Backup 1) and it finished after fossil collection finished.
- So we delete chunk/fossil A completely
- But wait… it is still referenced by Backup 2 that didn’t finish yet… oops?
What am I missing? As far as I can see this case means that we must not run multiple backups at the same time with the same “snapshot id” or we risk corrupting our backup. But isn’t this the whole point of duplicacy that we can run multiple backups at the same time?