How to keep two storages in sync?

I have two bit-identical storages (A and B) which I’d like to keep synchronized.

My current method:

  1. Backup only to A
  2. Regularly copy all snapshots/revisions from A to B
  3. Run prune on A and B with the same parameters

The problem comes if (for any reason) a scheduled prune fails to run on one of the storages.

Take this example:

  • A and B are synchronized, same ids and revisions
  • At the same time each week I run the following prune on each storage: prune -keep 1:7 –keep 7:30
  • An internet outage prevents connection to storage B so storage B isn’t pruned

When prune runs the next week it’s running against a different set of revisions on A and B, so the same command (prune -keep 1:7 –keep 7:30) might select different revisions.

Assume A had selected revision 100 as its latest “7:30”
Assume B now selects revision 101 as its latest “7:30”
Assume subsequent prunes on storage B delete the chunks associated with r100

Won’t copy now copy r100 and its associated chunks from A to B, delete them when prune selects r101 instead, then copy them the next time I run copy and forever after?

And if so is there a better way to keep them in sync, maybe leveraging 2-step deletion – e.g. on storage B: prune ALL revisions, then run copy, then the actual prune with check -resurrect somewhere in there?

Assume A had selected revision 100 as its latest “7:30”
Assume B now selects revision 101 as its latest “7:30”

This won’t happen. The prune algorithm always starts from the earliest revision (which is never pruned) and then checks subsequent ones in order.

1 Like

That’s not to say the problem of mismatched revisions can’t cause a yo-yo effect where Duplicacy prunes one storage, creating holes in early revisions numbers (e.g. the oldest revision), and then re-copies previously pruned snapshots…

I’ve seen this happen before, where the mismatches are so bad, that Duplicacy constantly prunes and re-copies the same snapshots over and over. (The fix is to sync copies both ways before resuming a prune schedule - making sure to prune both on the same day.)

Luckily I don’t run much into a situation where one storage fails to get pruned, and because they’re mostly matching, a single failure is unlikely to get into this situation, but it can happen.

IMO it’d be kinda nice to have maybe a copy -sync flag that performs a special prune at the end, where it deletes only snapshots on B that don’t exist on A. Then, you wouldn’t have to specify a precise retention policy for storage B, or even have to run prunes that regularly or even on the same day as A.

(Although if the entire prune experience can be fixed, such that revision files are removed/fossilised first - before chunk files are removed/fossilised - aborted prunes would be much less problematic.)

1 Like

Same.

Right I think timing is culprit. gchen said:

but if run on different days -keep 0:X can result in different “earliest revisions”

Either way I think your solution – copy A to B then B to A then prune – handles all the exceptions.

Just to add, this is a one-time ‘fix’ for sorting out snapshot holes. Normally I just copy A to B and haven’t had a significant problem since I discovered the issue.

Right.

My thinking is the added time of copying B to A is negligible and I only copy every other day so why not do it as a matter of course. One less problem requiring manual intervention.