Check is *very* unhappy after adjusting prune, all revisions now have chunks that don't exist

My check command was happy, I last succeeded a check on my storage 6 days ago.

After that, I decided I want to significantly reduce the amount of revisions I have to speed up the slow parts of Duplicacy that scale with the amount of revisions, so I adjusted my prune schedule from

-keep -keep 2:60 -keep 1:7 -a -threads 5

to

-keep 14:360 -keep 4:60 -keep 2:30 -keep 1:7 -a -threads 5 -exclusive

and disabled my backup schedules until that finished. Took a while and it cleaned a lot of revisions. That prune finished now, but now I tried doing a check and that is very unhappy.

As far as I can see in the log, check now says for every single revision in every snapshot that it has chunks that “does not exist”. For some revisions there are ~200 messages in the log about a chunk that “does not exist”, for some revisions there are ~1000 such messages, but every single revision seems to have chunks now that don’t exist.

Why? How can that be fixed?

Are you sure nothing else was touching the datastore? (Why is this fascination with -exclusive flag? Anyway…)

What is your target?

One possibility is for some reason prune was unable to delete snapshot files (e.g. was interrupted, or for some other reason) and now your datastore is full of ghost snapshots.

You would need to run check with persists, collect list of bad snapshots, delete them manually, and then run exhaustive prune to finish cleaning up.

Are you sure nothing else was touching the datastore? (Why is this fascination with -exclusive flag? Anyway…)

Yes, I’m sure nothing else was touching the data. And I tried it without exclusive flag first, but it was very slow, it just fossilized stuff forever. So then I switched to exclusive which was a bit faster (I assume exclusive needs only half as many API calls, so it should be twice as fast with a “slow” backend).

The prune certainly was interrupted many times, since I need to restart my PC quite often and that means Prune often doesn’t finish in one run.

What are “ghost snapshots”?

Target is Dropbox.

collect list of bad snapshots, delete them manually

All snapshots and revisions are bad according to check, there isn’t a single one that it’s happy with as far as I can see.

That’s the problem.

Prune first deletes chunks, and then, in the very end, snapshots. If it is interrupted, the snapshots remain, but now they point to missing chunks. They are ghosts: supposed to be removed but were not.

I doubt that. Even you have accidentally pruned all revisions, there still would have been one left. Oh wait, you’ve used exclusive flag. Then I’m not sure. All safety is off.

But if you have pruned as intended, only snapshots that needed to be deleted are affected. Delete them manually and run exhaustive prune.

I would not use more than 1 thread with Dropbox. They aggressively throttle, you will be increasing processing time. Also Dropbox is one of the slowest remotes out of those *drive services. You’ll have to wait quite a long time. Especially during exhaustive prune. That also should not be interrupted. You may want to run it on a cloud instance instead of your local machine, if it’s prone to frequent restarts.

1 Like

The solution to avoid the apparent state corruption due to interrupted prune was suggested a few times, including here Zero size chunks: how to solve it once and for all? - #2 by saspus

Now we wait for someone to donate their time implementing it.

1 Like

This keep at the beginning without parameters was just a ctrl-c ctrl-v error here in the post, right?

Prune first deletes chunks, and then, in the very end, snapshots. If it is interrupted, the snapshots remain, but now they point to missing chunks. They are ghosts: supposed to be removed but were not.

But the prune eventually succeeded, so why would those snapshots not be removed once it succeeded?

I doubt that. Even you have accidentally pruned all revisions, there still would have been one left. Oh wait, you’ve used exclusive flag. Then I’m not sure. All safety is off.

I checked again, and it seems you are right, every few revisions there does seem to be a revision where it does not complain about missing chunks. So it indeed just seems to be “ghost snapshots” as you called them.

But it’s like 1000 or so “ghost snapshots”. How can I efficiently fix (remove) those?

1 Like

yeah that was just a copy paste error

Unfortunately, you may have a horrid time of it:

You could try check -persist but you’ll probably run into the same issues I had.

My advice to you is to look through your prune-*.log's and find any and all errors similar to the below.

INFO SNAPSHOT_DELETE Deleting snapshot Mybox at revision 496
INFO SNAPSHOT_DELETE Deleting snapshot Myserv at revision 2665
ERROR CHUNK_DELETE Failed to fossilize the chunk 65f9d142567fb71a3e6da7fb6af68b16424ea00c1e58fb66a2e86142374b967d: rename /duplicacy/chunks/65/f9d142567fb71a3e6da7fb6af68b16424ea00c1e58fb66a2e86142374b967d /duplicacy/chunks/65/f9d142567fb71a3e6da7fb6af68b16424ea00c1e58fb66a2e86142374b967d.fsl: no space left on device

This is an example of what happens if you run out of disk space (other interruptions to prune may have different errors - it matters not).

Notice snapshot 496 of Mybox and rev 2665 of Myserv still remains in /duplicacy/snapshots/ - these have to be deleted manually. Ignore the chunk IDs - you can clean them up later with -exhaustive.

Since check -persist may not give you a full list, your prune logs should hopefully show what ghost snapshots need to go.

1 Like