-exhaustive implies -exclusive? Corrupted snapshots, can't delete with "No snapshot to delete"

I have hourly backup scheduled.

I’ve started duplicacy prune -exhaustive -a on the same storage, which ran for 15 hours nonstop and got rid of over 20 thousand chunks.

After that I ran check. Turned out, all 15 recent backups had missing chunks.

Does the -exhaustive flag imply exclusivity as well?

Edit. There seem to be a lot more revisions affected… check is still running and complaining as it goes.

Edit2. Something is horribly wrong here.

I’m trying to delete snapshots that fall check:

-log -d prune -storage Rabbit -r 64 -r 74 -r 82 -r 91 -r 480-598 -r 720-723 -r 735-740,

duplicacy enumerates snapshots, those do exist in the target storage:

2021-08-06 22:24:41.750 TRACE SNAPSHOT_LIST_IDS Listing all snapshot ids
2021-08-06 22:24:42.119 TRACE SNAPSHOT_LIST_REVISIONS Listing revisions for snapshot obsidian-users
2021-08-06 22:24:44.088 DEBUG DOWNLOAD_FILE Downloaded file snapshots/obsidian-users/1
2021-08-06 22:24:45.251 DEBUG DOWNLOAD_FILE Downloaded file snapshots/obsidian-users/3
2021-08-06 22:24:46.446 DEBUG DOWNLOAD_FILE Downloaded file snapshots/obsidian-users/64
2021-08-06 22:24:47.774 DEBUG DOWNLOAD_FILE Downloaded file snapshots/obsidian-users/68
2021-08-06 22:24:49.208 DEBUG DOWNLOAD_FILE Downloaded file snapshots/obsidian-users/74
2021-08-06 22:24:50.380 DEBUG DOWNLOAD_FILE Downloaded file snapshots/obsidian-users/82
2021-08-06 22:24:51.562 DEBUG DOWNLOAD_FILE Downloaded file snapshots/obsidian-users/91
...
2021-08-06 22:30:47.920 DEBUG DOWNLOAD_FILE Downloaded file snapshots/obsidian-users/745
2021-08-06 22:30:47.921 INFO SNAPSHOT_NONE No snapshot to delete

And that’s it. It does nothing.
What happened here? (I did delete the local cache, no difference)

It is not recommended to run duplicacy prune -exhaustive while there are backups in progress, because new chunks uploaded by these backups may be marked as unreferenced and thus turned into fossils. A check after that will report missing chunks because it doesn’t looking into fossils by default.

This is from the doc for the check command:

Not sure why prune by revisions didn’t delete the specified revisions – can you check the storage name and snapshot id in the .duplicacy/preferences?

How can I get those fossils resurrected?

This is how ./localhost/0/.duplicacy/preferences looks like:

[
    {
        "name": "Rabbit",
        "id": "obsidian-users",
        "repository": "/Users",
        "storage": "gcd://Backups@Duplicacy",
        "encrypted": true,
        "no_backup": false,
        "no_restore": true,
        "no_save_password": false,
        "nobackup_file": "",
        "keys": null,
        "exclude_by_attribute": true
    }

Seems to be correct. However the prune log seems to indicate that it is running the prune command from all folder:

Running prune command from /Library/Caches/Duplicacy/localhost/all
Options: [-log -d prune -storage Rabbit -r 64 -r 74 -r 82 -r 91 -r 480-598 -r 720-723 -r 735-740]
2021-08-08 21:08:52.883 INFO STORAGE_SET Storage set to gcd://Backups@Duplicacy
...

And there ./localhost/all/.duplicacy/preferences looks like so:

{
        "name": "Rabbit",
        "id": "duplicacyweb",
        "repository": "",
        "storage": "gcd://Backups@Duplicacy",
        "encrypted": true,
        "no_backup": false,
        "no_restore": false,
        "no_save_password": false,
        "nobackup_file": "",
        "keys": null,
        "exclude_by_attribute": false
    }

No idea why is id set to duplicacyweb. Snapshot ID should be obsidian-users.

So, what do we do here?

Reading manual, found this:

   -fossils                       search fossils if a chunk can't be found
   -resurrect                     turn referenced fossils back into chunks

I’ll attempt this now. This is something that probably should be default? Or maybe there should be separate command like “heal” or something along those lines to untwist slight datastore concussion… Maybe not… just a thought.

Update 1hr later: it’s resurrecting a metric ton (over 1k and counting) of fossils.

@gchen, I don’t think this is what happened here – I can understand this happening to backups that were made after prune started – but this does not explain why would itmanage to corrupt exiting old snapshots; that is an issue.

-fossils isn’t the default because it may incur extra lookups for some storages. However, in the case of missing chunks this overhead is probably negligible, so it may make sense to check the fossils as well…

I can only think of 2 possibilities that duplicacy prune -exhaustive would affect existing old snapshots: 1) prune doesn’t see the snapshot id (that is, for some reason the listFiles call doesn’t return the subdirectory corresponding to the snapshot id, or 2) the local cache stores different versions of the snapshot files.

Do you still have the log output from duplicacy prune -exhaustive?

Unfortunately no.

I have now noticed that every time I run prune it ends up breaking the datastore, that later can be fixed with check -fossils -resurrect. Maybe its’ related to that strange id in preferences file the all folder?

For now I “fixed” the datastore by adding -id obsidian-users argument to prune instead of -a (it’s the only snapshot id there anyway); when this succeeded I ran check and it succeeded too.

So, my datastore is now OK.

Next step, I’m going to run prune -exhaustive again, and then if this (hopefully; nothing worse than on-off mystery failures) screwes up the data store again will share the log.

Ok. It appears, when I explicitly specify -id the prune succeeds.

What I think happened – prune was failing to delete snapshots, after fossilizing chunks, which would appear as broken snapshots. After providing snapshot id explicitly prune succeeded and I don’t see this issue anymore. (I’ll see how another prune in a week will do, after enough snapshots accumulate eligible for the pruning); but I guess that is it.

1 Like

I have also been running exhaustive prunes for some time and it seems possible that my current issues may be related to that i a similar way. So I’m wondering: what conclusion are you drawing from the problems you were encountering? I have definitely turned off exhaustive for my weekly prunes, but there seems to have been more here, has there?

Likely not.

Issues I had were due to superposition of three things mainly:

  1. When backing up to google drive with one google account to a folder shared from another google account unless the user is assigned full permissions delete won’t work but rename will. This allowed orphan snapshots to get created referring to fossilized or deleted chunks, in which subsequent prunes and checks would stumble and fail.
  2. When duplicacy logs that it deleted the snapshot it’s lying. In reality it just adds it to the list of things to delete later. If it is interrupted — that later never happens and all these files remain, causing great deal of confusion when comparing what log says with what’s actually is present in the storage.
  3. Duplicacy cannot move on to next snapshot after check or prune fails on a current one. This made removing those orphan chunks a tedious process: run check, see what snapshot does it die over on, delete that snapshots manually from storage, repeat. I gave up after 5 or so and then rage-deleted the whole block of snapshots (100 or so). Then ran prune in exclusive exhaustive mode to get rid of unreferenced chunks; this freed up about 40% of space at the destination.

Because the permission issue that created the problem with the first snapshot was corrected by giving user full access (content manager or owner or admin, something confusing along those lines) this never happened again.

To emphasize: the problem was caused by prune failure; no important snapshots that were supposed to stay were affected. So, no data loss occurred (minus those snapshot I nuked manually after running out of partience during cleanup)

Two step snapshot deletion process would have prevented this from happening in the first place.

2 Likes