The ol' Missing Chunks post

I’m sure you get this a lot, due to my google search it seems that way but I cannot clear this issue.

I read the support post that said to delete the cache first and foremost, that worked and all chunks passed the next check. As soon as the next backup ran, then the next check, the problem is back.

This time I delete all the referenced revisions, and delete the cache, again, the next check is happy. The next backup, then a check, and the issue is back.

How do I overcome this? It’s becoming frustrating.

What is your backend?

Are you saying you

  • clean cache
  • run check —ok
  • run backup
  • run check — not OK?

Questions:

  • Is there only backup in the third step or also a prune?
  • What happens if you add -fossils to your check command — does the check pass then?
  • is this the only machine that backs up to the same snapshot id?

Hello,

So my backend is Windows 10 Pro, using WEBUI version.
You have the steps exactly correct, and no 3rd step I have prune off during this time because its not working, I did manually run prune once during one of my many fix attempts to see if that removed any orphans that may help me but alas it did not.
I will add the fossils directive and try again and let you know.
Lastly, yes this is the one machine that backs up to that snapshot ID, I backup 2 machines to google drive using duplicacy, the other machine has no issues, no missing chunks.

Actually i want to add a step to the process you listed

  • Remove bad revisions
  • Clean cache
  • Run check = ok
  • run backup
  • run check = missing chunks

So adding -fossils didn’t work, I have some screenshots for you. It says missing chunks again on the schedule page (and I got a corresponding email that told me the same) but the storage page says all chunks are there, is this discrepancy normal?

https://tinyurl.com/yjgkddt4

https://tinyurl.com/yg9u27nr

Now i just want to add (reiterate) that my other machine does not complain about missing chunks ever, so as much as wish this was a false positive i cannot live with that. There is certainly and issue to be fixed…

Can you elaborate on this one? Do you backup to windows 10 Pro machine? Via SMB? FTP? SFTP? any other storage server?

On this screenshot the destination says Cloud – what is that cloud?

Does it backup to the same destination?

Google drive to answer most of your questions. Google drive on both pc’s. I even found another post that mentioned it could be the machines hard drive if there is a hard drive issue so I moved the temp folder to another drive, deleted the revisions again re checked, re uploaded and I just keep getting the same result.

So both machines backup to the same google drive account.

  • Do they both use same datastore?
  • If so, can you confirm that the snapshot IDs they are using are different?
  • When check fails – for which revisions does it fail? newly created one or older ones?

Regardless of last two questions: if multiple machines are backing up to the same google drive destination you may be hitting this issue: Repeatable Chunk check failure with google drive (method to fix that is in the same thread)

By data store do you mean folder?

I created duplicacy-machine1
And duplicacy-matching2

Each machine backs up to its own folder

The snapshot ids would be in different locations no?

It’s always newer revisions, I keep deleting it and re-uploading it and getting the same result. I have (as you saw) 4 backup jobs. I’m backing up 4 different drives into their own folders on google drive. Each backup job has its own revision error. Drove m for example always has an issue on revision 6 the most recent one it tries to make.

Where drive P only has 2 revisions and revision 2 is always bad, delete it, delete the cache, re upload and corrupted again.

Right. These four jobs upload data to a single duplicacy datastore, named Cloud. .

In other words, there is a duplicacy data folder somewhere in you My Drive that contains the following folder structure

chunks/
    some stuff here
snapshots/
     one/
     two/
     three/
     four/
config

The four subfolders under snapshots correspond to your backup jobs/snapshot IDs. Right? If so, then this statement

is misleading: you are backing up everything into a single “folder” (aka duplciacy datastore). Chunks from all jobs end up in the chunks/ folder there.

And if you do that from multiple computers then you absolutely can hit that concurrency artifact from my previous comment.

Connect your google account to rclone, run rclone dedupe and then try to reproduce the issue again. If it started working normally – then that’s what the problem was.

(Another possibility is that you are keeping your duplicacy datastore on a google shared folder, which has limits on the number of files, but seeing the sheer amount of chunks reported you are way past that limit, which means your datastore is in My Drive folder, and not shared folder, and therefore this is not a concern).

1 Like

In addition to what @saspus said, there may be another concurrency issue…

Because you have a relatively large backup storage, with a lot of chunks, doing a check operation while backups are running (or visa versa) may lead to a false positive.

The first thing check does is to list all chunks, and if this step takes a long while, a backup could still be uploading new chunks and eventually a snapshot.

I’ve run into this issue a few times before, with a large Vertical Backup on a sluggish sftp, where the lsiting of chunks takes half hour or more.

Something to consider…

Edit: If this is the case, this just means the check operation is inconsistent and a fresh run - without backups running - may show all chunks present.

2 Likes

As someone who has dealt / dealing with this.

Since there is sadly not a 1click fix. I tend to purge from both ends the Storage and re-setup or Create a New Backup entirely.

Not a fix for everyone. but i have 2 servers this tends to creep up on and i just gave up fixing it in the end. Nothing wrong with Duplicacy. But I just cant deal with the 10 steps for it to “maybe” resolve it.