Fix missing chunks

snairolf · 29 July 2020 15:06

As per the response in FATAL DOWNLOAD_CHUNK Chunk (w/ Wasabi) it should be pointed out that if someone deletes the snapshots they should also delete the .duplicacy/cache to make sure it works.

gchen · 29 July 2020 20:27

I’ve updated the guide. Thanks for pointing it out.

saspus · 1 August 2020 00:45

Can this please be automatic? i.e. if I run duplicacy -prune -r 1000-1003 I’d expect duplicacy to manage cache accordingly and keep it up to date. Or simply nuke it for me.

I knew about this and yet wasted few minutes today with this issue again… I would not expect users to go and read the documentation; they will panic and go create new support topic…

gchen · 1 August 2020 23:01

In the current implement the prune command does delete the copy from the cache when deleting a snapshot file, but it only does it in the cache under the current repository. It can’t do it for other repositories on the same computer or a different one.

saspus · 1 August 2020 23:57

Oh, you are absolutely right. Now thinking about it that’s exactly what happened. Maybe Duplicacy should annotate the storage with which client last performed prune and clients would distrust cache if it wasn’t them?

gchen · 2 August 2020 00:47

I think the solution is to compare the timestamp of the cached copy with that of the file in the storage. However, due to an oversight in the design, the backend API doesn’t return the modification times when listing files in the storage (although most storages should support it).

Christoph · 30 September 2020 19:44

Where do I find that file in a duplicacy-web install (on linux)?

gchen · 30 September 2020 20:35

Those preferences files are auto-generated in the web GUI so it is not recommended to modify them. If you want to change the repository id (which is called a backup id in the web GUI), just create a new backup with a new backup id.

Christoph · 30 September 2020 20:38

Is there a way to duplicate and modify an existing backup? In order to follow the above instructions I obviously also need to use the same filters…

gchen · 30 September 2020 20:43

You can edit ~/.duplicacy-web/duplicacy.json directly – find the backup in computers -> repositories and then change the id.

Christoph · 30 September 2020 20:51

Changing that doesn’t change the backup ID in the UI. Will it still work?

gchen · 30 September 2020 20:59

Forgot to mention that you’ll need to restart the web GUI for the changes in duplicacy.json to take effect. Better yet, edit duplicacy.json while the web GUI is not running otherwise your changes may be overwritten.

Christoph · 30 September 2020 21:01

Can I restart the web-ui while a backup job is running?

gchen · 1 October 2020 00:47

Here is my reply from the other thread earlier today:

The CLI can be terminated any time and it shouldn’t leave any half-uploaded files on the cloud storage server, if the server behaves properly, because the content length is always set and the server should never store an incomplete chunk file shorter than the content length. OneDrive for Business is an exception but we’ve fixed that in the latest CLI release by using a different upload API.

For non-cloud storages like sftp and local disk, the CLI uploads to a temporary file first and then rename the temporary file once the upload completes. Aborting should cause any partial upload.

Christoph · 1 October 2020 06:15

So are you saying that restarting the web-ui will stopp the cli but ir doesn’t matter?

BTW: you can quote text from other topics/threads. That will create links between those topics.

Christoph · 1 October 2020 18:01

So I just waited for the backup to finish and then edited duplicacy.json, then restarted the web-ui. The new backup-ID showed up in the ui, and the backup went through without problems. But I don’t think it worked as intended because it uploaded tons of files that were supposed to be excluded (and which were excluded before renaming the ID). Might it be that renaming the repo in the .json file results in duplicacy displaying filters in the web-ui but not actually applying them?

JarnoP · 3 December 2020 09:10

There is also a case with B2 that the chunk exists, but there are multiple versions of it and the latest one is zero size. I did not find instructions what to do in this case.

Screenshot 2020-12-03 110902

SOLVED:
The solution seems to be to log in B2, locate the “missing” chunks and delete the zero-sized versions. I have no idea why those have been created in the first place, but I suspect an interrupted backup. I am not just sure does this method guarantee that the chunk content is valid anymore…

noah.e.miller · 8 February 2021 04:50

I have been using Duplicacy for a couple years to back up several repositories to the same storage. All the repositories are on my computer and nobody else backs up to the storage. A couple days ago, I started getting an error when I run the check command (normally I don’t include -fossils but you’ll see why I’m including it in this case):

$ duplicacy check -a -fossils
Repository set to /Users/me
Storage set to b2://bucket-name
download URL is: https://f002.backblazeb2.com
Listing all chunks
17 snapshots and 1252 revisions
Total chunk size is 477,541M in 121572 chunks
All chunks referenced by snapshot usr-local at revision 1 exist
All chunks referenced by snapshot usr-local at revision 32 exist
...
All chunks referenced by snapshot Documents-other at revision 158 exist
All chunks referenced by snapshot Documents-other at revision 194 exist
Chunk aafaf71f51fa153647ad4266668c63c808439e3162b8a1d4888a93201549f425 can't be found

I checked the storage; the chunk is not in the “aa” directory of the “chunks” directory of the storage. Grepping for the chunk id in all the repositories’ log directories, I find:

Marked fossil aafaf71f51fa153647ad4266668c63c808439e3162b8a1d4888a93201549f425

The explanation on this page says

This is because another ongoing backup that was seen by the prune command may reference any of these chunks. To be safe, the prune command will turn them into fossils, which can be either permanently removed if no such backup exists, or turned back into normal chunks otherwise.

However I don’t see a corresponding log entry saying the chunk was permanently removed. (In contrast, the logs mention other chunks that have been permanently removed.) So I have two issues:

If the logs say the chunk was marked as a fossil, but they don’t say it has been removed, shouldn’t it still exist?
How can I determine which revision the missing chunk belongs to, so I can delete the snapshot as described above? The error message, as I have shown, does not give me a revision number.

I’m running CLI version 2.6.1 (ACEF01) on Mac OS 10.14.6 Mojave. Thanks in advance for your attention.

gchen · 8 February 2021 16:22

What is your B2 lifecycle setting? See Should I disable Backblaze B2 Cloud Lifecycle Settings?

If it is not set to keep all versions fossils may be deleted automatically by B2.

noah.e.miller · 8 February 2021 17:04

Thank you. My B2 bucket lifecycle setting was not keep all versions, and I corrected that. Just so I understand how this is relevant to fossils: Duplicacy marks a chunk as a fossil by moving/renaming it, but B2 doesn’t support move/rename, so a workaround is used that may lead to errors if B2 is not told to keep all versions – is that correct?

My other question is, what’s the easiest way to tell which revision(s) have missing chunks so I can delete them? The check command isn’t telling me.