File not found: how do I start from scratch?

robdejonge · 24 February 2024 09:09

When I list my current revisions, I can see new revisions are still being added every day. When I try to prune, to remove some older ones, I get an error:

[...]
Deleting snapshot secretslug at revision 100
Deleting snapshot secretslug at revision 107
Deleting snapshot secretslug at revision 158
Chunk 39fc2e096db01751a4f8e67848bf3d41b53448890d5fdf2a44bf54a6bd54bb03 is a fossil
Failed to download the chunk 39fc2e096db01751a4f8e67848bf3d41b53448890d5fdf2a44bf54a6bd54bb03: URL request 'https://f123.backblazeb2.com/file/AnotherSecretSlug/chunks/39/fc2e096db01751a4f8e67848bf3d41b53448890d5fdf2a44bf54a6bd54bb03.fsl' returned 404 File with such name does not exist.

I’ve checked on the Backblaze server and the referenced file looks different from others. It’s marked with (3)* and when expanded I see 3 files with the same name of different dates. The most recent one is 0 bytes. None of these files actually have the .fsl extension referenced above though.

I’ve followed some guides and forum posts here, trying to resolve it. But no matter what I do, it keeps coming back with that 404. I’ve wasted enough time, and am now of the mind to just clear out the B2 bucket and start afresh.

But I’m not sure how to do that. Delete files from the bucket, delete the cache directory, and run duplicacy backup again?

saspus · 24 February 2024 10:09

There should not be any (*) files.

Is this bucket being synced by any other sync software by a chance?

The prune may be failing due to prior interrupted prunes that deleted the chunks but left the now ghost snapshots in the storage.

This is what I would do:

ensure not sync software is touching that bucket
delete all objects that have extra symbols like. ( or ) in the names
run check -persist, and get the list of all bad snapshots (those shlould have been deleted but were not)
delete manually snapshot files corresponding to those bad snapshots
run prune with -exhaustive flag to delete orphans.

Starting over is a bad idea because unless you fix the root cause, how many time are you going to be starting over?

robdejonge · 25 February 2024 07:40

Thanks for the comment!

No other sync software is touching the bucket. I vaguely recall at one point Duplicacy was interrupted mid-backup due to a power outage, which explains why this is happening in two buckets as well. Perhaps that is what caused this situation?

Do I delete all objects with extra symbols on the storage destinations, or the local (what I guess could be referred to as a) cache? Or both?

Upon closer inspection, the characters aren’t in the file name but rather the Backblaze interface. I’ve found another one.

Am I meant to be opening every folder to find these files?

saspus · 25 February 2024 08:18

Oh. This is something else. This looks like some sort of concurrency issue.

No no, they are in the name. After a space. But the B2 UI is just as idiotic as their API.

You can verify with some other tool, like CyberDuck

See, the object names are the same (sans suffix), but one is invalid — zero size.

Backblaze should have never retained zero sized object: before object is uploaded, its size is being made known. It looks like something went wrong there on their side.

This should not matter. Backblaze must either retain fully uploaded chunk or delete partially uploaded. Ideally. If everything works correctly there. Which sometimes it isn’t. See above.

I’m not sure about this. Perhaps the better approach here is to delete zero sized files and rename the wonky named ones to take their place (e.g. with Cyberduck, or with rclone mount + shell script). But where is the guarantee that the wonky one is actually correct?

I’m wondering if you are backing up from multiple machines/multiple threads to the same bucket and B2 API can’t handle concurrency properly?

I’m not sure what’s the best approach here, but since something is definitely screwed up, I would:

To triage:

ask Backblaze what happened with that object
look in duplicacy logs what was supposed to be happening with that object

To fix:

delete all zero sized chunks and wonky named chunks
run check -persist and collect bad revisions
delete those revisions files manually (under snapshots/snapshot-id folders)
make sure check now passes
run prune -exhaustive to get rid of orphaned chunks from snapshots deleted two steps ago

For the future:

consider switching to other storage provider. Or at least to using b2 via s3 api. S3 is de facto industry standard and has higher chance of working properly since much more people use Backblaze via it. (And maybe, because I don’t know.). It’s also possible that b2 bugs and s3 bugs stack together and it’s going to be worse. What a mess. It’s not the first time b2 screws up.
and if you already using s3 per above — maybe switch to b2 by different logic: Backblaze invented b2 to support their infrastructure and have been running it for longer than s3, so perhaps it may work better?

robdejonge · 26 February 2024 05:18

Ok, so there are no files that have odd characters in their name. It’s just the Backblaze web interface that marks a filename when there are multiple.

Nope, that does not work. Using the B2 Backblaze protocol selection, the CyberDuck interface does not show multiple instances of the same file, only one is shown.

I’ve now inspected ‘chunk folders’ 00-4F. Super tedious, I might add.

Where the Backblaze web interface showed there were multiple files with the same name, where one was 0 bytes in size, I deleted the file in those folders. After a while I started paying attention to the dates and they were all from 3-Dec-2023.

In addition, I’ve found 4 instances where Backblaze showed multiple files with the same name, where they were the same size and had the same date and timestamp as well. I’ve noted down those folders, but have not yet removed either one of them. Three of those were dated 18-Oct-2023, one of them was dated 21-Oct-2023.

Nope. Every day at the same time, my fileserver kicks off a cron job that triggers the duplicacy backup command. This is the only place this data exists, and I’ve never

Sharing all of the above with you mid-process to see if anything triggers further thoughts that may cut down on the effort I need to put into this. Although I’d be happy starting with a clean slate, I’m now invested in trying to figure this out. Really appreciate your help in trying to get this back to a working state.

saspus · 26 February 2024 05:52

Hmm. Maybe this is how Backblaze UI shows different versions of the object? Maybe there is a setting somewhere like “hide versions” or something like that? Every time chunk is renamed, a new “version” is created, maybe that’s what it shows? Very strange.

Let’s ignore Backblaze UI for now; Duplicacy sees what Cyberduck sees, so if Cyberduck sees just one version of a file and none of them are zero sized — perhaps it was a red herring.

We then should be able to just run check -persist to identify the affected revisions, and cleaning them manually per “to fix:” section from my previous comment.

robdejonge · 26 February 2024 08:08

I figured what I’d do is keep running duplicacy -persist and have it find every file that it wasn’t happy with, then go in and delete that file if there were multiple versions of it, and start again.

I did that once, and after that it just ran through.

I’ve then ran duplicacy prune -keep 0:360 -keep 30:180 -keep 7:30 -keep 1:7 to get the archive cleaned up, which worked. And subsequently ran duplicacy prune -exclusive which deleted a whole bunch of stuff.

And duplicacy list | wc -l now results in the same # of lines as all my other backups. So I’m happy and accepting that this issue is now solved!

Thanks a lot for your help. I would not have done this had it not been for the guidance!

saspus · 26 February 2024 09:21

I meant to let it finish, and only delete snapshot files that had at least one bad chunk, and the follow with exhaustive prune.

But what you did is better, albeit more labor intensive (there potentially could be many of those)

If you delete the bad file — it should have failed the subsequent check. Since it succeeded — perhaps you renamed the good one? I’m not sure what happened there — but glad it works now!

gchen · 27 February 2024 19:25

This is a bug that was fixed in CLI 2.3.0: https://github.com/gilbertchen/duplicacy/commit/1f9ad0e35c3fb24118d746612d4fad33626228be

This is correct. The chunk in question was marked as a fossil to be delete later. Our b2 backend did that by creating a zero-byte file to ‘hide’ the original version. So there was nothing wrong on Backblaze side.

robdejonge · 28 February 2024 04:47

Thanks for the comment, @gchen.

I’m running version 3.1.0 (27FF3E) though?

gchen · 28 February 2024 13:44

Sorry I meant CLI 3.2.0: Release Duplicacy Command Line Version 3.2.0 · gilbertchen/duplicacy · GitHub

robdejonge · 28 February 2024 14:06

Thank you for clarifying, @gchen.

I guess I should have made sure I have the latest version running! Sorry @saspus for causing all that work!

Now, to find out how I installed Duplicacy so I can figure out how to get and keep it updated in the future!

robdejonge · 29 February 2024 05:18

So … I think I must have just downloaded a compiled executable from GitHub in the past, as it’s in /opt/local/bin where afaik no package managers put their stuff by default.

Please pardon my ignorance and this off-topic question, but: is there a way to easily update to the latest version? Replacing the version number with ‘latest’ in the path or path and filename, both seem to result in a 404 which means maybe a script that runs a wget every once in a while does not work?

system · 10 March 2024 05:19

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.