Verified_chunks?

background - my set up is i backup to a local storage (in machine, on NTFS unfortunately), which i then copy to Wasabi. i decided to do a ‘full’ check using the check command with

-log check -storage local -chunks -a -tabular

and it failed after only 441 chunks. i then ran it with -persist and it found 10 chunks with errors out of about 100k. i can see the verified_chunks file but it was only updated during that previous failed run, it was never updated on the subsequent. shouldn’t it have been updated again? now running check again it is going to take another 2 hours, and that file still isn’t being updated.

what is the purpose of this file and how is it supposed to work?

as a follow up, a 3rd -persist run has finished also reporting 10 chunks corrupted, but it doesn’t say which ones?

Verified chunks will not be verified again. You can delete that file to have everything re-verified again.

As a side note, if you have to store backup to local disk at least enable erasure coding, while you work on setting up proper storage appliance with btrfs or zfs or, better yet, preparing to move to the cloud.

And definitely don’t copy from local single drive NTFS volume to wasabi; instead, backup to wasabi directly as second destination. If anything, it’s better to copy from wasabi to local drive — as wasabi guarantees data integrity; NTFS does not.

2 Likes

i couldn’t work out how to set this up (backing up to both local and wasabi). is it possible to set it up to have the same end result as backing up to local and then copying to wasabi (or vice versa)? that is what i wanted to do initially but i couldn’t work out how to do it.

can i not recover the corrupted chunks from wasabi which is supposed to be a copy of the local one?

Do you use command line duplicacy or web ui?

If the storage was created as bit-identical you can just copy those chunk files from there.

Otherwise — you can try deleting these chunks, then creating another local snapshot id in the same local storage and making initial backup. As part of that initial backup there is a chance that the files will be processed in the same way resulting in the same chunks; which will then be missing and re uploaded. Then you delete this second snapshot id and prune -exhaustive the datastore to get rid of new orphans.

Alternatively, after deleting corrupted chunks you can run check and delete snapshots that rely on those chunks, hoping that there would not many of them.

i’m using the web ui. this post says what i’m doing is the recommended approach:

i didn’t make it bit-identical. it’ll be a while before i can build an on-site NAS, so i guess i better use erasure coding. can i ‘replace’ the existing local storage by creating another copy-compatible with erasure coding enabled and then copying the existing local to it?

Yep you can do that - the option to do so is in the Web UI, when you create the new copy-compatible storage.

i was able to find the corrupted chunks in the log. i can restore a lot of them by deleting them and then running a new backup against a new snapshot id. is another option to delete a revision that references that chunk from the local storage, and then copy that revision back from cloud storage? will it detect that the chunks are missing and transfer them across?

You don’t need to delete the revision. You just need to delete corrupted chunks and then figure out which revisions reference these chunks (by running a check job). Then you can copy these revisions over from the other storage (if they are copy-compatible).

ok thanks. as it happens i was able to recreate the chunks by creating a new snapshot id (ie backup) and running the initial backup. if it happens again (hopefully not, i’ve converted the local storage to use erasure coding) i’ll try copying from the cloud storage. fwiw the could storage on Wasabi verified as fine (though it took 24 hours!)