Failed to decrypt the file snapshots

Raindogtoo · 26 November 2018 12:51

Receiving the following error from CLI on both a prune and a copy job:

Failed to decrypt the file snapshots/J742845-W10-J742845-J742845/3448: No enough encrypted data (0 bytes) provided

Both jobs appear to fail with that error.

I’ve not been watching the jobs as this appears to have been happening for a while.

Thoughts?

Raindogtoo · 26 November 2018 13:18

Note: removed the zero-byte snapshot file and that seems to have addressed the issue. Hopefully that won’t cause other problems.

What could have caused the zero-byte snapshot? Surprised to find that a seemingly minor error like that would crash the CLI.

gchen · 27 November 2018 01:32

Which backend are you using?

Raindogtoo · 27 November 2018 02:32

Local storage on an Ubuntu server.

gchen · 27 November 2018 13:19

I don’t know how zero-byte files can be possible. The local storage or sftp backend always uploads to a temporary file first and renames the file after the upload has completed. I remember that was another user who had the same issue and suspected a kernel panic caused that but I couldn’t find the post.

Raindogtoo · 27 November 2018 13:43

Thanks. A kernel crash is a possibility. I also noticed logs that had errors from locked read-only chunk files. That probably appears to have resolved itself over time.

Any thoughts on tweaking code so that anomalies like this don’t abend the program?

Adaptation · 5 December 2018 03:48

I’ve encountered zero sized chunks likely due to a kernel panic. This was the main motivation for PR 500 which adds verification of chunk lengths.

@gchen It may be PR 500 which you’re referring to?

gchen · 5 December 2018 18:43

Yes, that is the right one. But was there another issue that you submitted before the PR to report the zero-byte files?

Adaptation · 6 December 2018 04:12

I’d skipped creating an issue since I felt the zero byte files were more my fault due to several kernel panics. (I think there’s a power issue with my NAS when the CPUs are heavily loaded and its spun up all the HDDs). Was more focused on how to detect/recover from this scenario…

Hmmm I did also briefly mention it in this thread: Feature Suggestion: Possibility of verifying hash of chunks files using external tools

@Raindogtoo you might want to try a duplicacy check -files to ensure that your backups are intact. Just that those zero byte files can pass the quick do-the-chunks-exist check…

Raindogtoo · 6 December 2018 13:22

Thanks. Running that now.

I had already removed all the zero byte files - which included chunks, temp files, and a couple of snapshots - and have run an initial -check on both local and cloud storages. I’ll see if the more granular check job find anything else.

Related side issue: I’ve noted that once Duplicacy comes across this situation in -copy, -check and -backup jobs, the program reports the error and ends. Initially, I was having to restart the -check jobs ad nauseum just to locate each impacted shapshot. Seems like it would be better behavior for Duplicacy to note the error and attempt to continue with the job. Or maybe a switch that allows more tolerance for snapshot/chunk/temp file corruption?

Adaptation · 7 December 2018 03:21

Agreed that there’s further improvements that can be made to general robustness. While check will continue checking if there’s missing data chunks, I assume this isn’t the case for corrupted snapshots?

I do like the idea of a switch to collect & delay errors. Especially with potentially long unattended operations. As in some cases I’d want to know immediately if there’s a problem with the command I just issued. While in others I’d like it to copy as many chunks as possible until I have time to investigate why some chunks are corrupted.

And yes, I had to go through the same pain of locating chunks until I used a variant of find . -size 0 to do a manual cleanup.

Droolio · 7 December 2018 13:48

I just came across the 0 byte chunk problem myself today. It was entirely my fault - I ran out of disk space.

But cleaning up the mess is a bit of a pain because the check command tells you that all is well (because the chunks exist) but not that they’re the wrong size.

I like the idea of some consistency checking, with hashes rather than remembering chunk sizes though, which I think is a bit of a waste since many chunks will be of the same size (min or max) anyway, but I think a simple logic test to check if any chunks are 0 bytes, during a check command would be quite valuable here.

gchen · 8 December 2018 03:45

I totally agree with adding a check for zero-byte files, but not sure if the chunk lengths should be included in the snapshot files. My concern is that it would considerably increase the sizes of snapshot files. Maybe it should be implemented as an option to the backup command, so users can decide whether to enable it or not.

Adaptation · 9 December 2018 19:30

I’d say that the snapshot space overhead is minimal. Keep in mind we’re talking about storing a value per referenced chunk.

My latest snapshot references 26831 chunks.
With all of the upload lengths being stored in a single chunk of 167,372 bytes.

Which works out to be ~6 bytes per referenced chunk.

Lengths are currently stored using a JSON encoding methodology in the same way as the other snapshot values. If space is really a concern, these could be stored in a more compression friendly binary format. Assuming chunk lengths are roughly similar, there should be some good compression opportunities that we’re missing out on by using JSON.

I’ve also done a little bit more searching on how these zero-byte files came to be. From what I understand ext4 introduced a delayed allocation feature. This delays the allocation of file system blocks after a file is closed and can result in an empty file in the event of a power surge. So while a non-zero check would catch this particular case, I’m concerned that there are other cases where incomplete (non-zero) files could exist.

Anyway, I see this length comparison as a stepping stone to including the uploaded chunk hash for comparison with the backend. Ideally I’d like to be able to regularly verify remote storage integrity without having to transfer chunk contents. Leveraging the remote hashing facilities available on the majority of backend storage services.

Adaptation · 9 December 2018 19:47

@Droolio Duplicacy uses a variable length chunking algorithm and then can compresses those chunks. This introduces a wide range of final chunks sizes, which I believe to be an excellent candidate for a basic integrity check.

As an example; here’s a snippet of file sizes from one of my chunk stores:

  756576  d9db...
 3631754  d9f5...
 2259090  da6d...
 5455306  db15...
 5936194  dc1f...
 4830804  dd3c...
 1519776  de84...
 8762482  dfd5...
 3775090  e16d...
 2090114  e2fe...
 2054842  e30a...
 2159206  e4c8...
 2895832  e541...
 3721536  e710...
 7326982  e7d7...
 5705460  e7e0...
 2639474  e874...

To be clear; I do 100% agree that content hashes would be better, but not every backend is going to support those. So I’m proposing introducing a file length integrity check as a first step.

Droolio · 9 December 2018 21:16

Yes that may be fine for variable size chunks (although you may still have an awful lot of max-size chunk sizes that are exactly the same size), but not necessarily for fixed sized chunks…

The latest incident I encountered with 0 byte files was with Vertical Backup (special version of Duplicacy for ESXi) with 1MB chunks. Admittedly, much of that is compressed so they don’t all turned out to be 1MB exactly.

Incidentally, I had about seven 0-byte chunks which I deleted manually, along with the last two snapshots. After a new backup, a check -files verified that no other chunks were corrupted. So no half-written chunks thanks to running out of space. This was on ext4.

Droolio · 9 December 2018 21:38

OK how about this idea…

I kinda agree that storing hashes (or chunk sizes) in the snapshot files is not the best place. Each snapshot file would have the same duplicate information, and certain operations require loading all the snapshots into memory. Thinking about multi-TB-sized Vertical Backup storage here.

Why not store the raw chunk checksum (hash) as part of the filename?

How much memory overhead would this incur by processing a few extra chars at the end of the usual chunk ID? Would we need a full SHA1 or MD5? Maybe just trimming the last 4 chars would be enough?