So one corrupted chunk can destroy all my backups?

I recently discovered that nine chunks in my storage are empty (0 byte size). I deleted these and followed the instructions for fixing missing chunks, i.e. I changed the repository-ID and ran an “initial” backup, hoping that the missing chunks would be uploaded again. Unfortunately, they weren’t.

From what I understand, the only thing I can do now is delete all snapshots referring to one of the missing chunks, right? The problem is that one of the chunks is included in every single snapshot for that repository. From 1 to 721.

So, I’m wondering the following:

  1. Is there not a design flaw when a single 0-byte-chunk can destroy almost two years of backuphistory? Especially since that chunk might belong to a totally insignificant file, in which case I would still be able to restore all important files.
  2. How can it be possible that this missing chunk is missing even in the very latest snapshot, i.e. the one created after went through the fix missing chunk procedure? If the chunk is still needed in the current snapshot, doesn’t that mean the file still exists in the repository? Why doesn’t duplicacy upload it then?

In case someone is wondering why I didn’t run the check command earlier, the answer is: I have been trying for a long time but kept failing (see here)

1 Like

You’ll first need to figure out why the new initial backup with a different backup id uploaded files that were supposed to be excluded – these files may cause files to be packed differently which will create new chunks.

Even if you can’t recreate those missing chunks, since you only have 9 missing chunks you should be able to restore most of the files. CLI 2.7.0 now supports the -persist option for the restore and check commands. It can also tell you which files can’t be restored at the end.

To restore I would suggest running the CLI directly in a command line window. Create a new empty directory rather than working on the repository directory, initialize the directory and run restore with -persist.

That’s a good point. I didn’t think about that. But how do I figure that out? All I did was this:

Well, just to be precise, it was /.duplicacy-web/duplicacy.json that I edited, in case it matters.

And the change I made is from this

     "computers": [
             "name": "localhost",
             "repositories": [
                     "index": 1,
                     "id": "NAS",
                     "path": "/zfs/NAS",
                     "storage": "pcloud",
                     "global_options": "-v",

to this

     "computers": [
             "name": "localhost",
             "repositories": [
                     "index": 1,
                     "id": "NAS_tmp",
                     "path": "/zfs/NAS",
                     "storage": "pcloud",
                     "global_options": "-v",

After restarting the web-ui, I ran the NAS_tmp backup. So I have no idea how to figure out why the filters no longer applied och how to make sure that they will be applied.

That’s good to know (it’s not documented, though, is it?), but as long as I have missing chunks duplicacy apparently won’t produce any storage statistics (see here: Storage size not showing despite repeated checks)… And I will keep seeing this when running a check:


so that I won’t know whether there are any new missing chunks.

You can add the -d global option to the backup job so it will print out the pattern list and every excluded file.

As for the storage statistics, I would not worry about them before the missing chunks got fixed.

On the bright side, you should be able to continue backing up using the new repository IDs and all backups after that should be recoverable. As you probably noticed, de-duplication avoided having to re-upload most of the chunks again. Thus if I were you, I’d do a check -files -r <latest>* on the new backup just to make sure no chunks are corrupted (as opposed to missing or 0 bytes).

If you happen to identify which files those 9 missing chunks relates to, it still might be possible to re-upload them through an initial backup which consists of the data as it was at the time of the backup - using filters you used back then. i.e. restore to a temporary location using the new -persist flag, of your oldest revisions which first reference these missing chunks. Depending on what files are missing, you might be able to restore them manually from elsewhere (another backup method perhaps?) and then run an initial backup on that temp repo.

Edit: *Oops, meant chunk = check.

1 Like

OK, trying this. However, the web-ui seems to add -a no matter what.

I’m running these options:

But the log starts like this:

Running check command from /tmp/.duplicacy-web/repositories/localhost/all
Options: [-log -v check -storage pcloud -id NAS -files -r 721 -a -tabular]

Will -a and -tabular trump -r 721 or the other way around?

Yea, I’ve only ever used check -files adhoc with the CLI version, and only when I’ve had issues with sftp storage (running out of disk space for example). On my various Windows systems, I run standard check or nothing at all.

I’m betting -a will trump -r, since all implies all IDs as well as all revisions. So indeed, you may have to do that stage from CLI…

The duplicacy web-ui is starting to drive me nuts.

So I’m trying to do

/.duplicacy-web/bin/duplicacy_linux_x64_2.7.0 -log -v check -storage pcloud -id NAS -files -r 721

on the CLI but it just gives me

2020-10-03 01:39:58.789 ERROR REPOSITORY_PATH Repository has not been initialized

even though I am in the root directory of a repository. But since there is no .duplicacy file or folder there, duplicacy won’t know about it. So I guess I need to use the -repository or the .pref-dir option or something but it’s not documented for the check command. And even if it were, I wouldn’t know where which path to give it.

Why can’t the web-ui just let me specify -r 721??

Unless it’s a brand new install and you haven’t run any check or prune commands, you probably should have a .duplicacy directory in the .duplicacy-web/repositories/all location at the very least - which is where I’d run such check commands anyway. Not in the numbered directories. (If you don’t see it there btw, do a ls -al, as dot-files are normally hidden.)

1 Like

Thanks, Droolio, for helping me with this. I managed to kick off /.duplicacy-web/bin/duplicacy_linux_x64_2.7.0 -log -v check -storage pcloud -id NAS -files -r 721

(For some reason, I had to manually enter the webDAV and storage passwords, though)

Listing the chunks went fine, I guess:

2020-10-04 03:10:11.806 INFO SNAPSHOT_CHECK 1 snapshots and 1 revisions
2020-10-04 03:10:11.870 INFO SNAPSHOT_CHECK Total chunk size is 3722G in 3302176 chunks
2020-10-04 03:10:32.805 TRACE SNAPSHOT_VERIFY christoph/.identity
2020-10-04 03:11:00.852 TRACE WEBDAV_ERROR URL request 'PROPFIND chunks/f3/4b0d266bb39d9b870d01c8be4858f9f45f4b3375589f9faf2ecade9de67341' returned an error (Propfind EOF)
2020-10-04 03:11:24.079 TRACE SNAPSHOT_VERIFY christoph/20161021_084341.mp4

Checking the files went fine for a while but then errors started pouring in:

2020-10-04 03:30:15.199 TRACE SNAPSHOT_VERIFY christoph/android backup/BCDAS-20120415-1309/nandroid.md5
2020-10-04 03:34:28.762 TRACE SNAPSHOT_VERIFY christoph/android backup/BCDAS-20120415-1309/system.img
2020-10-04 03:35:31.685 TRACE WEBDAV_ERROR URL request 'GET chunks/7d/d274e5bf88bdfdd0e0f089b9f0dacf47b2d5ec22584e13e3e177b5cb3f46f2' returned an error (Get EOF)
2020-10-04 03:38:45.678 TRACE WEBDAV_ERROR URL request 'GET chunks/da/c1886b7c878723dd866f8fdd77854c7daf218239687b0eb046b9a05b345795' returned an error (Get EOF)
2020-10-04 03:40:07.030 TRACE WEBDAV_ERROR URL request 'GET chunks/f6/bad4e3a2d417092df81819b9986dd886e08efb0d1ac742d31f7eaf110aaa9b' returned an error (Get EOF)
2020-10-04 03:42:47.744 TRACE WEBDAV_ERROR URL request 'GET chunks/1e/61178362132f44bc99e0e54336fd364c1ae359b40a69f839e328b44cc72ced' returned an error (Get EOF)
2020-10-04 03:51:18.611 TRACE WEBDAV_ERROR URL request 'GET chunks/4e/78a5775244f3eebc2921e6ff2e316b56415c3cdcd5b1ce32ba58d47c5dc569' returned an error (Get EOF)
2020-10-04 03:52:39.345 TRACE WEBDAV_ERROR URL request 'GET chunks/ca/35fee822a1f7466d45b42cb09d65edd520351912789c6893f5919ced574b57' returned an error (Get EOF)
2020-10-04 03:53:01.641 TRACE WEBDAV_ERROR URL request 'GET chunks/ca/35fee822a1f7466d45b42cb09d65edd520351912789c6893f5919ced574b57' returned an error (Get EOF)
2020-10-04 03:53:24.063 INFO WEBDAV_RETRY URL request 'GET chunks/ca/35fee822a1f7466d45b42cb09d65edd520351912789c6893f5919ced574b57' returned status code 403
2020-10-04 03:58:19.922 TRACE WEBDAV_ERROR URL request 'PROPFIND chunks/14/97a547984e7771dfcf0e64daddcb52e8c90e0b4aeec3daaddc8cfbb1a6d8cd' returned an error (Propfind EOF)
2020-10-04 03:59:00.852 TRACE WEBDAV_ERROR URL request 'PROPFIND chunks/56/1f82e07c7a1e8f993ec053654958c2c8a17151e707784fbb281f0cc2750dc5' returned an error (Propfind EOF)
2020-10-04 04:05:55.963 TRACE WEBDAV_ERROR URL request 'PROPFIND chunks/56/0f1c6e0cec5db5fc99455f9f8ceaf73b28bf5f5746db1fa5965692f261557a' returned an error (Propfind EOF)
2020-10-04 04:10:54.396 TRACE WEBDAV_ERROR URL request 'PROPFIND chunks/41/183a105c031f1c9ed2f891a940d2157cef22e2b4c5b4448c1b2fa010329711' returned an error (Propfind EOF)
2020-10-04 04:12:55.458 TRACE WEBDAV_ERROR URL request 'GET chunks/34/2a7a0436036f338c7bfe930d192880a89aa8dbcc56606ec91e03888c4238b9' returned an error (Get EOF)
2020-10-04 04:15:05.613 TRACE WEBDAV_ERROR URL request 'PROPFIND chunks/93/2096afe9d943c9ca458a42a6667fc03d9450c23553bd9453280b62e010808a' returned an error (Propfind EOF)

Over the course of about 7 hours, the output continues like that but in total it is only about 1-200 errors. But there are no positive reports about files having been checked either. So I’m not sure whether this means duplicacy is slowing down because of backoff or what.

The last couple of output lines are these:

2020-10-04 12:09:14.514 TRACE WEBDAV_ERROR URL request 'PROPFIND chunks/27/e3d4f5208fc0546685b7ea3819b6bf27ba2cd0f03ffc7bfa8810d8a21c7c5a' returned an error (Propfind EOF)
2020-10-04 12:10:46.558 INFO WEBDAV_RETRY URL request 'GET chunks/b6/717cc55e12002476b0014d7d5823ad5ae24c2d2b21972083dfbdbc345dc142' returned status code 403
2020-10-04 12:12:55.755 TRACE WEBDAV_ERROR URL request 'GET chunks/67/6f86190b93f27e2accba81de1a6f25d56b9b7f9a9f3251d4053a859bbe0d15' returned an error (Get EOF)
2020-10-04 12:13:03.177 TRACE WEBDAV_ERROR URL request 'PROPFIND chunks/4d/bfcd761fe3261502627c66fee6df4ff43292717f4498c42cbe8e407a0a6e36' returned an error (Propfind EOF)
2020-10-04 12:15:51.295 TRACE WEBDAV_ERROR URL request 'GET chunks/b6/79a17118b5c201935da2fd7470f570e31743dd05ff2d3018edd3057f42992c' returned an error (Get EOF)

What do I do with this?

@gchen, regarding the 403 errors I found this:

In WebDAV, the 403 Forbidden response will be returned by the server if the client issued a PROPFIND request but did not also issue the required Depth header or issued a Depth header of infinity

I don’t know whether the pcloud webdav interface adheres to the RFC, but it might be worth checking whether duplicacy might sometimes issue a faulty “Depth header” (whatever that is)…

Job completed (?) after about 40 hours. But is it supposed to look like this? @gchen

2020-10-05 16:44:49.292 TRACE SNAPSHOT_VERIFY media/Music/c/Glenn Gould/Glenn Gould - Bach - Concertos for Keyboard & Strings/cd1/folder.jpg
2020-10-05 16:45:14.586 WARN DOWNLOAD_RETRY Failed to download the chunk 58cb89f994146c730c22a932d32918839ee02ebb26aad74e7170f7d7ee3c9f17: unexpected EOF; retrying
2020-10-05 16:45:48.239 WARN DOWNLOAD_RETRY Failed to download the chunk 0ba2903dd61c58a1230d934ac510d43d6f5caad12d54c3ca7e16e18185a1982e: unexpected EOF; retrying
2020-10-05 16:46:31.821 INFO WEBDAV_RETRY URL request 'GET chunks/0b/a2903dd61c58a1230d934ac510d43d6f5caad12d54c3ca7e16e18185a1982e' returned status code 403
2020-10-05 16:47:35.077 WARN DOWNLOAD_RETRY Failed to download the chunk 0ba2903dd61c58a1230d934ac510d43d6f5caad12d54c3ca7e16e18185a1982e: unexpected EOF; retrying
2020-10-05 16:47:59.806 WARN DOWNLOAD_RETRY Failed to download the chunk 0ba2903dd61c58a1230d934ac510d43d6f5caad12d54c3ca7e16e18185a1982e: unexpected EOF; retrying
2020-10-05 16:48:33.752 ERROR DOWNLOAD_CHUNK Failed to download the chunk 0ba2903dd61c58a1230d934ac510d43d6f5caad12d54c3ca7e16e18185a1982e: unexpected EOF

I was expecting som kind of summary at the end, telling me the overall results…

The job failed with this error:

2020-10-05 16:48:33.752 ERROR DOWNLOAD_CHUNK Failed to download the chunk 0ba2903dd61c58a1230d934ac510d43d6f5caad12d54c3ca7e16e18185a1982e: unexpected EOF

The log showed a few attempts to upload the same chunk before this error is thrown. This means the maximum number of retries (12) had been reached and Duplicacy had to bail out.

I would suggest using the -chunks option instead of -files and also using at least 4 threads:

web/bin/duplicacy_linux_x64_2.7.0 -log -v check -storage pcloud -id NAS -chunks -threads 4 -r 721

So the ERROR part is crucial, I see. (And the absence of “retrying” at the end, but that difference doesn’t seem to be implemented consistently, e.g. the log here doesn’t have any retrying even though duplicacy was retrying.)

Why? Will it still tell me which files can’t be restored due to missing chunks?

Sorry I forgot about that. But I would then suggest restoring all files to a local disk with enough space if possible, because that way if it fails again you’ll skip all restored files quickly on a second retry. Besides, the restore command supports multithreading.

I think the repository is too big to restore on my existing harddrive space. Could perhaps spin up some external harddrives but I think duplicacy needs the free space all on one drive.

So I kicked off /.duplicacy-web/bin/duplicacy_linux_x64_2.7.0 -log -v check -storage pcloud -id NAS -files -threads 4 -r 721

At the beginning, I had almost no error messages for a couple of hours, but now I’m seeing nothing but this:

2020-10-07 23:57:39.560 TRACE WEBDAV_ERROR URL request 'PROPFIND chunks/b4/8a667a3d6ec3503bf289e0589e5a7f1d0e916302a94a4abb7a5f11672c2d8a' returned an error (Propfind EOF)
2020-10-07 23:58:50.099 INFO WEBDAV_RETRY URL request 'GET chunks/27/e820ff7797c21701333c82a8b8b30f84f2af723f7aa1bc8d49d1226a0cb7b9' returned status code 403
2020-10-07 23:59:16.294 WARN DOWNLOAD_RETRY Failed to download the chunk 27e820ff7797c21701333c82a8b8b30f84f2af723f7aa1bc8d49d1226a0cb7b9: unexpected EOF; retrying
2020-10-08 00:02:15.189 TRACE WEBDAV_ERROR URL request 'PROPFIND chunks/4d/d4b798206214d90150a45230e872bce48c9f7bf0416effa0e057dd41b00b15' returned an error (Propfind EOF)
2020-10-08 00:05:25.632 TRACE WEBDAV_ERROR URL request 'GET chunks/1f/5616233b2f730aac172b5a9c487783d81423e8ef9688c71765f555b375beae' returned an error (Get EOF)
2020-10-08 00:08:17.640 TRACE WEBDAV_ERROR URL request 'PROPFIND chunks/f5/4f2f9981b00536b40b91a4cd23b39487db4b6f6d50e23b0627085949e77e30' returned an error (Propfind EOF)
2020-10-08 00:09:08.062 TRACE WEBDAV_ERROR URL request 'GET chunks/13/ac1c54283e0c0eb47d0af296e30cefaa0ff8490f5d62bcd8dcde76ea59dce5' returned an error (Get EOF)
2020-10-08 00:09:24.956 TRACE WEBDAV_ERROR URL request 'PROPFIND chunks/05/12c1cb1d37c105e2e877be684e38ccf3ddbf72fc1e08b61149934ef5f6877c' returned an error (Propfind EOF)
2020-10-08 00:10:06.582 TRACE WEBDAV_ERROR URL request 'PROPFIND chunks/f2/e5d5d34b259917cc3abe6c5c4ab5606055cdbba676383849c76ee5765a7237' returned an error (Propfind EOF)
2020-10-08 00:12:25.434 TRACE WEBDAV_ERROR URL request 'GET chunks/f5/f61594819fa163c9f4a46f45cca07f42da48d1b9f6f9ebe977201d4655d8bb' returned an error (Get EOF)
2020-10-08 00:14:13.252 TRACE WEBDAV_ERROR URL request 'GET chunks/4d/3c536714f53687e9ba7fd95b50a1212afe10833f4618279761747b8b935daa' returned an error (Get EOF)

What does this mean?

Previously it was listing files, but since 24 hours I have not seen a single one. Does that mean that duplicacy didn’t manage to verify even one file during that time? Or is duplicacy quite successfull in the background and only letting me know about the odd error here or there?

I’m thinking I should try contacting pCloud support once again. But I need to understand what kind of requests duplicacy is sending and what their server is answering. It it correct if I say something like this: “I’m sending propfind requests for specific files and the server keeps responding EOF”?

Also, could you confirm that duplicacy’s depth header is correct (see below)?

I just saw this:

@gchen Is duplicacy still have that setting? Wouldn’t that explain the 403 errors?

No, I only tested that change and it didn’t make to the code, because a depth of infinity isn’t supported by some WebDAV providers.

The depth sent by the WebDAV backend is either 0 or 1.