How is it possible that a missing chunk suddenly reappears?

A couple of days ago, I ran a check job and was informed that a chunk is missing in my storage:

2020-02-12 23:05:30.421 INFO SNAPSHOT_CHECK All chunks referenced by snapshot SERVER at revision 412 exist
2020-02-12 23:05:30.573 INFO SNAPSHOT_CHECK All chunks referenced by snapshot SERVER at revision 413 exist
2020-02-12 23:05:30.689 INFO SNAPSHOT_CHECK All chunks referenced by snapshot SERVER at revision 414 exist
2020-02-12 23:05:31.433 WARN SNAPSHOT_VALIDATE Chunk 188056b64444b232ea25a073c874f1f36614e88caba80eb780dca8c2d4268fdb referenced by snapshot SERVER at revision 415 does not exist
2020-02-12 23:05:31.441 WARN SNAPSHOT_CHECK Some chunks referenced by snapshot SERVER at revision 415 are missing
2020-02-12 23:05:32.132 WARN SNAPSHOT_VALIDATE Chunk 188056b64444b232ea25a073c874f1f36614e88caba80eb780dca8c2d4268fdb referenced by snapshot SERVER at revision 416 does not exist
2020-02-12 23:05:32.147 WARN SNAPSHOT_CHECK Some chunks referenced by snapshot SERVER at revision 416 are missing
2020-02-12 23:05:35.135 WARN SNAPSHOT_VALIDATE Chunk 188056b64444b232ea25a073c874f1f36614e88caba80eb780dca8c2d4268fdb referenced by snapshot SERVER at revision 417 does not exist
2020-02-12 23:05:35.139 WARN SNAPSHOT_CHECK Some chunks referenced by snapshot SERVER at revision 417 are missing
2020-02-12 23:05:38.958 WARN SNAPSHOT_VALIDATE Chunk 188056b64444b232ea25a073c874f1f36614e88caba80eb780dca8c2d4268fdb referenced by snapshot SERVER at revision 418 does not exist
2020-02-12 23:05:38.965 WARN SNAPSHOT_CHECK Some chunks referenced by snapshot SERVER at revision 418 are missing
2020-02-12 23:05:38.965 ERROR SNAPSHOT_CHECK Some chunks referenced by some snapshots do not exist in the storage
Some chunks referenced by some snapshots do not exist in the storage

I checked in /chunks/18/ for 8056b64444b232ea25a073c874f1f36614e88caba80eb780dca8c2d4268fdb but it was really not there. So I thought: Ok, we have a problem. But I didn’t have time to deal with it.

A few days later, the check completes without errors:

2020-02-15 21:07:45.725 INFO SNAPSHOT_CHECK All chunks referenced by snapshot SERVER at revision 412 exist
2020-02-15 21:07:45.837 INFO SNAPSHOT_CHECK All chunks referenced by snapshot SERVER at revision 413 exist
2020-02-15 21:07:45.948 INFO SNAPSHOT_CHECK All chunks referenced by snapshot SERVER at revision 414 exist
2020-02-15 21:07:46.163 INFO SNAPSHOT_CHECK All chunks referenced by snapshot SERVER at revision 415 exist
2020-02-15 21:07:46.372 INFO SNAPSHOT_CHECK All chunks referenced by snapshot SERVER at revision 416 exist
2020-02-15 21:07:46.526 INFO SNAPSHOT_CHECK All chunks referenced by snapshot SERVER at revision 417 exist
2020-02-15 21:07:46.657 INFO SNAPSHOT_CHECK All chunks referenced by snapshot SERVER at revision 418 exist
2020-02-15 21:07:50.171 INFO SNAPSHOT_VALIDATE Chunk c1278ed19c6e2cccdb5ece93cbdd876c565d70be252370b0e13e4d153cf53e79 is confirmed to exist
2020-02-15 21:07:50.376 INFO SNAPSHOT_VALIDATE Chunk 1e48d09c136dc4e7240466a5dc6b0807c38a230be602649a2978791df8d28c14 is confirmed to exist
2020-02-15 21:07:50.607 INFO SNAPSHOT_VALIDATE Chunk ff54d18a2666e81695ba57270060ce804e94ba54488918c8d62512b50b989da3 is confirmed to exist
2020-02-15 21:07:50.815 INFO SNAPSHOT_VALIDATE Chunk 3abe57775f37e95a95584d1cc474fbdd2586d5886df1f86508d32b74ec260a0d is confirmed to exist
2020-02-15 21:07:51.100 INFO SNAPSHOT_VALIDATE Chunk d07ddaf9ee13a44f335d8981a517833dabf55a4e4852143a8600a3dda5612395 is confirmed to exist
2020-02-15 21:07:51.492 INFO SNAPSHOT_VALIDATE Chunk 3538bd69bc76dc23fe1288de2abea980d3f82d2249aca08b49aead45ef9b4fae is confirmed to exist
2020-02-15 21:07:51.726 INFO SNAPSHOT_VALIDATE Chunk 539f196612f92e77e8108ebf8d0e0e920db35c99c2b3cfcc347544884fa04590 is confirmed to exist
2020-02-15 21:07:51.940 INFO SNAPSHOT_VALIDATE Chunk 650785b2c9955d8bed2c7855483feb9d142f5b9e4f57541fe9f8fe278084f829 is confirmed to exist
2020-02-15 21:07:52.622 INFO SNAPSHOT_VALIDATE Chunk ac593ca683a109dad251863c2a75e9ed39aeec78e87e872e9cad55621f39555f is confirmed to exist
[... snip .... many more chunks confirmed]
2020-02-15 21:08:30.089 INFO SNAPSHOT_CHECK All chunks referenced by snapshot SERVER at revision 419 exist

Between these two checks, the only thing that happend where a couple of daily backup jobs. Definitely no prune.

How is it possible that a missing chunk suddenly reappears? Or - since I do understand that it was uploaded by one of the backup jobs (the chunk’s creation date on the storage is 14/02/2020, 04:08:08) - maybe I should ask: how could it be reported missing in the first place?

A pCloud issue?

1 Like

This may not answer your question but I’ve had a few instances of a check reporting missing chunks only for it to later be OK…

TBH I haven’t heavily researched into it but I suspect it’s because the checks run in the middle of a lengthy copy operation which happens over the internet via sftp for the best part of each weekend. This is a 1TB+ storage of fixed chunk sized (1MB) for Vertical Backup VMs, so quite a lot of chunks. Though it normally only seems to report the last revisions as having a missing chunk and I don’t have to do anything to fix it.

Another thing to note is that check, by default, won’t see fossilised chunks. Is it possible the chunk was fossilised and later resurrected? Did you look for that specific chunk name, or did you try say ls 8056b64444* to see if there were any chunks with the .fsl file extension?

The is confirmed to exist message is printed here:

The commit message explains why we need an extra check:

A chunk not in the chunk list may actually exists in two scenarios:

  • the chunk may be a special snapshot chunk that contains the chunk sequence,
    so it may be resurrected by the chunk downloader if it had been turned into
    a fossil before
  • if the API to list all chunks doesn’t return the complete list due to some
    bug

This additional lookup avoid reporting the missing chunk prematurely.

I don’t think it is the first case here, because two checks didn’t give the same results. So it must be case 2. Which storage backend is this?