Chunk can't be found

Droolio · 10 March 2020 03:17

It’s still not clear if the missing chunks in your case, are from historic revisions or is missing from the last backup. Your logs should say what revisions the missing chunk pertains to. So hopefully you only have to delete revisions that are affected.

Also, if it’s not missing from the last backup, you only need to delete the revisions they’re missing for. If it’s missing for the last backup, you need to ‘fix’ the missing chunk or delete enough revisions until you have an intact most-recent snapshot as your last revision.

I wouldn’t use prune to delete these revisions as it may abort when it discovers a chunk is missing, and cause more chunks to be renamed to fossils and not properly collected.

To delete the revisions, you simply delete the numbered files under the storage in the \snapshots\<repository_id> folder…

As far as fixing it upon the next upload, it’s not quite as simple as that. To enable incremental backup, Duplicacy assumes that all the chunks referenced in the last revision actually exist. It doesn’t check if they do, so will skip them regardless.

But you can often force these missing chunks to be re-uploaded (if it’s still part of you existing repository and not old data that’s long been deleted). One way to do this is to create a new temporary backup ID - pointing at the same repository location. Running such a backup will check which chunks exist on the storage and then rescan the whole repository and re-upload everything, skipping chunks that don’t need to be uploaded, uploading only the missing chunk(s). This should fix things but there’s a small chance the chunk doesn’t get recreated from the current state of the repository.

Oh and if you delete any revisions, keep a copy or put them in the recycle bin. If you’re able to restore the missing chunk(s), those revisions will be good again. Put them back and your storage should be good again.

kevinvinv · 10 March 2020 03:41

Thanks much!

One question if I may… what logs do you think will tell me what revisions need the missing chunk?

I am not seeing that info in the check log… is there a different log I should look at?

Thanks again.

Droolio · 10 March 2020 04:50

Yes, it should be in the log for the check job. On the Dashboard at the bottom under Activities, hover your mouse over the red line - a ‘check’ should show up which you can click and that particular log opens.

For reference, an example line should look like (scroll right):

2019-04-14 04:21:42.950 WARN SNAPSHOT_VALIDATE Chunk 605dcc6c289af06ebe14e8978028b7eea03b5e1fcc8332198c3950c0ff122221 referenced by snapshot Redacted at revision 231 does not exist

Edit: Oh and you may have several of those lines, followed by a summary for each revision:

2019-04-14 04:21:44.980 WARN SNAPSHOT_CHECK Some chunks referenced by snapshot Redacted at revision 231 are missing

kevinvinv · 12 March 2020 02:12

wow I dont see anything like that… perhaps I need to upgrade? Here is all I see:

Running check command from /Users/kevinvdel/.duplicacy-web/repositories/localhost/all
Options: [-log check -storage storage_timberwolf -a -tabular]
2020-03-09 00:09:39.707 INFO STORAGE_SET Storage set to sftp://backups@my_machine//share/homes/backups/KevinV
2020-03-09 00:09:40.085 INFO SNAPSHOT_CHECK Listing all chunks
2020-03-09 00:10:20.963 INFO SNAPSHOT_CHECK 1 snapshots and 63 revisions
2020-03-09 00:10:20.965 INFO SNAPSHOT_CHECK Total chunk size is 311,203M in 79497 chunks
2020-03-09 00:10:20.969 FATAL DOWNLOAD_CHUNK Chunk db537514f0d6d66e4662622438b843cbab8f042f7a4ecaef0e6de78f3d7628cd can't be found
Chunk db537514f0d6d66e4662622438b843cbab8f042f7a4ecaef0e6de78f3d7628cd can't be found

Do you think it is a version thing?

Droolio · 12 March 2020 19:08

It shouldn’t be - Duplicacy Web should automatically download the latest CLI engine, although I think it only does that on startup. Double-check that your Web client is up-to-date, though.

Maybe the chunk in question is a metadata chunk, and therefore may not highlight which snapshot revision it’s failing to load the data for?

Otherwise, you should see a line for each revision like so:

2020-03-12 08:46:00.545 INFO SNAPSHOT_CHECK All chunks referenced by snapshot dolores-c_users_droolio at revision 1 exist
2020-03-12 08:46:00.556 INFO SNAPSHOT_CHECK All chunks referenced by snapshot dolores-c_users_droolio at revision 34 exist
2020-03-12 08:46:00.571 INFO SNAPSHOT_CHECK All chunks referenced by snapshot dolores-c_users_droolio at revision 67 exist
2020-03-12 08:46:00.588 INFO SNAPSHOT_CHECK All chunks referenced by snapshot dolores-c_users_droolio at revision 240 exist
...
2020-03-12 08:46:09.532 INFO SNAPSHOT_CHECK All chunks referenced by snapshot dolores-c_users_droolio at revision 5750 exist

etc.

kevinvinv · 29 March 2020 20:23

Reviving this… I backed off all the way to revision 1 and a check ran clean… then less than a few days later I am getting this again from check.

I am getting fairly desperate here… any other ideas?

Running check command from /Users/kevinvsdel/.duplicacy-web/repositories/localhost/all
Options: [-log check -storage storage_timberwolf -a -tabular]
2020-03-29 05:57:46.411 INFO STORAGE_SET Storage set to sftp://backups@kv.dns.org:2222//share/homes/backups/KevinV
2020-03-29 05:57:46.928 INFO SNAPSHOT_CHECK Listing all chunks
2020-03-29 08:32:46.975 WARN SFTP_RETRY Encountered an error (failed to send packet: EOF); retry after 1 second(s)
2020-03-29 08:32:55.394 INFO SNAPSHOT_CHECK 1 snapshots and 12 revisions
2020-03-29 08:32:55.395 INFO SNAPSHOT_CHECK Total chunk size is 323,570M in 83835 chunks
2020-03-29 08:32:55.400 FATAL DOWNLOAD_CHUNK Chunk db537514f0d6d66e4662622438b843cbab8f042f7a4ecaef0e6de78f3d7628cd can’t be found
Chunk db537514f0d6d66e4662622438b843cbab8f042f7a4ecaef0e6de78f3d7628cd can’t be found

gchen · 30 March 2020 02:17

Does this chunk exist in the storage?

If not, run the check command on one revision at a time (duplicacy check -r n) to see which revision this chunk is in. And the go through all the check logs (~/.duplicacy-web/check*.log) to find out since when the chunk became missing, and also all prune logs (~/.duplicacy-web/prune*.log) to see if any prune command remove this chunk.

kevinvinv · 31 March 2020 01:03

OK so this is weird… thanks @gchen for your info.

I did this on the command line:

duplicacy -log init -e kevin_salvage1 sftp://backups@kv.dns.org:2222//share/hoxdrmes/dataps/KevinV

duplicacy -log check

And got this:
2020-03-30 18:57:06.098 INFO SNAPSHOT_CHECK Listing all chunks
2020-03-30 18:57:12.875 INFO SNAPSHOT_CHECK All chunks referenced by snapshot kevin_salvage1 at revision 1 exist
2020-03-30 18:57:13.158 INFO SNAPSHOT_CHECK All chunks referenced by snapshot kevin_salvage1 at revision 2 exist
2020-03-30 18:57:13.430 INFO SNAPSHOT_CHECK All chunks referenced by snapshot kevin_salvage1 at revision 3 exist
2020-03-30 18:57:13.702 INFO SNAPSHOT_CHECK All chunks referenced by snapshot kevin_salvage1 at revision 4 exist
2020-03-30 18:57:13.995 INFO SNAPSHOT_CHECK All chunks referenced by snapshot kevin_salvage1 at revision 5 exist
2020-03-30 18:57:14.279 INFO SNAPSHOT_CHECK All chunks referenced by snapshot kevin_salvage1 at revision 6 exist
2020-03-30 18:57:14.551 INFO SNAPSHOT_CHECK All chunks referenced by snapshot kevin_salvage1 at revision 7 exist
2020-03-30 18:57:14.823 INFO SNAPSHOT_CHECK All chunks referenced by snapshot kevin_salvage1 at revision 8 exist
2020-03-30 18:57:15.098 INFO SNAPSHOT_CHECK All chunks referenced by snapshot kevin_salvage1 at revision 9 exist
2020-03-30 18:57:15.375 INFO SNAPSHOT_CHECK All chunks referenced by snapshot kevin_salvage1 at revision 10 exist
2020-03-30 18:57:15.716 INFO SNAPSHOT_CHECK All chunks referenced by snapshot kevin_salvage1 at revision 11 exist
2020-03-30 18:57:16.031 INFO SNAPSHOT_CHECK All chunks referenced by snapshot kevin_salvage1 at revision 12 exist
2020-03-30 18:57:16.314 INFO SNAPSHOT_CHECK All chunks referenced by snapshot kevin_salvage1 at revision 13 exist

So it looks like everything is good from this point of view

But if I look at the GUI logs I see this:

and if I click on “Missing Chunks” I see these messages

Running check command from /Users/kevinvsdel/.duplicacy-web/repositories/localhost/all
Options: [-log check -storage storage_timberwolf -a -tabular]
2020-03-29 05:57:46.411 INFO STORAGE_SET Storage set to sftp://backups@kv.ds.org:2222//share/hXXXXX/ups/KevinV
2020-03-29 05:57:46.928 INFO SNAPSHOT_CHECK Listing all chunks
2020-03-29 08:32:46.975 WARN SFTP_RETRY Encountered an error (failed to send packet: EOF); retry after 1 second(s)
2020-03-29 08:32:55.394 INFO SNAPSHOT_CHECK 1 snapshots and 12 revisions
2020-03-29 08:32:55.395 INFO SNAPSHOT_CHECK Total chunk size is 323,570M in 83835 chunks
2020-03-29 08:32:55.400 FATAL DOWNLOAD_CHUNK Chunk db537514f0d6d66e4662622438b843cbab8f042f7a4ecaef0e6de78f3d7628cd can’t be found
Chunk db537514f0d6d66e4662622438b843cbab8f042f7a4ecaef0e6de78f3d7628cd can’t be found

Can you pls advise what I should try next?

Is it bad that the Check is the first thing in my schedule?

gchen · 31 March 2020 20:25

I suspect that you’re running an early version of Duplicacy CLI (and the old GUI) where there are 2 nesting levels for chunks, whereas recent versions of Duplicacy all uses a nesting level of 1. If the chunk db5375... is stored as chunks/db/53/75... then the nesting level is 2.

If that is the case then the fix is to write a script that move all chunks up one level.

kevinvinv · 31 March 2020 22:00

OK can you clarify? The screen shot in my post above I think shows that I am using the web gui…

It is possibly that when I ran command line… I used an older version of the CLI… but that said all chunks were present.

I am awful confused.

gchen · 1 April 2020 03:02

All chunks are there, but they are under subdirectories that are 2 levels down, while the web GUI expects them to be under subdirectories that are only 1 level down.

Can you first check the chunks directory on the storage to confirm that this is the case?

kevinvinv · 1 April 2020 10:14

My chunks dir looks like this

and inside one of those sub dirs I see this:

Thoughts?

gchen · 1 April 2020 14:36

So the nesting level is 1 and this is not the issue.

I do notice that you ran CLI without the -a option (duplicacy -log check), which means it only checks snapshots with the id kevin_salvage1. So there must be another snapshot id that has missing chunks.

Two questions:

Does the chunk db537514f0d6d66e4662622438b843cbab8f042f7a4ecaef0e6de78f3d7628cd exist in the storage?
Is there another subdirectory under snapshots besides kevin_salvage1?

kevinvinv · 2 April 2020 01:58

Regarding other directories under snapshots: - it looks like no.

I do not see that chunk anywhere in the storage

???

kevinvinv · 2 April 2020 02:03

I thought I was on to something but I was wrong… so … the question still is here… why does the CLI say no missing chunks and the gui says missing chunks?

Can I pay for some on-site support from someone? I need to get this resolved asap- seriously. Thanks!!

kevinvinv · 2 April 2020 02:22

I also note that this is happening on two completely separate storages I think.

Should I somehow totally uninstall Duplicacy and start over? Maybe there is something messed up from the old gui days??

gchen · 2 April 2020 14:50

This chunk should be found under the subdirectory db with the name 537514f0d6d66e4662622438b843cbab8f042f7a4ecaef0e6de78f3d7628cd.

I would suggest running the CLI from the directory ~/.duplicacy-web/bin and checking with the -a option (check -a). Also make sure that the CLI is working with the same storage as the web GUI (in your post it looks like they are not the same (sftp://backups@kv.dns.org:2222//share/hoxdrmes/dataps/KevinV vs sftp://backups@kv.ds.org:2222//share/hXXXXX/ups/KevinV even though I understand the latter is redacted)

kevinvinv · 4 April 2020 01:40

This chunk does not exist in the db directory or any directory that I can find…

Will try your next suggestion next.

kevinvinv · 4 April 2020 02:02

OK- in the webgui / bin directory there are three execuatables. I ran the latest one with check -a and see this: (No missing chunks)

Salvage1:bin kevinvannorsdel$ ./duplicacy_osx_x64_2.4.1 check -a

Storage set to sftp://backups@kv.redact.org:2222//share/homes/backups/KevinV

Listing all chunks

1 snapshots and 17 revisions
Total chunk size is 328,017M in 85224 chunks
All chunks referenced by snapshot kevin_salvage1 at revision 1 exist
All chunks referenced by snapshot kevin_salvage1 at revision 2 exist
All chunks referenced by snapshot kevin_salvage1 at revision 3 exist
All chunks referenced by snapshot kevin_salvage1 at revision 4 exist
All chunks referenced by snapshot kevin_salvage1 at revision 5 exist
All chunks referenced by snapshot kevin_salvage1 at revision 6 exist
All chunks referenced by snapshot kevin_salvage1 at revision 7 exist
All chunks referenced by snapshot kevin_salvage1 at revision 8 exist
All chunks referenced by snapshot kevin_salvage1 at revision 9 exist
All chunks referenced by snapshot kevin_salvage1 at revision 10 exist
All chunks referenced by snapshot kevin_salvage1 at revision 11 exist
All chunks referenced by snapshot kevin_salvage1 at revision 12 exist
All chunks referenced by snapshot kevin_salvage1 at revision 13 exist
All chunks referenced by snapshot kevin_salvage1 at revision 14 exist
All chunks referenced by snapshot kevin_salvage1 at revision 15 exist
All chunks referenced by snapshot kevin_salvage1 at revision 16 exist
All chunks referenced by snapshot kevin_salvage1 at revision 17 exist

but the GUI check still shows missing…

Can I somehow see the exact command line that the GUI is running?

gchen · 4 April 2020 02:45

The server here (kv.redact.org) is different from what you showed earlier (kv.ds.org):

The log you posted already above tells you the directory the check command ran from and the arguments:

So this is what you need to do to replicate the check run:

cd /Users/kevinvsdel/.duplicacy-web/repositories/localhost/all
/Users/kevinvsdel/.duplicacy-web/bin/duplicacy_osx_x64_2.4.1 check -storage storage_timberwolf -a -tabular