Possible to restore an incomplete backup?

DANLSN · 18 June 2020 02:39

Hi there,

I’ve searched for an answer and suspect that it’s not possible but here goes.

I’m working through a really messy backup problem that’s compounded over the years and over that time my systems have changed and it’s become fragmented and duplicated all over the place and under lockdown I’ve tried to fix it.

The problem I have is that I have a snapshot on a spinning disk that does not have a complete revision but I believe it has ~3TB of chunks from an incomplete backup. Unfortunately I don’t have the original drive where the backup originated so I can’t just resume the backup.

Is it possible to restore an incomplete backup, even if it’s partially corrupted or incomplete?

TIA!

Dan Lawson

gchen · 18 June 2020 15:00

The metadata for incomplete backups are only stored locally as .duplicacy/incomplete. If you don’t have this file then it is almost impossible to recover files that have been uploaded.

DANLSN · 20 June 2020 00:02

Hi Gilbert, thanks for taking the time to respond.

I should have probably added in my original post that the initial backups were done using the Web GUI, so I believe I could still have the .duplicacy/incomplete, or is it very difficult even then?

TIA

gchen · 20 June 2020 19:59

It is possible to convert .duplicacy/incomplete into a regular snapshot file but that requires some hacking into the source code.

light2089 · 27 October 2020 03:50

I have few very important files that were still being backed up when my server array crashed (~10TB). Most of the important stuff got backed up though (~8TB).
Duplicacy was being run in separate cache drives using a docker, so all of it’s appdata survived.
How can I restore from the incomplete backup? Any help is much appreciated!
I can confirm I have the incomplete file of ~140MB

I am happy to buy a commercial license if required.

gchen · 27 October 2020 15:56

The .duplicacy/incomplete is a plain json file. You can open it in a text editor to see if needed files are there. If a file isn’t included in the file list it will be very hard to recover.

Note that file paths are encoded in base64, so when you look for a file you’ll need to convert its relative path to base64 first.

light2089 · 27 October 2020 17:50

Thanks for the response Gilbert. I did open the incomplete file but I am still not sure how to restore a file based on the information there.
The incomplete file has a list of files, and chunks. A typical file excerpt looks like this -
{
“content”: “52:613473:55:1100314”,
“gid”: 100,
“hash”: “37f5d62078f35b84526d0d86f44c9bd562a01bcd7ea01b820310ce9a655c1d8f”,
“mode”: 438,
“name”: “Rmxhc2ggTW9iIGJ5IFNlY3Rpb24gRCAyMDEyLTE0IEJhdGNoLmZsdg==”,
“size”: 4998608,
“time”: 1378413379,
“uid”: 99
},
I am able to decode the name. Thanks for letting me know how it is encoded.
If you could let me know the command I would run to restore the file in that excerpt, I can replicate it to restore all my files from the incomplete file.
Additionally, where is the directory structure information stored?

gchen · 28 October 2020 03:09

Is the storage encrypted? The CLI doesn’t support this kind of recovery out of the box. Additional code is needed. Or maybe I can get you a separate tool just for this purpose.

light2089 · 28 October 2020 03:16

Yes the storage is encrypted.

That would be great! Let me know how to proceed.

Thinking about other ways - Is there a way I could generate a snapshot file corresponding to the files in the incomplete list, upload it to my cloud backup folder and make Duplicacy see it as a snapshot and restore using the regular procedure?

gchen · 28 October 2020 04:08

This is a great idea. Yes, I can confirm it works!

First, manually create ‘fake’ copies of files that you want to restore. You can fill them with zeros or random data, but they have to have the exact sizes and timestamps as the original ones (which you can look up from the incomplete file).

Then run duplicacy backup. I think you should run from a new repository with only those fake files. The backup command will read the incomplete file, skip fake files, and remove non-existent files from the file list. It should complete quickly and you’ll have a valid revision.

Now you can remove all fake files and run a restore to recover the original copies.

Please make the a copy of the incomplete file and save it somewhere else. This is very important – the backup command will delete this file on completion.

light2089 · 28 October 2020 05:26

I could possibly write a script to create the fake files using the incomplete list as an input but the problem is I would require to create 8TB+ data!
If that wasn’t the case, I could’ve done that. Is creating those fake files the only way to create the snapshot file?
In that case, a tool to restore using the incomplete file list might be more feasible.
What do you think?

gchen · 28 October 2020 14:19

You can create fake files as sparse files, i.e., open new file for writing, seek to the supposed size, and then close. This won’t take too much space.

light2089 · 29 October 2020 05:50

Thanks for the suggestion and help.

I converted the json to a csv and decoded the file names to analyze the files that were successfully backed up. Unfortunately the backed up files were not the important ones so I am pausing this endeavor for now.
Instead I will try to retrieve the data from the dead HDDs - hopefully it is just a PCB issue and the data can still be retrieved.