Problems restoring from ext4 drive

Does that file exist under N:\Duplicacy\chunks? The full path should be N:\Duplicacy\chunks\43\4348aa… if it exists.

If the file is there, it could be an issue with Paragon’s ext4 disk driver (especially consider that Duplicacy can’t even list N:\Duplicacy\snapshots properly with the standard API call). Can you try restoring file using Ubuntu just as how you did with the list command?

No luck either, chunk is not there…

timo@timo-VirtualBox:/media/timo/Volume/Restore$ ~/Downloads/duplicacy_linux_x64_2.0.10 restore -overwrite -r 74
Storage set to /media/timo/Sicherung I/Duplicacy
Restoring /media/timo/Volume/Restore to revision 74
Chunk 434548aa7855f2bb3e36d6355acc901918f077e602fdd45e29652317a84c6f89 can't be found
timo@timo-VirtualBox:/media/timo/Volume/Restore$

The check command can report which chunks are missing:

~/Downloads/duplicacy_linux_x64_2.0.10 check -r 74

If the missing chunks don’t exist on the storage and it is likely that a prune command deleted them with the -exclusive option while there were ongoing backups. The prune logs are kept under the .duplicacy/logs directory and you’ll be able to find out when they are deleted.

If you never ran a prune command before then I had no clue why they are missing.

This one has me a bit worried. Can I ask a question about this?

Does duplicacy somehow verify that all chunks are present on the storage when a backup is complete? Should I be running a check command after backup and prune to be sure?

Next question,

If a chunk somehow disappears on the storage- will it get re-uploaded at the next backup?

Final question to the OP,

Can you go back a couple snapshots and get a complete restore?

No, the backup command doesn’t verify that all chunks are present before it is complete. A prune command with the -exclusive option may remove some chunks before the snapshot file is uploaded as the last step of the backup command. Therefore, if you can’t absolutely exclude the possibility of a prune -exclusive command you should run the check command.

If a chunk somehow disappears on the storage- will it get re-uploaded at the next backup?

Only if you run backup -hash, otherwise Duplicacy will always assume that all chunks referenced by the last snapshot exist.

3 chunks are missing in total starting from my 2nd oldest revision 22 up to the latest revision 74. Revision 1 is fine - trying to recover now.
I had the “Prune snapshots after first backup” checkbox enabled in the GUI though…

Can you implement the -hash option into the GUI please, make Duplicacy more fail-proof? Finding out the hard way is quite unsatisfactory as you can imagine… :frowning:

I don’t care much about speed, I care about my data.

Couple more questions"

A prune command with the -exclusive option may remove some chunks before the snapshot file is uploaded as the last step of the backup command.

This would only happen if multiple concurrent operations are happening to the same storage … is that correct? If I am the only user backing up to a particular storage- then would this ever happen?

And lets say that we run the -check command to be sure of no missing chunks… If I find a missing chunk- what do we do then to get those chunks re-established? I guess you probably answered this… the only way to do this is run backup -hash? Is that right?

I feel a little concerned about this. Can you help me become a little less so? :slight_smile:

This would only happen if multiple concurrent operations are happening to the same storage … is that correct? If I am the only user backing up to a particular storage- then would this ever happen?

This would only happen if multiple concurrent operations are happening to the same storage AND you run the prune command with the -exclusive option. You can run the prune command without the -exclusive option if there are multiple concurrent operations to the same storage.

The key here is the -exclusive option. It assumes the exclusive access to the storage so by definition you can’t have another backup that is in progress.

If you find a missing chunk, then first make sure it doesn’t exist in the storage. If it does exist then it is a different issue. Then check the prune logs under the .duplicacy/logs directory to see when the chunk was removed. This is to confirm this isn’t a Duplicacy bug. If it was deleted accidentally (due to the use of the -exclusive option), then run duplicacy backup -hash hoping it may be able to recreate the same chunk. However, if the on-disk files already changed then you may not be able to recreate it.

Thank you very much This makes sense.

But lets say I want to be sure that my most recent backup is good… regardless of any corruption that may have happened on the storage previously… what do I have to do… run the backup with -hash?

Also- reading the documentation on -hash – I dont quite understand how that accomplishes what we want. Can you explain just a bit please?

THANK YOU!

Run duplicacy check -r revision is probably enough under most circumstances, but if you are worried about the reliability of the storage then duplicacy check -r revision -files, which basically downloads every file to verify they have the right hashes and thus may take a long time.

The -hash option rescans all the files in the repository without checking if they have been changed. In doing so they will blindly attempt to upload every chunk it creates, but skip the upload if the chunk already exists. Therefore it may have a chance to recreate a missing chunk.

ok thanks.

I guess I dont understand really what to do about this.

I went to my storage and deleted one chunk file deliberately.

Then I ran the check command and verified that duplicacy reports a missing chunk.

Now I want to “fix” the problem… so I tried to do a backup command with the -hash option… but it seems like the newly created snapshot is still missing the chunk… so the -hash didnt really solve the problem.

I guess I am thinking of periodically running the check command and maybe with -files … BUT I dont know what to do if I find a missing chunk… somehow there needs to be a way to “heal” the backup snapshot.

Thoughts?

Yes, my thoughts are that I can’t relay on Duplicacy unfortunately. And when it comes to backups I need something 100% bulletproof. Duplicacy should consistently check its chunks. I never run a prune -exclusive command… maybe some chunks got lost in the upload process to my server or due to faulty sectors on the HDD or whatever. But this should be detected and repaired by an automatic backup, otherwise I could copy the files by hand, as Duplicacy’s repair functionality seems to be carried out manually by the user and maybe - as I read your discussion - is unsuccessful anyway.

Long story short, thank you very much for your support @gchen but may I ask for a (partial) refund for my two licenses, as I’m not going to use Duplicacy any longer… Thanks!

Hi Timo,

I have struggled with this one and have sort of convinced myself it is a medium level problem in some sense. For example- it seems to me that while some backup solutions do continually monitor the backup integrity (crashplan and others) - many dont (Time Machine, acronis, most personal-level tools)… do you think I am right about this?

It further indicates the need for multiple backup solutions.

I for one, dont want to use the duplicacy copy function for this sort of reason- rather I am using duplicacy to backup to two storages completely independently (rather than backing up the backup)…

And I’ll still run time machine

What I am worried about though are some of my family members who now rely on me for backups (duplicacy) and dont seem to be as obsessed with this sort of thing as I am… until they need the data… and then I feel bad if there is a problem and they didnt follow my direction to use a 2nd backup too as well :slight_smile: – so I still am a little stressed about this… and not sure what to do :slight_smile:

Timo, I issued the full refund of your two licenses on stripe.com and you should be able to see it on your credit card statement in a couple of days. If you don’t, please let me know.

Sorry to lose you as a customer, but I should clarify that it is possible to check chunks automatically after each backup – we just don’t make this the default. You can create a post-backup script that runs the check command to make sure that every chunk just uploaded does exist. This will catch any missing chunk error much earlier.

In most cases when users reported missing chunks, the cause was always the misuse of the -exclusive option (for example here and here. However, in your case I tend to believe it is the issue of the Paragon ext4 driver, mostly because of the weird listing bug. If so, then it would have been caught by the post-backup check script.

On how to recover from missing chunks – sorry I was wrong on the use of the -hash option. I forgot that the -hash option still assumes that the chunks referenced by last snapshot all exist. The correct way is run the backup command with a different repository id (by editing the .duplicacy/preferences file). After the repairing backup command you can change the repository id back to the original one. Then you can remove the snapshot file snapshots/temporary_id/1 from the storage. If no files changed before this repairing backup command then no unreferenced chunks will be generated by this operation.

It should be noted that repairing missing chunks isn’t guaranteed to work every time. If some files have changed then it will be impossible to regenerate the missing chunks. To be able to recover from missing chunks under any circumstances requires the error correction technique, which incurs significant overhead on most storage systems that can be reasonably trusted. If you’re really worried about this issue, then you should back up to multiple storages.

Kevinvinv, I disagree with you on the reason that the copy function should be avoided. If you back up to two storages independently and some files changed between these two backups then you will not get the same set of chunks on different storages. So when one chunk is missing on one storage you can’t grab it from the other storage. Another advantage of the copy function is that you can use third-party tools like rsync/rclone to copy chunks between storages.

1 Like

HI Gchen, thanks for this thorough response.

I had a question- can one determine if all the chunks are available (like the -check command) without using a password? In other words, is the storage password really required to verify the presence of all the chunks?

I would like to do something server-side for all my users but dont want to have their passwords on file.

Unfortunately you need the encryption keys to derive the file names of chunks, so the answer is no.

Thanks.

Also- to me it seems that an acceptable recovery method when one finds a missing chunk is to re-upload the entire backup. I agree that replacing a missing chunk is pretty difficult since source files will no-doubt have changed.

Detecting the situation reliably is the main concern I have at the moment. Recovering from it can be dealt with … but knowing one needs to recover is the first step! :slight_smile: