Does Duplicacy ensure the integrity of restored files?

What happens if a bit get flipped in a chunk file? Does the restore fail when it gets to restoring that chunk?
Therefore, can you be sure that all the bits are correct in the restored files if the restore completes?

If there is a bit flip in a chunk you won’t even be able to download the chunk since the chunk hash will be different.

On top of that, after a file is restored its hash is compared with the known hash stored in the backup. The restore will fail if two hashes are different.

3 Likes

As @gchen explained, Duplicacy depends on the file system to provide long-term reliable storage. In practice that means that cheap onsite storage like external HDDs is not suitable as backup storage medium for any important data.

You need a file system like ZFS or BTRFS that has built-in checksums to detect bit rot and you need disk redundancy through RAID, so these file systems are able to repair detected checksum errors by reading the data from the redundant media.

For most non-business users with (presumably) a limited amount of important data using an off-site backup might be cheaper than a suitable onsite solution, say a NAS in the appropriate configuration.

4 Likes

Good answers. Thanks.

The restore will fail if two hashes are different.

I am a new Duplicacy user and was researching my options. Reading the above had me worried as I was planning to use an external HDD for backing up my NAS and concerned about a single bit flip could render the “entire” backup/restore to be useless.

Then I realized the restore command actually has a -persist option that would ignore the corrupted chunk and continue to restore the unaffected files! I tested it out by temporarily renaming a saved chunk.

While I understand using something with parity and redundancy is a better option (as @tangofan has mentioned), I figure I could live with the risk of using an external HDD as a backup of a backup, and only “some” of the files will be affected should an unfortunate bit flip did occur. I suppose a frequent check -chunks would also help to identify the problem early while the source files may still be there.

Today duplicacy supports erasure coding so if you choose to initialize the storage this way it can handle unreliable media, within reason.

That said – it is not a guarantee, it’s merely improvement in the outlook; so better approach would be to backup to reliable storage, that guarantees correctness and consistency. Otherwise according to Murphy’s law, the corruption will be right in the middle of the exact file of the exact version you wanted to restore.

2 Likes

Wow fascinating. Thanks to your Reddit posts (and your blogposts on duplicacy tips on macOS and synology) that introduces me to this wonderful tool.