Deduplication and check command

I initialized a repo with the defaults and backed up a copy of an ESXi virtual machine of a Linux guest without any applications or databases on it, expecting little file system changes between subsequent copies of the vmdk. After running check, I get the following check output.

Do I interpret this correctly in that hardly any deduplication has occurred? The first backup has chunks/bytes equal to new/bytes which is expected, however the second backup looks like new bytes is %61 of actual bytes. I would have thought it would be much better?

The recommended setting (which is also used by our ESXi backup tool, Vertical Backup) for backing up virtual machines is fixed-size chunking with the chunk size set at 1MB . This is done by running the init command with the following options:

duplicacy init -c 1M -max 1M -min 1M repository_id storage_url

You’ve already initialized the storage and run two backups, so you can add a secondary storage with the settings for virtual machines, then ‘replay’ the backups using the following commands:

duplicacy add -c 1M -max 1M -min 1M fixed-size repository_id storage_url
duplicacy restore -r 1                         # Restore the first vmdk file from the default storage
duplicacy backup -storage fixed-size -stats    # Backup the first vmdk file to the new storage 
duplicacy restore -r 2                         # Restore the second one 
duplicacy backup -storage fixed-size -stats    # Backup the second one to the new storage
duplicacy check -storage fixed-size -tabular   # Now check if the new settings make any differences

I should have mentioned I found the advice about the init parameters in a github issue and I did use those values for the repo in question.

I don’t think the storage was initialized with -c 1M -max 1M -min 1M. The backup at revision 1 is 8194M bytes but only has 674 chunks. If the chunk size were fixed at 1M, the number of chunks would have been 8194 (or about this value if there are extra files).

You can run duplicacy -d list to check the parameters about the storage.