Only backup changes

Hi all,

I’ve just come from CrashPlan where I had my backups setup to only upload any changes made to the folders I wanted backed up.

One of the folders I backup contains all of my photos. If I delete files from it, add files to it, or move files within it, then I need those changes reflected in my backup.

I don’t want to have to upload the entire folder again. A) because of its size, and; B) because of my cruddy internet speed.

Being new to Duplicacy I’m wondering, is this already the way Duplicacy works? If not, is there a way I can make it work this way (using the web interface)?

Cheers

Absolutely. Duplicacy always does incrementals after the initial backup.

In fact, these incrementals are unlike most backups in that they also function as full snapshots, so there’s no point doing full backups (although there is a -hash option to force checking contents instead of looking at timestamps etc., but even that doesn’t waste upload bandwidth or storage space).

I would encourage you to read the CLI readme and some of the docs there, including the bit about Lock Free Deduplication, to see how the underlying engine works.

2 Likes

I would have phrased it slightly differently: in duplicacy the distinction between full and incremental backup doesn’t make sense. All backups are full backups (in the sense that they allow you to restore all files in that backup, even if you delete (prune) all other backups. And they are all incremental in the sense that data that has previously been uploaded to the backup storage will not be uploaded a second time. Right?

2 Likes

I would further clarify that while this is correct, there are two modes:

  1. Mode 1 where Duplicacy traverses your whole file tree and uploads all “chunks” which don’t already exist. If it is a redundant backup with only minor changes, then only the new and different chunks will be uploaded. However, it still takes time for Duplicacy to “go through the motions” of checking each individual chunk to see whether it already exists or not
  2. Mode 2 where Duplicacy makes an internal record of all files previously backed up, and only visits those files that have changed. It then breaks those into chunks, and again uploads only the chunks that don’t already exist. This is useful, say, if you have a movie file that you changed the meta information for, but did not change the data. Any backup program, including Duplicacy, will see this as “changed.” However, with Duplicacy, if the original movie file was already backed up, then only the new/different “chunk(s)” containing the changed metadata would be uploaded, potentially saving lots of bandwidth.

In any case, Mode 2 is faster than Mode 1 because it doesn’t have to do nearly as much work - but in the end, both result in the same space taken on the backup destination. The destination is the same, only the journey is expedited (in mode 2). So, on the first pass, Duplicacy does #1 above, then on the following passes, it does #2 (no, we’re not referring to potty training :wink: )

I have a RAID with 22TB of data on it, but with Duplicacy this gets down to about 8TB for the backups, and incrementals add only a tiny bit too that.

Morgan

3 Likes

Wow, that made my head hurt. But the Lock Free Deduplication side of things really lost me.

Thanks for the info guys. Sounds like it’s doing exactly as I need it to.

How I just need to work out how to get it to backup to my secondary local drive and also to the cloud, without me adding seperate backup jobs for each folder/storage location. I’ve read a few threads here and it seems this is something I can do through scheduling?

How do you set your backup to use Mode 2?

Duplicacy by default runs the initial backup in Mode 1 and any subsequent backup in Mode 2. The -hash option would force Mode 1 for any backup.