Resuming interrupted backup takes hours (incomplete snapshots should be saved)

Christoph · 20 February 2018 08:42

It’s one of the v2.1.0 betas. Not sure if it’s the latest one though. Does it matter which beta?

gchen · 20 February 2018 19:01

Sorry, I forgot the partial snapshot file is saved for incomplete initial backups only. After the initial backup is done, it always uses the last complete backup as the base to determine which files are new. So if you add a lot of new files after the initial backup, you won’t be able to fast-resume after the backup is interrupted.

Christoph · 20 February 2018 19:11

Oh. Well, that is bad news. Why are incomplete snapshots not saved for subsequent backups?

I have about 1 TB of data to back up and for various reasons, that won’t go through without interruptions. Without incomplete snapshots, I lose several hours every time during which duplicacy just skips chunks. And as the backup progresses, the longer that waiting time becomes.

May I suggest that in v2.1.0 this limitation to incomplete snapshots is taken away? Or if there are some serious downsides of that, to at least make it available as an option, -save-incomplete or something?

Christoph · 26 February 2018 09:11

This is really becoming a major issue for me.

What if I delete the first two snapshots? As a work around, will duplicacy save the ongoing incomplete snapshot again? If so, would duplicacy prune -r 1-2 -storage <storage url> be the correct way of doing this? Or should I just delete the two files in the snapshot folder? (I obviously don’t want the chunks to be deleted, only the snapshots.)

gchen · 26 February 2018 18:14

You should manually delete the two files 1 and 2 under snapshots/snapshot_id in the storage.

Christoph · 26 February 2018 18:39

OK, I have done that, but assuming that duplicacy checks for the existence of previous snapshots at the beginning rather than at the end, this will only become effective the next time I start the backup command, right?

Just to give you an idea how serious a problem this behaviour of duplicacy is: It currently takes around 24 hours for the backup to resume actual uploads, i.e. it is skipping chunks for a full day. I really don’t see what the point is of limiting incomplete backups to initial backups only.

Usefulvid · 23 August 2018 06:50

Is this problem fixed in 2.1.1? Did I get it right the resuming is only possible for initial backups?

Christoph · 23 August 2018 10:19

No, resuming is always possible. But with initial backups duplicacy knows which chunks it can skip and the actual upload resumes much faster than when subsequent backups are resumed (because duplicacy has to “rediscover” every time which chunks have already been uploaded.

For smaller backups, this doesn’t really matter, but with larger ones, like I had, it can take a full day or longer until the actual upload resumes (i.e. duplicacy spends 24 hours just skipping chunks).

Christoph · 24 August 2018 05:43

Oh, wait. I now see you were probably referring to this:

And now I’m not sure anymore. My understanding was that while windows may produce additional reasons for the incomplete snapshot file missing, duplicacy by design only attempts to create it for the first snapshot. Is this correct @gchen? If so, could this behaviour be changed?

camjac521 · 29 December 2018 17:49

Is this translated into the Web GUI version as well? Having an API within the CLI for the GUI to communicate with would fix the problem right? If the GUI sends a request for the CLI to gracefully stop.
Are Signals not supported with killing processes in Windows?
Like a graceful terminate that allows the process to close on its own. If I press the stop button in the Web GUI in version 0.2.7 and restart, it ends up scanning what seems like from the beginning of the drive and takes hours on the initial backup before it starts uploading again. For someone with terabytes to backup, it would require them to leave their computer on for days to avoid a long rescan before it starts uploading.

zorba-duplicacy · 12 May 2019 00:13

I’ve recently added a bunch of data to my existing backup and it’s now taking about a month to catch up. In the meantime, any time I need to reboot, or any power failure, results in it spending hours scanning files to see where it is.

Is there any way the “incomplete” behavior could be extended outside the initial-backup period? Say, write an incomplete file every hour, delete the incomplete file with a successful backup.

santacruzskim · 9 January 2020 20:39

Adding my experiences to this thread which I assume stem from the same issues in Duplicacy handling incomplete backups…

Erwin · 11 March 2020 03:42

Same issue here, it still hasn’t been fixed in the latest CLI version (2.4).

I did a first test backup excluding large folders with filters. Once complete, I have removed the exclusions (hundreds of GB) but running a subsequent backup is taking hours just to skip chunks Duplicacy should know have already been uploaded. Since Duplicacy also does not seem to have any retry logic, the backup process is back to square one each time there is a network disconnection or I put my computer to sleep.

There is really no reason not to write the incomplete file at all opportunities (e.g. after each chunk), so the backup process can be efficiently resumed. This is a low hanging fruit, I hope it gets implemented ASAP (but I am not very optimistic considering that the issue was raised in 2018 ).

leerspace · 11 March 2020 11:41

This isn’t accurate. It doesn’t retry for any and all possible errors. But it does retry with exponential backoff for errors that can safely be considered non-fatal.

Erwin · 11 March 2020 16:25

It definitely doesn’t retry with Dropbox non-fatal errors (e.g. network related), from what I can tell.

leerspace · 11 March 2020 17:15

I don’t think network connectivity errors currently fall into the bucket of typical rate limiting type errors that duplicacy would retry, so I wouldn’t be surprised.

Depending on the error it might be possible to add retry behavior. And it may not be as simple as retrying the last operation. If the connection is being lost it would probably need to also try to reinitialize the connection to the storage.

indoracer · 24 April 2020 17:39

It seems like Duplicacy is re-scanning uploaded files when I resume an interrupted backup.

I interrupted my first backup using Ctrl-C on my Mac. When I resumed the backup, duplicacy loads the incompete snapshot but says 0 files skipped and starts to pack all files that were already uploaded from the very first.

This process of re-packing takes a very long time. According to what you wrote it seems like the expected behavior for resuming an incompete backup is to skip files that have already been uploaded without re-scanning them.

^CIncomplete snapshot saved to /Volumes/Ext HD/.duplicacy/incomplete
indoracer@Diamond Ext HD % duplicacy backup -threads 12 -vss
Storage set to gcd://Backups/Duplicacy
No previous backup found
VSS not supported for non-local repository path: /Volumes/Ext HD
Indexing /Volumes/Ext HD
Parsing filter file /Volumes/Ext HD/.duplicacy/filters
Loaded 14 include/exclude pattern(s)
Incomplete snapshot loaded from /Volumes/Ext HD/.duplicacy/incomplete
Listing all chunks
Skipped 0 files from previous incomplete backup
Packed xxxxx
Packed xxxxx
...

gchen · 26 April 2020 01:55

This can happen if the fist file is a big file and has not been completely uploaded.

The incomplete file /Volumes/Ext HD/.duplicacy/incomplete is just a json file. You can open it to see its content.

indoracer · 26 April 2020 05:58

I’m pretty sure it had uploaded the first file as I interrupted the backup after 24 hours. It was well beyond the first file at that time. When I resumed the backup it went through and gave a pack message for each file that had already been uploaded but it didn’t re-upload them. It must have taken at least an hour for it to reach to the point where it had left off and then it continued to upload the remain files. I also checked the online storage before I resumed the upload and saw that it had uploaded gigs of data.

gchen · 27 April 2020 02:12

Next time when you interrupt the backup, resume it it with -d and there should be more information:

 duplicacy -d backup -threads 12 -vss