Resuming interrupted backup takes hours (incomplete snapshots should be saved)

It’s one of the v2.1.0 betas. Not sure if it’s the latest one though. Does it matter which beta?

1 Like

Sorry, I forgot the partial snapshot file is saved for incomplete initial backups only. After the initial backup is done, it always uses the last complete backup as the base to determine which files are new. So if you add a lot of new files after the initial backup, you won’t be able to fast-resume after the backup is interrupted.

1 Like

Oh. Well, that is bad news. Why are incomplete snapshots not saved for subsequent backups?

I have about 1 TB of data to back up and for various reasons, that won’t go through without interruptions. Without incomplete snapshots, I lose several hours every time during which duplicacy just skips chunks. And as the backup progresses, the longer that waiting time becomes.

May I suggest that in v2.1.0 this limitation to incomplete snapshots is taken away? Or if there are some serious downsides of that, to at least make it available as an option, -save-incomplete or something?

2 Likes

This is really becoming a major issue for me.

What if I delete the first two snapshots? As a work around, will duplicacy save the ongoing incomplete snapshot again? If so, would duplicacy prune -r 1-2 -storage <storage url> be the correct way of doing this? Or should I just delete the two files in the snapshot folder? (I obviously don’t want the chunks to be deleted, only the snapshots.)

1 Like

You should manually delete the two files 1 and 2 under snapshots/snapshot_id in the storage.

1 Like

OK, I have done that, but assuming that duplicacy checks for the existence of previous snapshots at the beginning rather than at the end, this will only become effective the next time I start the backup command, right?

Just to give you an idea how serious a problem this behaviour of duplicacy is: It currently takes around 24 hours for the backup to resume actual uploads, i.e. it is skipping chunks for a full day. I really don’t see what the point is of limiting incomplete backups to initial backups only.

2 Likes

Is this problem fixed in 2.1.1? Did I get it right the resuming is only possible for initial backups?

No, resuming is always possible. But with initial backups duplicacy knows which chunks it can skip and the actual upload resumes much faster than when subsequent backups are resumed (because duplicacy has to “rediscover” every time which chunks have already been uploaded.

For smaller backups, this doesn’t really matter, but with larger ones, like I had, it can take a full day or longer until the actual upload resumes (i.e. duplicacy spends 24 hours just skipping chunks).

1 Like

Oh, wait. I now see you were probably referring to this:

And now I’m not sure anymore. My understanding was that while windows may produce additional reasons for the incomplete snapshot file missing, duplicacy by design only attempts to create it for the first snapshot. Is this correct @gchen? If so, could this behaviour be changed?

2 Likes

Is this translated into the Web GUI version as well? Having an API within the CLI for the GUI to communicate with would fix the problem right? If the GUI sends a request for the CLI to gracefully stop.
Are Signals not supported with killing processes in Windows?
Like a graceful terminate that allows the process to close on its own. If I press the stop button in the Web GUI in version 0.2.7 and restart, it ends up scanning what seems like from the beginning of the drive and takes hours on the initial backup before it starts uploading again. For someone with terabytes to backup, it would require them to leave their computer on for days to avoid a long rescan before it starts uploading.

1 Like

I’ve recently added a bunch of data to my existing backup and it’s now taking about a month to catch up. In the meantime, any time I need to reboot, or any power failure, results in it spending hours scanning files to see where it is.

Is there any way the “incomplete” behavior could be extended outside the initial-backup period? Say, write an incomplete file every hour, delete the incomplete file with a successful backup.

5 Likes

Adding my experiences to this thread which I assume stem from the same issues in Duplicacy handling incomplete backups…

Same issue here, it still hasn’t been fixed in the latest CLI version (2.4).

I did a first test backup excluding large folders with filters. Once complete, I have removed the exclusions (hundreds of GB) but running a subsequent backup is taking hours just to skip chunks Duplicacy should know have already been uploaded. Since Duplicacy also does not seem to have any retry logic, the backup process is back to square one each time there is a network disconnection or I put my computer to sleep.

There is really no reason not to write the incomplete file at all opportunities (e.g. after each chunk), so the backup process can be efficiently resumed. This is a low hanging fruit, I hope it gets implemented ASAP (but I am not very optimistic considering that the issue was raised in 2018 :frowning:).

This isn’t accurate. It doesn’t retry for any and all possible errors. But it does retry with exponential backoff for errors that can safely be considered non-fatal.

1 Like

It definitely doesn’t retry with Dropbox non-fatal errors (e.g. network related), from what I can tell.

I don’t think network connectivity errors currently fall into the bucket of typical rate limiting type errors that duplicacy would retry, so I wouldn’t be surprised.

Depending on the error it might be possible to add retry behavior. And it may not be as simple as retrying the last operation. If the connection is being lost it would probably need to also try to reinitialize the connection to the storage.

It seems like Duplicacy is re-scanning uploaded files when I resume an interrupted backup.

I interrupted my first backup using Ctrl-C on my Mac. When I resumed the backup, duplicacy loads the incompete snapshot but says 0 files skipped and starts to pack all files that were already uploaded from the very first.

This process of re-packing takes a very long time. According to what you wrote it seems like the expected behavior for resuming an incompete backup is to skip files that have already been uploaded without re-scanning them.

^CIncomplete snapshot saved to /Volumes/Ext HD/.duplicacy/incomplete
indoracer@Diamond Ext HD % duplicacy backup -threads 12 -vss
Storage set to gcd://Backups/Duplicacy
No previous backup found
VSS not supported for non-local repository path: /Volumes/Ext HD
Indexing /Volumes/Ext HD
Parsing filter file /Volumes/Ext HD/.duplicacy/filters
Loaded 14 include/exclude pattern(s)
Incomplete snapshot loaded from /Volumes/Ext HD/.duplicacy/incomplete
Listing all chunks
Skipped 0 files from previous incomplete backup
Packed xxxxx
Packed xxxxx
...

This can happen if the fist file is a big file and has not been completely uploaded.

The incomplete file /Volumes/Ext HD/.duplicacy/incomplete is just a json file. You can open it to see its content.

I’m pretty sure it had uploaded the first file as I interrupted the backup after 24 hours. It was well beyond the first file at that time. When I resumed the backup it went through and gave a pack message for each file that had already been uploaded but it didn’t re-upload them. It must have taken at least an hour for it to reach to the point where it had left off and then it continued to upload the remain files. I also checked the online storage before I resumed the upload and saw that it had uploaded gigs of data.

Next time when you interrupt the backup, resume it it with -d and there should be more information:

 duplicacy -d backup -threads 12 -vss
1 Like