Resuming interrupted backup takes hours (incomplete snapshots should be saved)

You should manually delete the two files 1 and 2 under snapshots/snapshot_id in the storage.

1 Like

OK, I have done that, but assuming that duplicacy checks for the existence of previous snapshots at the beginning rather than at the end, this will only become effective the next time I start the backup command, right?

Just to give you an idea how serious a problem this behaviour of duplicacy is: It currently takes around 24 hours for the backup to resume actual uploads, i.e. it is skipping chunks for a full day. I really donā€™t see what the point is of limiting incomplete backups to initial backups only.

2 Likes

Is this problem fixed in 2.1.1? Did I get it right the resuming is only possible for initial backups?

No, resuming is always possible. But with initial backups duplicacy knows which chunks it can skip and the actual upload resumes much faster than when subsequent backups are resumed (because duplicacy has to ā€œrediscoverā€ every time which chunks have already been uploaded.

For smaller backups, this doesnā€™t really matter, but with larger ones, like I had, it can take a full day or longer until the actual upload resumes (i.e. duplicacy spends 24 hours just skipping chunks).

1 Like

Oh, wait. I now see you were probably referring to this:

And now Iā€™m not sure anymore. My understanding was that while windows may produce additional reasons for the incomplete snapshot file missing, duplicacy by design only attempts to create it for the first snapshot. Is this correct @gchen? If so, could this behaviour be changed?

2 Likes

Is this translated into the Web GUI version as well? Having an API within the CLI for the GUI to communicate with would fix the problem right? If the GUI sends a request for the CLI to gracefully stop.
Are Signals not supported with killing processes in Windows?
Like a graceful terminate that allows the process to close on its own. If I press the stop button in the Web GUI in version 0.2.7 and restart, it ends up scanning what seems like from the beginning of the drive and takes hours on the initial backup before it starts uploading again. For someone with terabytes to backup, it would require them to leave their computer on for days to avoid a long rescan before it starts uploading.

1 Like

Iā€™ve recently added a bunch of data to my existing backup and itā€™s now taking about a month to catch up. In the meantime, any time I need to reboot, or any power failure, results in it spending hours scanning files to see where it is.

Is there any way the ā€œincompleteā€ behavior could be extended outside the initial-backup period? Say, write an incomplete file every hour, delete the incomplete file with a successful backup.

5 Likes

Adding my experiences to this thread which I assume stem from the same issues in Duplicacy handling incomplete backupsā€¦

Same issue here, it still hasnā€™t been fixed in the latest CLI version (2.4).

I did a first test backup excluding large folders with filters. Once complete, I have removed the exclusions (hundreds of GB) but running a subsequent backup is taking hours just to skip chunks Duplicacy should know have already been uploaded. Since Duplicacy also does not seem to have any retry logic, the backup process is back to square one each time there is a network disconnection or I put my computer to sleep.

There is really no reason not to write the incomplete file at all opportunities (e.g. after each chunk), so the backup process can be efficiently resumed. This is a low hanging fruit, I hope it gets implemented ASAP (but I am not very optimistic considering that the issue was raised in 2018 :frowning:).

This isnā€™t accurate. It doesnā€™t retry for any and all possible errors. But it does retry with exponential backoff for errors that can safely be considered non-fatal.

1 Like

It definitely doesnā€™t retry with Dropbox non-fatal errors (e.g. network related), from what I can tell.

I donā€™t think network connectivity errors currently fall into the bucket of typical rate limiting type errors that duplicacy would retry, so I wouldnā€™t be surprised.

Depending on the error it might be possible to add retry behavior. And it may not be as simple as retrying the last operation. If the connection is being lost it would probably need to also try to reinitialize the connection to the storage.

It seems like Duplicacy is re-scanning uploaded files when I resume an interrupted backup.

I interrupted my first backup using Ctrl-C on my Mac. When I resumed the backup, duplicacy loads the incompete snapshot but says 0 files skipped and starts to pack all files that were already uploaded from the very first.

This process of re-packing takes a very long time. According to what you wrote it seems like the expected behavior for resuming an incompete backup is to skip files that have already been uploaded without re-scanning them.

^CIncomplete snapshot saved to /Volumes/Ext HD/.duplicacy/incomplete
indoracer@Diamond Ext HD % duplicacy backup -threads 12 -vss
Storage set to gcd://Backups/Duplicacy
No previous backup found
VSS not supported for non-local repository path: /Volumes/Ext HD
Indexing /Volumes/Ext HD
Parsing filter file /Volumes/Ext HD/.duplicacy/filters
Loaded 14 include/exclude pattern(s)
Incomplete snapshot loaded from /Volumes/Ext HD/.duplicacy/incomplete
Listing all chunks
Skipped 0 files from previous incomplete backup
Packed xxxxx
Packed xxxxx
...

This can happen if the fist file is a big file and has not been completely uploaded.

The incomplete file /Volumes/Ext HD/.duplicacy/incomplete is just a json file. You can open it to see its content.

Iā€™m pretty sure it had uploaded the first file as I interrupted the backup after 24 hours. It was well beyond the first file at that time. When I resumed the backup it went through and gave a pack message for each file that had already been uploaded but it didnā€™t re-upload them. It must have taken at least an hour for it to reach to the point where it had left off and then it continued to upload the remain files. I also checked the online storage before I resumed the upload and saw that it had uploaded gigs of data.

Next time when you interrupt the backup, resume it it with -d and there should be more information:

 duplicacy -d backup -threads 12 -vss
1 Like