Resuming interrupted backup takes hours (incomplete snapshots should be saved)

gchen · 24 January 2018 21:03

So you’re saying that those Skipped chunk 1 size 8427437… messages are only produced when duplicacy has the incomplete file to check which files to skip? In other words: when duplicacy attempts to upload a file that turns out to already exist (and hence skips it) then this is not reported as Skipped chunk…? Fine.

No, it is the opposite: if there is an incomplete file Duplicacy will know which files to skip so it won’t produce many Skipped chunk messages. If that file is absent, it doesn’t know which files had been uploaded in the previous backup (nor does it know there was a previous backup), so it will start from the first file and attempt to upload every chunk. But a lot of chunks had indeed been uploaded and that was why you saw so many Skipped chunk messages.

Christoph · 24 January 2018 21:46

Okay. Fine. What about the issue that duplicacy did not log anything until the PowerShell was closed?

gchen · 25 January 2018 03:57

Duplicacy only writes to stdout. It is normal for Windows to create a buffer to take the output from a running process and only when the buffer is full will the content of the buffer be flushed to the file on disk.

Christoph · 25 January 2018 19:20

Ah, thanks for explaining. But I’m afraid this doesn’t explain much of what I’m seeing because the first six lines appeared within a few seconds of starting the backup. But the following 236000 (sic!) appeared only when I stopped the task. So if there is a buffer, it is max 6 lines long. Which leaves the question why the other 236000 lines were kept in the buffer (or wherever they were kept)?

I understand that this is probably not a bug in duplicacy, but I suspect that there may be some room for improvement in how it interacts with windows.

Or maybe there is something wrong with my script? Here it is:

# Basic setup (edit here!)
$backupID = "PC_D_christoph"
$repositorypath = "D:\christoph"
# Construct logfile name for the day
$logfiledate = get-date -format yyyy-MM-dd
$logfilename = "backup_$backupID-$logfiledate.log" 
# Go to repository 
Set-Location -Path $repositorypath >> "C:\duplicacy\logs\$logfilename"
$(get-date -Format "yyyy-MM-dd HH:mm:ss") + " *** Starting new backup of " + $(convert-path $(get-location).Path) + " ***" >> "C:\duplicacy\logs\$logfilename"
# Start backup (edit here!)
& "c:\Program Files (x86)\Duplicacy\duplicacy.exe" backup -vss -stats -limit-rate 4000 >> "C:\duplicacy\logs\$logfilename"
$(get-date -Format "yyyy-MM-dd HH:mm:ss") + " *** Backup of " + $(convert-path $(get-location).Path) + " stopped ***" >> "C:\duplicacy\logs\$logfilename"

gchen · 26 January 2018 02:26

Your script looks ok to me although I must admit that I have little experience with PowerShell.

Christoph · 1 July 2018 00:03

12 posts were split to a new topic: Automating duplicacy with PowerShell scripts

Christoph · 15 February 2018 20:53

if the Duplicacy process gets killed it won’t get the the chance to save the incomplete file. The CLI version doesn’t have this issue when you press Ctrl-C

if there is an incomplete file Duplicacy will know which files to skip so it won’t produce many Skipped chunk messages. If that file is absent, it doesn’t know which files had been uploaded in the previous backup (nor does it know there was a previous backup), so it will start from the first file and attempt to upload every chunk.

I have to come back to this because I cannot confirm the above. I’ve had the suspicion for a while but never was entirely sure. Now I am: when I stop an ongoing backup with Ctrl + C in PowerShell and then restart the same backup, I will see skipped chunk message for hours (literally) before it resumes the actual upload.

gchen · 16 February 2018 03:21

If you don’t see a log message “Incomplete snapshot saved to”, then it means Duplicacy isn’t getting the Ctrl-C signal. I’m not sure if PowerShell would pass the signal to Duplicacy when it is running inside a script, but if you run Duplicacy from the command line it should get the signal.

Christoph · 16 February 2018 07:01

I’m not running it as a script. I’m running it directly from the powershell command line.

Christoph · 19 February 2018 22:50

Today the backup failed/stopped because the destination drive became unavailable. I’d assume that in this situation duplicacy had all the time in the world to write the incomplete snapshot, but it didn’t:

Now it will have to skip chunks again for hours. Is there any way I can get it to save the incomplete snapshots?

gchen · 20 February 2018 04:28

Any change you’re not running the latest version?

Christoph · 20 February 2018 08:42

It’s one of the v2.1.0 betas. Not sure if it’s the latest one though. Does it matter which beta?

gchen · 20 February 2018 19:01

Sorry, I forgot the partial snapshot file is saved for incomplete initial backups only. After the initial backup is done, it always uses the last complete backup as the base to determine which files are new. So if you add a lot of new files after the initial backup, you won’t be able to fast-resume after the backup is interrupted.

Christoph · 20 February 2018 19:11

Oh. Well, that is bad news. Why are incomplete snapshots not saved for subsequent backups?

I have about 1 TB of data to back up and for various reasons, that won’t go through without interruptions. Without incomplete snapshots, I lose several hours every time during which duplicacy just skips chunks. And as the backup progresses, the longer that waiting time becomes.

May I suggest that in v2.1.0 this limitation to incomplete snapshots is taken away? Or if there are some serious downsides of that, to at least make it available as an option, -save-incomplete or something?

Christoph · 26 February 2018 09:11

This is really becoming a major issue for me.

What if I delete the first two snapshots? As a work around, will duplicacy save the ongoing incomplete snapshot again? If so, would duplicacy prune -r 1-2 -storage <storage url> be the correct way of doing this? Or should I just delete the two files in the snapshot folder? (I obviously don’t want the chunks to be deleted, only the snapshots.)

gchen · 26 February 2018 18:14

You should manually delete the two files 1 and 2 under snapshots/snapshot_id in the storage.

Christoph · 26 February 2018 18:39

OK, I have done that, but assuming that duplicacy checks for the existence of previous snapshots at the beginning rather than at the end, this will only become effective the next time I start the backup command, right?

Just to give you an idea how serious a problem this behaviour of duplicacy is: It currently takes around 24 hours for the backup to resume actual uploads, i.e. it is skipping chunks for a full day. I really don’t see what the point is of limiting incomplete backups to initial backups only.

Usefulvid · 23 August 2018 06:50

Is this problem fixed in 2.1.1? Did I get it right the resuming is only possible for initial backups?

Christoph · 23 August 2018 10:19

No, resuming is always possible. But with initial backups duplicacy knows which chunks it can skip and the actual upload resumes much faster than when subsequent backups are resumed (because duplicacy has to “rediscover” every time which chunks have already been uploaded.

For smaller backups, this doesn’t really matter, but with larger ones, like I had, it can take a full day or longer until the actual upload resumes (i.e. duplicacy spends 24 hours just skipping chunks).

Christoph · 24 August 2018 05:43

Oh, wait. I now see you were probably referring to this:

And now I’m not sure anymore. My understanding was that while windows may produce additional reasons for the incomplete snapshot file missing, duplicacy by design only attempts to create it for the first snapshot. Is this correct @gchen? If so, could this behaviour be changed?