Resuming interrupted backup takes hours (incomplete snapshots should be saved)

So you’re saying that those Skipped chunk 1 size 8427437… messages are only produced when duplicacy has the incomplete file to check which files to skip? In other words: when duplicacy attempts to upload a file that turns out to already exist (and hence skips it) then this is not reported as Skipped chunk…? Fine.

No, it is the opposite: if there is an incomplete file Duplicacy will know which files to skip so it won’t produce many Skipped chunk messages. If that file is absent, it doesn’t know which files had been uploaded in the previous backup (nor does it know there was a previous backup), so it will start from the first file and attempt to upload every chunk. But a lot of chunks had indeed been uploaded and that was why you saw so many Skipped chunk messages.

1 Like

Okay. Fine. What about the issue that duplicacy did not log anything until the PowerShell was closed?

Duplicacy only writes to stdout. It is normal for Windows to create a buffer to take the output from a running process and only when the buffer is full will the content of the buffer be flushed to the file on disk.

1 Like

Ah, thanks for explaining. But I’m afraid this doesn’t explain much of what I’m seeing because the first six lines appeared within a few seconds of starting the backup. But the following 236000 (sic!) appeared only when I stopped the task. So if there is a buffer, it is max 6 lines long. Which leaves the question why the other 236000 lines were kept in the buffer (or wherever they were kept)?

I understand that this is probably not a bug in duplicacy, but I suspect that there may be some room for improvement in how it interacts with windows.

Or maybe there is something wrong with my script? Here it is:

# Basic setup (edit here!)
$backupID = "PC_D_christoph"
$repositorypath = "D:\christoph"
# Construct logfile name for the day
$logfiledate = get-date -format yyyy-MM-dd
$logfilename = "backup_$backupID-$logfiledate.log" 
# Go to repository 
Set-Location -Path $repositorypath >> "C:\duplicacy\logs\$logfilename"
$(get-date -Format "yyyy-MM-dd HH:mm:ss") + " *** Starting new backup of " + $(convert-path $(get-location).Path) + " ***" >> "C:\duplicacy\logs\$logfilename"
# Start backup (edit here!)
& "c:\Program Files (x86)\Duplicacy\duplicacy.exe" backup -vss -stats -limit-rate 4000 >> "C:\duplicacy\logs\$logfilename"
$(get-date -Format "yyyy-MM-dd HH:mm:ss") + " *** Backup of " + $(convert-path $(get-location).Path) + " stopped ***" >> "C:\duplicacy\logs\$logfilename"

Your script looks ok to me although I must admit that I have little experience with PowerShell.

12 posts were split to a new topic: Automating duplicacy with PowerShell scripts

if the Duplicacy process gets killed it won’t get the the chance to save the incomplete file. The CLI version doesn’t have this issue when you press Ctrl-C

if there is an incomplete file Duplicacy will know which files to skip so it won’t produce many Skipped chunk messages. If that file is absent, it doesn’t know which files had been uploaded in the previous backup (nor does it know there was a previous backup), so it will start from the first file and attempt to upload every chunk.

I have to come back to this because I cannot confirm the above. I’ve had the suspicion for a while but never was entirely sure. Now I am: when I stop an ongoing backup with Ctrl + C in PowerShell and then restart the same backup, I will see skipped chunk message for hours (literally) before it resumes the actual upload.

1 Like

If you don’t see a log message “Incomplete snapshot saved to”, then it means Duplicacy isn’t getting the Ctrl-C signal. I’m not sure if PowerShell would pass the signal to Duplicacy when it is running inside a script, but if you run Duplicacy from the command line it should get the signal.

1 Like

I’m not running it as a script. I’m running it directly from the powershell command line.

1 Like

Today the backup failed/stopped because the destination drive became unavailable. I’d assume that in this situation duplicacy had all the time in the world to write the incomplete snapshot, but it didn’t:

Now it will have to skip chunks again for hours. Is there any way I can get it to save the incomplete snapshots?

2 Likes

Any change you’re not running the latest version?

1 Like

It’s one of the v2.1.0 betas. Not sure if it’s the latest one though. Does it matter which beta?

1 Like

Sorry, I forgot the partial snapshot file is saved for incomplete initial backups only. After the initial backup is done, it always uses the last complete backup as the base to determine which files are new. So if you add a lot of new files after the initial backup, you won’t be able to fast-resume after the backup is interrupted.

1 Like

Oh. Well, that is bad news. Why are incomplete snapshots not saved for subsequent backups?

I have about 1 TB of data to back up and for various reasons, that won’t go through without interruptions. Without incomplete snapshots, I lose several hours every time during which duplicacy just skips chunks. And as the backup progresses, the longer that waiting time becomes.

May I suggest that in v2.1.0 this limitation to incomplete snapshots is taken away? Or if there are some serious downsides of that, to at least make it available as an option, -save-incomplete or something?

2 Likes

This is really becoming a major issue for me.

What if I delete the first two snapshots? As a work around, will duplicacy save the ongoing incomplete snapshot again? If so, would duplicacy prune -r 1-2 -storage <storage url> be the correct way of doing this? Or should I just delete the two files in the snapshot folder? (I obviously don’t want the chunks to be deleted, only the snapshots.)

1 Like

You should manually delete the two files 1 and 2 under snapshots/snapshot_id in the storage.

1 Like

OK, I have done that, but assuming that duplicacy checks for the existence of previous snapshots at the beginning rather than at the end, this will only become effective the next time I start the backup command, right?

Just to give you an idea how serious a problem this behaviour of duplicacy is: It currently takes around 24 hours for the backup to resume actual uploads, i.e. it is skipping chunks for a full day. I really don’t see what the point is of limiting incomplete backups to initial backups only.

2 Likes

Is this problem fixed in 2.1.1? Did I get it right the resuming is only possible for initial backups?

No, resuming is always possible. But with initial backups duplicacy knows which chunks it can skip and the actual upload resumes much faster than when subsequent backups are resumed (because duplicacy has to “rediscover” every time which chunks have already been uploaded.

For smaller backups, this doesn’t really matter, but with larger ones, like I had, it can take a full day or longer until the actual upload resumes (i.e. duplicacy spends 24 hours just skipping chunks).

1 Like

Oh, wait. I now see you were probably referring to this:

And now I’m not sure anymore. My understanding was that while windows may produce additional reasons for the incomplete snapshot file missing, duplicacy by design only attempts to create it for the first snapshot. Is this correct @gchen? If so, could this behaviour be changed?

2 Likes