Graceful Handling of Failed and Incomplete Snapshots

Droolio · 11 January 2020 01:40

I think by ‘storage arrays’ he’s referring to the repository source disks here - not the destination storage.

dgcom · 11 January 2020 01:59

Oh, I see… In that case it’s quite possible to get empty but valid backup…
What would help here is the pre-backup script which can fail entire execution if source is not available…

Erwin · 11 March 2020 03:53

Duplicacy should indeed fail immediately if either the source OR destination volume is not present. Is it the case (I have not tried yet but this seems like an essential basic behavior)?

leerspace · 11 March 2020 11:53

No. I think in order for this to be the case someone would have to manually implement a way for the CLI to check mount status for any given backup path on each and every supported platform, and make it work for any conceivable mount type.

From what I can tell, there is no cross platform method to do this out of the box in golang.

It’s possible to make a pre-backup script to check that there are more than 0 files in the backup path. But I haven’t actually tested it myself.

santacruzskim · 15 March 2020 19:21

I think this is a further reaching issue than is being suggested and at least in my experience, happens when it clearly shouldn’t. I also don’t think such elaborate means should be needed to fix this (though I say this with a bit of arrogance). I suspect many people suffer from this issue and don’t even know it because they don’t back up obscene amounts of data (like I do) and they aren’t sitting there monitoring their backups. If a backup takes an hour instead of five minutes because something triggered a -hash, most people simply wouldn’t notice or care.

To get back to my original issue, I actually had this bite me again just yesterday. I brought my storage arrays online (for the sake of this discussion, just consider it plugging in an external USB drive). I then manually ran a backup in the WebUI. It failed, with the log saying it didn’t have access to the volume. Worst part of all (and the central reason for my issues, I believe): it went on with the backup(!) reporting 0 files at the source, 0 file chunks uploaded, and 3 metadata chunks, which I’m now thinking were probably flagging the backup as now empty and triggering a complete re-scan upon future backups).

Running backup command from C:\Users\Administrator/.duplicacy-web/repositories/localhost/3 to back up Y:/ServerFolders/eYe Archive
Options: [-log backup -storage EYE_STORAGE_01 -threads 30 -stats]
2020-03-14 23:20:30.892 INFO REPOSITORY_SET Repository set to Y:/ServerFolders/eYe Archive
2020-03-14 23:20:30.892 INFO STORAGE_SET Storage set to gcd://(e)/DUPLICACY_EYE_01
2020-03-14 23:20:43.421 INFO BACKUP_START Last backup at revision 34 found
2020-03-14 23:20:43.421 INFO BACKUP_INDEXING Indexing Y:\ServerFolders\eYe Archive
2020-03-14 23:20:43.421 INFO SNAPSHOT_FILTER Parsing filter file \\?\C:\Users\Administrator\.duplicacy-web\repositories\localhost\3\.duplicacy\filters
2020-03-14 23:20:43.422 INFO SNAPSHOT_FILTER Loaded 28 include/exclude pattern(s)
2020-03-14 23:20:43.422 WARN LIST_FAILURE Failed to list subdirectory: open \\?\Y:\ServerFolders\eYe Archive: Access is denied.
2020-03-14 23:20:51.666 WARN SKIP_DIRECTORY Subdirectory  cannot be listed
2020-03-14 23:20:51.776 INFO BACKUP_END Backup for Y:\ServerFolders\eYe Archive at revision 35 completed
2020-03-14 23:20:51.776 INFO BACKUP_STATS Files: 0 total, 0 bytes; 0 new, 0 bytes
2020-03-14 23:20:51.776 INFO BACKUP_STATS File chunks: 0 total, 0 bytes; 0 new, 0 bytes, 0 bytes uploaded
2020-03-14 23:20:51.776 INFO BACKUP_STATS Metadata chunks: 3 total, 8 bytes; 3 new, 8 bytes, 882 bytes uploaded
2020-03-14 23:20:51.776 INFO BACKUP_STATS All chunks: 3 total, 8 bytes; 3 new, 8 bytes, 882 bytes uploaded
2020-03-14 23:20:51.776 INFO BACKUP_STATS Total running time: 00:00:18
2020-03-14 23:20:51.776 WARN BACKUP_SKIPPED 1 directory was not included due to access errors

I double-checked permissions and whatnot, ran it again and got the same error. I restarted and the backup started right away, showing that it needed to upload all 9TB of data to my Google Drive account. I am watching it as we speak as reads every…single…file in my local repository, with the WebUI reporting a transfer rate of 50MB/s = about 48 hours of thrashing my disks. Of course, no actual data is being uploaded, because every single bit of data is already in my Google Drive storage. So even though is currently reporting it has uploaded over 2TB, my network traffic indicates around 600MB has come and gone through the duplicacy service from checking (not uploading) every damn chunk.

As a semi-“power user” I understand enough to get me in trouble, but also have “normal human” expectations for how an application like this should function, and I believe I represent the majority of users in that sence. From my standpoint this is what I’m seeing:

First, shouldn’t freak out if it doesn’t complete a backup, no matter the reason for the interruption. I thought the lock-free system would be perfect for this - it should know what chunks were successfully uploaded, which weren’t, and which are incomplete and need to be tossed and re-uploaded.

Second, if it does fail, it should not require a complete re-scan of every single bit of data in the repository vs the destination storage.

Third, if it fails specifically because it can’t read data on the repository when it begins the backup, no matter the reason, it should gracefully fail. In fact, I would argue that it should not have even started in the first place since that should have already been part of the process for preparing the backup operation: check the source. check the destination.

Whether it be from the volume not being mounted, or not existing, or being write protected, or reporting 0B of data… these are all red flags that should stop a backup from happening and be reported to the user, just as bad “options” already are (they just report “invalid options” and stop). The do not “fail” - they stop the operation before starting.

I hope this gets addressed sooner than later, or at least gives users a way to bypass the issue.

Sidenote - during this current fiasco, I was thinking that if I do catch it performing a bad backup like this, if I delete that revision and try again, maybe it will perform more gracefully. It doesn’t fix a single problem, but it may help me get out of a jam and work around the issue while it is hopefully being properly addressed.

Droolio · 15 March 2020 22:42

Yes, do that next next time. Delete the snapshot with 0 files, it’ll be fine. It’s rehashing because your next backup has nothing to compare to.

I wholeheartedly agree the Web UI should eventually have a source check before backup, but you need to understand that it’s not an easy task to cover in all circumstances. To build this into the CLI would be quite problematic, thus it could be a GUI task. On Linux for example, a mount point that isn’t currently mounted is just an empty directory - and all programs (backup or otherwise) just sees an empty directory.

In the meantime, you may have to write a pre-script to check perhaps for the existence of a specific file in the source directory - before running the backup job, for such external mounts. Both CLI and GUI can do pre-scripts.

gchen · 16 March 2020 02:49

I agree that if there is an error accessing the root directory to be backed up, Duplicacy shouldn’t create an empty backup. I vaguely remember there was a discussion about this issue a while ago and we decided not to change the current behavior for some reason?

leerspace · 17 March 2020 00:30

FWIW I can’t think of a reason not to treat an error while accessing the root directory as a fatal error.

This is the only other thread I could find in the time I have to search, but there might be others.

santacruzskim · 22 March 2020 19:48

I fully endorse that possibility

I just can’t help but think there is an easier way to get around what appears to be such a basic problem, at least for the issues that I am currently dealing with, which are…

I can’t create scheduled backups for source repositories that aren’t always on and always connected since the absence of the volume will report 0B (no data) and trigger a complete rescan on the next backup instance.
I have to judiciously monitor my larger backups since (as demonstrated in my current issue), having the volume offline isn’t the only reason a backup might fail or return a 0B.

Neither of these, in my mind, have to do with Duplicacy’s inability to recognize when a volume isn’t attached. It has to do with data being inaccessible (for whatever reason) and how responds to such a situation. At a minimum, and as @gchen described, Duplicacy should never create an empty backup. I can’t imagine a negative side effect from this, but my ignorance might be showing there. If no one else can either, what is the next step towards getting this implemented?

leerspace · 23 March 2020 13:17

I think there’s an important distinction between this and what @gchen wrote. The comment quoted below says that Duplicacy shouldn’t create an empty backup under a specific scenario (error accessing the root directory) – not “never”. While it may not be true with your setup, some directories just look empty when unmounted rather than throw errors.

I think that has been suggested elsewhere, but maybe there should be two changes.

Make errors accessing the root directory fatal, so the backup fails before uploading an empty snapshot
For mounts that only look empty when unmounted, maybe add an option to backup like -no-empty-root or something that causes an empty root directory to also throw a fatal error

santacruzskim · 23 March 2020 17:08

In the context of the rest of my post, I believe we are saying the same thing, but I understand the distinction you are making. Regardless, it seems everyone is in agreement that 's response to not being able to access a repository should not be to create an empty backup. The various situations that would cause the issue to arise and the issues it creates have now been well documented and agreed upon (and now reported in many places in these forums) and I think we can focus on solutions.

These make sense to me, but I’m curious why the most popular suggested workaround of creating a pre-backup script cannot also be the solution. That is, to bake the same “check” on the repository into the default behavior of the backup command itself. I have prodded at this from various angles, but for this solution in particular, what are the downsides? Maybe I’m missing something that could really cause issues for people?

Here’s a thread that specifically talks about pre-backup scripting in the context of working around this issue (linking directly to a post on my experience setting this up on Windows)

Christoph · 28 March 2020 20:15

From what I understood, the problem is that your pre-backup script will work because it is written for your specific setup/environment. To bake such a script into the backup command itself would require it to work in all setups/environments, which is very difficult (or impossible?) for reasons that others can explain better.

santacruzskim · 28 March 2020 22:38

I suggested this method specifically because it works on all platforms, at least Windows and Linux-based (maybe there are lots of users on platforms where this isn’t an option?). The syntax is slightly different of course, but they accomplish the same goal of asking a folder, file, mount-point, etc., “are you there? can I access you?” and it issuing a yes/no response which duplicacy can then act on.

My lack of knowledge on how something like this would be implemented may be showing though. I’m assuming there would be a way to detect the platform and issue the relevant command, or at least get creative…

Regarding pre-backup scripts in particular, you can send both Windows and Linux commands and have no downside that I can see - on Windows if I create a script called pre-backup.bat it will run if it’s in the scripts folder. If it’s called pre-backup (not an executable file extension), it is totally ignored, but on Linux it would have run, correct?

These commands appear to be very simple too, where Duplicacy would only need inject the root path(s) to the source/“repository.”

And if all of this is complete nonsense to the folks much more knowledgeable than myself, the option of just skipping a backup if there are 0 files detected would seem to get the job done and be platform agnostic, no? This would also prevent 0B / bad backups from getting factored into prune calculations where some pretty common options would leave you with dramatically less revisions containing actual data.

leerspace · 28 March 2020 23:41

If you’re referring to this post where the script only checks if a directory exists, this won’t work for all mount types. For example, on one my Linux VMs I have several NFS mounts that just look like empty directories when they aren’t mounted.

So a pre-backup script that just checks if the path exists would say that it’s okay to run the backup, even if the NFS share isn’t mounted.

It’s a maintenance headache. You now have two scripts to maintain and keep in sync with any improvements (one for each platform), multiplied by the number of backups you want to have pre-backup scripts check before running.

Agreed. Is there an edge case you think these two suggested changes from earlier in the thread wouldn’t cover?

santacruzskim · 30 March 2020 04:07

Thanks for the follow-up.

My script checks to see if the source exists because at least on Windows, this appears to be all I need (it worked with symlinks pointing to offline volumes too). I was assuming (possibly incorrectly) that a viable Linux script would be similarly easy to accomplish; same logic, different syntax.

To put some context around the scripts, I look at these as inspiration for how something so simple seems to address the problem. I was/am assuming the same tactics could be built into Duplicacy natively using the tools and language Duplicacy is built on. Not knowing what tools are actually available to address this, I speak with willful ignorance and the intent of inspiring someone who actually knows more than me and can make that leap to an actual solution.

@leerspace 's suggestions make perfect sense to me. As I see it, my suggestions above are exploring how to implement #1:

Looking at my log of a successful backup of an offline volume under Windows, I see 3 (overlapping) things Duplicacy is claiming knowledge of that should cause the backup to abort, rather than complete “successfully”: (1) Can’t find the path (2) can’t list contents (3) errors accessing directory (in this case the root of the repository)

#1: WARN LIST_FAILURE Failed to list subdirectory: open \\?\I:\FiTB Archive: The system cannot find the path specified.
#2: WARN SKIP_DIRECTORY Subdirectory  cannot be listed
#3: WARN BACKUP_SKIPPED 1 directory was not included due to access errors

I’m pasting the entire log at the bottom for context.

For item #2 of @leerspace 's solution:

I see no possible side effects and it has my vote and 100% support. Depending on how this is implemented, I wonder how would respond to a situation like mine where it can’t find the directory. Curious if there is a distinction in its logic between “empty” and “inaccessible” since the resulting behavior is currently the same under both circumstances.

I would even like to take things a step further and provide another option like -skip-empty-backup where if there is 0B to back up for any reason, the backup is skipped instead of completing successfully. I would never recommend this be a default behavior though. I just like that it feels super easy to implement and would be an interesting alternative solution for my particular backup needs.

…and here’s the full log of my empty, successful backup of an offline volume:

2020-03-29 13:23:52.652 INFO REPOSITORY_SET Repository set to I:/FiTB Archive
2020-03-29 13:23:52.653 INFO STORAGE_SET Storage set to gcd://(e)/DUPLICACY_FITB_01
2020-03-29 13:23:56.403 INFO BACKUP_START Last backup at revision 27 found
2020-03-29 13:23:56.403 INFO BACKUP_INDEXING Indexing I:\FiTB Archive
2020-03-29 13:23:56.403 INFO SNAPSHOT_FILTER Parsing filter file \\?\C:\Users\Administrator\.duplicacy-web\repositories\localhost\0\.duplicacy\filters
2020-03-29 13:23:56.403 INFO SNAPSHOT_FILTER Loaded 26 include/exclude pattern(s)
2020-03-29 13:23:56.403 WARN LIST_FAILURE Failed to list subdirectory: open \\?\I:\FiTB Archive: The system cannot find the path specified.
2020-03-29 13:23:58.325 WARN SKIP_DIRECTORY Subdirectory  cannot be listed
2020-03-29 13:23:58.336 INFO BACKUP_END Backup for I:\FiTB Archive at revision 28 completed
2020-03-29 13:23:58.336 INFO BACKUP_STATS Files: 0 total, 0 bytes; 0 new, 0 bytes
2020-03-29 13:23:58.336 INFO BACKUP_STATS File chunks: 0 total, 0 bytes; 0 new, 0 bytes, 0 bytes uploaded
2020-03-29 13:23:58.336 INFO BACKUP_STATS Metadata chunks: 3 total, 8 bytes; 0 new, 0 bytes, 0 bytes uploaded
2020-03-29 13:23:58.336 INFO BACKUP_STATS All chunks: 3 total, 8 bytes; 0 new, 0 bytes, 0 bytes uploaded
2020-03-29 13:23:58.336 INFO BACKUP_STATS Total running time: 00:00:03
2020-03-29 13:23:58.336 WARN BACKUP_SKIPPED 1 directory was not included due to access errors

gchen · 31 March 2020 00:36

With this commit, Duplicacy will error out if the repository root isn’t accessible, or if there are no files under the repository to be backed up.

santacruzskim · 31 March 2020 03:29

woah. major thank you @gchen. I can’t wait to test this!

leerspace · 31 March 2020 19:01

When is the next planned release that would include this change?

gchen · 1 April 2020 03:03

A new CLI release is planned for this week, then followed by a new web GUI release next week.

system · 8 February 2021 10:41

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.