Using VSS when backing up files (Windows)

I agree, no doubt about that. It would be extremely valuable if duplicacy maintained list of known default exclusions for each OS (just like Crashplan you mentioned and pretty much any other backup program ( Arq, Kopia, even Backblaze Personal) evolved to do).

On macOS it is already accomplished by TM extended attribute; there it is indeed working as you describe: set to backup everything and it will do the right thing. I’m using it on macOS and don’t bother with filters, other than Logs/Caches and .Trash, which must be excluded manually still…

On windows and other OSes unfortunately there is no way to do that other than via centralized exclusion list or by peppering your filesystem with .nobackup markers.

Anyway, this is a candidate for another feature request.

If a file cannot be opened Duplicacy will skip the file. However, if a file can be opened but can’t be read then Duplicacy will simply quit. In this case it is caused by a poorly-implemented VSS writer, but it can also indicate a bigger problem like disk corruption. So I think it is better to just give up the backup.

Respectfully, I disagree, at least in this case. There’s no disk corruption - it’s a binlog file, presumably used by the database Telegram uses, so it must be a case that either isn’t handled by VSS, Duplicacy, or something else. Telegram is one of the most popular messengers on the planet, so it would make sense to not abort the entire backup here as it’ll be happening to many users.

Furthermore, I think any sort of read issues should be treated as skips with warnings, just like they’re already treated for all the other cases listed. It should be on the user to deal with real potential corruption, not Duplicacy. Duplicacy should skip and move on.

1 Like

Why can’t Duplicacy just skip the problematic file (regardless of the underlying problem), log the error/warning, and continue backing up the rest of the repository that can be read and in need of backing up?

1 Like

Fully agree here with @fisowiw784 and @archon810.

It could be also, as the message seems to indicate, that portion of the file is locked (e.g. memory mapped), as part of absolutely legitimate operation.

Let’s think about it this way: There are only two answers to the question “Can duplicacy read this file atomically and successfully in its entirety?”:

  • YES: Pack it up, move to the next file.
  • NO: Skip it, make a note, move to the next file.

The reasons why exactly couldn’t it read it are irrelevant. Maybe the disk has bad sector. Maybe file is open in exclusive mode by other program. Maybe something unexpected is going on that ultimately results in file read failure. Duplicacy shall make a note of the incident, and continue.

Having a bad file planted between my important document shall never preclude backup of said documents.

On the flakiness of VSS: it happens all the time. On some Windows machines, the file system gets corrupted routinely. Does not matter, duplicacy must do its best to suck in as much of the user data as possible.

1 Like

I’m still not convinced that read errors should be skipped. Do you report an error, or merely a warning, at the end of the backup? If you report an error, then you still need to exclude these files anyway in order to silence the error.

Just a warning: i.e.

3882 new and changed files backed up
12 files skipped due to read errors.

Skipping few files does not make backup a failure. In fact it’s almost expected when used without VSS.

In other words it’s hard to imagine that just because some files failed user decides that they don’t want other files backed up anymore.

In yet other words: backup software must do its best to protect as much data as possible. This implies not giving up on minor failures. Lose few fights but win the war.

1 Like

What happens with VSS enabled and it then fails? Should Duplicacy plough on and potentially encounter more failures and create an empty snapshot? Making the next incremental take eons? That behaviour has already been reported by users, even recently.

And then it keeps happening coz there’s a b0rked VSS writer or stuck service (often resolved by a reboot IME) and the user is non-the-wiser coz all their backups complete and yet are empty. :grimacing:

Rather it aborted and retried again at the next schedule…

No difference.

Yes, and it should keep skipping failures and saving successes.

If everything fails — yes. That’s the state of the filesystem now. Backup program does not get to decide what and when to backup. It shall keep trying.

That would not be the case. Duplicacy does not make incremental backups based on previous one, they are incremental on entire dataset.

Duplicacy could warn the user when the number of files picked up or failed changes drastically between backup sets.

Why would this be any different? If vss is screwed — it’s screwed. User needs to repair filesystem. And next backup will be attempted anyway when time comes; this does not justify giving up on the current one.

This can’t be always be known, because Duplicacy may not see the files due to the error.

The SnapRAID helper script I use to schedule syncs and scrubs has a protection measure where, a customisable threshold can be configured - e.g. 300 - to detect during the pre-diff run if that many files were deleted, and to abort if exceeded.

Often I have to manually bump it up when curating the array, but it’s a pita to deal with in a backup program. Much much simpler to abort and retry later. As a user, I want to know as soon as major errors occur, not after my backup program glosses over them.

That’s not true.

Duplicacy indexes the fs and iterates through it, comparing against the list of files made during the last backup to determine what should be hashed and chunked. Normally this is new or modified files but the whole lot can be forced with -hash.

An empty previous snapshot would be the same as adding all the same data again, in a different directory or whatever. Or the same as a backup with -hash. Because all that data is ‘new’ according to Duplicacy.

And that’s great. If it is only possible to read 0 files from the file system now – that’s what this filesystem is now. It must be backed up as filesystem with zero files. It’s accurate. Do you also expect backup program to fix file system, re-flow ram and order you new UPS when the last one can’t switch reliably?

Duplicacy may, as a courtesy, report to the user when the number of files in the data set changed drastically – like OneDrive notifies users when it notices that many files are deleted in bulk. But it’s just that – a courtesy notification.

Incremental, as in “no data will be transferred to the target when files re-appear”.

This is correct. Data went away and now some data reappeared. Maybe old, maybe new. Duplicacy can’t know. It shall re-hash. This is correct behavior.

After a filesystem crash, trying to optimize rehashing time is the least of users’ worries. They need to salvage data and replace the drive, and need to have their backup to be up to date. It would be infuriating to find that duplicacy did not pick up the document they were editing for the past last couple of hours because the disk sector holding some temporary trash from the browser cache rotted and duplicacy “gave up”.

Edit:
Duplicacy also may, if VSS fails, attempt to back up the file(s) direclty without VSS, also with a warning. Knowing what piece of unstable turd VSS on Windows is — this would be a prudent, custoemr friendly approach.

Duplicacy already aborts a backup, with an error, when 0 files are backed up. Quite happy with this arrangement, and no I don’t expect it to fix anything. Just report the error, because it is.

What about when 10% is excluded, or 90%? And Duplicacy happily creates partial backups - potentially forever - without generating an error!

Not relevant to what you said:

Which is technically incorrect, since Duplicacy has to rehash all the data again. Untouched data that was already in a previous backup, but is now disregarded because of an intermediate fs hiccup. That’s simply not correct behaviour.

This is a bug that needs to be addressed. I’ve reported it before, years ago.

Yes, because there is no error. All files that were possible to back up were correctly backed up. It is not possible to back up files that are unreadable. Kind of obvious.

What we are actually discussing here is whether there is some threshold of failures that should warrant notifying the user.

Not aborting backup, just notifying the user. That courtesy note I mentioned above, that duplicacy can issue when count of files in the dataset changes drastically. Or maybe even if at least one file was skipped: this way to prevent constant nagging, users ultimately would have to write an exclusion pattern for those files, thereby making any message from duplicacy meaningful (zero-warning policy)

For most users, transferring data takes time and is expensive. Compute resources are free and infinite.

You are trading the mild inconvenience of spending extra CPU cycles rehashing content for risk of skipping important, readable files. Makes zero sense to me.

The correct behavior is to protect as much of user data as possible. That’s the goal. If that means re-hashing everything next time around – so what? CPU time is worthless, user’s data is priceless.

Not a bug, in fact a quick n dirty workaround to address the exact problem we’re talking about.

The rehashing is just undesirable behavour. Not the first it’s come up. Either way, I wouldn’t call a 30-fold increase in backup time, acceptable behaviour - given that it can safely abort and retry on the next schedule.

The dangerous behaviour I’m more concerned about is a proposal to make Duplicacy NEVER fail when encountering a disk error! i.e. Silently ignoring read errors. I chose Duplicacy because of its reliability - not to do silly stuff like that. Perhaps if you want to do that, the backup command could have a -persist option too?

I agree with your idea that Duplicacy could revert to using normal file access if VSS fails, although there’s no evidence that’d help; archon will probably have to exclude the binlog file anyway. Not such a big deal for a new repository.

Not silently. Notifying the users, as suggested above.

I guess we argue about two different things:

  • You say duplicacy shall not ignore failures. I agree. It shall report them, and keep nagging the user and have him/her configure exclusions in a way that no exceptions are reported during normal backup run.
  • I say it shall nevertheless continue attempting to backup whatever else it can, not abort on first error, simply because there in the queue could be a document that user spent hours editing, and it shall be given a chance to get backed up. If this comes at the cost of 30x longer next backup – well, you just salvaged user’s data from the disintegrating system. It’s a very small prices to pay. The machine is dead anyway.

This is backwards. Backing up empty dataset is not an error. What should have been detected as an error instead – is VSS failure, or network failure, or mount point missing. If duplicacy can’t distinguish between these failures – it’s a bug; checking for zero count of files is a dirty hack of a workaround, that needs to be undone, once the correct error handling is implemented (which is sending a message such as “selected mountpoint is missing”, or “VSS timed out”, or anything else.

This problem is not unique to duplciacy, and approach to solving it can be borrowed from other tools, that do exactly what suggested – report that mount point is missing and carry on with backing up the rest of the stuff.

How is the user notified something urgent needs to be fixed with their backup jobs if there can never be such a thing as a failed backup when disk read errors occur??

I see small numbers of WARNs in my logs all the time - for junction points and locked files (with VSS). It’s a reasonable balance. Harder errors like disk reads will just get lost amongst a sea of green marks in the UI. Nothing is amiss if you don’t look at the logs. Unless you fail the backup and don’t save the snapshot.

Not saying things can’t be improved here. i.e. Try non-VSS when VSS fails. Exceed 100 skipped files? Error out.

In fact it’d be nice if Duplicacy could adopt the incomplete snapshot concept for incrementals as it does for an initial backup. (And make the incomplete data accessible.) That’d solve everyone’s needs.

Well I guess if you want that behaviour, Duplicacy could always offer a -persist option.

Via the same mechanism used for notification about failed backup. Or I don’t understand your question.

In the email sent, there would be three sections:

  1. Fatal errors: Could not connect to the target, could not transfer data, could not upload snapshot, etc.
  2. Warnings: Unexpected failures to read few files: bad sectors, VSS crap, SIP, other stuff that shoudl work but did not, etc; list those files.
  3. Notes: Expected failures to read few files: reparse points, symlinks, other stuff that is known unsupported but was included, etc; list them too.

Successful backup means no fatal errors. User is supposed to review the warning and notes and fix them. I would also not send an email at all if there are no warnings, no notes, and not failures. All is good – don’t say anything. There is enough spam as it is.

(Separate discussion is whether backup tool developers should maintain exclusion list for mainstream OSes and not put the burden on the user)

Also, none of what’s listed in Warnings or Notes should preclude completing the remaining backup. Only the user knows which failures and what files are important. Duplicacy cannot take it upon itself to abort backing up a set of files because the other set of files failed. I keep repeating this, but this is very critical: safety of user data is paramount here. CPU usage is not.

Fix it. Add exclusions. See next comment.

It’s not. Because, just like you said, critical warnings will drown in the sea of unimportant ones. Empty logs are happy logs.

How will you know it failed the backup? By receiving a notification.
That same notification can just list warnings instead, without failing the backup and missing files that are readable just fine.

Why 100? Why not 10? or 1? Any magic number here will be wrong. What is my mount of external filesystem got mounted with wrogn permissions? All 30000 files are now unreadable. Shall back up halt and skip backing up my document I just edited in the home folder? Answer – heck no!

Oh, yes! Another related killer feature would be to continue taking local file system snapshots at the scheduled cadence, even if backup cannot be started or completed for other reasons (such as no network, or running on battery) and when connectivity is restored – process and transfer data for these multiple snapshots all at once. I don’t think anyone besides Time Machine does it today. This would be a huge selling point for folks like myself.

That’s a copout. Offering the multitude of configuration options only moves the burden of making a choice (the developer could not make!) to the user. The developer needs to decide on the right approach and stick to it, and optionally, teach the user the correct way.

Herein lies my problem with your logic. As said previously, WARNs can happen almost every single backup. Perfectly normal behaviour. VSS isn’t perfect. Now you expect to flag disk read errors as a WARN and let the user sift through all that? Never an error.

What you’re advocating for is ‘silent failure’, however you want to put it.

Not possible. Locked files. These are random.

Quite funny really. You expect me to add exceptions for junction points but not for others to do the same with troublesome binlogs and other .lock files VSS can’t deal with.

Couldn’t agree more, which is why I think your proposal is unsafe. The underlying issue may mean a complete snapshot is never created again + the last known good backup is no longer protected from pruning + the user is not properly informed anything is wrong.

Apparently, he has, and agrees that it should fail on such disk read errors.

I’d never use a -persist option either but definately don’t want the default behaviour to change. I very much favour incomplete for incrementals as the better compromise, although recognise implementation could take a fair bit of effort. :slight_smile:

I’m the type of person who always, without fail, checks every file that isn’t backed up. If a file fails to backup, for whatever reason, I still expect my other files to be backed up. Give me those warnings and errors; I will deal with them. I absolutely do not want any ‘silent failures’ in any way!

Back up everything, but tell me what failed. Every backup program I’ve used follows this behaviour, even IDrive.

I also think incomplete should definitely be implemented for all backups, not just the initial backup, if only to make resuming aborted/cancelled backups faster.