The snapshot contains an error: The entry X/ appears before the entry Y/


#1

I have an issues backup up with duplicacy windows causing a failure to backup when I have the OneDrive folder symbolic linked in a directory I backup with duplicacy. I have had this sort of error before, and have got around in by changing the capitalisation of the symlinks and somehow that seems to have got around it. But I have added new directories and the problem has happened again, and I am not sure how to get around it.

See output below :-

C:\duplicacy-backup>dir
 Volume in drive C is OS
 Volume Serial Number is FE00-CF4F

 Directory of C:\duplicacy-backup

05/11/2018  04:56 PM    <DIR>          .
05/11/2018  04:56 PM    <DIR>          ..
05/11/2018  04:55 PM    <DIR>          .duplicacy
05/11/2018  03:37 PM    <JUNCTION>     Documents [c:\Users\xxxx\Documents]
02/11/2018  11:00 PM    <JUNCTION>     google-drive [c:\Users\xxxx\Google Drive]
05/11/2018  04:56 PM    <JUNCTION>     OneDrive [c:\Users\xxxx\OneDrive]
05/11/2018  03:37 PM    <JUNCTION>     Pictures [c:\Users\xxxx\Pictures]
               0 File(s)              0 bytes
               7 Dir(s)  140,822,466,560 bytes free

C:\duplicacy-backup>

C:\duplicacy-backup>duplicacy -log backup -storage wasabi-msduplicacydocuments-direct -vss -stats -threads 6
2018-11-05 16:57:36.485 INFO STORAGE_SET Storage set to minios://us-west-1@s3.us-west-1.wasabisys.com/msduplicacy/documents
2018-11-05 16:57:39.828 INFO BACKUP_START Last backup at revision 91 found
2018-11-05 16:57:39.833 INFO VSS_CREATE Creating a shadow copy for C:\
2018-11-05 16:57:51.748 INFO VSS_DONE Shadow copy {4ACC9E82-3C89-4A04-BE5D-7956138B0B7F} created
2018-11-05 16:57:51.752 INFO BACKUP_INDEXING Indexing C:\duplicacy-backup
2018-11-05 16:57:51.753 INFO SNAPSHOT_FILTER Loaded 0 include/exclude pattern(s)
2018-11-05 16:57:52.504 INFO BACKUP_THREADS Use 6 uploading threads
2018-11-05 16:57:53.116 INFO UPLOAD_PROGRESS Uploaded chunk 1 size 338, 338B/s 00:00:01 100.0%
2018-11-05 16:57:53.215 ERROR SNAPSHOT_CHECK The snapshot contains an error: The entry OneDrive/ appears before the entry Documents/
2018-11-05 16:57:53.234 INFO VSS_DELETE The shadow copy has been successfully deleted

Removing OneDrive sym link of course fixes it, but I do want to back that up. Previous time it happened was also with the OneDrive folder, so not sure if there is something particular about that directory, or that is just coincidence.

Any assistance would be greatly appreciated.


#2

Further to the issue above, it would appear that a rename of the symbolic link to lower case for the documents link fixes the issue. eg the follow seemed to work :-

C:\duplicacy-backup>dir
 Volume in drive C is OS
 Volume Serial Number is FE00-CF4F

 Directory of C:\duplicacy-backup

05/11/2018  05:16 PM    <DIR>          .
05/11/2018  05:16 PM    <DIR>          ..
05/11/2018  05:25 PM    <DIR>          .duplicacy
05/11/2018  05:16 PM    <JUNCTION>     documents [c:\Users\xxxx\Documents]
02/11/2018  11:00 PM    <JUNCTION>     google-drive [c:\Users\xxxx\Google Drive]
05/11/2018  04:56 PM    <JUNCTION>     OneDrive [c:\Users\xxxx\OneDrive]
05/11/2018  03:37 PM    <JUNCTION>     Pictures [c:\Users\xxxx\Pictures]
               0 File(s)              0 bytes
               7 Dir(s)  139,608,571,904 bytes free

But assume this is some sort of bug and it would be good to understand this issue a little better to avoid the problem in the future.


#3

I can’t figure out what was wrong here. Can you run duplicacy -d backup ... to see which directory gets listed first?


#4

I have run with the debug, and interesting the very 1st time the work OneDrive appears in the log is the error line :-
2018-11-09 15:19:23.548 ERROR SNAPSHOT_CHECK The snapshot contains an error: The entry OneDrive/ appears before the entry Documents/

This happens after 100% of the other chunks are uploaded. The directories happen in the order of Documents, Pictures and then google. When I change Documents to documents to get it working again, the directory order is Pictures, documents, and then google. But again OneDrive is missing from logs, and in this case completely missing because there is not the error.

So above got be curious, so changed the backups, and sure enough OneDrive is missing from the backups. So in a way the error was good, because it has helped me identify OneDrive is completely missing from the backups which is a bigger problem (because I think by renaming it I had work around it).

So there seems to be something special about the OneDrive folder which is causing the issue, but I have no idea what. If I look the directory in its orginal location, there seems nothing special about it I can see other than the folder icon has changed to show a cloud icon. And obviously I assume the folder is created by OneDrive. But I have no idea why it appears that Diplicacy does not like it.


#5

Could you please run duplicacy -d backup -dry-run >alloutout.txt and add that file here? to see if anything else appears?


#6

I have sent request files privately.


#7

I don’t use OneDrive myself but I’m aware that, since Windows 8.1, it uses a new Windows feature whereby files within OneDrive can be smart, offline or online only - i.e. placeholders that point to the data and is only downloaded once attempted to open. Are any of your OneDrive files ‘online only’?

This CrashPlan KB (scroll down to the Important info box) suggests it isn’t possible to backup such files and you should exclude the folder from backup.


#8

This most definitely has to do with the onedrive folder being a reparse point.

When added directly to the backup set Duplicacy knows to skip it ( this is the only logical way to handle it — as only some files are present, so backup will not be deterministic and therefore it’s best to skip backing it up as opposed to picking up unknown number of who knows which files).

However in ops case it’s a symlink to a reparse point. I have no idea how that should behave - but this sort of undefined behavior op is seeing is not unexpected.

Perhaps the correct way to address this is for Duplicacy to actively skip first level symlinks if they point to a reparse point (edit: or its children!). (And warn the user by a log message at least!)

If onedrive is to be backed up it shall be synced in its entirely, in which case it becomes just a regular folder.


#9

All my OneDrive files are downloaded, so should be able to be backed up. The crashplan KB “important note” only refers to issues with files that are not downloaded locally which obviously then can’t be backed up, so does not apply to me. The rest of KB seems to confirm that it should be doable.

I have noticed for some reason, Duplicacy only backs up smbolic links at the 1st level. So backing up a repository that has symbolic links is fine. But it appears that if those symbolic links point to a folder that also has symbolic links, then it does not follow those symbolic links. Is that on purpose, or is that a bug?? I think if it is on purpose, that is dangerous, especially if it does not trigger some sort of errors, as it is common for people to only check their backups when they need them, and this implementation might lead to assumption that something was backed up, but is not. I also see no reason for not backuping up all the symbolic links, no matter where in the tree they appear. This is especially true for a powerful tool like Duplicacy that does deduplication, because not extra space will be used to backup the same file in different backup jobs.

Anyway, I did try backup up OneDrive folder directly (ie in c:\user\xxxx\OneDrive) rather than via a symbolic link. And that indeed works just fine. So I suspect my issue might be similar to the reason Duplicacy does not like backing up a symbolic link to a symbolic link??? This is even though I have EVERYTHING synced locally in OneDrive and as far as I am aware all the files are locally.


#10

I don’t use much symlinks, but you can find some information in this topic:


#11

Thanks for pointing that out. Though I am not sure I totally agree with that point of view presented by TheBestPessimist. Obviously, it is not my software, so my point of view matters little. But I think it is dangerous for backup software not to default to a position of “if in doubt, back it up or at least clearly warn”. All too often, the only time a user properly checks their backup out, is when they really need it. So quietly missing directories that end users might assume are backup up because the job ran without errors is really dangerous. In a lot of cases, end users might not even be particularly aware of sym links and the subtle differences in outcome due to them.

I appreciate the concern about circular sym links putting the backup software into a loop, or other unintended consequences of sym link pointing back to root directories etc. But I would not be too worried about this. I assume if it was circular, I assume the backup job would pretty quickly error out with a path too long or something and you could then investigate and address it. If sym links pointed to other root directories, again, you would probably have a clue because your backup jobs were bigger than expected and taking more space than expected. But in either cases, I think 1 of 2 things will happen, neither of which would be as traumatic as one day discovering you need a restore of files from backup that do not exist. Option 1) is you back up extra stuff and don’t realise (does not sounds like a big problem to me), or 2) you realise a problem with the backups, and can then investigate and fix to your satisfaction (ie exclude the symbilic link that causes the problem etc).

Because of the “Repository” centric view Duplicacy takes, for some users, that can present a scalability issue. ie a lot of users with need lots of “backups” of lots of repositories to get their backups done, and this at some point because more difficult and confusing to manage. I strongly believe backups as far as possible need to adopt the “KISS” principle AND default to backing up more rather than less, or you increase the risks what what you need will not be their when you need it. So symbilic links are a nice way to consolidate lots of backups into a simple single job, if that is what suits the end user best. But this means no regular symbolic links will be backed up, so thus I think the importance of allowing symbolic links to be followed all the way down the tree.


Follow symlinks in subfolders?