Proposal: following symlinks by pattern

This is to address issues brought up by recent posts such as Expanded Symlink Support and Excluded symlinks are still included in backup

My proposal is to add a new field to the preference:

   ...
   "symlink_follow": ["^dir1/link.*", "^dir2/[^/]+$"]
   ...

The value is a list of regex patterns. If a symlink matches any regex pattern in the list, then the symlink will be followed. If the target is a file, the target file will be backed up. If the target is a directory, the target directory will be listed.

If a symlink doesn’t match any pattern in the list, the symlink is backed up as a symlink (if it is not excluded by the filters file).

Another change to be made is, when a symlink is to be followed and the target turns out to be a directory, the source directory (basically the path of the symlink with a trailing / added) is checked against include/exclude patterns in the filters file to determine if the source directory is excluded. Note that it is the source directory, not the target directory, that will be checked.

If symlink_follow is null or an empty list, then a default pattern ^[^/]+$ is used. This pattern means only the symlinks in the root of the repository, not any subdirectories, will be followed, which is the current behavior.

I don’t want to add symlink_follow as an option to the backup command, because that would be error-prone. For instance, you may only run the backup in a script with a carefully selected symlink_follow patterns, but one day you accidentally type duplicacy backup, and your backup will be screwed.

These symlink_follow patterns will be stored in the snapshot files so you’ll be able to go back and check what patterns were used in the past.

2 Likes

I have mixed feelings about this #feature request

Another change to be made is, when a symlink is to be followed and the target turns out to be a directory, the source directory (basically the path of the symlink with a trailing / added) is checked against include/exclude patterns in the filters file to determine if the source directory is excluded. Note that it is the source directory, not the target directory, that will be checked. (emphasis @TheBestPessimist)

Yes, i think this could be the solution for Excluded symlinks are still included in backup and all the related issues.

To me this sounds a lot like the already existing Filters/Include exclude patterns which just need more tweaking. :d: already lists everything and matches agains the filters file, chosing what to include and what to exclude.

By adding the symlink_follow feature, it seems like we are duplicating a lot of what the include patterns are supposed to do.

A different suggestion i’m making is to tweak/empower the include patterns (i:regex or +filepath to follow symlinks nesteed more deeply than repository root, only if the current “to be checked” path (file, folder, symlink to anything ) matches an include pattern.

2 Likes

I thought about that, by adding actions after the patterns. For instance, +dir1/*.lnk: follow means to follow any symlinks with the suffix .lnk under dir1. This would serve as the basis for other extensions, but I didn’t go this route because 1) it is a much bigger change and 2) backward compatibility (with the current behavior of following first-level symlinks) is hard to enforce.

1 Like

What i proposed is simpler than this (ie. than adding a : follow, which implies the user has previous knowledge that this is a symlink file or folder):
If the user types

+dir1/*

then nothing really changes, no symlinks are followed if they are not in repo root (which in this case they arent).
However if there is a symlink

dir1/some/path/to/a/symlink_folder

(since the symlinks currently appear without a / at the end in :d:, i have also written it like that) and the user has the following lines in filters

  • option 1
+dir1/some/path/to/a/symlink_folder/

(here i have written with a / at the end since both in windows and in macos i see folder symlinks as normal folders! there’s no indication this is a symlink unless i test it in cli)

  • option 2
+dir1/*
+dir1/some/path/to/a/symlink_folder/

then the symlink is followed since the folder is explicitly included.

There’s the same option for symlink file as well, we add the following to filters:

+dir1/some/path/to/a/symlink_file

[later edit]: In case the include pattern is written like a file (the just-above) example, a folder should also be included, since

+dir1/some/path/to/a/a_folder

and

+dir1/some/path/to/a/a_folder/

currently behave similar (afaik, please correct me if i’m wrong here, as i have only used regexp filters so far).


Basically the way i imagine this feature is that the user writes the path by what it sees in its file explorer: symlinks to either files or folders appear just as a regular file/folder to the user, so :d: should work things out the same way the user sees them.

2 Likes