Filters/Include exclude patterns

filters

#1

For the backup command, the include/exclude patterns are read from a file named filters under the .duplicacy directory. For the restore command, the include/exclude patterns are specified as the command line arguments.

Duplicacy offers two different methods for providing include/exclude filters, wildcard matching and regular expression matching. You may use one method exclusively or you may combine them as you deem necessary.

All paths are relative to the repository (the folder you execute duplicacy from), without a leading β€œ/”. As the upmost folder on Windows is a drive, this means drive letters are not part of the path of a pattern. The path separator is always a β€œ/”, even on Windows. Paths are case sensitive.

1. Wildcard Matching

An include pattern starts with β€œ+”, and an exclude pattern starts with β€œ-”. Patterns may contain wildcard characters β€œ*” which matches a path string of any length, and β€œ?” matches
a single character. Note that both β€œ*” and β€œ?” will match any character including the path separator β€œ/”.

When matching a path against a list of patterns, the path is compared with the part after β€œ+” or β€œ-”, one pattern at a time. Therefore, the order of the patterns is significant. If a match with an include pattern is found, the path is said to be included without further comparisons. If a match with an exclude pattern is found, the path is said to be excluded without further comparison. If a match is not found, the path will be excluded if all patterns are include patterns, but included otherwise.

Patterns ending with a β€œ/” apply to directories only, and patterns not ending with a β€œ/” apply to files only.
Patterns ending with β€œ*” and β€œ?”, however, apply to both directories and files. When a directory is excluded, all files and subdirectories under it will also be excluded. Therefore, to include a subdirectory, all parent directories must be explicitly included.
For instance, the following pattern list doesn’t do what is intended, since the foo directory will be excluded so the foo/bar will never be visited:

+foo/bar/*
-*

The correct way is to include foo as well:

+foo/bar/*
+foo/
-*

The following pattern list includes only files under the directory foo/ but not files under the subdirectory foo/bar:

-foo/bar/
+foo/*
-*

To include a directory while excluding all files under that directory, use these patterns:

+cache/
-cache/?*

2. Regular Expression Matching

An include pattern starts with β€œi:”, and exclude pattern starts with β€œe:”. The part of the filter after the include/exclude prefix must be a valid regular expression. The
regular expression syntax is the same general syntax used by Perl, Python, and other languages.
Full details for the supported regular expression syntax and features are available here.

When matching a path against a list of patterns, the path is compared with the part after β€œi:” or β€œe:” one pattern at a time. Therefore, the order of the patterns is significant. If a match with an include pattern is found, the path is said to be included without further comparisons. If a match with an exclude pattern is found, the path is said to be excluded without further comparison. If a match is not found, the path will be excluded if all patterns are include patterns, but included otherwise.

Some examples of regular expression filters are shown below:

# always include sqlite databases
i:\.sqlite$
# exclude sqlite temp files
e:\.sqlite-.*$
# exclude temporary file names
e:.*/?~.*$
# exclude common file types (case insensitive)
e:(?i)\.(bak|mp4|mkv|o|obj|old|tmp)$
# exclude lotus notes full text directories
e:\.ft/.*$
# exclude any cache files/directories with cache in the name (case insensitive)
e:(?i).*cache.*
# exclude lightroom previews
e:(?i).* Previews\.lrdata/.*$
# exclude Qt source
e:(?i)develop/qt[0-9]/.*$
# exclude any git stuff
e:\.git/.*$
# exclude cisco anyconnect log files: matches .cisco/log/* or .cisco/vpn/log/*, etc
e:\.cisco/.*/?log/.*$
# exclude trash bin stuff
e:\.Trash/.*$
# exclude old firefox stuff
e:Old Firefox Data/.*$
# exclude dirx stuff: excludes Documents/dir0/*, Documents/dir1/*, ...
e:Documents/dir[0-9]*/.*$
# exclude downloads
e:Downloads/.*$
# exclude duplicacy test stuff
e:DUPLICACY_TEST_ZONE/.*$
# exclude lotus notes stuff
e:Library/Application Support/IBM Notes Data/.*$
# exclude mobile backup stuff
e:Library/Application Support/MobileSync/Backup/.*$
# exclude movies
e:Movies/.*$
# exclude itunes stuff
e:Music/iTunes/iTunes Media/.*$
# include everything else
i:.*
# include Firefox profile but nothing else from Mozilla
i:(?i)/AppData/[^/]+/Mozilla/$
i:(?i)/AppData/[^/]+/Mozilla/Firefox/
e:(?i)/AppData/[^/]+/Mozilla/

Explanation of the regex above:

  • /[^/]+/: has the purpose of assuring that there is exactly 1 folder between AppData and Mozilla
  • we need to include
    • the Mozilla folder, but nothing it contains (therefore the $)
    • the Firefox folder, and everything it contains
    • exclude everything in the Mozilla folder which is not contained in the rules above
    • (important) put the $ include rule(s) for each folder we want to include up to the actual folder where we take everything, (check Google Chrome profile below). (note: someone please explain this better)
# include Google Chrome profile but nothing else from Google
# note that we include the whole profile, because we are unsure how many "users" are added beside the "Default" profile
i:(?i)/AppData/[^/]+/Google/$
i:(?i)/AppData/[^/]+/Google/Chrome/$
i:(?i)/AppData/[^/]+/Google/Chrome/User Data/
e:(?i)/AppData/[^/]+/Google/

As seen in the examples above, you may add comments to your filters file by starting the line with a β€œ#” as the first character of the line.
The entire comment line will be ignored and can be used to document the meaning of your include/exclude wildcard and regular expression filters. Completely blank lines are
also ignored and may be used to make your filters list more readable. Note that if you add # anywhere else but at the beginning of a line, it will be interpreted as part of the pattern, not as a comment.


Need to be able to select only specific folders in the repository to back up
Filters for files in subdirectories
Backup command details
Filters on a share
Duplicacy - exclude files video tutorial
Include a folder for backup in Mac user library
Considering Duplicacy with Wasabi
Include patterns for symbolic links
Restore command details
Backing on External Drive w/ Time Machine Hidden Directories
Patterns for exclusion/inclusion are confusing
Patterns for exclusion/inclusion are confusing
Patterns for exclusion/inclusion are confusing
Let me limit the folders that are checked during a backup
Backing up serval directories at once
Include patterns with wildcards
Duplicacy User Guide
Files-from -- filter by mtime size
Backup from specific directory without cd
Which folders in ProgramData and AppData should be backed up?
#2

Do I understand correctly that with Wildcard Matching, the entire path must be specified? For example:

/home/joe/foo/bar the directory I want to exclude
-/bar/ wrong
-*/bar/ right

But in Regular Expression Matching, the expression only needs to match part of the path? For example:

/home/joe/foo/bar the directory I want to exclude
e:/bar/$ right
e^.*/bar/$ also right

(also, is /$ at the end the regular expression the correct way to exclude a directory?)


#3

A post was split to a new topic: Patterns for exclusion are confusing


#4

I did some testing, and I believe that the answer is yes: with Wildcard Matching, the entire path must be specified, but in Regular Expression Matching, the expression only needs to match part of the path. If you want a regular expression to only match the end of a path, put $ at the end of the regular expression (or /$ in the case of directories).


#5

This is correct, and mostly because the only regex matching function in Go, regexp.Match, behaves more like Search in other languages (I had been confused by this for a while). So if you need to match the entire path then the anchors ^ and $ are required.

And yes, /$ at the end matches directories only.