Filter pattern to include only one type of file in several subfolders

filters

#1

I’m taking a beating from an include / exclude pattern …

I have a repository with multiple subfolders, with various file types, and need to back up a specific file type (let’s say .txt) that is present in multiple subfolders.

If I try with this format:

i:(?i).*.txt*$
e:.*

nothing is backed up, because the folders are ignored by the parameter e:. *.

If I try with this format only:

i:(?i).*.txt*$

also nothing is backed up, because the folder names doesn’t (obviously) match this “txt” pattern.

What would be the correct format in this case?


#2

Thanks for this little puzzle. I’m using it to practice my very low regex and duplicacy filter skills. So what follows is not authoritative until confirmed by someone else…

No, I dont think that folders are ignored because of e:. * You can exclude the whole world at the end of your filters, as long as you’ve included everything you want before.

So I think the problem is that you have not included those subfolders. Try something like

i:(?i).*\/foldername\/.*\/
i:(?i).*.txt*$
e:.*

Where “foldername” is the parent folder of those subfolders you’re targeting.

Edit:
No, I don’t think the above will quite work yet (unless your parent folder is the repository root folder. If it’s not, you also need to include the parentfolder’s parentfolder and the parentfolder’s parentfolder’s parentfolder (and and so on).


#3

Let me describe some more details.

These folders and files are generated automatically in a monthly basis:

repository/
├── 2018-06/
│   ├── file1.ex1
│   ├── file2.ex2
│   ├── file3.txt  <=
│   ├── file4.ex1
│   ├──...
├── 2018-07/
│   ├── file5.ex1
│   ├── file6.ex2
│   ├── file7.txt  <=
│   ├── file8.ex1
│   ├──...
...
├── 2019-06/
│   ├── file9.ex1
│   ├──...

(hundreds of files are generated)

I just need to back up the txt files.

So:

I also thought that, but:

...
INFO SNAPSHOT_FILTER Loaded 2 include/exclude pattern(s)
TRACE SNAPSHOT_PATTERN Pattern: i:(?i).*.txt*$
TRACE SNAPSHOT_PATTERN Pattern: e:.*
...
DEBUG PATTERN_EXCLUDE 2018-06/ is excluded by pattern e:.*
DEBUG PATTERN_EXCLUDE 2018-07/ is excluded by pattern e:.*
DEBUG PATTERN_EXCLUDE 2018-08/ is excluded by pattern e:.*
...

And using only the inclusion parameter:

INFO SNAPSHOT_FILTER Loaded 1 include/exclude pattern(s)
TRACE SNAPSHOT_PATTERN Pattern: i:(?i).*.txt*$
...
DEBUG PATTERN_EXCLUDE 2018-06/ is excluded
DEBUG PATTERN_EXCLUDE 2018-07/ is excluded
DEBUG PATTERN_EXCLUDE 2018-08/ is excluded
...

I can’t do this, because the folders will continue to be generated, and I would have to update the filters file every month.

So, we can summarize this way: “how to back up a specific file type, regardless of which subfolder it is located?”


#4

Folks, you overcomplicate things :slight_smile:

Consider this:

+*/
+*.txt
-*

Example:

alexmbp:source alex$ find .
.
./.duplicacy
./.duplicacy/filters
./.duplicacy/preferences
./a.bin
./d2
./d2/b.bin
./d2/d3
./d2/d3/c.bin
./d2/d3/3.txt
./d2/2.txt
./1.txt

And here we go:

alexmbp:source alex$ duplicacy backup -dry-run
Storage set to /tmp/target/
No previous backup found
Indexing /tmp/source
Loaded 3 include/exclude pattern(s)
Packed 1.txt (0)
Packed d2/2.txt (0)
Packed d2/d3/3.txt (0)
Backup for /tmp/source at revision 1 completed

Hint - * matches everything, including path separator. Name ending with / only matches directory name :slight_smile:


#5

Worked perfectly, thank you! :ok_hand:

An additional question: what is the logic behind the need for this line?-> +*/

Is this to force the inclusion of subfolders (due to the slash at the end)?


The curious thing is that every time I try to use the regex format :nauseated_face: I end up going back to the wildcards…


#6

Yes, this means “Include all directories” — otherwise only files in the root will be seen parsed if the subfolder is not matched for inclusion it’s content won’t be seen.

You can do the same with regex; something like i:.*\/$ for the first line, etc; but wildcards are more readable and as long as they get the job done no need to involve more powerful but uglier tools :smiley:


#7

In addition to saspus’ explanation you can check out this topic where I’m giving Gilbert a hard time explaining the logic behind it:


#8

Yes, I remember this topic. :wink: