Exclude/include Filters Question

Hello,

new to Duplicacy. Trying to figure out include and excludes.
I am using Duplicacy on MacOS and have setup a backup for the entire harddisk but want to exclude certain partitions, directories and also some patters. I read https://forum.duplicacy.com/t/filters-include-exclude-patterns/1089, just still not fully clear to me.

Can I mix regular expressions and wildcards in one exclude file?

My current exclude file looks something like this:

-SHARED/*
-.timemachine/*
-.DocumentRevisions-V100
-.MobileBackups
-.MobileBackups.trash
-.Spotlight-V100
-.TemporaryItems
-.Trash
-.Trashes
-.dbfseventsd
-.dropbox
-.dropbox.cache
-.fseventsd
-.hotfiles.btree
-.vol
-Backups.backupdb
-Cache
-Caches
-/Library/Metadata/CoreSpotlight
-DerivedData
-node_modules
-Logs
-
/iTunes/iTunes Media/Downloads
-/iTunes/iTunes Media/Podcasts
-
/iTunes/Album Artwork
-/iTunes/Previous iTunes Libraries
-
/Library/Application Support/CrashReporter
-/.DS_Store
-
/.VolumeIcon.icns
-/.fseventsd
-
/.vol
-*/.file

I am however not sure if I would not have to do e:\ for all the

-*/abc

Basically I want to exclude the SHARED Partition completely and then certain things like .DS_Store files, Trashes, iTunes, TimeMachine Things.

For example for the SHARED partition should it be

-SHARED/*

or

-/Volumes/SHARED

or something

-/Volumes/SHARED/*

Lastly is there an option to exclude directories if for example a hidden

.nobackup

File is present.

Thank you so much for your help.

Yes. Every path is evaluated against rules, and the first matching rule is used, the rest are ignored.

For what you need regex would be overkilll. You can do everything with the glob syntax.

Note, * matches everything, including path separator, and if the pattern ends with / — it matches directories. Else — files.

I would not recommend excluding .DS_Store.

The patterns are relative to the root of the repository. Where did you initialize the duplicacy? If you initialize it in /Users then to exclude /Users/Shared you would write -Shared/.

On macOS I highly recommend not bothering with duplicacy exclusions. Duplicacy already honors Time Machine exclusions (by default if you use webui, or you need to enable it if you use CLI). Thus you can use tmutil to set exclusions, that will have a number of benefits:

  • no need to maintain complex exclusions for every backup program: tmutil uses extended attribute to mark objects for exclusion; this means your metadata about what to exclude lives with the data
  • most [good] software vendors already mark what to exclude properly
  • if they don’t — you can mark it yourself with tmutil addexclusion <path>

Most of the stuff from your list would have been excluded automatically. You can verify if something is excluded with tmutil isexcluded <path>

The exclude_by_attribute feature thread: Honoring com_apple_backup_excludeItem on MacOS - #27 by gchen

2 Likes

This was super helpful, thank you so much!!!

I take it this is the Option I was looking for at the end of my post?

nobackup-file —> Directories containing a file with this name will not be backed up

No, i believe this is a different one, where you can plop a special “no backup” file into a folder to skip that folder. I’m not sure, I’ve never used it myself.

The exclusion by attribute though is specific to macOS and works on both individual files and folders, and has a benefit of having most of the junk/caches/temps marked as such, courtesy of Time Machine api.

So would be this, correct?

I understand that you are talking about exclusions by tmutil marker. But this is the other option I was looking for since I have used this before my structure is setup this way already with directories having this file present.

Ah, yes, makes sense! Enable both, since you already have your directory structure infused with no backup files :slight_smile:

you can specify which file to use as a no-backup file:

1 Like

Perfect, thank you so much!

1 Like

I do have anther question so I went to the .duplicacy-web/repositories/localhost/all folder and did run /Users/Antergosgeek/.duplicacy-web/bin/duplicacy_osx_x64_3.1.0 set -nobackup-file .nobackup

It replied new options for mystorage have been set.

When I look at the duplicacy.json it looks like this…

          {
                "index": 1,
                "id": "MBP-Intel",
                "path": "/Users",
                "storage": "mystorage",
                "global_options": "",
                "options": "-threads 4",
                "report_url_enabled": false,
                "report_url": "",
                "report_url_on_failure": false,
                "nobackup_file": "",
                "exclude_by_attribute": true,
                "filters": true,
                "job_log": "",
                "job_code": "",
                "job_note": "",
                "job_time": "0001-01-01T00:00:00Z"
            }

Should it not list something here:

“nobackup_file”: “”,

like

“nobackup_file”: “.nobackup”,

for example?

Any idea?

Don’t modify anything under .duplicacy-web/repositories/localhost/ – those preference files get created every time Web UI launches CLI to do the job. All your changes will be lost.

This just set the flag in the /Users/Antergosgeek/.duplicacy-web/repositories/localhost/all/.duplicacy/preferences that will be overwritten next time web ui launches the job.

If you want to make these changes – make them in duplicacy.json, then webui will incorporate the changes into those temp repositories.

Yep, it appears it needs to be set, default is an empty string, and does nothing: https://github.com/search?q=repo%3Agilbertchen%2Fduplicacy%20NobackupFile&type=code

Got you! So will go to the duplicacy.json and set

Thank you!

Not sure that my excludes are working. I did a -dry-run and the log says this…

Running backup command from /Users/Antergosgeek/.duplicacy-web/repositories/localhost/1 to back up /Users/Antergosgeek
Options: [-log backup -storage mystorage -threads 1 -dry-run -stats]
2023-08-14 08:41:28.798 INFO REPOSITORY_SET Repository set to /Users/Antergosgeek
2023-08-14 08:41:28.799 INFO STORAGE_SET Storage set to mystorage
2023-08-14 08:41:33.614 INFO BACKUP_EXCLUDE Exclude files with no-backup attributes
2023-08-14 08:41:34.284 INFO BACKUP_START No previous backup found
2023-08-14 08:41:34.285 INFO BACKUP_LIST Listing all chunks
2023-08-14 08:41:34.808 INFO BACKUP_INDEXING Indexing /Users/Antergosgeek
2023-08-14 08:41:34.809 INFO SNAPSHOT_FILTER Parsing filter file /Users/Antergosgeek/.duplicacy-web/repositories/localhost/1/.duplicacy/filters
2023-08-14 08:41:34.809 INFO SNAPSHOT_FILTER Ignoring duplicate pattern: -/Library/Metadata/CoreSpotlight …
2023-08-14 08:41:34.809 INFO SNAPSHOT_FILTER Ignoring duplicate pattern: -
/.fseventsd …
2023-08-14 08:41:34.809 INFO SNAPSHOT_FILTER Ignoring duplicate pattern: -*/.vol …
2023-08-14 08:41:34.809 INFO SNAPSHOT_FILTER Loaded 59 include/exclude pattern(s)

2023-08-14 08:55:37.679 INFO UPLOAD_FILE Uploaded .CFUserTextEncoding (7)
2023-08-14 08:55:37.679 INFO UPLOAD_FILE Uploaded .DS_Store (67588)
2023-08-14 08:55:37.679 INFO UPLOAD_FILE Uploaded .autorestic.yml (1)
2023-08-14 08:55:37.679 INFO UPLOAD_FILE Uploaded .gitconfig (187)
2023-08-14 08:55:37.679 INFO UPLOAD_FILE Uploaded .p10k.zsh (91444)
2023-08-14 08:55:37.679 INFO UPLOAD_FILE Uploaded .zcompdump (49375)
2023-08-14 08:55:37.679 INFO UPLOAD_FILE Uploaded .zcompdump-MacBook Pro (Intel)-5.9 (50008)
2023-08-14 08:55:37.679 INFO UPLOAD_FILE Uploaded .zcompdump-MacBook Pro (Intel)-5.9.zwc (117872)
2023-08-14 08:55:37.679 INFO UPLOAD_FILE Uploaded .zprofile (199)
2023-08-14 08:55:37.679 INFO UPLOAD_FILE Uploaded .zsh_history (29428)
2023-08-14 08:55:37.679 INFO UPLOAD_FILE Uploaded .zshrc (4452)

2023-08-14 08:55:37.680 INFO UPLOAD_FILE Uploaded .Trash/07012023_Audi BKK.pdf (328383)
2023-08-14 08:55:37.680 INFO UPLOAD_FILE Uploaded .Trash/07012023_Audi BKK.pdf 12-19-37-860.pdf (389375)
2023-08-14 08:55:37.680 INFO UPLOAD_FILE Uploaded .Trash/07242023_ING.pdf (380194)
2023-08-14 08:55:37.680 INFO UPLOAD_FILE Uploaded .Trash/07242023_ING.pdf 12-22-13-126.pdf (563365)

Should it not exclude .Trash?

I don’t think it makes sense to start pattern with / – they are always relative to the repo directory.

Trash may be special – probably Time Machine excludes it by rule, rather than extended attribute.

See:

# Create new folder
% mkdir test

# by default it's includd
% tmutil isexcluded test
[Included]    /System/Volumes/Data/Users/alex/Downloads/test

# and has no extended attributes
% xattr test

# lets add it to exclusion and check attributes:
% tmutil addexclusion test
% tmutil isexcluded test
[Excluded]    /System/Volumes/Data/Users/alex/Downloads/test
% xattr test
com.apple.metadata:com_apple_backup_excludeItem


# Clear attributes, now it will appear as included again:

% xattr -c test
% tmutil isexcluded test
[Included]    /System/Volumes/Data/Users/alex/Downloads/test
# What about trash: Is it included? YES:
% tmutil isexcluded ~/.Trash
[Excluded]    /System/Volumes/Data/Users/alex/.Trash

# Does it have an attribute? Nope:
% xattr ~/.Trash
%

There are a few files and folder like these, and you will need to add them to exclusion list, OR, add the attribute manually (e.g. with tmutil adddexclusion).
I wrote a script some time back to find all of them, and there are just a few:

~/Library/Logs			<— Logs and Caches, also standard locations. 
~/Library/Metadata/CoreSpotlight	<— That’s interesting!
~/Library/Caches			<— not unexpected
~/.Trash        <--- lol	

here is the script if you want to run it on your system:

#!/usr/bin/python
import os
import xattr
import subprocess

maxdepth = 5

for root, dirs, files in os.walk(os.environ["HOME"]):

    continue_traversing = True

    if "com.apple.metadata:com_apple_backup_excludeItem" in xattr.listxattr(root):
        continue_traversing = False
    else:
        if "Excluded" in subprocess.check_output(["tmutil", "isexcluded", root]):
            print(root)
            continue_traversing = False

    if continue_traversing and root.count(os.sep) >= maxdepth:
        continue_traversing = False

    if not continue_traversing and len(dirs):
        del dirs[:]

Quite a bit of homework :joy:.

I will check it out and report back. Thank you!

Ah! Did not mean it to be a homework :slight_smile: you can just add these three to the exclusion list

-Library/Logs/		 
-Library/Metadata/CoreSpotlight/	
-Library/Caches/			
-.Trash/

or run tmutil addexseclusion on them to set the attribute.

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.