Honoring com_apple_backup_excludeItem on MacOS

Hi Saspus,

That command is not working for me:

❯ duplicacy set -exclude-by-attribute=true
Incorrect Usage.

NAME:
   duplicacy set - Change the options for the default or specified storage

USAGE:
   duplicacy set [command options]

OPTIONS:
   -encrypt, e[=true]		encrypt the storage with a password
   -no-backup[=true]		backup to this storage is prohibited
   -no-restore[=true]		restore from this storage is prohibited
   -no-save-password[=true]	don't save password or access keys to keychain/keyring
   -nobackup-file <file name> 	Directories containing a file with this name will not be backed up
   -key  			add a key/password whose value is supplied by the -value option
   -value  			the value of the key/password
   -storage <storage name> 	use the specified storage instead of the default one

I installed Duplicacy via the WebGUI download. Do I need to separately update the command line version from Github perhaps?

In the Web Version > Setting > Command Line Version, it does in fact show Current Version as 2.7.2, which is the latest version on Github.

This is what I see:

% duplicacy set help
The set command takes no arguments.

NAME:
   duplicacy set - Change the options for the default or specified storage

USAGE:
   duplicacy set [command options]

OPTIONS:
   -encrypt, e[=true]           encrypt the storage with a password
   -no-backup[=true]            backup to this storage is prohibited
   -no-restore[=true]           restore from this storage is prohibited
   -no-save-password[=true]     don't save password or access keys to keychain/keyring
   -nobackup-file <file name>   Directories containing a file with this name will not be backed up
   -exclude-by-attribute[=true] Exclude files based on file attributes. (macOS only, com_apple_backup_excludeItem)
   -key                         add a key/password whose value is supplied by the -value option
   -value                       the value of the key/password
   -storage <storage name>      use the specified storage instead of the default one
   -filters <file path>         specify the path of the filters file containing include/exclude patterns

(built from source last week. Maybe the change is not in 2.7.2?).

Okay I’m talking to myself at this point, but I didn’t want to bother you with chasing something down that I’ve figured out – I had previously installed Duplicacy via homebrew cask, and the command line version was actually older, even though the web version was saying something else. So I’m trying to get that sorted out.

As far as the actual issue in this post is concerned, I think I’m good.

Thanks again.

Confirmed this works (I had to separately install the command line client).

My initial backup time dropped from 4 days to 17 hours after setting this. To confirm, I unset it and it went back to 4 days.

Though for the life of me I can’t figure out what these gigantic files are that Time Machine are excluding which I hadn’t already excluded.

This change is in 2.7.2:

$ ~/.duplicacy-web/bin/duplicacy_osx_x64_2.7.2 set help
The set command takes no arguments.

NAME:
   duplicacy set - Change the options for the default or specified storage

USAGE:
   duplicacy set [command options]  

OPTIONS:
   -encrypt, e[=true]		encrypt the storage with a password
   -no-backup[=true]		backup to this storage is prohibited
   -no-restore[=true]		restore from this storage is prohibited
   -no-save-password[=true]	don't save password or access keys to keychain/keyring
   -nobackup-file <file name> 	Directories containing a file with this name will not be backed up
   -exclude-by-attribute[=true]	Exclude files based on file attributes. (macOS only, com_apple_backup_excludeItem)
   -key  			add a key/password whose value is supplied by the -value option
   -value  			the value of the key/password
   -storage <storage name> 	use the specified storage instead of the default one
   -filters <file path> 	specify the path of the filters file containing include/exclude patterns

To see everything that will get excluded run this from the root of your Duplicacy repository:

find . -xattrname "com.apple.metadata:com_apple_backup_excludeItem" -exec ls -dp {} \;

Sorry if this should be obvious, but can this be set for the Web UI/scheduled jobs to use? I’m familiar with usage at the command line, but I just started exploring the Web version and not sure how to hook it in. When just running the duplicacy set ... command, where does that get persisted?

Update - as I just looked through the docs again, and after running the set command, from the storage location, it looks like it gets saved in the storage preferences - .duplicacy/preferences - (answering my own question). Still, posting this anyway in case others have the same question.

You would run duplicacy set under ~/.duplicacy-web/repositories/localhost/<n>

Unfortunately this would not work as the web GUI will recreate the preferences file every time it runs a backup job (or other jobs). In the coming web GUI version this option will be enabled by default.

1 Like

Hi folks,

This was a couple months ago but I want to come back to it because I don’t actually think it’s working.

I am able to run:

duplicacy set -exclude-by-attribute=true

without issue.

But when I run:

find . -xattrname "com.apple.metadata:com_apple_backup_excludeItem" -exec ls -dp {} \;

I see not output.

One thing I am confused about is what the actual root of my respository is. I am using the web version of the Duplicacy. In ~/.duplicacy-web/repositories/localhost, I have: 0, all, and restore. I’m not sure which of 0 or all is my root repository. They both seem recently updated. Anyhow, I ran both the above commands in both 0 and all, and in both cases I was able to set the setting but received nothing back from the find command.

What brought me back to this task is that everyday when the backup runs, it runs again for a long time – far more than anything I’ve just changed locally. So I think macos system files are still being backed up.

This won’t matter because

From which directory do you run this?

This would be what you have configured on the web gui. Often it’s users’s home.

You can see what has been backed up in the log files

The . near the start of that command refers to the current directory; wherever your shell is when you run the command. You could replace that . with whatever directory you want to search. You could, for example, use ~ to indicate your home directory or / to search your whole hard drive. You probably want to replace it with the root of your repository.

I’m confused by this thread. Could someone who understands explain how I apply the -exclude-by-attribute=true setting on MacOS with the web-ui?

You run it as a duplicacy command:
duplicacy set -exclude-by-attribute=true

This won’t work, as explained here:

Currently there is no way to set that flag when using web ui.

1 Like

Ah right. I missed the reference to the web ui. Sorry about that.

This is from the 1.5.0 release thread:

This is done by setting the preference key exclude_by_attribute to true in new backups created in 1.5.0, which means existing backups will not skip these files by default. If you want to change this behavior for existing backups, manually modify the preference key exclude_by_attribute in ~/.duplicacy-web/duplicacy.json to true .

2 Likes

Thanks a lot for the clarifications.

So since 1.5.0 is the first version I ever installed on my mac, I take it that this is enabled by default in my case. So this means that duplicacy will apply the same exclusions as time machine, without me doing anything, right?

PSA: Not all data that is excluded by Time Machine has that attribute set.

In other words, if you want to replicate Time Machine’s behavior concerning exclusions you would need to add few locations manually to the filters file.

Example folders that don’t have that attribute and yet are skipped by Time Machine:

~/Library/Logs					<-- Logs, standard locations. 
~/Library/Caches				<-- not unexpected either
~/Library/Metadata/CoreSpotlight<-- That’s interesting!
~/.Trash						<-- 

Your duplicacy filters file therefore should have at least

-Library/Logs/
-Library/Caches/
-Library/Metadata/CoreSpotlight/
-.Trash/

or

-*/Library/Logs/
-*/Library/Caches/
-*/Library/Metadata/CoreSpotlight/
-*/.Trash/

depending on whether repository root is at user’s home or elsewhere respectively

There perhaps are more instances of excluded stuff; those can be discovered with a script similar to the one below (I did not run it on the whole system though, only on my home folder, you may want to tweak it accordingly.)

#!/usr/bin/python
import os
import xattr
import subprocess

maxdepth = 5

for root, dirs, files in os.walk(os.environ["HOME"]):

    continue_traversing = True

    if "com.apple.metadata:com_apple_backup_excludeItem" in xattr.listxattr(root):
        continue_traversing = False
    else:
        if "Excluded" in subprocess.check_output(["tmutil", "isexcluded", root]):
            print(root)
            continue_traversing = False

    if continue_traversing and root.count(os.sep) >= maxdepth:
        continue_traversing = False

    if not continue_traversing and len(dirs):
        del dirs[:]

The output is a list of folders that are excluded by Time Machine but don’t have that attribute set.