Some newbie questions - backup organisation

Hello guys,

I am coming from a joyful painful love-hate relationship with Duplicati. Lot’s of speed and amount of time related reasons to look for something else. At least for the huge (TByte || many files) backups.

My question is about backup organisation.

I want to backup music (a few folders), documents (folders spread all over the place), pictures (some folders)… So in Duplicati I created a backup set and defined which folders are part of it. Let’s say they all go to Google Drive and I want to have the new “backup set” in distinct folders on GDrive, in order to be able to delete them on Google side.

What would I do in Duplicacy? Create 3 storages, “music”, “documents” and “pictures”? For every folder I define one backup and assign it to the related storage? That would mean maaaany backup definitions. What is the concept of “ID” in the backups? Is it supposed to be a descriptive name?

Looks a bit to me like Duplicacy is targeted to backup just a few, top level folders and do the filtering… well, by filtering or Regex filtering (haven’t looked at it yet).

I am using the web UI as a start, later I will do cmd scripting (Windows here). So scripting + windows inbuilt cron is the way to go here?

Thx guys! Best regards
Michael

I am really wondering, if I created backups with different IDs in the same storage, how can I physically remove the backup data from just one ID?

Welcome to the forum @Tortuosit!

Me too. In my case just hate, no love. :laughing:

First of all, take a look at this topic to see if it helps with some of your questions:

I read it. Need more reading and testing. Uff, difficult stuff.
A lot there is about ONE repo has MULTIPLE IDs.
I am asking: Multiple repositories, ONE ID. “ID” as an organisational tag as mentioned above. ID “Music” may include 5 music repositories.
Bad idea?

And if there is just one storage - how do I get rid of files which belong to a specific ID (and obviously just to that ID)?

It cannot be done on filesystem level.

Nope. 3 repos: documents on John’s computer and documents and databases on Mary’s computer.

You will have to set one snapshot-id per repository. The same snapshot-id should not be used in different repositories.

I suggest you set the repository to a root folder that contains the 5 folders and select the 5 folders using filters.

This is the easy part and it can be done in different ways.

  1. You can use prune command with -id <snapshot id> option;
  2. You can directly access the storage and delete the desired/specific snapshot folder (which contains the files that reference that snapshot-id’s chunks) and then execute the prune command with -all option.
1 Like

I was expecting this is the way to go in Duplicacy.

So does that mean, at that moment, where I include (i.e. whitelist) a subfolder via filter, at that moment it will not include all other subfolders?
Or do I always have to explicitely include/whitelist subfolders?

Looks I need to read a bit more :smiley:

Let’s say your folder structure looks something like:

folder/
├── music1/
├── folder-foo/
│   ├── file-a.txt
│   ├── file-b.txt
│   └── file-c.txt
├── music2/
├── folder-bar/
...

The filters file of this repository could be:

+music1/*
+music2/*
-*

The last line will EXclude everything else.

Take a look:

1 Like

Ah yeah, also saw, that web-UI is quite helpful. But that is an unexpected order! I’d expect it to read top down. I.e. without any knowledge, I’d use:

-* (Exclude all at first)
+include1/*
+include2/*

Your example:
+music1/*
+music2/*
-*

reads like: Include music1/music2, then exclude everything. I.e., do nothing.

And that’s the way it works!

This way you propose:

the first line would always be found first and the other lines would never be read.

I thought now I’d understand its way of thinking but I was proven wrong, because it does nothing. I thought this is the right way of thinking:

Duplicacy goes through all files and folders. I have to think from each individual file/folder perspective. Say “C:\folder1\Iamafile.txt”. Now it looks into filter line 1. Say “+blah”. It does not match. Next filter line. “-*”. Match. File is excluded, no further processing. Next file.

Wrong way of thinking?

So now my situation is this:

M:\ M:\Folder1\ M:\Folder2\ M:\FolderFoo\

I want to include anything. The only thing I want to exclude is M:\FolderFoo. I added via the web-ui:

-FolderFoo/ +*

Does nothing. I am disappointed. My whole thinking was wrong. Also I am too stupid to add line breaks here in code tags.

I hope my thinking was right. After failed attempts, I always deleted Repositories via web ui, also deleted Storages, and there was a mess of filter files and other files left in .duplicacy-web\repositories\localhost…
That may have been my cause of struggling.

I’ve cleaned that up, restarted httpd, now it is working with my expected filters. Will see at restore time, if it included the folders I want.

In restore - Are backups only listed after totally finished? It says “No previous backups for this backup ID”.

See Scripts and utilities index.

I find myself making more use of NTFS links (Symlinks) in order to include stuff that is not on the same file system level. I’m OK with this. Average user could not do this. A more flexible, organisation based include system would be nice.

I think I was similarly confused. Did you see this topic:

It’s because @Tortuosit was reading like this:

While the correct logic / reading is:

I (Duplicacy :grinning:) am backing up a file whose path matches music1/*. OK! Stop reading the filtering pattern and back up.
I’m backing up a file whose path matches music2/*. OK! Stop reading the filtering pattern and back up.
I’m backing up a file whose path does not match either music1/* or music2/*. Ah! But there is a third filter that says “all” (*), and it says to exclude, so I won’t do anything. Next file!

And there is an important detail that I took a while to understand when I started using :d:, but it is essential to consider when you are creating your filtering logic:

This is a good example:

2 Likes

Sorry, I hadn’t noticed this doubt before.

Use four spaces

https://www.markdownguide.org/basic-syntax/#code-blocks

or fenced code blocks

https://www.markdownguide.org/extended-syntax/#fenced-code-blocks

1 Like

Thanks for clarification. Yes, I meanwhile understood I need to think from the individual file perspective exactly as you wrote. Filtering order makes sense then. But I think it is unuasual. But easy as well.

I came from a result list based thinking. Like:

  • At first exclude/blacklist all -> So we have an empty list as a start
  • Then do whitelisting, i.e., add all matching files to the list
  • Then we have the finished list: Start backing up what is in that list

Will read it, thx. Meanwhile I have understood the way filters work here.

Ok, just remember that you will have to place the include/whitelisting patterns before the “exclude/blacklist all” pattern you created in the previous step you mentioned.