Some newbie questions - backup organisation

Tortuosit · 27 August 2019 12:53

Hello guys,

I am coming from a joyful painful love-hate relationship with Duplicati. Lot’s of speed and amount of time related reasons to look for something else. At least for the huge (TByte || many files) backups.

My question is about backup organisation.

I want to backup music (a few folders), documents (folders spread all over the place), pictures (some folders)… So in Duplicati I created a backup set and defined which folders are part of it. Let’s say they all go to Google Drive and I want to have the new “backup set” in distinct folders on GDrive, in order to be able to delete them on Google side.

What would I do in Duplicacy? Create 3 storages, “music”, “documents” and “pictures”? For every folder I define one backup and assign it to the related storage? That would mean maaaany backup definitions. What is the concept of “ID” in the backups? Is it supposed to be a descriptive name?

Looks a bit to me like Duplicacy is targeted to backup just a few, top level folders and do the filtering… well, by filtering or Regex filtering (haven’t looked at it yet).

I am using the web UI as a start, later I will do cmd scripting (Windows here). So scripting + windows inbuilt cron is the way to go here?

Thx guys! Best regards
Michael

Tortuosit · 27 August 2019 14:45

I am really wondering, if I created backups with different IDs in the same storage, how can I physically remove the backup data from just one ID?

towerbr · 27 August 2019 15:14

Welcome to the forum @Tortuosit!

Me too. In my case just hate, no love.

First of all, take a look at this topic to see if it helps with some of your questions:

Tortuosit · 27 August 2019 16:17

I read it. Need more reading and testing. Uff, difficult stuff.
A lot there is about ONE repo has MULTIPLE IDs.
I am asking: Multiple repositories, ONE ID. “ID” as an organisational tag as mentioned above. ID “Music” may include 5 music repositories.
Bad idea?

And if there is just one storage - how do I get rid of files which belong to a specific ID (and obviously just to that ID)?

It cannot be done on filesystem level.

towerbr · 27 August 2019 16:45

Nope. 3 repos: documents on John’s computer and documents and databases on Mary’s computer.

You will have to set one snapshot-id per repository. The same snapshot-id should not be used in different repositories.

I suggest you set the repository to a root folder that contains the 5 folders and select the 5 folders using filters.

This is the easy part and it can be done in different ways.

You can use prune command with -id <snapshot id> option;
You can directly access the storage and delete the desired/specific snapshot folder (which contains the files that reference that snapshot-id’s chunks) and then execute the prune command with -all option.

Tortuosit · 28 August 2019 11:54

I was expecting this is the way to go in Duplicacy.

So does that mean, at that moment, where I include (i.e. whitelist) a subfolder via filter, at that moment it will not include all other subfolders?
Or do I always have to explicitely include/whitelist subfolders?

Looks I need to read a bit more

towerbr · 28 August 2019 12:03

Let’s say your folder structure looks something like:

folder/
├── music1/
├── folder-foo/
│   ├── file-a.txt
│   ├── file-b.txt
│   └── file-c.txt
├── music2/
├── folder-bar/
...

The filters file of this repository could be:

+music1/*
+music2/*
-*

The last line will EXclude everything else.

Take a look:

Tortuosit · 28 August 2019 12:08

Ah yeah, also saw, that web-UI is quite helpful. But that is an unexpected order! I’d expect it to read top down. I.e. without any knowledge, I’d use:

-* (Exclude all at first)
+include1/*
+include2/*
…

Your example:
+music1/*
+music2/*
-*

reads like: Include music1/music2, then exclude everything. I.e., do nothing.

towerbr · 28 August 2019 12:30

And that’s the way it works!

This way you propose:

the first line would always be found first and the other lines would never be read.

Tortuosit · 28 August 2019 12:49

I thought now I’d understand its way of thinking but I was proven wrong, because it does nothing. I thought this is the right way of thinking:

Duplicacy goes through all files and folders. I have to think from each individual file/folder perspective. Say “C:\folder1\Iamafile.txt”. Now it looks into filter line 1. Say “+blah”. It does not match. Next filter line. “-*”. Match. File is excluded, no further processing. Next file.

Wrong way of thinking?

So now my situation is this:

M:\ M:\Folder1\ M:\Folder2\ M:\FolderFoo\

I want to include anything. The only thing I want to exclude is M:\FolderFoo. I added via the web-ui:

-FolderFoo/ +*

Does nothing. I am disappointed. My whole thinking was wrong. Also I am too stupid to add line breaks here in code tags.

Tortuosit · 28 August 2019 13:02

I hope my thinking was right. After failed attempts, I always deleted Repositories via web ui, also deleted Storages, and there was a mess of filter files and other files left in .duplicacy-web\repositories\localhost…
That may have been my cause of struggling.

I’ve cleaned that up, restarted httpd, now it is working with my expected filters. Will see at restore time, if it included the folders I want.

Tortuosit · 28 August 2019 13:05

In restore - Are backups only listed after totally finished? It says “No previous backups for this backup ID”.

TheBestPessimist · 29 August 2019 11:45

See Scripts and utilities index.

Tortuosit · 1 September 2019 09:36

I find myself making more use of NTFS links (Symlinks) in order to include stuff that is not on the same file system level. I’m OK with this. Average user could not do this. A more flexible, organisation based include system would be nice.

Christoph · 2 September 2019 22:18

I think I was similarly confused. Did you see this topic:

towerbr · 3 September 2019 12:06

It’s because @Tortuosit was reading like this:

While the correct logic / reading is:

I (Duplicacy ) am backing up a file whose path matches music1/*. OK! Stop reading the filtering pattern and back up.
I’m backing up a file whose path matches music2/*. OK! Stop reading the filtering pattern and back up.
I’m backing up a file whose path does not match either music1/* or music2/*. Ah! But there is a third filter that says “all” (*), and it says to exclude, so I won’t do anything. Next file!

And there is an important detail that I took a while to understand when I started using , but it is essential to consider when you are creating your filtering logic:

This is a good example:

Improving the instructions for include/exclude patterns (filters)

Assuming foo/bar/file.txt is the only file you want to back up, the correct patterns are:
+foo/bar/file.txt
+foo/bar/
+foo/
-*
Duplicacy lists the root of the repository. The foo/ folder matches the third pattern, while all others will match the fourth one and thus get excluded. In the next step Duplicacy lists the foo/ folder and only foo/bar/ will be included. Finally, Duplicacy lists foo/bar/ and finds foo/bar/file.txt which is the only file with a matching include pattern while all others will be excluded by -* . Overall, Duplicacy only needs to list 3 directories to locate the only file to be included without iterating through the entire directory tree.

towerbr · 3 September 2019 12:18

Sorry, I hadn’t noticed this doubt before.

Use four spaces

https://www.markdownguide.org/basic-syntax/#code-blocks

or fenced code blocks

https://www.markdownguide.org/extended-syntax/#fenced-code-blocks

Tortuosit · 3 September 2019 15:29

Thanks for clarification. Yes, I meanwhile understood I need to think from the individual file perspective exactly as you wrote. Filtering order makes sense then. But I think it is unuasual. But easy as well.

I came from a result list based thinking. Like:

At first exclude/blacklist all -> So we have an empty list as a start
Then do whitelisting, i.e., add all matching files to the list
Then we have the finished list: Start backing up what is in that list

Tortuosit · 3 September 2019 15:32

Will read it, thx. Meanwhile I have understood the way filters work here.

towerbr · 4 September 2019 22:22

Ok, just remember that you will have to place the include/whitelisting patterns before the “exclude/blacklist all” pattern you created in the previous step you mentioned.