Q about Duplicacy best practices

tangofan · 14 March 2020 02:35

Hi,

I’m a new user of duplicacy-web (running it in a docker container on my NAS) and I have a few questions to make sure that I’m getting started on the right footing:

How should I group my storage (with a given cloud provider)? Say I use Google Drive. Should I backup everything (even from multiple devices) into the same storage on Google Drive? When would it make sense to have separate storages, say in different directories of Google Drive?
Is it possible to rename the backup id (perhaps with the cli version)? Say I backup my Lenovo laptop and use the backup id “Lenovo laptop” for that? Then I switch to a new laptop, say an HP, transfer all my file and I want to rename the backup id to “HP laptop”, since now I’m backing up from the new laptop. Is that possible? Or would I just create a new backup id, backup into the same storage and deduplication would take care of the rest?
If I only have certain subdirectories of my file tree that I want to backup, do I need to use (or should I use) separate backup ids or can/should I use one backup ids and use filters?
Example: Let’s take the (partial) tree:

/backuproot
–Media
----NAS
------Backup
--------My Books
--------My Music
----NASCrypt
------Backup
--------Personal
------Temp

If I want to backup “Personal” (and its subfolders) and “Media” and its subfolders, should/can I use separate backup ids?

Thanks in advance for any help.

gchen · 14 March 2020 04:04

It depends on if there is cross-device deduplication. That is, if there are a lot of identical files/directories on different devices, then they should go to the same storage. Otherwise just use separate storages.

Yes, you can use a new backup id. The new backup will start from revision 1, but it should be fairly fast because most chunks are already in the storage.

Either way should work.

tangofan · 14 March 2020 23:31

Hi @gchen,

Thanks so much for the response. That helps me quite a bit. I hope you don’t mind another question re. improving deduplication:

In this thread about system design and performance issues you mention that it might be beneficial for the deduplication rate, if files larger than the average chunk size would go into their own chunks.
Has this feature ever been implemented?

Also can I still expect improved deduplication, if I create my storage (with the CLI tool) with a 1M chunk size, instead of the 4M that the web UI uses by default?

towerbr · 15 March 2020 22:58

An advantage of using different backup ids is that you can have more granular control over when to perform each backup. Some folders need more frequent backups than others. I use it like this (several ids) but I think it is basically a matter of personal taste and how your files are organized.

You’ll find some interesting information here:

Test #9: Test of wide range chunk setup

And here: