Duplicacy backup is slow when using filters

When using this filter:

-.Recycle.Bin/*
-Plex-Media-Server/Library/Application Support/Plex Media Server/Media/localhost/*.bif
-kiwix-serve/*
-Plex-tmp/*
-Plex-var-tmp/*

Duplicacy will spend 4 minutes at this stage:

Loaded 5 include/exclude pattern(s)

Is that normal? It’s kind of weird because the backup only takes a couple of seconds. Kind of frustrating that the filtering stage should take 90% of the total ammount.

I am not entirely sure if I wrote the filters correct though.:thinking: What I want is just to eclude the top level folders .Recycle.Bin, kiwix-serve, Plex-tmp and Plex-var-tmp. The only exception is the -Plex-Media-Server/Library/Application Support/Plex Media Server/Media/localhost/ folder where I want to exclude all *.bif files recursively.

The command I am using is duplicacy -log -verbose -stack backup -stats -threads 12

I don’t think rules have anything to do with the slowdown; it’s filesystem traversal itself.

Questions:

  1. What happens if you run it twice in a row, without changing anything? Is the second time faster?
  2. How much free ram do you have on the server?
  3. What is the OS
  4. What is the filesystem?
  5. How is it mounted?

That said, your filter are inefficient and can be improved.

  1. Use a trailing / for directory exclusion; /* is unnecessary and slows matching.
  2. Recursion is automatic with *, as it also matches /

So, use this:

-.Recycle.Bin/
-kiwix-serve/
-Plex-tmp/
-Plex-var-tmp/
-Plex-Media-Server/Library/Application Support/Plex Media Server/Media/localhost/*.bif

Next step would be getting a spindump, or flamegraph (or whatever analogue go or your target os provides) and actually see what is it spending time on during those four minutes.

Thanks for assisting again @saspus

Same thing. Even if I omit the * in the filter as you suggested.

A lot — 107 GB.

Unraid 7.1.4

I am backing up from a ZFS NVMe to a ZFS NVMe on the same server.

I have no clue. I am not that Linux savvy, but this is what ChatGPT says:

"In Unraid, the appdata share is not mounted like a normal partition. Instead, it is provided through Unraid’s User Share system, which uses the shfs FUSE filesystem.

Oh, unraid, of course…

I don’t have much experience with unraid, but I think your robot-friend is right – all user mounts on unraid go via FUSE, and there is no way around it.

You can confirm by looking at output from mount command – see if you see any mention of fuse on the user mounts. You woudl see likely /mnt/user mounted as such – and that’s the root of your problem, it makes walking the filesystem very slow.

To confirm, cd to the root folder of your duplicacy repository, (if duplciacy is in the container - open shell into container, and do it there) and run

time find . -type f > /dev/null

This will measure time it takes to enumerate all files starting from the current directory (and throw away the result), and then print time. If you see the same 4 minutes – duplciacy cannot do anything about this.

I don’t think there is a way around this on unraid, other than moving plex’s app data folder to outside of user mount, that can be mounting directly, thus avoiding fuse. Or maybe it’s already accessible directly from other mount points? Or can be mounted directly to somewhere outside of /mnt/user?

If the find . -type f is fast but duplicacy is slow – it woudl be very unexpected. We can collect strace from duplciacy process during that time and look at system calls, to potentially uncover another bottleneck.

What I’m sure it is not – is regex/matching engine of duplciacy. It’s very fast. You should not be noticing any performance issues enabling filters.

Thanks!

I have learned something new today :nerd_face::blush:

Seems like you were right. Unraid might be the problem.

If I switch out /mnt/user/appdata with /mnt/cache/appdata/ (thereby bypassing the FUSE layer and the shfs overhead) the total backup time went from 6 minutes to 1 minute and 10 seconds. :partying_face:

This is good news and bad news.

The good news: is that the appdata folder is almost always stored on a single disk so this solution is effective and doesn’t have any considerable frawback.

The bad news: is that other shares such as, say a folder called TV SHOWS can be split across multiple drives, in which case this solution is not really possible. Well technically it is, but it will make the backup direcotries confusing. If the user at some point switch out the drives (the main selling point of Unraid) the backup repository will have to be updated accordingly.

Seems like the best way to go is to use the FUSE layer for most backups in Unraid to avoid complication down the road. But since the appdata folder specifically stays on the same drive we can make an exception.

5 posts were split to a new topic: Unraid architecture vs storage landscape today