Duplicacy copy vs rclone sync

805hp · 13 December 2022 22:35

Hi!

I just need some sanity check here.
I’m backing up my devices to Synology NAS over SFTP. Each device runs scheduled scripts to do the backup.
For offsite backup I’m using Wasabi and I normally just use duplicacy copy for that, also in scheduled scripts on client device. But recently I realized this is not optimal as I’m essentially downloading the chunks from Synology to client device to upload them to Wasabi, so I started to consider improvements.

So I’m going to schedule something on Synology itself. What should I do? Run same duplicacy copy or use rclone sync? Wasabi storage has been made “bit-identical”.
I’m inclined to use rclone as it’s seems more straightforward to setup. What do I lose in functionality, apart from ability to copy only specific revisions?

saspus · 14 December 2022 00:54

I think you are correct on both counts—you can use Rclone, and you won’t be able to sync selectively. But if you are replicating the whole datastore—sure. It will be likely more efficient too. After all, this is the whole point of -bit-identical flag:

  -bit-identical                       (when using -copy) make the new storage bit-identical to also allow rsync etc.

towerbr · 14 December 2022 12:15

Everything mentioned above is correct. There is however an additional point: when you use duplicacy copy, you are actually decrypting (from NAS) and re-encrypting (to upload to Wasabi) chunks. It ends up being an indirect “check”. And it’s extremely fast. I particularly use duplicacy copy, even on my buckets that are bit-identical.

sevimo · 14 December 2022 16:12

This also means that unlike rclone, duplicacy copy will never benefit from server-side copy even if conditions are met.

saspus · 14 December 2022 17:20

But chunks are by definition unique, and their location in the hierarchy is fixed and defined by the content. What conditions do you have in mind?

This could be actually another reason not to use duplicacy copy in this scenario:

checking snapshots consistency is cheap — metadata is cached in memory (see also last bullet)
checking chunk consistency is unnecessary, it gets checked when file is read by btrfs. And since btrfs is the only reason to ~~tolerate horrible software~~ use Synology — I assume it’s the filesystem.
running duplicacy on the other hand probably requires some nontrivial amount of ram, and probably more than rsync/rclone. The most important job of ram on a nas is to hold filesystem cache. Anything else that uses ram does so by evicting part of that cache. This has direct and measurable effect on performance and responsiveness. So not only duplicacy copy validates integrity twice (once with btrfs, second time with duplicacy) but also it causes caches to get evicted and nas peformance to degrade.

Not that any of that has any significant effect, but since we’re penny pinching let’s analyze the big picture.

sevimo · 14 December 2022 17:37

If you’re copying between different storages on the same provider (e.g. Google Drive to Google Drive), rclone can bypass downloading completely and execute server-side copy. Not all backends support that, but some do, in which case it is much more efficient.

saspus · 14 December 2022 18:31

Ah yes, of course, but this is not part of OP’s usecase (local to cloud sync): if duplicacy moved the chunks around in the repository this could have helped, but since does not, (optimized) server side move will never occur.

805hp · 15 December 2022 02:18

That’s another good point. My storage pool is using BTRFS, and my Synology is pretty old, so there is not a lot of resources - 1 GB RAM and a CPU without hardware encryption support.

system · 25 December 2022 02:19

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.