I also want to chime in on the praise.
TLDR; Like Crashplan, but better in all regards. Faster. Less resource intensive. More portable. Affording user better data control. And has a CLI!
Have been a customer of Crashplan for years, initially with the Home edition and then forced over to the small business plan when the former plan was sunset. Since the time I received a heads-up from crashplan about Home sunsetting, I’ve been reading up on, and evaluating a lot of options.
Like others in this thread, I tried Duplicati as well, after reading through the design docs for the service. It looked fine on paper. Only after having put the software through its paces, a number of major flaws surfaced.
-
Horrible restore design; leading to super slow restores during disaster (restore to new machine). The problem is that the software relies on having local index files (gigantic sqlite files) available on the machine. In case of a disk failure, one needs to download the entire friggin backup and store it somewhere locally. Then use a out-of-band (separate) program to re-create those index databases. This process would take days to weeks with a 1TB backup set. Only then would the user be able to start restoring data. A downtime stretching weeks for a rather small data set isn’t really viable.
-
Very slow backup jobs. It took me 30 minutes to hours for each delta backup run of a 1TB data set. The majority of that time the program was spending on very slow file system scanning.
-
Very slow file browsing. Opening the backup’s root folders on the program’s restore tab took 10 minutes. Expanding any node (directory) took another 10-15 minutes [1]. This meant that to restore a file 5 directories down would take an hour just to navigate to that file, provided I knew exactly where that file resided (which is never the case). Completely useless for data sets larger than a demo set.
-
Bugs, bugs and bugs. I ran into a lot of critical bugs, such as VSS failing totally, backup jobs failing without anything helpful in the logs, leaving me to having to dig through the source code on github to try figuring out what the heck could be going on.
-
Slow bug fixes. Github issues I faced lingered for 6 months before being picked up.
- DotNet. This is a subjective issue, since I’m not a Microsoft developer. As the project relies on volunteer work, it took me a lot of effort to fix some of the bugs locally. Personally I’d preferred a language more common in the unix camp (C/++ or golang) since that’d have made it easier to contribute.
All in all, Duplicati isn’t even close to being fit for purpose.
After exploring different options, including various rsync based ones, I felt so depressed about the current state of affairs in the backup space, that I just decided to pony up for a Crashplan business license. Crashplan seemes like the Atlassian of backup services, meaning no-one is very happy with it, but it does provide the feature set people are looking for, at an affordable price.
By chance I recently stumbled across a reddit discussion where Duplicacy was mentioned, and with very low expectation, decided to quickly try it out. The experience with Duplicati meant I knew exactly what to look for with regard to reliability, security and efficiency which made the eval quick. I was super impressed with the CLI. And the Web UI, though very confusing, did work out of the box with my candidate storage providers (S3 and Wasabi). After having read through the excellent guide on the developer’s website, I mostly understood how to navigate the Web UI, and it has now become my main backup agent.
What I really appreciate with Duplicacy is the following:
- It’s rediculously fast doing both full and delta backup. I backed my current 500GB data set consisting of a million files or so, in just a couple of minutes [2]. Delta backups are even faster.
- Very fast directory scanning. The bottleneck is entirely my disks, so I don’t see how the developer can improve much on this.
- Restoring is a snappy process. The Web UI directory browser doesn’t have any of the design faults that Duplicati has, and allows me to interactively navigate through my directory tree as I would expect in a file browser.
- Restoring through CLI worked perfectly, and is the option most likely to be the option during disaster recovery [3].
- The developer is very active and responsive. I think going with a commercial model allows him to treat this project seriously, and users of the software as customers rather than “hey, if it doesn’t work, pull requests are welcome” mentality that most spare time OSS hobby projects suffer from. A backup solution has to safeguard my data, and as such this is a space where I want someone who is both passionate and financially motivated to maintain the product.
- Client side encryption [4]. This is a minimum bar for any candidate product in this space, but I was surprised that so few products/projects offer this fundamental element.
In comparison to Crashplan, which was the reference backup solution, Duplicacy gives me a much faster backup and restore option with more control. It achieves this at a total price that is less than Crashplan [5], given that I have a very modest backup size at present.
Let me conclude with a tip on a great storage backend for Duplicacy; Wasabi (which I mentioned above).
It’s modelled on the defacto standard (S3) API, with its own AWS IAM clone. This means it’s super trivial to setup a policy which allows Duplicacy to read and write to a bucket, but prevent it from being able to delete. This helps against ransomware threats. Every now and then I temporarily add DeleteObject permission and run a manual prune job [6] to reclaim space, in order to save on storage cost. Normally having Delete disabled gives me peace of mind.
The transfer performance is great if you reside in Europe. I use eu-central-1 and get amazing, almost AWS level, transfer speed, for a fraction of the price.
I pair this bulk backup to Wasabi with another copy of the most critical data to S3, since it’s less likely AWS will fold from financial hardship than the startup.
If Wasabi was to go belly up, I’d just copy the files over to S3 and eat the increased cost. Duplicacy plus a managed bucket service has turned out to be the perfect pair for offsite backups, given all dimensions except perhaps for extreme penny pinching [7].
Never thought I’d find a perfect solution in this space, but now having this cross-platform backup product vetted, finally allows me to close the book on my backup struggles. When people ask (companies or startups), I’m finally able to give a confident answer to the question “How should I deal with backups”; Use Duplicacy.
[1]: Everytime a user navigates to a directory node, Duplicati reads the entire index database, performs SQL queries on it, and writes a completely new tens of gigabyte large temporary database. The developer hasn’t explained that design choice, but it makes the restore feature completely useless, even when the indices reside on fast SSDs or RAM disk!
[2]: Am sitting on a symmetric 1 Gbps connection, and am getting about 600 Mbps transfer speed to Wasabi (a bit higher to S3). Crashplan in comparison took a day for the same amount of data.
[3]: Point the duplicacy CLI at the remote storage, enter the decryption secret and voila. Data restored in a few minutes.
[4]: This is actually a point on which I think the software could improve. Steal the great idea from Duplicati where it allows the user to choose between a built-in encryption implementation, or piping through an external GPG program. Since so much can go wrong wrt. security, and particularly cryptographic implementations, I think allowing users to choose an encryption program they trust would help a lot, particularly in the enterprise space, or for people who work with sensitive corporate / government data.
[5]: Including Duplicacy paid license & the cost of storage on Wasabi/S3
[6]: Takes just a few seconds, and then I disable the the delete permission again.
[7]: Depends on how large your backup sets are. For a few hundred gigs, bucket solutions are cheaper than cloud-disks (dropbox/idrive/gdrive/adrive…) but if you start reaching 2TB then the latter may be more cost effective. Though you have other issues to worry about, such as ransomware, backup performance etc. which are none issues with good bucket solutions.