B2 Bucket Setup with Duplicacy & NAS Support

I have a couple of questions.

The first one has to do with B2 and the bucket setup for retention. In looking at B2, by default, anytime a file is changed, it makes a copy of file. I’m not familiar with how the repositories in Duplicacy are setup, but if they are single files that contain all history (similar to how, say Time Machine for Apple computers work), anytime a backup is taken, this would result in large amounts of storage being taken up on B2. Thus, my question is if using B2, should this option be turned off? Or is Duplicacy actually storing files, albeit it encrypted, raw such that I would need to rely on B2’s versioning?

Second, has anyone ran Duplicacy directly on a NAS (in my case, a Synology)? Via Docker or directly installed? My thoughts are that you could probably do so via Docker (and it would likely be CLI at the moment), but I haven’t seen a post of someone doing so unless I’m blind.

Conceptually, Duplicacy concatenates all your files into one huge stream, chops the stream up into chunks, and then uploads the chunks as individual files. There’s not a 1:1 relationship between these chunks and the file(s) they contain. Many small files could be included in a single chunk, or a large file could be spread across multiple chunks that it doesn’t share with anything else.

Because of the way the chunking is done, a change to a single file is only going to impact a small number of chunks (possibly just one). Only these chunks would be recomputed and their contents stored. The change in space required is roughly proportional to the magnitude of the change in the data you’re backing up.

That having been said, Duplicacy has its own mechanism for keeping old copies of files. It can do this even using storage back ends that don’t have any concept of versioning at all. You don’t need to turn on versioning on the provider. Versioning the chunks doesn’t get you much, if anything. It’s hard to imagine a case where it would actually help you.

I am pretty sure I saw a posting from someone doing exactly what you describe - Duplicacy on Docker on a NAS. I can’t remember if it was Synology or QNAP but I’m pretty sure it’s here. The nice thing about programs written in Go (which Duplicacy is) is that they are single self-contained binaries. They are dead-easy to “install”. You just copy the file to the destination and run it. I’m talking CLI.

1 Like

Actually come to think of it, chunks are named after their contents, so chunks will never change, and versioning won’t impact those at all. Only the non-chunk files are updated, and right now all I can think of is the config file, which is tiny. So the versioning setting won’t have any measurable impact on storage space. If you want to reduce your storage footprint you have to do it with the mechanisms in Duplicacy - I believe it has a purge command to remove old snapshots.

Thanks for providing the information. Based on what you are stating, turning of versioning would be beneficial then (at B2). The issue is that if Duplicacy is updating chunks, etc., what B2 will do is mark the changed chunk (however big it is) as a version. Depending on how big the chunk is (let’s say it’s 2GB), then B2 will charge me for 4GB of storage, even if the chunk isn’t significantly different. So glad you confirmed that I can turn that off and not worry about it having to be on.

As for the NAS option, going to search for that now. Thanks.

Duplicacy won’t ever update a chunk. Its architecture is completely predicated on this. If source file(s) change, one or more new chunk(s) will be calculated and those new chunks will be uploaded. The new chunks will have different names from the old chunks. They don’t replace the old chunks. Both the old and new chunks are present.

Each snapshot has a manifest that identifies the chunks it contains. The first snapshot would reference the old chunks and the second snapshot would reference the new chunks. This is what allows Duplicacy to store multiple versions.

It’s why I said you had to use Duplicacy’s mechanisms to reduce the required storage space. The only way to get rid of those old chunks is to delete all of the snapshots that refer to them.

In practice I don’t think the versioning setting is going to have any impact on the storage space. But for what it’s worth, I have it turned off.

This is the one I was thinking of, but the person was using QNAP. I don’t know anything about QNAP or Synology but they do make reference to Docker so I guess they work in a similar fashion: QNAP integration - anyone with experience?

1 Like

@Danny thanks for the explanations! Yes, all Duplicacy chunks are immutable, so the versioning setting really doesn’t matter.

I haven’t tried running Duplicacy inside a Docker container, but I have tested the arm build on my QNAP ts-212. You may need the linux build if your QNAP model is different. Also, if it gives your some certificate issue you may need to install root CA certificates under /etc/ssl/certs/.