How to optimize Duplicacy for Storj

I have seen some posts addressing how to optimize Duplicacy for Storj and would like to know more. I am using Web-UI version

  1. Should I set custom size? If so how? Can it be done in the Web UI?
  2. Is there any other things I should think about to reduce costs and improve performance?

It depends, but likely yes, increasing chunk size can help with performance with any storage that has nontrivial latency, with storj it also helps reduce number of segments and therefore cost.

  1. I would set average chunk size 32MB (the default is “average chunk = 4MB”). Depending on type of data you backup some other chunking scheme may or may not be more appropriate. For example, if you backup virtual machine images, then fixed chunk size of 64MB will provide best performance and storage utilization.

    • No, it cannot be done in web ui. You would need to download the duplicacy CLI, and initialize storage in some temporary empty repository. Then add already initialized repository in WebUI
  2. With storj you have two options:

    1. Native integration:
      Pros: can achieve massive performance, limited only by your internet connection and hardware
      Cons: You will be limited by your internet connection and hardware; most consumer level routers and switches will experience stability issues. Your upstream will be 2.7x higher than actual data upload rate: native integration uploads directly to storage nodes after all the chunking, encrypting, and erasure encoding; this is a massive load at your CPU and internet channel.
    2. S3 Gateway:
      Pros: Storj hosted gateway will handle all that processing. You won’t have upload amplification or any special hardware requirements. It will literally behave like any other S3 storage.
      Cons: you won’t reach performance possible with native integration.

    For most users for backup scenarios I would suggest S3 gateway. If you suddenly need to restore at maximum possible performance – you can always configure native integration and do restore that way, you are not burning any bridges going with either approach.

My 2 cents…

A few years ago, I discovered Storj and thought I’d found the perfect backup solution, complete with built-in geographic redundancy. However, I soon encountered a critical drawback: per-segment pricing.

Most of my files are small - office documents, markdown files, and plain text. With Duplicacy default chunk size of 4MB, the per-segment fees added roughly 50% to the storage costs, making it comparable to Backblaze B2 or other S3-compatible services. Increasing the chunk size reduced the segment fees, but this caused storage costs to rise rapidly. Even minor updates to a small office file (a few KB file) generated multiple MB-sized chunks, inflating costs with each backup.

Additionally, Storj’s project-specific password requirement complicated the web interface, often displaying incorrect information and hindering usability. The egress fees, while infrequent, were another frustration. Although I rarely needed to restore or download backups, transferring them between locations after project completion incurred unexpected charges.

Ultimately, managing backups on Storj required significant effort. After struggling with these issues, I switched back to Backblaze B2 for a simpler solution.

This should not have happened. Does it happen consistently?

Can you elaborate more please? I’d expect amount of small files that constantly change to be small small, so even 10x cost of storage for those would not make any difference – in ether case it would be under a cent.

I’m wondering if configuring fixed chunking is a real culprit here – changing small file in the middle woud result in a massive number of new chunks, becauase now the sausage is shredded differently. Did you set fixed chunk size or specified the range (e.g. 16-64)?

I don’t observe what you report with my backup, but I’ll try to reproduce with fixed chunking.