Diabolical download speeds off OneDrive using Duplicacy

So, your storage cost is $9.65/month, likely because you are using a very expensive “S3 Standard” storage tier. Instead, you shall use “S3 Glacier Instant Retrieval” or “S3 Intelligent - Tiering”, with cost structure that is better aligned with backup usecase.

The source of the problem however is this:

It’s a download from S3. It’s expected that this is expensive, also it’s expected that you shall never need to do that ever, until your machine and all local copies of data burn in flames and you need to do full restore.

So, what caused 684GB to be downloaded from S3?

check --chunks ? .

1 Like

Dunno! I ran a ‘prune’ a couple of times to check how it would work going forward, but other than that I did not download anything. Just ran backups.

Anyway, I’ve extricated myself from the AWS service. If I cannot understand/control my spend, then need to look for alternates. Recently (via windows admittedly) I was backing up using Syncbackpro and writing direct to an online Onedrive account. It was managing up to 10MB/s so I may just go with that as it’s no extra cost to me.

Picking alternative storage is not a solution for unexplained egress. Regardless of which storage provider you end up using, there shall be no mysterious traffic, even if it’s not explicitly charged.

I would review your logs, and while I too think the root cause @sevimo suggested is a likely culprit, you need to get to the bottom of it. 600gb is not a bag of peanuts to get lost in a shuffle.

How would you avoid this charge? Can you run a check without --chunks to avoid downloading?

You don’t need -chunks flag to run check, without it it will only check existence of referenced chunks, and will assume that if chunk file exists it has the right content. This will avoid most of the download activity on check.

1 Like

Got it, I just looked up the documentation and posts about the chunks and files flags. Thanks!

On the note of checking chunk content, I think it’s also worth mentioning that AWS doesn’t charge for data transfer between an S3 bucket and other AWS services in the same region.

So, if you really want (or need) to check the content of chunks stored in S3 for some reason, it might be cheaper and/or faster to perform the check using an EC2 instance. (This is especially true if you already have an EC2 instance that you’re willing to use.)

To be clear, there are still some costs associated with checking chunk content with S3:

  • You’ll still be charged for S3 GET and LIST requests.
  • If you’re using S3 Intelligent Tiering, reading older chunks will still transition them back to the more expensive frequent access tier.
  • If you’re using other storage classes that have retrieval fees, you’ll still be billed those retrieval fees.

With that said though, if you already have your mind set on checking chunk content, these costs are relevant regardless of whether you use an EC2 instance or not.

This means the main thing to consider is whether the cost of the EC2 instance would be cheaper than the S3 data transfer costs.

I haven’t personally tried this, so it’s hard for me to say for sure whether it’s actually cheaper or not. I figured I’d at least point it out for those interested in checking chunk content with S3 storage though.