Backblaze B2 bucket size does not match storage size

Hiya,

I’ve got two local Duplicacy repos using the same local storage, which I copy to Backblaze B2 once a day.

Duplicacy reports both the local storage and the Backblaze storage as being 2.2 TB in size, but Backblaze reports the bucket size as 3.6 TB.

The “Keep all versions of the file (default)” option is enabled on the B2 bucket, and there are no other repos being copied to that same bucket. I prune the local storage before the copy action, and the Backblaze storage after the copy action - both prunes use the same settings (-keep 30:365 -keep 7:90 -keep 1:7 -a -exhaustive).

Anyone have any ideas what might cause this difference?

It’s probably this:

1 Like

From what I can find that seems to be the recommended setting (B2 Lifecycle Settings · Issue #168 · gilbertchen/duplicacy · GitHub) - is that still the current recommendation?

I interpreted Should I disable Backblaze B2 Cloud Lifecycle Settings? - #5 by andrew.heberle to mean that a purge will free up the used space in the B2 bucket - is that not the case?

I don’t see the point in keeping versions of chunks and revisions, because they are immutable once created.

If the idea is to protect yourself against a corruption of the chunk in the storage (in which case you have a bigger problem, because your storage is not reliable) or even an accidental deletion made by the user, I think a better solution is to use write-only keys and create storage with erasure coding.

I don’t see the point in keeping versions of chunks and revisions, because they are immutable once created.

Agreed, but as gilbertchen recommended this setting in B2 Lifecycle Settings · Issue #168 · gilbertchen/duplicacy · GitHub, I’m a little hesitant to start experimenting with it…

1 Like

I understand that this lifecycle setup can be a problem in very specific cases, so each one should assess whether their use case (number of repositories backed up to the same storage, prune policy, etc) might have a problem with that.

Specific case example:

@jt70471 that is a very rare case but it could happen to you too. Suppose that an old backup to be deleted by the prune command contains the only copy of a file. But before the prune command renames the chunks that compose the file, a new backup from another repository (still unknown to the prune command) happens to include the same file but doesn’t upload all the chunks since they are already in the storage. The prune command goes ahead to rename all the chunks (or hide the chunks using B2’s hide markers) without realizing that they are needed by another backup (which is still in progress at this time so the final snapshot file has not been uploaded).

(by gchen)

Ref: Restore is very slow · Issue #362 · gilbertchen/duplicacy · GitHub

I have a practice of using simple, granular settings, which isolates possible points of failure (actually pretty much eliminates them in practice). My view is that backups should be reliable, not necessarily complex.

Simple configuration example: I don’t use prune. It just doesn’t make sense in my use case, because the “savings” in storage cost doesn’t justify the time I would spend configuring it.

I understand your fear of adopting a “not recommended” configuration… But look, it’s not unanimous… :wink::

Thanks for all the feedback.

I’m not entirely sure if the B2 lifecycle settings actually cause the discrepancy in size. But I’m going to try the “Keep only the last version of the file” option - worst case scenario, I have several other copies of my files :wink: .

Will report back in a few days.

The “Keep all versions of the file (default)” option is the right one to use. If another option is selected, then fossils (which are chunks temporarily marked for deletion) may be deleted prematurely.

I think the size discrepancy is caused by a large amount of fossils. You can run prune -exhaustive -exclusive -dryrun to see if there any. Normally fossils will eventually be cleaned up after some time but if you want to delete them now you caun run prune -exhaustive -exclusive.

1 Like

Thanks! I will try pruning the fossils, will let let you know the results.

Indeed, lots of unreferenced fossils, so I’m going to delete those.

Thanks for all the help :slight_smile: !