Wasabi reliability, data loss?, oh my

joe · 28 November 2023 18:30

Agree to disagree here. A backup software with no support for the most reliable and cheap backup storage is, in my opinion, still a massive oversight.

saspus · 28 November 2023 18:49

I can’t, because facts don’t support the claim.

Most backup software today does not support Glacier. Arq is an exception. By that logic most backup software is faulty with massive oversights?

Restic thread on the same topic, with a very good explanation of why they don’t support it: Restic and S3 Glacier Deep Archive - Features and Ideas - restic forum. The reasoning is almost verbatim the same as with duplicacy.

This is indeed nice, and could benefit other high latency providers, not just storj.

It’s just as reliable as any other (amazon) hot storage – both Glacier Deep Archive and, say, Intelligent Tiering, provide 11 nines of durability. And so does Storj, and Backblaze…

joe · 28 November 2023 19:03

To be clear, I said “reliable and cheap”. Can you point to an equally (or more) reliable storage at the price of Glacier deep storage?

You inserted the word “faulty” – I never claimed it was a fault to not include it, just a massive oversight, which I still personally believe no matter what the competition may or may not have.

saspus · 28 November 2023 19:18

Of course not. But cost of storage is just one side of it. There are many more variables to consider – cost of restore, speed of restore, effect of long minimum retention; from development perspective: risk of changing software architecture at such a late stage of development. This may negate all benefits of supporting such storage.

To be clear, I feel it would be nice if duplciacy supported it, and I would use it myself. BTW, I don’t see a feature request thread about it – there are a few discussions like this one S3 Glacier Deep Archive? and this one [Cold Storage] Compatibility with OVH Cloud Public Archive but I don’t see a feature request. Maybe we shall start one.

Understood. (For me these are synonymous)

That’s fair.

facboy · 2 December 2023 11:53

sorry for necro, always appreciate your takes in these threads.

i’m on wasabi and run the basic check (do chunks exist) weekly. these reports are making me nervous though so i’m going to switch to GCS - would you bother doing duplicacy checks on GCS (or S3), or do you assume that they are reliable?

saspus · 2 December 2023 21:08

Wasabi has issues with availability, not durability. They still promise the same durability as amazon and others. Whether to trust that claim – different story.

I do that after every backup. This checks is mostly to protect against app error, it is worth doing regardless of storage backend.

Validating the chunk consistency on the other hand is counterproductive. If you don’t trust the remote to do one job you pay them to do: keep data intact – don’t use that remote. Downloading entire dataset weekly to ensure it’s still valid is silly. And will only ensure it was valid, you still won’t know if it is still valid.

facboy · 2 December 2023 21:58

yeah, i have a local nas that i back up to, do the basic check daily on that (and wasabi, i confused myself), and a chunk consistency check on the nas once per week.

yes the problem is that i don’t think i do trust Wasabi any more, not with the stories of them losing data because of bugs and then not telling any of their customers.

saspus · 2 December 2023 22:12

If your nas has self-healing filesystem (btrfs, zfs), then you could (should) run periodic scrub on the nas instead. This will find and fix discrepancy across filesystem, not just duplicacy datastore.

quiet.art · 2 December 2023 22:31

ZFS just patched a recently discovered data corruption bug that has been in the code for a very long time. The nature of it was that a scrub wouldn’t detect it. In this case, a chunk check would find it (I hope…). But I agree, it shouldn’t be a necessary part of normal life.

facboy · 2 December 2023 23:03

unfortunately it’s RAID1 only.

saspus · 2 December 2023 23:20

It was undiscovered for so long specifically because it occurs under circumstances most users don’t encounter. It’s an extremely rare bug. The recent block cloning feature exacerbated the repro rate, at which point it was promptly fixed. Vast majority of uses don’t jump on new features, so for them it would have remained an extremely remote possibility, not worth worrying about.

And lastly, due to the nature of bug, it would have never happened with duplicacy data in the first place.

If anything, this is a validation that it’s not worth wasting time validating chunks. On the other hand, Duplicacy checks every chunk only once, so only newly uploaded chunks would be checked. If doing so buys you peace of mind — why not.

quiet.art · 2 December 2023 23:36

It was initially discovered by someone doing a very normal workflow…compiling code. Newer versions of coreutils introduced reflink support to cp, which exercised the conditions necessary to hit it. And they only noticed because their workflow failed.

Sure, on the initial write of the data. As far as I know, duplicacy doesn’t use reflinks or sparse files, but I very well could have cp’d my repo somewhere else in the pool and silently tripped over the bug.

Not sure I understand that point at all. An application level checksum would be the only way to detect such a bug. I’m not advocating a chunk check routine, simply pointing out that there are use cases for it, and “self healing” filesystems are not infallible.