Wasabi reliability, data loss?, oh my

saspus · 25 November 2023 21:03

It’s a Glaciar Deep Archive.

I’m thinking to rearchitect the whole backup over the holidays: rclone immutable data to archive and keep mutable data in hot storage. Price won’t matter because there is very little of mutable data.

Christoph · 26 November 2023 16:43

I see. Good to know that there are multiple Glacier products. But is my understanding correct that all of these are technically S3 storages, which means they can all be accessed by Arq?

Then again, duplicacy also supports S3 but not Glacier. So S3 can’t be the sole criterion.

Could you elaborate? What is “rclone immutable data”? Do you mean this: GitHub - emmetog/immutable-backups: A wrapper around rclone to perform immutable backups, both full and incremental (and restore them)? You mean you want to use rclone to backup to glacier?

And what do you mean by “keep mutable data in hot storage”?

Is arq no longer going to be part of your backup setup? I was just about to finally give it a try…

saspus · 26 November 2023 20:50

S3 is a protocol. Amazon’s cold storage is offline, it needs to be “thawed” (transferred to hot storage) before data can be accessed. This can take a few hours. The application needs to be able to handle that: make a request for files, wait, and then proceed restoring data.

Duplicacy does not manage those types of storage today so you can’t use Amazon glacier of any kind with duplicacy.

You could however use Google cloud storage Archive tier: it’s slightly more expensive, and has twice longer minimum retention but does not have thawing requirements. I don’t know what is there free tier parameters, ie how much data can you restore for free, but this is one of the options to use with duplicacy that likely will be still cheaper than hot storage.

Photos and videos make up the majority of my data. That data never changes, it is not compressible, and cannot be deduplicated. Versioning it is only useful to protect against bit rot on the source.

Hence, instead of using duplicacy to back it up, I could rclone copy data to the cloud, using access keys that prohibit delete. Then changed data (read: corrupted) will fail to overwrite good data on the cloud.

The amount of rest of the data, that will benefit from versioning, is small, few hundreds of GB at most, stuff like documents, spreadsheets, projects. It can be backed up to hot storage, that does not require thawing and is supported by duplicacy. For example, Amazon standard or infrequent access tiers. Their higher cost is irrelevant because total amount of data is very small.

I will still use it to backup that hot data. But now for different reason: it supports backing up cloud-only files. I have 3TB of stuff on iCloud and my Mac has only 1TB of space. I don’t have any other way to handle that.

towerbr · 27 November 2023 19:45

… or to protect against errors made by the “part” between the keyboard and the chair.

saspus · 27 November 2023 21:05

Absolutely; I’d argue that part malfunctions more often than hardware. Same solution applies though - forbid deleting once uploaded (this also takes care of modifications; most cloud storage protocols, and specifically S3, require delete permission to modify the file)

joe · 28 November 2023 17:39

Wow, this seems like a massive oversight for any backup program. I was about to look into Glacier deep archive, but I guess I’ll have to do that in Arq instead of Duplicacy.

I’ve noticed myself relying on Arq way more often than Duplicacy nowadays (better Storj compatibility, offline file support, and now Glacier deep archive). Perhaps I shouldn’t have just bought the lifetime Duplicacy license. Oh well.

saspus · 28 November 2023 18:17

I would not call it “massive oversight”. It’s a missing, rather niche, feature, that appeals to a small minority of home users, but unlike adding support for yet another storage provider (see how Arq added Storj by just templating the S3 parameters) this requires some re-architecting. This could be much larger risk than potential reward.

At its core archival tiers are for archiving – stuff you don’t need, but don’t want to throw away. Archival is not the same as backup. Some apps support it for backup, but even in disaster recovery scenario, waiting for 12 hours for data to defrost only appeals for price-focused solutions, aka home users.

Arq decided to support it, but it is in minority.

Can you elaborate on this one? Arq uses storj S3 gateway, and so can duplicacy. In addition, duplicacy supports native storj backend. So, how is that Arq’s compatibility is better?

joe · 28 November 2023 18:29

Arq automatically adjusts the block size for perfect compatibility with Storj’s object sizes, without needing to manually edit storage config settings like Duplicacy.

joe · 28 November 2023 18:30

Agree to disagree here. A backup software with no support for the most reliable and cheap backup storage is, in my opinion, still a massive oversight.

saspus · 28 November 2023 18:49

I can’t, because facts don’t support the claim.

Most backup software today does not support Glacier. Arq is an exception. By that logic most backup software is faulty with massive oversights?

Restic thread on the same topic, with a very good explanation of why they don’t support it: Restic and S3 Glacier Deep Archive - Features and Ideas - restic forum. The reasoning is almost verbatim the same as with duplicacy.

This is indeed nice, and could benefit other high latency providers, not just storj.

It’s just as reliable as any other (amazon) hot storage – both Glacier Deep Archive and, say, Intelligent Tiering, provide 11 nines of durability. And so does Storj, and Backblaze…

joe · 28 November 2023 19:03

To be clear, I said “reliable and cheap”. Can you point to an equally (or more) reliable storage at the price of Glacier deep storage?

You inserted the word “faulty” – I never claimed it was a fault to not include it, just a massive oversight, which I still personally believe no matter what the competition may or may not have.

saspus · 28 November 2023 19:18

Of course not. But cost of storage is just one side of it. There are many more variables to consider – cost of restore, speed of restore, effect of long minimum retention; from development perspective: risk of changing software architecture at such a late stage of development. This may negate all benefits of supporting such storage.

To be clear, I feel it would be nice if duplciacy supported it, and I would use it myself. BTW, I don’t see a feature request thread about it – there are a few discussions like this one S3 Glacier Deep Archive? and this one [Cold Storage] Compatibility with OVH Cloud Public Archive but I don’t see a feature request. Maybe we shall start one.

Understood. (For me these are synonymous)

That’s fair.

facboy · 2 December 2023 11:53

sorry for necro, always appreciate your takes in these threads.

i’m on wasabi and run the basic check (do chunks exist) weekly. these reports are making me nervous though so i’m going to switch to GCS - would you bother doing duplicacy checks on GCS (or S3), or do you assume that they are reliable?

saspus · 2 December 2023 21:08

Wasabi has issues with availability, not durability. They still promise the same durability as amazon and others. Whether to trust that claim – different story.

I do that after every backup. This checks is mostly to protect against app error, it is worth doing regardless of storage backend.

Validating the chunk consistency on the other hand is counterproductive. If you don’t trust the remote to do one job you pay them to do: keep data intact – don’t use that remote. Downloading entire dataset weekly to ensure it’s still valid is silly. And will only ensure it was valid, you still won’t know if it is still valid.

facboy · 2 December 2023 21:58

yeah, i have a local nas that i back up to, do the basic check daily on that (and wasabi, i confused myself), and a chunk consistency check on the nas once per week.

yes the problem is that i don’t think i do trust Wasabi any more, not with the stories of them losing data because of bugs and then not telling any of their customers.

saspus · 2 December 2023 22:12

If your nas has self-healing filesystem (btrfs, zfs), then you could (should) run periodic scrub on the nas instead. This will find and fix discrepancy across filesystem, not just duplicacy datastore.

quiet.art · 2 December 2023 22:31

ZFS just patched a recently discovered data corruption bug that has been in the code for a very long time. The nature of it was that a scrub wouldn’t detect it. In this case, a chunk check would find it (I hope…). But I agree, it shouldn’t be a necessary part of normal life.

facboy · 2 December 2023 23:03

unfortunately it’s RAID1 only.

saspus · 2 December 2023 23:20

It was undiscovered for so long specifically because it occurs under circumstances most users don’t encounter. It’s an extremely rare bug. The recent block cloning feature exacerbated the repro rate, at which point it was promptly fixed. Vast majority of uses don’t jump on new features, so for them it would have remained an extremely remote possibility, not worth worrying about.

And lastly, due to the nature of bug, it would have never happened with duplicacy data in the first place.

If anything, this is a validation that it’s not worth wasting time validating chunks. On the other hand, Duplicacy checks every chunk only once, so only newly uploaded chunks would be checked. If doing so buys you peace of mind — why not.

quiet.art · 2 December 2023 23:36

It was initially discovered by someone doing a very normal workflow…compiling code. Newer versions of coreutils introduced reflink support to cp, which exercised the conditions necessary to hit it. And they only noticed because their workflow failed.

Sure, on the initial write of the data. As far as I know, duplicacy doesn’t use reflinks or sparse files, but I very well could have cp’d my repo somewhere else in the pool and silently tripped over the bug.

Not sure I understand that point at all. An application level checksum would be the only way to detect such a bug. I’m not advocating a chunk check routine, simply pointing out that there are use cases for it, and “self healing” filesystems are not infallible.