New to backups and confused about 3-2-1 strategy and bit rot

jesusaperdomo · 26 November 2023 14:42

Hi all,

I’m sure you are all probably sick of answering these probably very basic questions but I am probably missing something along the way in my months of reading about the topic. I’m a

I’m a long time lurker, first time poster, trying to make a decision for longevity. I have recently digitized decades worth of family videos and photos totaling ~2TB. These files are currently sitting in a disk inside my unraid server that is exclusively used to hold these and is otherwise spun down. I also have 3 copies of these files in external HDDs, 2 of which I have given to other family members, and 1 of which I keep in my closet.

Additionally, I have also setup nextcloud for me and my family to upload new photos and videos, which along the large ‘historical’ archive, will make up our ‘family memories’ so to speak. I’d like to keep these newer files as well as the historical files safe using the 3-2-1 backup rule and easily accessible for me and family to have a look at every once in a while.

Initially I thought to have a drive in my unraid server that houses the historical and new files along with a separate drive for local backup + a copy of this backup on a cloud solution. However, the more I read, the more I realise it may not be as simple as this (or maybe I am overthinking it). I think it all boils down to bit rot and how to solve it

would I be right in thinking that if bit rot affects my local ‘active’ copy, this wouldn’t affect the ‘backup’ copy, and duplicacy won’t overwrite it either? - meaning that if I ever find a ‘rotted’ file, I can just restore the backup and we are back online?
what if bit rot affects the backup chunk? - this means I am relying on a backup that may not necessarily be functioning. Is this detected in any way by duplicacy? Can duplicacy use the second backup in the cloud to ‘fix’ the rotted one?
From my understanding, I can just mitigate concerns about bit rot by using a self-correcting FS such as ZFS. My only opposition to this is additional cost (since I’d have to buy drives specifically for this purpose - for proper bit rot correction on both the backup and the active pools, I’d need 4 drives (2 each as far as I understand)). Am I correct in thinking this way or is this overkill if I have a second cloud backup to restore from in case of issues?

Again, apologies if this is all very basic. I have read previous somewhat related posts, but none really address this particular scenario that I could find.

Many thanks in advance!

Droolio · 29 November 2023 02:12

Pretty accurate. IMO ‘bit rot’ (literal bits flipping) is an overblown concern these days on modern drives - what’s more likely to happen is sudden drive failure or one or many bad sectors (I don’t think of this the same as bit rot as generally bits on spinning rust generally don’t fade on their own, and SSDs generally lose bits only after extremely long periods of offline time; i.e. years).
But yes, Duplicacy won’t overwrite with bad data because a) file content is only backed up if the metadata is deemed to be have been modified, b) read errors shouldn’t create a valid snapshot, c) you have snapshot revisions of previous backups, so you’ll normally have sufficient time to detect a bad source drive even if (b) fails for whatever reason, and then you can go back to it (or other backup media).
You should take measures to read drive data (both source and backup destination) periodically - to test if the drive is functioning. That’s not Duplicacy or backup related - if you haven’t read written bits from a drive in months, you’ll never know it’s still in good nick. So you can do check jobs often, and check -chunks every once in a while unless you have some other HDD monitoring tool (I used StableBit Scanner to scan every sector of every HDD in my system over the course of 30 days). Watching SMART alone probably isn’t enough, but testing your backups by restoring or otherwise reading is minimum. Duplicacy will detect if any bit is corrupt if you read those chunks, or the process of copying backups to another backup destination. And indeed, a second backup destination can be used to ‘fix’ another - I’ve done it a few times myself, although there’s a bit of manual process involved.
Having enough copies is really the best strategy IMO, although ZFS wouldn’t hurt, I wouldn’t say it’s necessary. Duplicacy does however have an optional Erasure coding feature, which you can enable on a new storage (and copy between backup destination with/without erasure coding enabled). It’s good enough to detect and correct a small number of bad sectors or corrupt bits, though not to repair missing chunks (you can use a secondary backup to help with that).

Long story short - 3 or more copies, at least 1 off-site is an excellent strategy regardless of the tool - but Duplicacy is pretty robust in dealing with corruption. But always, test your backups, often.

jesusaperdomo · 29 November 2023 13:57

This is great, thank you for your reply - this clarifies how Duplicacy handles this scenario. I can rest now that I know won’t overwrite with bad data.

From reading previous posts on this forum, it has been suggested multiple times that Duplicacy shouldn’t really be ‘responsible’ for data and backup integrity. Despite this, the erasure coding feature was implemented in order to detect / correct chunk corruption, a nice addition overall albeit outwith the scope of the tool as such. I think I lean more towards letting the filesystem or environment handle data corruption at the core rather than delegating this to Duplicacy - not only does it probably save space, but it probably performs more robustly, especially if its at the filesystem level.

I think I’ll probably go with ZFS for nextcloud and local backups for peace of mind in terms of bit rot. This in conjunction with an external backup should be robust enough for my needs at present.

Now to learn more about ZFS, nextcloud, and setting up Duplicacy for this scenario!