I understand that duplicacy functions as a file level backup tool. I am very interested in leveraging it for its high backup efficiency properties to be able to keep regular ol’ files backed up well for anything that is not resident on ZFS pools which already provide robust and efficient replication via zfs recv/send.
I will transition my Linux OS disks to ZFS so that handles those.
This leaves Windows OS drives to backup, primarily.
I would generally be satisfied with having a file level backup. But I wanted to explore going one step farther, which is to have a safety net for the windows and macOS OS disks.
It’s possible to make full bootable backups of operating systems. This can be done with imaging tools, e.g. Veeam on Windows and CCC and others for macOS. Many choices exist. We consume some resources (mainly time to process) on the source computers to fully image them regularly, but I want to see if it can be made practical.
My question is about evaluating duplicacy’s performance for this somewhat extreme case of Terabyte scale files and deduplicating content chunks within files. A few tens or hundreds of gigs may change between each version update of the disk images.
The use case is to periodically do full disk images of macos and windows OS disks. The question here is around the efficiency in syncing huge image files regularly.
- One huge full disk image for each computer is generated and stored locally or say in zfs samba share on my NAS.
- Let’s just say that I keep one copy of each, and let’s say once a week, the full disk image is getting replaced there. So, much of the content changes, but it should be a fraction of the overall image file’s content.
- To implement step 2 of 1-2-3 backup, I could use zfs replication send/recv to keep a second zfs pool synchronized in backblaze let’s say. OR i could implement this with Duplicacy into backblaze.
The question here is, if 1GB updates on a given computer, while the computer has 2TB of storage and therefore a, say, 1.5TB full disk image it’s producing every day, I need Duplicacy to be intelligent enough to only transfer ~1GB into the backup target. The question is simple, will it do this or will it re-transfer 1.5TB? Obviously we must assume the disk image is not compressed or encrypted so that blocks can match up. As mentioned above my fibered up LAN may handle a few terabytes of images getting recorded each day or week, but this would not be practical for egress to offsite even on gigabit internet.
Assuming the above is possible and can work, I also want to know what sort of control we might have over it. with ZFS we would be able to simply replace the images and then use snapshots to preserve past state at desired intervals. We may also (questionable approach however) use deduplication and keep multiple copies of older images around.
I imagine with duplicacy that snapshots going back can be explicitly managed. This will be nice. The question is could we delete intermediate snapshots? So if we have one snapshot each week but i want to keep one per month for stuff older than 3 months. i would need to delete the 2nd, 3rd, and 4th weekly backups from each of the older months’.