Contacting tech support

I am considering Duplicity.

Can I use my AWS Snowball as a source and bring back 50 TB of files and exclude duplicates and sort by year?

Asher

I am using a Mac Pro late 2013.

Or could I use multiple Hard drives on my desktop also containing the original data to duplicate just the unique files to a new RAID or series of large drives by year.

Asher

Http://opf8.com

I suppose you mean Duplicacy…

Not sure what you mean by “bring back”. Before you can bring back anything with Duplicacy, you first need to back something up (using duplicacy)…

I want to backup my 50 TB of data. So far I have loaded them on to an AWS Snowball. I still have the snowball connected to my Mac Pro late 2013, 6 Core, 3.5 GHZ, 16 GB of RAm. I also have all the original HDD connected by either Thunderbolt 2 or USB 3.

I want to copy either from the Snowball or from the original HDDs to a new Desktop RAID with no duplicates and sorted by year.

I am writing a review of using Oud backups and consolidating massive data sets.

Thanks!

Asher

I think duplicacy (and duplicity and any other backup solution) is wrong tool for your purpose if my understanding of what you are trying to accomplish is correct.

Duplicacy is a backup solution that stores historical versions of your data (without wasting space twice is multiple sources (files or computers) contain the same chunks of data) in an opaque container that only duplicacy can read. You can then restore your data to the state it was at some point in the past. It’s an effective way-back machine.

Based on the second paragraph from the quote above and the fact that you “so far have loaded them to” a DAS it seems that you don’t need versioned backup. You just want to collect unique files from multiple drives and consolidate them into a single destination while arranging them in the certain way sorted by year.

What you need is doable with a simple bash script (to rename files at the destination), rsync or rclone to copy data and utility like Araxis’s Find Duplicate Files to get rid of duplicates by content, regardless of file name, if your data contains multiple instances of the same renamed file.

2 Likes

Thanks so much for your generous and knowledgeable answer.

So the de-duplicated “chunks” are only accessible via “Duplicacy” software reassembling data. So that is like “Cloudberry” so I understand that the storage drive for that is inaccessible to my computer directly. Still I appreciate value
For now I value too the nth your kind help in pointing to other more relevant software options.

If you come across specialized solutions for backup and mining of terrabyles of imagess with icons to look at them I would be happy to know, but in the meanwhile, I am happy studying Arexis and it does seem a very well crafted set of tools that I can use.

Asher