Multiple backup to one storage - is this safe?


#1

Or better make separate storage for each backup ?


#2

Multiple backup to one storage

This is how you should be using :d: !
If multiple computers don’t backup to the same storage, you will lose quite a lot of the deduplication potential.


#3

I meant another
Separate backups jobs (from other folders) - with use one storage as destination.
Now i use one storage only for one backup destination (with many versions), another folders backups go to each separate destinations.
I make this and it’s working - in snapshots folder i see new folder with backup name - but chunks will not overwrite?
Or is it better not to risk it and to make each archive separately?

Sorry for my wrong English - but is not my primary language :wink:


#4

Yes, send them to the same destination to take advantage of deduplication.

Not only does duplicacy support dedup of data coming from various folders, but also across data coming from different machines, as @TheBestPessimist pointed out; this is in fact a core killer feature that very few backup tools support.

In fact, the design does not use the knowledge about where does the data come from: it gets shredded into chunks and referenced in snapshot file. If chunk already exists – it will be re used. It does not matter how that existing chunk got to be there; information about its origin is not stored and not used.

Here is a description about how it works: Lock Free Deduplication · gilbertchen/duplicacy Wiki · GitHub


#5

Very interesting
I really don’t realize how it possible to exists identical chunks from absolutely different datafiles.


#6

And this is precisely how and why it works – chunks with the same name contain by design the same data. Chunk’s name is the hash of the data it contains. So, if they overwrite – no harm. In fact, this is how deduplication works (technically, there is more to that, but enough for broad explanation) – if the backup run is about to upload a chunk and it is already there - awesome – no need to upload anything. It’s already there.


#7

You are right, it’s not possible. But if datafiles are not absolutely different, there is a possibility that they may contain similar parts. And that’s where it helps. Media files and zip archives are non-deduplicatable and non-compressible by nature – if it was possible to compress them further – it would have been done. But vast number of other files are.

And consider another use case – you backup /Users/greg as one repository. Then you also backup /Users/emlily as another. And then you backup /Users as well, because you are nice admin and care about your users. All three backups run at different or same schedule and have different retention/pruning settings. But data from these three backup jobs completely overlap!. And that not to mention that Greg and Emily have shared photo album with literaly identical files inside. So, there would be a lot of shared chunks and likely when you backup /Users after /Users/greg and /Users/Emily completed their backup no new data will be uploaded – because its already there.


#8

Thanks.
I understand that my whole backup scheme is wrong.