So I have a couple datasets that are large blobs, or high rate-of-chage. Examples include, vzdump files from proxmox VM backups or surveillance videos from cameras. For all my other data, I’m finding dedupe very effective at storing extra versions without much additional storage, but I’m wondering about this data set. My general prune arguments are as follows with hourly backups:
-keep 30:365 -keep 7:30 -keep 1:7
I suppose my question is a few fold…
- Does anyone know how well dedupe with Duplicacy works on vzdump images? The base VMs are not changing much per day, so assuming the de-dupe works well, then this may be a non-issue.
- Surveillance video files seem like they wouldn’t have much opportunity for dedupe. Is anyone else backing up this kind of data?
- Assuming either of these don’t dedupe efficiently, I guess I will need more aggressive pruning… Should I create a separate storage location for these datasets so that their prune settings don’t conflict with my general ones above? Or should I use the
-id
option to prune just the video and vzdump id’s more aggressively, then do a second prune pass with-all
using my general settings above?
Thanks in advance for the communities advice!