Backup Immutability - Object Lock support?

For this to work in duplicacy, basically you would have 3 issues to deal with:

#1
Any duplicacy config files that get changed during every backup. This could be fixed by using a new config file copy every time, and deleting the old config files after the lock expires.

For example, instead of using config.data (or whatever), you would use
config.data.00001
then
config.data.00002

It should always copy the highest numbered file to a new one. The old ones can be deleted out as their locks expire.

#2
Pruning / deleting files from the backup. Just don’t set any pruning for less than however long the lock period is. (Or don’t worry about this issue at all, and silently ignore lock errors when deleting, knowing that eventually one day, the delete will work. Every prune will try to delete these, and eventually it will work.)

#3
Keep old files locked somehow. Since duplicacy reuses existing chunks forever, you don’t know how long to lock them for. So, after every prune, it would need to relock any files that are not locked (but ONLY files that are not locked). This would be tricky. Some files would be expire the lock for some time before the next prune to relock them and would be vulnerable during that time. There would instead need to be something indicate that a file needs to be allowed to expire the lock, and otherwise continually renew the lock.

Interestingly, B2 also has “Legal Holds” now in addition to the new locks. I’m not sure how the legal holds work within b2, and they don’t seem to be documented anywhere yet, but this could potentially be used as a different kind of locking mechanism we could use.

Interesting note on Legal Holds… I might send their tech support a question on this.

On the other points - yes, it won’t be simple to implement, but should be possible.
I wonder if prune process can extend lock every run if it knows that chunk is needed and then mark somewhere chunks which can be deleted on the next run, when lock expires.
This will require regular prune run with the period shorter than the lock…

I just realized that the config file (issue #1) wouldn’t be an issue at all . Just don’t lock that file.

So, really, the only issue is #3. Relocking (extending the lock) on all files after each prune, and also like you said, noting the files in another config file that need to be deleted on the next run.

That could be transaction costly through (on b2), having to access through every single chunk, every single week (or however often), and then updating the lock on them.

Or, just thinking here, what if instead of all of that, duplicacy use the b2 “hide” instead of “delete” command. B2 has a feature to automatically delete hidden files after xx days. (Or is hide already being used elsewhere in the duplicacy logic). Remove delete from your key (and any other unneeded functions).

If hide isn’t being used already by the logic, that would be a very simple code change.

Then, the only new feature that would need to be implemented would be a “fix after messing up” if a virus went in and “deleted” everything (and thus made it all hidden) … we would need some way to know what to unhide, programitically. Make some immutable file that is created after each run with a list of what chunks exist. Then, just run that file against an unhide script to “fix” or reverse the virus damage.

1 Like

It does seem to me that duplicacy should be using hide instead of delete in the b2 client. This would allow us to use any retention policy we wanted via the bucket lifecycle rules. The app key could be issued without deleteFiles capability as hiding uses writeFiles.

I don’t think storage retention policy will help with backup snapshot (version) retention - they are different things.
And “hiding” files won’t help protecting them - they are actually not really hidden from API and can be deleted - even with “write-only” API key - operations b2_list_file_versions and b2_delete_file_version are allowed.
The only way you can protect yourself is to use storage which includes read-only snapshots.

I’m assuming your goal aligns with mine which is to keep an attacker that has compromised the machine from also deleting the remote backup using the API key present on the machine.

In b2 the possible ways to remove a file from a bucket are b2_hide_file and b2_delete_file_version. “Hiding” is equivalent to a soft delete. The only other option is to delete by version which is what duplicacy is doing. In b2, “hiding” functions very similarly to the “delete” in the AWS S3 api. In other words, it is supposed to be used for typical file deletion operations. Indeed if you look at the code for another backup tool, duplicity, it is using hide rather than delete. The lifecycle policy will decide when the file is permanently deleted (possibly immediately).

B2 hiding has a different privilege requirement than deleting. When you create the key, you can give it writeFiles capability which will allow b2_hide_file (which again, might result in the file being immediately deleted depending on bucket lifecycle policy), but not give it deleteFiles capability needed to use b2_delete_fil_version.

https://www.backblaze.com/b2/docs/b2_create_key.html
https://www.backblaze.com/b2/docs/b2_hide_file.html

Now it could be argued that this isn’t an ideal solution as you can’t easily restore to a single point in time with b2 - you would have to write a script that would go through and delete any versions newer than X point in time… and I would agree. But still, this is far more attractive than losing your entire dataset.

My goal is to protect backup from being targeted by someone trying to wipe it out or corrupt.

Don’t mix Duplicacy version and B2 file version - these have nothing in common :slight_smile:

By “hiding” the file on B2 you are not protecting it - it remains visible for b2_list_file_versions call - so the adversary, who wants to delete your backup just pulls all files using that call and then uses b2_delete_file_version to delete them all - which is unrecoverable operation.
All is because the “write-only” API key has write and delete capabilities.

"Hiding’ file is more like a “cosmetic” operation in B2 - it just removes file from “normal” view.
This is why Backblaze had to come up with “object lock support” referenced in my original post.

Object lock can help, but it is not really compatible with Duplicacy…

I am not mixing them and this is not accurate. It is possible to create a key that can b2_hide_file but not b2_delete_file_version. Please see the documentation I linked above.

Hiding is not a cosmetic operation if bucket lifecycle policy is set to do anything other than keep all versions perpetually as the file will eventually be permanently deleted. Hiding is a poorly named soft delete. b2_list_file_names will not return these files which is already what duplicacy is using to get a file listing.

No, you cannot do that, unfortunately. When you create a key, you only have 3 options:

  • Read and Write: most capabilities
  • Read Only: no write/change capabilites
  • Write Only - capabilities: deleteFiles, listBuckets, writeFiles

b2_hide_file needs writeFiles and b2_delete_file_version needs deleteFiles - both of which are given to “Write Only” API key.

I discussed this with B2 support some time before they announced object lock and they confirmed that there is no way to secure bucket from such attack.

I see the confusion. These are the only options exposed by the b2 web UI. However, you can create a key with specific privileges using their API:

capabilities
required
A list of strings, each one naming a capability the new key should have. Possibilities are: listKeys, writeKeys, deleteKeys, listBuckets, writeBuckets, deleteBuckets, listFiles, readFiles, shareFiles, writeFiles, and deleteFiles.

1 Like

Ok, than this is something new - as I said, my conversation with their support ended up with confirming inability to protect files when write permission was given.
Have you tried creating API key with only writeFiles capability? If that is possible, then it should be possible to use Duplicacy and protect backup from being targeted.

You have to use B2 CLI tool:

b2 authorize-account
b2 create-key --bucket [bucket-id] [backup-key-name] listBuckets,listFiles,readFiles,writeFiles

Or just call an API.
B2 tool does not like restricted API keys, so you can’t upload:

 ConsoleTool cannot work with a bucket-restricted key and no listBuckets capability
 ERROR: application key has no listBuckets capability, which is required for the b2 command-line tool

I already tested this with small C# snippet and it seems to work, maybe useful for some other project I have.

I think with this it should be possible to get Duplicacy to create sort of write-only, no delete backups, although I am not sure if Duplicacy relies on listFiles…
And you’ll still need to be able to delete files somehow - possibly running cleanup manually with different API key.

I didn’t understand your point, the above commands with the CLI tool are calling B2 API.

I use two keys, one for backup and one for prune.

The “backup” key has the permissions listBuckets,listFiles,readFiles,writeFiles

And the “prune” key the permissions listBuckets,listFiles,readFiles,writeFiles,deleteFiles

The particularity in my case is that I only execute prune manually (and rarely). The “prune key” is encrypted with GPG and I provide the password at run time.

I think the only vulnerability in my case is if a ransonware was able to install a keylogger and capture the password I type for the prune key when I use it.

I do exactly the same thing and it works well. The only other thing is to ensure the bucket Life Cycle settings keep prior versions (at least for a while) as writeFiles allows files to be overwritten (if you only keep the latest version). That said, I think this is a requirement for fossilisation anyway.

Should an attacker choose to mess up all the chucks at least I’d have a copy. Albeit some DIY scripting would be needed to restore a backup from previous versions of chucks.

1 Like

This may work if you can manually run prune regularly - not really a good option in every case. I still think read only snapshots on the storage side are better solution.
I also prefer solution where Duplicacy creates local backup (on NAS or removable drive) and then separate job (on central NAS) pushes data to cloud storage using rclone.
Looks like rclone also now supports “soft delete”, I need to figure out what would be a minimal set of key capabilities for rclone to properly mirror Duplicacy backup to B2, including prune and let life cycle to take care of “soft deleted” files.

So you are saying that outside of pruning, duplicacy doesn’t need to delete anything? including metadata?

@towerbr is right indeed. Also linking to another thread where we talked about this in the context of Backblaze. I think I came to the conclusion that some of this is a happy accident of how Duplicacy works on b2 storage.

1 Like

Thanks! I was looking for exactly this topic to quote here, but I haven’t found it, I was probably not using the right words in the search.