How secure is duplicacy?

I fully agree with your comment: as long as the storage is accessible from the client machine only some sort of write-only mechanism on the storage machine itself may help against typical malware/client breach/password re-use scenarios. This is THE major problem of push backups to storage backends and a real show stopper for many backup strategies.

I’m currently using duplicacy to backup to a storage on a freenas box which gives me the possibility to “freeze” the storage by creating ZFS snapshots of the storage. In case of a breach on the client machine the attacker could delete the whole storage and I would still be able to roll back the ZFS snapshots to the point in time when the storage was intact.

Another possibility would perhaps be to have a daily task running on the storage system which changes user permissions on the storage files to read-only for the backup user and do a chown in order to keep the backup user from changing permissions. According to my understanding this would only harm “prune” commands because the “backup” commands will only add files to the storage.

1 Like

On Linux you should be able to set immutable attribute: chattr +i backup/, root is required to do that and then only reset it for a period of time when prune is running (perhaps on the storage server itself).

After some time, I have re-arranged my backups. I’m now using duplicacy to backup from NAS1 to NAS2 via sftp wherein the sftp user is called backup.

Using your suggestion on the immutable attribute, a root cronjob on NAS2 executes the following commands:

chown -R root:backup *
chmod -R 755 *
find . -type d -print0 | xargs -0 chmod 1775

The last command sets the so called sticky bit for all directories of the storage. Thereby only the owner, which is now root, can delete or change files within that directory. Because the backup user is in the backup group, NAS1 connecting via the backup user can still add new files with the next dulicacy backup run.

The prune command is running as root on NAS2. For this purpose I had to set up a local repository (source) which is more or less empty but is connected to the same storage (target) to which NAS1 is connected. Using the “prune -all” allows to prune the snapshot revisions from the NAS1 backup.

Hope this helps some people to set up a backup which survives a malware attack or a breach.

Any further suggestions are always welcome.

4 Likes

One suggestion for cloud-based storage is to provide an access key which doesn’t have delete permissions. I only use Backblaze B2, so I don’t know if this works on others (in part perhaps because of the versioning you have to setup on Backblaze).

If your “normal” key can’t delete files (or even overwrite them), then your data can stay safe. I have a “prune” key which has delete permissions but I only use that interactively and have a script which requires that I unlock it with my PGP key. As such, it never sits around on disk unencrypted and the only time my backups are deleted (pruned) is when I do it. Of course this means I can only prune my storage manually.

I’m sure other cloud storages may have to be setup a bit differently, and the key permissions tweaked accordingly, but the above works for me. It’s probably not perfect, but it gives some protection from someone extracting my Backblaze API key from the system and reusing it to delete backups.

Does it work with rename/move operations such as during chunk upload? If you allow rename then the malicious actor can just rename all files into one. And without allowing rename duplicacy can’t guarantee atomicity. What am I missing?

Since Backblaze doesn’t have a rename function as I recall, things like fossil collection are handled via hiding the file (which would still be permitted). I don’t know about the upload operation (would have to check the code). Also, on Backblaze you need to keep all versions (or at least for 7+ days as I recall) so any overwritten chunks are not lost forever. Only a delete operation (which only my prune key can do) would permanately change/delete data.

This may be nothing more than a happy accident of using Backblaze as the storage. But it’s been working without issue for a few weeks. All my backups, checks, and prunes work as expected. To be fair I haven’t tested every edge case. And if you try to prune with the “backup only” key (which you wouldn’t purposely do anyway, but I wanted to test what if someone tried) Duplicacy throws some errors, but that’s not a problem as the data are still there and can be recovered.

Update: The chunk upload isn’t a concern since it doesn’t involve a rename operation (Interrupted upload to cloud).

Looking into it more, this really does seem like a accidental feature of needing to keep “All revisions” of files in Backblaze (to support the two-stage fossil collection) That said, if the cloud storage handled “create” and “modify” permissions differently, then you may be able to craft keys accordingly (I don’t know much about the other cloud storage offerings). Backblaze doesn’t have such a distinction. “Write” means both create a file and modify a file in Backblaze. In this case, the multiple versioning saves the day. That said, it doesn’t actually add to increased storage usage since a chunk should never change (other than being marked a fossil, aka hidden) during it’s lifetime before being deleted (pruned).

1 Like

How did you create a key that can write but cannot delete?

I only see these options in B2:

key

And when I create a key with writeFiles permission it always comes with deleteFiles permission.

You have to use the B2 API to assign specific capabilities (Application Keys). The website doesn’t let you get that granular. The easiest way is to use their B2 command line tool (Get the Command-Line Tool).

Using the b2 cli tool:

b2 authorize-account
b2 create-key --bucket [bucket-id] [new-backup-key-name] listBuckets,listFiles,readFiles,writeFiles
b2 create-key --bucket [bucket-id] [new-prune-key-name] listBuckets,listFiles,readFiles,writeFiles,deleteFiles

Just be sure to save the Key ID and Key when the tool outputs those values.

3 Likes

Thanks! Time to regenerate all the keys … :roll_eyes: :laughing:

I think I found a problem with this approach.

If I use a key that is not allowed to delete (to perform the backups), and this key is stored in the keychain / keyring, the prune command will try to use this key as well, and it will obviously cause an error.

Possible workarounds:

  • Create a dummy entry in the preferences file with -no-save-password option, to run the prunes. Problem: the prune can only be run manually (as mentioned above by @tallgrass). How do you do to calling prune without it using the “backup key” that is already on the keyring?

  • Save the keys / passwords in the preferences file, with one entry for backup and another for prune. Problem: unsafe.

  • Use environment variables in the script that runs the prune, since “If an environment variable for a password is provided, Duplicacy will always take it.

I think the third is the way to go, right?

Any other options that I didn’t figure?

This is exactly what I do. Since the environment variable take precedence, I set it at runtime for the script which performs the prune. So that I don’t have a plaintext key sitting around, I keep the key GPG encrypted and the script is responsible for decrypting it (and using it to set the environment variable) at runtime.

1 Like

I had debated this approach early on. I had considered setting up the same B2 bucket as a different storage in Duplicacy. Then backup to “default” storage, and prune on the “B2-prune” storage ID. I believe in this case Duplicacy would not try to use the default storage credentials when accessing the B2-prune storage.

Ultimately, I didn’t like where it was headed and it sounded like too much to remember. So instead I settled on the environment variable route.

1 Like

Following up after 6+ months to say I’ve been using the same “two-key” B2 approach with a script to do a weekly prune and check. That script runs manually using my prune key. I have had absolutely NO issues and everything is working just fine. All my prunes and checks behave as expected and show no errors.

For background, this storage receives daily backups from 3 different machines all using the “backup-only” key. Once a week I prune the storage using my “prune key” and keep 1 snapshot a day for the past 30 days, and 1 snapshot a month for the past year beyond that.

I didn’t anticipate the massive increase in ransomware that we’ve seen, but I sleep even better at night now.

3 Likes

Hi,

curiosity, should the key also have the “shareFiles” capability in order to allow Duplicacy to download?

From B2:

Lets the client create authorization tokens for downloading files.

For application keys restricted to a bucket, only files in that bucket can be authorized.

For application keys restricted to a file name prefix, only files whose name starts with that prefix can be authorized.

Provides access to these APIs:

Also you think the capababilities: writeBucketEncryption and readBucketEncryption could be any harm?

No. You aren’t sharing them with third-parties. Your readFiles takes care of the download/restore.

1 Like

Not sure how this works. But since you can enable Duplicacy encryption, with keys you control, I’m not sure there is much gain.

1 Like

Hi,

would like to try this method.
How did you save the B2 “prune” key to disk?

So you run Duplicacy from another machine to prune, with the same storage init’ed on a dummy repository?

Here’s some of the snippets from my script:

# GPG encrypted file which holds the secrets
# Encrypt the file with "gpg -s -r <recipient> -e <file>"
# The intended use is to store the API keys which have delete permissions.
# We use environment variables because duplicacy will read and prefer those
# over any other storage (like gnome-keyring) which means we can override
# the "backup-only" API keys for the prune operation.
# This file should contain each environment variable on a
# separate line in the format with blank line at end:
# VARIABLE=value
SECRETS_FILE="${REPO_DIR}/.duplicacy/duplicacy-maint-secrets.gpg"

#########################################
# Get secrets
# This step will require user interaction to perform the decrypt

echo -e "Decrypting secrets from ${SECRETS_FILE}...\n"

# Exit the script if the decrypt fails
DECRYPTED_SECRETS=$(gpg --quiet --decrypt "${SECRETS_FILE}") || exit 1
echo -e "\nSecrets decrypted.  Pausing for 30 seconds for user to inspect signature...\n"
sleep 30

# Export each environment variable so duplicacy can access it
for LINE in ${DECRYPTED_SECRETS}
    do
    export ${LINE}
done

#...script continues, you now have the environment variables set from the GPG encrypted file but only after you manually inputted the password and confirmed the signature...

And finally, a sample duplicacy-maint-secrets.gpg file (decrypted of course):

DUPLICACY_B2_ID=12345
DUPLICACY_B2_KEY=key12355
2 Likes

Same machine. Same repository. Same storage.

Thank you!!

I will need to study this and try to apply to my case because i actually use Duplicacy Web Edition in a container and this seems a little tricky.