Feature Request: support custom B2 download URL in WebUI

So, I opened a pull request for specifying a custom download URL when backing up to B2. This allows you to get free data downloads through Cloudflare. I’m using this now myself and it’s working great.

I figured I could try it with the WebUI by replacing the duplicacy binary and manually adding the keys in the right places to the preferences files. In practice though, it looks like the preferences files are generated as needed and the “keys” section that I add is overwritten every time.

As far as I can see there isn’t a place to add keys and values to the storage configuration anywhere in the WebUI. As the WebUI seems to mostly use the duplicacy binary for all tasks, I think it’d just be the configuration interface that is necessary to support the custom download URL.

3 Likes

Thank you for your contribution and sorry for not responding earlier. The changes look good but I haven’t had a chance to test it. My only reservation is this URL should be placed in the storage URL instead. For example, maybe we need a new storage prefix such as b2-alt:// or b2-custom://, so the URL will be like b2-alt://foo.example.com/bucket/path.

1 Like

Ah, I see. Would it be enough to parse the URL with a modified regex?
Right now it’s matching:
^([\w-]+)://([\w\-@\.]+@)?([^/]+)(/(.+))?

b2 looks at 3 and 5 for the bucket and path respectively. The current pattern would still work if the URL was specified as b2://custom.url@bucket/path
Alternatively, a specific pattern for b2-custom:
^b2-custom://([^/]+)/([^/]+)(/(.+))?
That would put the URL in group 1, the bucket in 2, and the optional path in 4.
If that looks good I can make the changes and push an update to the PR.

If I’m looking at this correctly, that would mean the WebUI wouldn’t need to be modified to support this. Even better :slight_smile: Well, I suppose an interface would be needed to support configuring the new storage type, but this could also be configured manually by editing the json file.

1 Like

You don’t need a new regex; just use the general one (group 1 is b2-custom, group 3 is the URL, group 5 is the bucket and the path.

Right, for the current web version you can edit the storage url the json file. Once this feature gets into a CLI release I’ll update the web version to support it.

2 Likes

Ah, I didn’t think of that. I just pushed an update that does use a new regex. If you’d prefer, I can switch back to the default, though that would then require splitting group 5. Let me know your preference.

1 Like

That is ok. We can always make changes later. I’ll run a test and then merge the pull request.

1 Like

Just to put the link to the PR:

2 Likes

Just wanted to post an update on my own progress. I finally got around to trying my branch with the web UI. I replaced the command line binary and updated the storage URLs in duplicacy.json to use the b2-custom://b2.example.com/bucket format. I ran my scheduled tasks and restored around 200 MB of data. No data was charged against my B2 account and I could see the spike of data in the CloudFlare interface.

@gchen let me know if I can provide any assistance in your testing or if you want any changes to the PR.

1 Like

When I try to add an additional storage with my existing b2 id, key, and password but with the custom URL, I receive a 101 error when trying to run a check. I have my CNAME record set up and my worker rule within Cloudflare, but I’m wondering if I’m missing a piece on the Duplicacy side. Could you possibly lay out the steps to get this working or how you did it?

I’m running saspus’ docker container, if that matters.

what is the exact error message?

The error is:

Running check command from /cache/localhost/all
Options: [-log check -storage B2-Cloudflare -a -a -tabular]
2023-01-18 08:57:56.992 INFO STORAGE_SET Storage set to b2-custom://subdomain.mydomain.com
exit status 101

For the config file, I copied my existing storage from the .json and pasted it again in the storage array, edited the URL from b2://mybucket to b2-custom://subdomain.mydomain and left everything else identical. Is that the proper method?

Edit: I just saw arno typing below (thanks, @arno !) and saw the post prior re: bucket URL. I just added the bucket to the end. This is the new error:

Running check command from /cache/localhost/all
Options: [-log check -storage B2-Cloudflare -a -a -tabular]
2023-01-18 09:32:06.968 INFO STORAGE_SET Storage set to b2-custom://mydomain/mybucket
2023-01-18 09:32:07.100 INFO BACKBLAZE_URL download URL is: https://mydomain
2023-01-18 09:32:07.341 ERROR STORAGE_CONFIG Failed to download the configuration file from the storage: https://mydomain/file/mybucket/config missing headers: [x-bz-file-id x-bz-file-name]
Failed to download the configuration file from the storage: https://mydomain/file/mbucket/config missing headers: [x-bz-file-id x-bz-file-name]

I think you’ll still need the bucket at the end of the URL, but I suppose that might also depend on the rule that you set up. Though I don’t think the rule can rewrite the request URL, so I think you definitely need the bucket in there.

1 Like

This sounds like something is misconfigured with Cloudflare. I can’t be sure, but x-bz-file-id and x-bz-file-name are reference on their page about image hosting.

The only thing I can think of is to double check your configuration. I couldn’t find the Backblaze guide that I followed, but Archive.org has it: https://web.archive.org/web/20220625041259/https://help.backblaze.com/hc/en-us/articles/217666928-Using-Backblaze-B2-with-the-Cloudflare-CDN

Oh!….
Did you follow the image hosting guide and write rules to remove the x-bz headers? I wonder if that is tripping up duplicacy?

Is your bucket public? I was following the guide to set up a worker instead of a page rule because I have a private b2 bucket. How to allow Cloudflare to fetch content from a Backblaze B2 private bucket – Backblaze Help

It is indeed public. I wonder if the worker isn’t returning all the headers. Can you try fetching your config with curl to see what is returned?

curl -I https://mydomain/file/mybucket/config

HTTP/2 200 
date: Wed, 18 Jan 2023 18:53:46 GMT
content-type: text/html
cf-ray: 78b9890e9b19dbb2-LAX
report-to: {"endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report\/v3?s=hSffh05gnfoCWggb0BWAbA3DuKaix2cxKMh8sJT790vwQ%2F4tN32PthXEFCHGsb0pTtUCTjCrZoA2akoHzPIlYhJQkANLP3k77EJ5tCnhxda07S0gmFwRr9u7lT2lxu3x8Q%3D%3D"}],"group":"cf-nel","max_age":604800}
nel: {"success_fraction":0,"report_to":"cf-nel","max_age":604800}
server: cloudflare
alt-svc: h3=":443"; ma=86400, h3-29=":443"; ma=86400

Not sure on the details but a 200 should be good, right? Edit: Well, actually no, this is CF content, not duplicacy config. Hm

200 is good, but the content-type is wrong. Mine is application/octet-stream.
What’s the actual content look like? I’d guess that your worker isn’t configured appropriately, but I don’t have any experience to troubleshoot that. I didn’t purse using a private bucket as I already had the public version.

Is there any reason you went with a public bucket, or rather is there any disadvantage to it? I’m using a key encryption within duplicacy so if this is easier, perhaps I’ll leverage a public bucket.

I had already been using a public bucket because… yeah. It’s just how I set it up in the first place and everything is already encrypted so I haven’t seen the need to make any changes. The one disadvantage is I think it’s technically possible for someone to discover the B2 URLs and run up my bill by exfiltrating data. I have caps in place to catch that before it gets out of control and I don’t really see it as the most likely attack, but it’s at least possible.

The other “image hosting” method is also intriguing, so if I get it working by that route (with and without private buckets and workers) I’ll post my findings.