You can have successful all-inclusive models. and unsuccessful pay-as-you-go models, there is no direct dependency there. All-you-can-eat restaurants? Can be sustainable. Unlimited bandwidth internet connection? Can be sustainable. Unlimited minutes mobile plans? Can be sustainable.
My internet provider provided unlimited bandwidth plans for many, many years, and there is no indication they are going out of business any time soon. In fact, unlimited bandwidth for end users is more of a norm than the exception nowadays. Are heavy users being subsidized by light users? Sure, just like people buying things on sale are being subsidized by people who pay full price.
And if you think just because youāre paying a-la-carte your services cannot be terminated or substantially changed, well⦠letās just say there is no universal law that prevents it from happening. You may want to check out Nirvanix story, itās dated but still quite relevant. Some clients had to extract petabyte+ datasets on a two week notice
I personally treat all storage as unreliable, and public cloud is no exception. As far as I am concerned, any single storage can disappear in its entirety at any particular moment for whatever reason, and my overall infrastructure should be resilient to such events.
Iām looking at this from a value to a consumer perspective, not service providersā ability to make profit. And Iām not saying that pay-as-you-go model guarantees service immortality. All the above you mentioned are examples of horrible, to the consumer, models, for two main reasons:
These models unfairly penalize many light users, to subsidize few heavy abusers.
Service provider incentives are not aligned with that of the customer: service provider is incentivized to prevent you from using resources you paid fixed price for ā because the less/slower you use ā the more they earn. In contrast, in pay-for-what-you-use approaches ā the more/faster you use ā the more they earn.
You can wrap it any way you want ā but this conflict is fundamental and unavoidable.
Iām beginning to think that Amazon S3 is probably the better option for me. I just want to store a weekly full system backup on a Sunday (~75GB compressed) and daily incremental backups (~ 2GB) and hold two weeks worth. So about 175GB in total which I hope never to have to download for a full system restore. Trouble is is that the backup software I use doesnāt appear to support S3. It supports OneDrive (Personal and Business), Dropbox and Google Drive, but thatās it. Mmmmmm.
May just think about backing up to Unraid and then let Duplicacy back that up to S3. But this means that if my backup on the NAS was unavailable, I would potentially need to download a full weeks worth of backups to something else to restore my system instead of just restoring directly from the Cloud. Never easy and straightforward is it?!
Just wish my testing of the Duplicacy process even with S3 was faster for a potential restore than the 8.5MB/s I experienced. Perhaps a bit of throttling on a āfreeā 5GB account?
Does the download speed look any faster on a paid-for S3 tier that youāve experienced?
There must be a typo there somewhere. So I assume 200GB. Since you only want to store it for a very short time, AWS āS3 Standard - Infrequent Accessā tier seems the best fit, which results in 200GB * $0.01/GB/month = $2/month, + api cost. (see Amazon S3 Simple Storage Service Pricing - Amazon Web Services).
This is so weird. What software is this?
You can backup with duplicacy to unraid, and replicate with another instance of duplicacy from unraid to the cloud. Then, if you need to restore, and unraid is gone, you can initialize duplicacy with that cloud destination locally and restore directly. No need to download in full.
How many threads? With s3 you can use 10, 20, 40 etc threads, and you are only limited by your ISP connection. Duplicacy has built-in benchmark. What does it report?
Unless the daily changes are new compressed music files, images and/or videos, Duplicacyās deduplication will reduce the total storage requirement quite a bit.
A traditional 7-day round robin schedule of 1 full + 6 incremental backups isnāt needed with Duplicacy. Your very first full backup will likely be less than the estimated ~75GB. Each daily incremental afterwards will likely be less than the estimated ~2GB, and might even shrink over time depending on the type of data.
One of the huge advantages to a chunk-based deduplicating backup tool like Duplicacy is that only the very first run is a full backup. All successive uploads to the same storage destination ā while technically āincrementalā backups ā are effectively full backups because duplicate chunks are reused. Any snapshot, including the first one, can be pruned at any time.
Nope, unfortunately reliable backups rarely are.
I havenāt compared Amazon S3, but it certainly is the case for Google Drive and Microsoft OneDrive (I have business accounts for both at work).
Given that your storage requirements are around 100GB (75GB + 13 * 2GB), as an alternative to OneDrive / Google Drive / S3, consider rsync.net.
Rsync.net offers a special pricing tier for advanced users willing to pay annually https://rsync.net/products/borg.html ā 200GB of data costs $36/yr with no ingress/egress charges and/or bandwidth caps (minimum charge is for 100GB = $18/yr, $0.015/GB thereafter).
I helped a friend set up a backup to rsync.net. No ābucketsā or other opaque storage format and no special API, just standard SSH + SFTP access to a Linux account that you can store data on however you want.
A good and reliable solution must be simple and straightforward. Otherwise, you canāt trust it. If at any point you feel the arrangement becomes too cumbersome ā itās time to stop, re-assess and likely start over.
Iāve looked at it, rsync.net may still has its uses, but itās not a good fit for backup in general and OP in particular for many reasons:
You have to pay for storage upfront, regardless whether you use it or not, and therefore waste unused space
There is 680GB minimum order.
Cost of their geo-redundand tier is comparable to the most expensive AWS S3 tier
No egress fees: With backup you rarely if ever have to egress, and yet you are indirectly paying for other peopleās egress.
No API fees: There is very small amount of calls involved in uploading backup data, and yet you are indirectly paying for other peopleās use of the infrastructure.
They provide an interesting solution with interesting features, but those features will be wasted if it is only used as a backup target. In fact, if you look closely, their main selling points (āWhat Makes rsync.net Specialā) are not special at all⦠Since they started in 2001 a lot has changed. Itās hard to compete with amazon, google, and microsoft.
Iāve just explained right there. āFreeā is an illusion. Traffic costs money. Not charging a specific customer for it means the aggregate cost is rolled in into the storage cost. For backup usecase there is very little egress. Hence, the customer will be paying for other usersā egress. Or, put it another way, part of the payment will be going to cover other userās egress and as a result customer will receive smaller value of services provided, aka overpaying. Same goes on the infrastructure/API costs
In other words ā āfreeā things, as always, are the most expensive ones.
LOL, thatās getting better and better. But Iāll play the ball and expand the argument. So both Google Cloud storage and AWS charge for storage and bandwidth, and are right there in terms of how things should work, right? But wait, Google Cloud is losing a ton of money as a business unit - this means if you use GC youāre leeching off ad users who have to pay higher ad rates to support your GC usage. Not fair.
AWS is opposite, it is quite profitable. But wait, that means that youāre supporting some other businesses that lose money, like Amazonās investments into electric trucks/vans in Rivian. Not fair again!
Blockquote saspus
There must be a typo there somewhere. So I assume 200GB. Since you only want to store it for a very short time, AWS āS3 Standard - Infrequent Accessā tier seems the best fit, which results in 200GB * $0.01/GB/month = $2/month, + api cost. (see Amazon S3 Simple Storage Service Pricing - Amazon Web Services).
Iāll take a look. Cheers!
Blockquote saspus
This is so weird. What software is this?
EaseUS Todo Backup their āEnterpriseā version even though I just use if for my personal PC.
Blockquote saspus
You can backup with duplicacy to unraid, and replicate with another instance of duplicacy from unraid to the cloud. Then, if you need to restore, and unraid is gone, you can initialize duplicacy with that cloud destination locally and restore directly. No need to download in full.
Iām afraid I got into the habit of using EaseUS to do a full system/drive image backup including the UEFI partition. To ārestoreā the entire system EaseUS has a Pre-OS installed that can just restore straight to the UEFI and boot partitions in their entirety. Not sure how I could achieve that with Duplicacy if there is a non-functioning system drive to sort out!
Blockquote saspus
How many threads? With s3 you can use 10, 20, 40 etc threads, and you are only limited by your ISP connection. Duplicacy has built-in benchmark. What does it report?
It was 4 threads. Interesting, Iāll test with more and see what happens!
Thatās irrelevant, and missing the point entirely. We can discuss this too, but itās a whole separate new topic.
Here in that comment, Iām talking about the value for money the service provides to its customer. Anything āfreeā there, and/or rolled into the āfixedā cost is a very poor value for a majority of users, by design; thatās the whole point of doing it. With itemized invoices, itās much harder to screw the user over.
Profitability of a specific company is not a topc of this dicussion.
Duplicacy is not designed for a full system bare metal backup.
Generally, if you need full system backup, a hybrid approach is advised: infrequent bare metal backup, (say, monthly, or after major system updates or changes, or never) and frequent user data only backup (say, hourly). That way, system data does not compete with user data for spacce and bandwidth, and user changes donāt sit in the queue behind bulk system backups.
Barely costs anything at these scales! Charging for āAPIā is even more silly, though you can understand the reason AWS et al do it with their service - because big businesses do things on a much much larger scale, where ingress/egress and transactions do ramp up along with its storage size - needing to be time-critical, will actually have tangible cost at those levels.
This isnāt necessarily the case at smaller scale and with home users, and smaller providers. Quite frankly, itās really up to the provider to juggle that, and for the consumer to stop conning themselves into a worse deal over some questionable principles. This mentality is the reason why certain countries donāt have internet connections with unlimited bandwidth as a defacto standard.
Anyway, the fact that Google Cloud Storage and Google Drive exist in the same Workspace product, proves this isnāt a downside.
I agree. Those āprinciplesā are justification or explanation of what happens, not the ultimate goal in themselves. And yet, the experience suggests that the overall quality and value of the service received strongly correlates with the incentivesā alignment, as described above. So, as a shortcut, to avoid scrutinizing every service in every respect, (which is often impossible to do, and even if you did ā it may change tomorrow) you can go by these rules of thumb. As a result, you will get with better deal.
For specific examples: people who think that they found a ādealā of getting 6TB for $5/month though Office365 discover after uploading a large chunk of data that performance sucks, API has bugs, and Microsoft would not do anything about it. Any savings evaporate this instant.
Disagree. With Workspace, you pretty much get storage for free: storage is not a product, itās incidental to the SaaS they are offering, the collaboration and management platform. If you recall, they have never enforced storage quotas there (I donāt know if they still do). They are effectively āunlimitedā. Would it be wise to use a Google Workspace account as a replacement for GCS or AWS S3? Hell no.
But⦠but. I was doing it myself!. Yes, I was dumb, and learned on my mistakes, so you donāt have to.
I agree entirely with the general notion. Weāre all seeking the Holy Grail of backup solutions.
For the average user, it usually means itās on by default and just happens without any effort (e.g., iOS and Android devices backing up to their respective cloud services).
For more sophisticated users, itās most often a USB drive sitting on a desk or drawer. The level of complexity ranges from simple drag and drop to some software solution.
And then thereās you, me and the other folks on this and similar websites with more advanced requirementsā¦
For us, the journey to backup nirvana begins with deciding on which path to take (offline, DAS, NAS, cloud, or some combination thereof?); the simplicity of drag-n-drop, disk imaging or software with a multitude of backup options; and sifting through all of the service providers to find out what best meets our needs (speed, cost, compatibility, reliability, etc.).
At the end of the day, it takes us a lot of work to make our backup solution(s) look straight forward, simple, reliable, and good all at the same time.
That particular minimum only applies to the default service plans. Thereās a special āBorgā service plan (not limited to the Borg backup software) with a 100GB minimum ($18/yr).
While itās true that storage is paid for up front, other than the relatively small annual minimum ($18 isnāt a whole lot of income for a service provider after factoring in 1%-6% for the card issuing bank + network fees + merchant bank fees), additional storage is billed in 1GB increments for $0.015/GB ($0.18/yr) and storage thatās added/removed is prorated for the remainder of the year so wasted space can be kept to a minimum.
For a user with 5GB to back up, there are definitely cheaper options than rsync.net if cost of storage is paramount. But itās also why Dropbox currently has 700 million users but only 17 million paying customers (< 2.5%) and a $8.6 billion market cap while having over $3.2 billion in debt and other liabilities on its balance sheet. The former has been profitable while the latter is still a work-in-progress.
The main course plus sides might not be the best value meal ever, but itās still among the lowest overall tabs out there (itās almost dinner time ).
Also true, but the standard plan might be sufficient for many users, especially for those who follow a 3-2-1 backup protocol.
Given the storage requirements @rjcorless estimated, if rsync.netās datacenter got hit by a nuke, as long as one of the devices being backed up is outside of the blast radius, thereās likely plenty of time to make another backup.
I honestly donāt think rsync.net is actively trying to unseat any of the big three. Itād be a futile exercise. But rsync.net doesnāt have to out compete in order to be a sustainable business.
Thereās Google Photos and many other free alternatives, but yet SmugMug has a loyal paying customer base (only free option is a 14-day trial). I used to be a SmugMug customer and would again if the need arose because it was a good value.
Same goes for Fastmail which is competing with Gmail, Outlook/Hotmail and Yahoo. I have family and friends whoād balk at paying even $1/yr for email service but plenty of enough people pay for Fastmail to have sustained it since 1999.
For sure, rsync.net wonāt be everything to everyone, but itās certainly something to someone for it to have lasted more than two decades so far. Itās good to have a variety of options to choose from.
Sorry, Iāve seen no evidence here that this is the case. Unnecessary paying for egress and API calls doesnāt make something a ābetter dealā and certainly no rule of thumb to actively look for.
Google Drive is perfectly fine as a backup destination. No download caps or costs, no API charges. Why would any sane home user choose GCS over GCD when it fits the bill and costs significantly less?
Well, just did a test on S3 with 10 20 30 and 40 threads. Looks like 30 is the max/optimum for my bandwidth as I could do a restore @ 15MB/s.
Can anyone tell me how to run the benchmark test because (i) going to either an unraid console or the Duplicacy console resulted in āDuplicacy not foundā message and (ii) once I found āduplicacy_linux_x64_2.7.2ā was the program to run, I got init errors and I could not work out what to do as I am using the Web Gui.
I am not CLI proficient and probably know enough to be really dangerous!! Ta.
One may come to that conclusion with naive approach of only optimizing āstorage cost on paperā. But this not the only factor. If you consider the whole package ā how much time and money is it cost to run duplicacy to GCD ā the cost of GCS becomes negligible in comparison.
It goes back to using the right tool for the job. GCD is not designed for bulk storage. GCS on the other hand is specifically designed for that. So one should expect fewer issues with the latter and more with the former.
And indeed, have you noticed I stopped posting here with technical problems quite a while ago? And have you seen all the issues I reported with GCD? Blog posts I written? The amount of time I spent triaging them and building workarounds would have covered the cost of GCS tenfold for decades. And that is GCD, that is one of best ones, that is possible to made work. I gave up on OneDrive in an hour. An hour I will never get back, mind you. I value that my hour, let alone amount of time I spent triaging GCD issues much more than the aggregate cost of the storage on amazon for years.
Have you seen any issues anyone reported with S3 or GCS, that were not a configuration issues? I havenāt. I wonder why.
And lastly, a plot twist, invalidating the false premise of āGCD being cheapā even on paper itelf: google cloud drive is not cheaper than proper S3 archival tier storage over the lifetime of a backup. Duplicacy does not support archive tiers, and thereby forces users to overpay for hot storage. Backup in a hot storage is an oxymoron. The middle ground is S3 intelligent tiering, but this requires reading and analyzing long term costs. End result is be a massive time and money savings for home users who predominantly backup static media. This is something that Acrosync can do with duplicacy-web ā to wrap archival storage into a end user product drastically saving money to the customers. But thatās whole other topic.
you need to run it in the folder where the storage is initialized. Ideally, run it on your desktop, to remove unraid from the picture entirely.
Otherwise, go to the backup or check logs in WebGUI and in the first few lines will be a path to a temp folder.
You need to cd to that folder and run duplicay benchmark (with the ful path) in there
Iām only running Duplicacy on the Unraid at the moment, not on my PC. Because I want a quick-as-possible restore process, Iām using a combo of EaseUS ToDo for the ābare metalā backup, and Iāve also set it up to run a āsmartā backup which runs every 30 minutes across all my documents and stores locally on another (non OS) drive in my PC (this runs Incremental backups throughout the day than as midnight rolls over it creates a differential backup for the entire previous day and re-starts incrementals for the next day.) Then I use Syncbackpro to transfer to my Unraid NAS at the end of the day. I was just going to use Duplicacy to provide that cloud based back up from the Unraid server.
Synbackpro can save directly to the cloud as well, including S3 (which ToDo canāt).
As you may tell, my ābackupā strategy is a bit of a patchwork quilt precipitated by me building my first NAS with Unraid on an old re-purposed Sandybridge MB with an Intel 2500K and 16GB Ram. On oldy but a goody! It has a very stable Overclock of up to 4.5Ghz, great cooling and 8 SATA ports. Perhaps my strategy is more a reflection of ignorance more than it is necessity!