Backup of Google Drive to local

I would like to have a local backup of my Google Drive account saved to my NAS. I’m attempting to set up Google Drive under Storage configuration. I have the token successfully downloaded, but when I get to Directory, it won’t allow me to select the root of the Google Drive (which I assume would be a blank field). Unless I choose a subfolder, it also won’t let me proceed. Is this a limitation of Duplicacy? I see plenty of posts on backing up TO Google Drive (presumably, to a folder within the main Drive) but nothing about backing up Drive to local, and nothing specifically about backing up to or from the root directory fo Drive. Thanks for the help!

Duplicacy is intended to backup local data to remote destination and it does not support reading from google drive (unless you sync or mount it locally first; but then it becomes just the local directory).

If your nas is Synology check out Active Backup for Google Workspace

Or to use duplicacy -1 mount the google drive with rclone mount and then configure backup of mounted folder with duplicacy as if it was local.

@saspus thank you! These are really great points. My NAS is Synology, and I’ve played around with Cloud Sync but of course, it is not true backup. I will look into Active Backup as a potential solution. Interestingly, I also use rclone regularly but more for server-side backups and transfers, so mounting is another great idea so it would behave as if local.

I made an attempt to do a backup of Drive based on the Google Drive app mounting the cached folder in Finder, but that was riddled with various errors, and it really was not the best approach, IMO. A bit clunky and, to our point, not intended use for either the Drive app or Duplicacy.

If you want to accomplish that on the nas (and I’d recommend against it) – you could use cloud sync along with periodic filesystem snapshots for versioning if cloud sync wasn’t such an unbearable unstable and unreliable piece of work. No, seriously, it leads my personal top 100 rating of crappy software across the board. (that is as of DSM 6.2.4; I got rid of all my synology boxes shortly after). That said, you could use any other sync software – like rclone sync – to keep local data in sync with the cloud and then Btrfs snapshots will provide a cheap and lightweight deep versioning. You don’t really need anything else.

A bit off-topic but that’s precisely what I replaced my park of synology diskstations with: google workspace account mounted locally with cache. Better performance, better reliability, and much, much cheaper. It works very well for me for almost a year now.

IMO it is the best approach – this involves first party apps that are guaranteed to continue working no matter what and that you don’t need to maintain. Google Drive (formerly Google File Stream) recently switched from FUSE to SMB mounts (at least on macOS; not sure about windows) and this seems to have increased stability. With that your google drive is just a mounted SMB volume. You can back it up with any tools of your choice including duplicacy to the nas. Another advantage of this approach is that you don’t run any software on a stip down and concussed version of linux that DSM is that only works for a narrow set of synology usecases; realistically – only good for SMB/SFTP/iSCSI, not much for running third party apps. I would strongly suggest not using DSM for anything but those basic services: amount of opaque shim layers they inserted into the open source components they use is terrifying and does not inspire confidence.

1 Like

When you say you mount it locally with cache, do you use rclone? Or you’re using the native Drive app (formerly Drive Filestream)? So far, I have tried doing rclone sync via the mounted Google Drive (Gsuite business) folder, and I’ve also tried with Google Drive configured as an rclone remote. I got slightly different transfers each time, which was disconcerting. Looking at the logs, I was also getting various IO etc. errors. And that is why I have been interested in Duplicacy to see how it works with Google, in case the results are more consistent and stable.

I was able to do a small test in Duplicacy yesterday, with the Drive app mounting it as a volume. Did a test restore and it looked good, but maybe rclone is the way to go with Google. And I’ll just use Duplicacy for my automated local HDD backups to the NAS.

Both. Initially I used rclone with --vfs-cache option; but it relied on FUSE for the VFS part; so did GFS/GoogleDrive (they shipped with their own version of osxfuse). I needed encryption on top so I picked rclone using google service account from a custom project and rclone crypt for encrypton. It worked really well.

However I really, really don’t like to have to install third party kernel extensions so once google drive switched to SMB – I moved to it instantly. The benefit being that besides not needed kexts its a first party app so it will work and supporting it is not on me.

What do you mean? Client to virtual disk transfers (rclone was faster, but both are fast enough) or to the cloud (which is irrelevant, all is needed that sync occurs occasionally)

Do you have logs? IO errors are usually result of bad media.

It works using public google drive api. Just like rclone does. Just like GFS does. There are no issues there. If you see issues with rclone but not duplicacy it does not mean that duplicacy is magic – it means that you still have issues just duplicacy does not hit it yet. I would start with triaging those issues.

I would strongly recommend against doing that: if you backup to a virtual drive then backup is no longer deterministic: when it is “done” files are not yet safe at the cloud and there is no telling when they will be.

Instead have duplicacy backup directly to google drive (as in Google Drive webservice). If performance is not great – create your own google project and issue credentials from there. But that is rarely a problem, it’s a backup tool, performance is irrelevant.

Rclone and duplicacy are both written in go and use the same google libraries to talk to google drive. One is backup tool, another is sync tool. I’m not sure I understand how can you use one in place of another.

Ah, you have HDD. That might explain your IO failures

When using rclone sync to sync the Google Drive (set up as an rclone remote) to local folder, I’d get say 1234 files transferred to the local folder. When I did the same Google Drive sync to local folder_2 (for testing purposes) it would give me, say, 1245 files transferred. It would report 100% checks, 100% transferred in both cases. It would also report a few errors (checked those and they were permissions errors, because I have a few shared spreadsheets that are saved to my Drive, but cannot create a copy fo it). Still not sure of the discrepancies and need to study the logs a bit more, or do some more tests.

Yes, I did look at the logs. I actually had issues where it would not delete items/folders in the destination, which is the expected behavior if it encounters IO errors. The logs reported the following:

ERROR : Local file system at /Volumes/Test-Restore3: not deleting files as there were IO errors

But other than that, I don’t see any other errors aside from the reported issues with downloading/opening a few files due to permissions, and the occasional rate limiting error with Google api. (Maybe that is enough to prevent deletion at the destination? Looking into this still.) I have rclone running on a Mac with SSD; my Test-Restore share is on the NAS which is mounted via SMB3. Typical Ironwolf drives in that.

Interestingly, I’m using a personal Google account to do these tests, so it’s the standard api. On my Enterprise google account, I do have my own credentials. Should probably give that a try instead, but those folders are much larger.

I wouldn’t use them interchangeably. I’d just have a different workflow. All I really want is a copy of my files, stored elsewhere; if I use rclone to sync it occasionally so my destination is up to date, that would be fine. Or, I could use Duplicacy and back up occasionally.

I think this is easiest, most stable, and makes the most sense. And great point about the virtual drive. This may account for some of the issues I’ve been having.

Ah, those could be metadata files (e.g. .DS_Store) that maybe got created while you were looking at the folder :). I would nor worry about few files here and there. Or just delete all dot files and compare folders, with e.g. diff /Volumes/folder1 /Volumes/folder2.

With google documents it’s complicated. You can have rclone export them (see --drive-export-formats) and perhaps there is similar option for google drive. I don’t know much about this – don’t use google productivity tools myself.

Ah, IO errors on mounted filesystem. Was that Fuse or SMB? I found fuse is very finicky on macOS (I assume your OS is macOS because the whole /Volumes thing). I started seeing quite a few issues with it once updated to macOS beta, and I don’t want to play catch with them so I’ve stopped using fuse altogether.

Ah, so you are rclone from the cloud to the /Volumes/Test-Restore(n) which is mounted via SMB from your NAS? (is the NAS Synology by a chance?)

It’s more not about API but the google project that credentials are issued from. If you use gcd_start to generate the token then it was issued by acrosync owned project that will share resources with all the rest of duplicacy users that went that route. You can create your own project, create an account there and use that instead to access your google drive, if you are hitting some limitations.

You could run rclone sync (or even rclone copy) directly on then as to keep local copy of the drive content, and then enable periodic filesystem snapshots for versioning (if your has supports ZFS or BTRFS). That would be cheap and performant. And you do want versioning to safeguard against corruption/ransomware/etc. Just a copy of the data is not very useful as a backup. Then you can take it even further and backup this local copy to another third party cloud with Duplicacy.

Oh right, those pesky files, you’re probably correct! I like a perfect match in terms of filecount, but I should worry less and trust the check/verify.

Biggest difference I’ve seen between rclone and Duplicacy is that rclone will not transfer the google documents (but does export them to .xls or .doc); Duplicacy will backup and restore the .gdoc/.gsheet documents but they are zero bytes. Correct modification time and and name, just not actual, active files. Interesting nonetheless! Honestly, Google’s very frustrating in this regard :slight_smile:

Yep, it is via SMB and the NAS is Synology, mounted in macOS.

Ah, gotcha – I will look into this. The slower speed doesn’t bother me much, but I’d certainly like to avoid those limitation errors.

Great suggestion, and you’re right about the versioning of course. It is BTRFS and I have versioning enabled. I guess in theory, that’s all I need :slight_smile: And I like the suggestion of another cloud backup eventually, though I have only just started this backup journey. Perhaps one day, Wasabi or B2 for me.

1 Like

Probably dupliascy shall skip those altogether – there is no point in backing them up – those are just short xml files with URL pointing to the documents store in the cloud database. (Synology Office took same approach – documents are just links). There is separate set of APIs to export the documents into a file – and that’s what rclone (and Sinology’s Cloud Sync one-way sync option) does – they explicitly export the document when encountering the .gdoc/.gsheet files.

I think it is by design. Google wants to host and own everything, and this also helps ecosystem lock in. Neither Apple (Numbers/Pages) nor Microsoft OneDrive (Word/Excel) do that – they actually sync actual documents. That’s one of the reasons I don’t use Google anything. Just Drive. Because its’ cheap :slight_smile:

In DSM 6.2.3/update2 Synology broke something in SMB. It become ridiculously unstable, at least with macOS. More information here: https://www.reddit.com/r/synology/comments/j7s4qk/afp_vs_smb_not_as_clearcut_anymore/

I don’t trust Synology to get things right. They are making invasive changes in the open source projects they use and break everything way too often. You can try switching to AFP instead (which is ridiculous that opensourece avahi server that is based on reverse-engineering of the protocol happens to be more stable), or better yet (since AFP is slowly goes away it seems) – NFS. This is too old and too simplistic to be easily breakable.

The way I manage my data – everything I care about lives in iCloud. This gets backed up to Google Drive with Duplicacy (that runs in the background via launchd). Everything I don’t care that much – like media library, old project, some temporary stuff, and other less important data – on mounted (with encryption – via Cryptomator) google drive, in the same account, because why not.

So there is copy of the data on iCloud (That itself offers limited recovery) and (encrypted) backup on Google.

B2 and Wasabi maybe are OK too, and a lot of people use that; I personally dislike Wasabi for their past shenanigans and with B2… they allow silly bugs to sneak in and judging by the software quality of their Personal/Business line of products I don’t feel like trusting them with important data. But that is my personal very subjective thoughts.