How to set up pruning in Web GUI?

I would like to set up Duplicacy on 8 Windows computers (small company) to back them up completely to Google G Suite Drive (business plan). For a test I have installed Duplicacy Web GUI on one PC where I would like to back up 2 internal and 1 external drive which is about 4 TB totally. Although there is supposedly unlimited space available on Google Drive, I would still like to have automated pruning while backup process would run once per day.

I was wondering how can I set this up? Do I just add all hard drives into ā€œBACKUPā€ tab, run the manual backup once and then add a schedule? Can I simply add a schedule to run this backup with 24h frequency and insert ā€œduplicacy prune -keep 0:360 -keep 30:180 -keep 7:30 -keep 1:7ā€ into ā€œOptionā€ field? Or is there a better way, should pruning be a separated job from the backup cron?

I was also wondering what happens if the computer is not turned on when the backup task should run? Will it run the first time when the computer turns on or will it be skipped? Also what happens if the computer is rebooted while the backup task is running?

The ā€œOptionā€ field is for options passed to the backup command. You should add a prune job to the schedule and set up the -keep parameters.

When the computer turns back on the schedule will not run immediately. Instead it will run at the next scheduled time.

If the computer is rebooted while the backup task is running, youā€™ll have some chunks uploaded to the storage. Next time the backup is run these chunks will not be re-uploaded. If the source files didnā€™t change then basically the backup will resume from where it left off. Otherwise, you may have some chunk uploaded but not referenced in the resumed backup. To get rid of these chunks, you can add the -exhaustive option to the prune job.

If I run scheduled backup once per day, how often should be prune job running? Would it be appropriate that it runs once per day? Should it run before or after backup process? What time window should be between the prune job and the backup job? Can they run in parallel?

I have seen that there is a dedicated option available for prune in Web GUI. These are the default settings:
Delete snapshots older than 1800 days
Keep 1 snapshot every 7 days if older than 30 days
Keep 1 snapshot every 1 days if older than 7 days

This seems to be different than ā€œ-keepā€ option where 4 rules are available. Should I state anything into ā€œOptionsā€ field at all or is it enough that I just fill in the numbers for days and save the job? Should I tick option ā€œRun this job in parallel with other jobsā€?

Running prune once per day should be ok. It can be run any time, in parallel to the backup job or not. But I would not recommend running multiple jobs in parallel unless you have to, because that might consume too much cpu or network resources.

For the initial settings of the prune job, you can enter anything but after you save the job, you can click the Options field to change to the rules that you prefer.

Thanks for explanation, I guess I will just have to check how long does prune take so it will not overlap with backup process.

Nevertheless, I have tried to do a test run and backup whole C:\ drive where Windows are installed as well. Duplicacy Web Gui was backing up for 3 days with 3 days remaining time, however now it is constantly failing each time I run it manually, so it never finishes. Here is the last log file:

Running backup command from C:\Users\raaan/.duplicacy-web/repositories/localhost/0 to back up C:/
Options: [-log backup -storage raaan -stats]
2019-12-17 18:06:19.174 INFO REPOSITORY_SET Repository set to C:/
2019-12-17 18:06:19.175 INFO STORAGE_SET Storage set to gcd://Duplicacy
2019-12-17 18:06:23.823 INFO BACKUP_START No previous backup found
2019-12-17 18:06:23.823 INFO BACKUP_INDEXING Indexing C:\
2019-12-17 18:06:23.824 INFO SNAPSHOT_FILTER Parsing filter file \\?\C:\Users\raaan\.duplicacy-web\repositories\localhost\0\.duplicacy\filters
2019-12-17 18:06:23.824 INFO SNAPSHOT_FILTER Loaded 0 include/exclude pattern(s)
2019-12-17 18:06:25.874 WARN LIST_FAILURE Failed to list subdirectory: open \\?\C:\Config.Msi: Access is denied.

///////////////// 1000 lines with "Access is denied" and "The system cannot find the file specified."

2019-12-17 18:38:23.206 WARN OPEN_FAILURE Failed to open file for reading: open \\?\C:\Users\raaan\AppData\Local\Microsoft\Internet Explorer\CacheStorage\edb.log: The process cannot access the file because it is being used by another process.
2019-12-17 18:38:24.933 ERROR CHUNK_MAKER Failed to read 0 bytes: read \\?\C:\Users\raaan\AppData\Local\Microsoft\Outlook\info@raaan.com.ost: The process cannot access the file because another process has locked a portion of the file.
2019-12-17 18:38:35.783 INFO INCOMPLETE_SAVE Incomplete snapshot saved to C:\Users\raaan\.duplicacy-web\repositories\localhost\0/.duplicacy/incomplete
Failed to read 0 bytes: read \\?\C:\Users\raaan\AppData\Local\Microsoft\Outlook\info@raaan.com.ost: The process cannot access the file because another process has locked a portion of the file.

Please advise, what should I do?

As youā€™re backing up the root drive, youā€™ll encounter a lot of in-use system filesā€¦

You have a lot of warnings, but ultimately, it errors out on trying to backup C:\Users\raaan\AppData\Local\Microsoft\Outlook\info@raaan.com.ost. Which again, is in-use, but it wasnā€™t able to skip it for some reason.

You can either add *.ost to the filters (double-check the syntax; Iā€™m not sure off the top of my head how wildcards work yet in Duplicacy), OR run the backup with the -vss option flag (click the little - under the option column for the backup; put -vss without the quotes). (You could probably also exit Outlook temporarily just to get the backup done, but those .ost files can be quite big and are unnecessary)

However, in order to use -vss, you have to run Duplicacy with elevated privileges. So if you havenā€™t done so already, Quit the Duplicacy Web Edition icon tray and re-run Duplicacy Web Edition by right-clicking Run as administrator. (You can change the desktop icon to always run as admin too.)

Personally, I would re-install Duplicacy Web Edition in service mode and -vss should work, though itā€™s a bit of a faff around.

I personally see no sense in backing up the entire disk. You will be backing up temporary files (cache, etc), will have a lot of problems with files in use (such as the above error), your backup will take forever and you will waste storage space.

And if the goal is to restore the entire disk in the event of a complete failure, chances are you wonā€™t be able to do that with your operating system and applications.

It makes a lot more sense to back up your personal files only, and in the event of a complete hard drive loss, reinstall the OS to a new disk, applications, and restore the backup of the personal files. Itā€™s faster and more effective.

2 Likes

I totally agree with you though I will say thereā€™s a certain degree of reassurance by selecting the top-most directory / drive as root for your backup, and then applying filters - knowing that you got everything, and know exactly what you donā€™t needā€¦

Personally, I backup from C:\Users\Droolio (as well as various other folders on other data drives). Thereā€™s quite a lot of in-use stuff in there I donā€™t care for, but will only bother filter the stuff that eats a lot of space. Going through all those directories is rather tedious, so rather than pick out just the Docs or Music folders, say, itā€™s best to capture more then exclude later.

For instance, I could change it to C:\Users and ensure any future backups automatically includes new Windows user profiles. By changing it to C:\, I preclude the day I forget to adjust the backup because C:\Important_Stuff has since been created and itā€™s the only copy to exist. Or I install a legacy app which stores its data in C:\Program Files (I still see quite a lot of that).

BTW, -vss should be capable of backing up everything, although some things might be problematic - e.g. PST files.

2 Likes

PST files are always problematic ā€¦ :laughing:

I use another approach (for many years now): I donā€™t use the default folder C:\Users. Instead, I created a folder C:\Files, and all my personal files are there (documents, media, etc.), and I back up this folder. This avoids problems with temporary files, in use, etc., and keep files more ā€œisolatedā€.

I have uninstalled Duplicacy and installed it again as a service via CMD. I am still not sure if it is running as a service or not and I am also not sure if I need to run Duplicacy as an admin when I open the app. I have moved C:\Users<user>.duplicacy-web directory into C:\ProgramData.duplicacy-web and it asked me to enter the decryption password but it does not state anywhere that it would run as a service.

I also havenā€™t found any settings for VSS in Web Gui. Should I just add ā€œ-vssā€ into schedule task options for backup? Are there any other pros/cons of VSS beside that it can backup files currently in use?

What method would you then recommend to get the whole C drive backed up and stored at cloud? Macrium Reflect can definitely do that flawlessly offline, but I find it a bit too expensive to buy two softwares for all computers.

Not sure what cmd you used, but v1.1.0 of the Web Edition installer now asks to install as a service at the end of the wizard, IF you run the installer as admin. Check in services.msc to see if you see Duplicacy in there.

This is normal, I believe.

To see if Duplicacy service is running, put 127.0.0.1:3875 into your web browser. At the moment, there isnā€™t a companion tray icon for the service, so this is how you start Duplicacy - NOT via the desktop shortcut, which will run it a second time, and you donā€™t want these to conflict. If you see Duplicacy in the icon tray, right-click and Quit.

Yes, exactly that. Click the ā€˜-ā€™ under the Option column.

Main pro is you can backup in-use files. No real disadvantage except for a very small delay at the beginning of the backup job, while it takes a shadow copy. Should only take a few seconds but if you have a lot of data, maybe a minute or so?

As @towerbr correctly highlighted, it isnā€™t possible for Duplicacy to capture the entire state of a working OS. You just wonā€™t be able to restore a working OS from the system files. Concentrate on irreplaceable user-created data.

Macrium Reflect is a different kind of backup tool - it images an entire drive, and you generally have limited options to exclude files/folders from such images from the main OS drive at least, because the techniques are officially unsupported. In this regard, itā€™s more space-efficient to run file-based backups with something like Duplicacy, because you can keep a bigger versions history. Image-based backups tend to be more cumbersome, as maintenance of full vs incremental backups are a big overhead.

(Personally, I use both Duplicacy and Veeam Agent for Windows for image-based backups - so both methods are good to have.)

Thanks for all the help and explanation. Duplicacy is now running as a service, I didnā€™t know that Duplicacy service automatically creates .exe process when service starts. I have also added -vss option and excluded some system folders with a list that I found in some other thread.

I have now noticed though that when I manually start the backup process, it will finish with a ā€œcompletedā€ status in a few minutes and it wonā€™t actually back up anything. Please see the last two logs below:

Running backup command from C:\ProgramData/.duplicacy-web/repositories/localhost/0 to back up C:/
Options: [-log backup -storage raaan -vss -stats]
2019-12-21 16:13:14.653 INFO REPOSITORY_SET Repository set to C:/
2019-12-21 16:13:14.654 INFO STORAGE_SET Storage set to gcd://Duplicacy
2019-12-21 16:13:18.498 INFO BACKUP_START No previous backup found
2019-12-21 16:13:18.500 INFO VSS_CREATE Creating a shadow copy for C:\
2019-12-21 16:13:26.711 INFO VSS_DONE Shadow copy {9BF8DF64-6861-4993-BA6E-50B4920863F1} created
2019-12-21 16:13:26.713 INFO BACKUP_INDEXING Indexing C:\
2019-12-21 16:13:26.713 INFO SNAPSHOT_FILTER Parsing filter file \\?\C:\ProgramData\.duplicacy-web\repositories\localhost\0\.duplicacy\filters
2019-12-21 16:13:26.714 INFO SNAPSHOT_FILTER Loaded 10 include/exclude pattern(s)
2019-12-21 16:13:26.828 INFO INCOMPLETE_LOAD Incomplete snapshot loaded from C:\ProgramData\.duplicacy-web\repositories\localhost\0/.duplicacy/incomplete
2019-12-21 16:13:26.828 INFO BACKUP_LIST Listing all chunks
2019-12-21 16:17:25.547 INFO FILE_SKIP Skipped 26 files from previous incomplete backup
2019-12-21 16:17:33.220 INFO BACKUP_END Backup for C:\ at revision 1 completed
2019-12-21 16:17:33.221 INFO INCOMPLETE_SAVE Removed incomplete snapshot C:\ProgramData\.duplicacy-web\repositories\localhost\0/.duplicacy/incomplete
2019-12-21 16:17:33.221 INFO BACKUP_STATS Files: 0 total, 0 bytes; 0 new, 0 bytes
2019-12-21 16:17:33.221 INFO BACKUP_STATS File chunks: 0 total, 0 bytes; 0 new, 0 bytes, 0 bytes uploaded
2019-12-21 16:17:33.221 INFO BACKUP_STATS Metadata chunks: 3 total, 8 bytes; 2 new, 6 bytes, 588 bytes uploaded
2019-12-21 16:17:33.221 INFO BACKUP_STATS All chunks: 3 total, 8 bytes; 2 new, 6 bytes, 588 bytes uploaded
2019-12-21 16:17:33.221 INFO BACKUP_STATS Total running time: 00:04:16
2019-12-21 16:17:33.240 INFO VSS_DELETE The shadow copy has been successfully deleted



Running backup command from C:\ProgramData/.duplicacy-web/repositories/localhost/0 to back up C:/
Options: [-log backup -storage raaan -vss -stats]
2019-12-21 16:20:26.594 INFO REPOSITORY_SET Repository set to C:/
2019-12-21 16:20:26.595 INFO STORAGE_SET Storage set to gcd://Duplicacy
2019-12-21 16:20:30.053 INFO BACKUP_START Last backup at revision 1 found
2019-12-21 16:20:30.054 INFO VSS_CREATE Creating a shadow copy for C:\
2019-12-21 16:20:37.478 INFO VSS_DONE Shadow copy {8E53BE0F-AD76-4C48-939F-C258D476952E} created
2019-12-21 16:20:37.480 INFO BACKUP_INDEXING Indexing C:\
2019-12-21 16:20:37.480 INFO SNAPSHOT_FILTER Parsing filter file \\?\C:\ProgramData\.duplicacy-web\repositories\localhost\0\.duplicacy\filters
2019-12-21 16:20:37.481 INFO SNAPSHOT_FILTER Loaded 10 include/exclude pattern(s)
2019-12-21 16:20:39.659 INFO BACKUP_END Backup for C:\ at revision 2 completed
2019-12-21 16:20:39.659 INFO BACKUP_STATS Files: 0 total, 0 bytes; 0 new, 0 bytes
2019-12-21 16:20:39.659 INFO BACKUP_STATS File chunks: 0 total, 0 bytes; 0 new, 0 bytes, 0 bytes uploaded
2019-12-21 16:20:39.659 INFO BACKUP_STATS Metadata chunks: 3 total, 8 bytes; 0 new, 0 bytes, 0 bytes uploaded
2019-12-21 16:20:39.659 INFO BACKUP_STATS All chunks: 3 total, 8 bytes; 0 new, 0 bytes, 0 bytes uploaded
2019-12-21 16:20:39.659 INFO BACKUP_STATS Total running time: 00:00:10
2019-12-21 16:20:39.690 INFO VSS_DELETE The shadow copy has been successfully deleted

So the best practice is to additionally install a backup software which can image the whole disk, such as Macrium or Vaeem and then use Duplicacy so sync those backups with Google Drive?

You have a problem with your job because virtually nothing has been backed up:

All chunks: 3 total, 8 bytes; 2 new, 6 bytes, 588 bytes uploaded

Probably some issue with your filters.

Duplicacy is not a synchronization software, but a backup software, like the other two mentioned.

The cause was indeed in filters, I have copy pasted them from forum but the syntax was wrong. Is there a list of recommended exclusions available that should be applied?

Perhaps I have used a wrong terminology, I thought to transfer full & differential backups created by Veeam/Macricum automatically to cloud. What that be a problem? Can duplicacy handle such big files?