Monitor backups status using healthchecks.io (Windows/CLI)

akvarius · 16 June 2019 00:04

@TheBestPessimist is using healthchecks.io to monitor backup status: Scripts and utilities index.

I thought I’d show some examples in case you want to do this in your scripts. I use the CLI (command line) version, and Windows 10 and Windows 7 as backup clients.

https://healthchecks.io is a service which allows setting up “checks”. If a check does not receive a signal for a reasonable time (expected period + grace period), you can have a mail sent to you. You can send a failure signal as well, to avoid waiting for the grace period. Signal can be sent to the service by mail or a HTTP request. The service is free for limited personal use. You can monitor 20 checks, with limited logging (100 last events)

The beauty of this kind of service is that although it might not be 100% fail proof, it is independent of your computer which could crash or be switched off.

To avoid false positives you must select your period and grace settings wisely, especially if the computer is seldom used. (You can use the “Start” feature for this)

To test these examples, first register on the service. Set up a check and copy the URL for it. This URL you can paste into the command examples wherever I write https://hc-ping.com/your-check-url

Since I am a Windows/CLI user, I will show some example for the Windows environment. I use powershell and bat files for these examples, but the object system.net.webclient can be used in some other languages as well.

The first simple example requires Windows 10 or a newer version of powersehll (v3) (See below for Windows 7 and older PS)
The powershell command is
Invoke-RestMethod 'https://hc-ping.com/your-check-url'

You can execute this from a bat-file like this:

powershell.exe -command "&{Invoke-RestMethod 'https://hc-ping.com/your-check-url'}"

This sends a signal to the service saying that this check is OK. You probably only want to do this when a backup has been run successfully!

I recommend a scheduled job which performs a backup, logs result locally, and (if successful) reports status to healthchecks.io

Let’s say you have set up a Duplicacy repository at X:\YourRepositoryFolder, and installed the CLI files to C:\Program Files\Duplicacy. Create a batch file somewhere it cannot be modified without being administrator (e.g. C:\Program Files\Duplicacy) and schedule a task to run it for instance once a day.

A batch file in Windows is a text file with the file extension .cmd or .bat.
(.cmd looks more modern)

CD /D X:\YourRepositoryFolder
"C:\Program Files\Duplicacy\duplicacy.exe" backup
If not Errorlevel 1 powershell.exe -command "&{Invoke-RestMethod 'https://hc-ping.com/your-check-url'}"

The /D option for the CD command selects the the X: drive and then changes the current folder to the path specified. This allows Duplicacy to find the repository.
The If Errorlevel command is true if the exit code of the previous command is equal or higher than the number provided after “Errorlevel”.
“If not Errorlevel 1” is true if the exit code of the previous command is 0 (success)
( If Errorlevel 0 is always true so can not be used here. )

Remember, this is just an example. You can add options from the Duplicacy user guide to show more details, and pipe log results to a log file or other things that you want.

If you want you can report a fail immediately if Duplicacy reports an error, like this:

CD /D X:\YourRepositoryFolder
"C:\Program Files\Duplicacy\duplicacy.exe" backup
If not Errorlevel 1 (
  REM Exit code is not 1 or higher, we're good!
  powershell.exe -command "&{Invoke-RestMethod 'https://hc-ping.com/your-check-url'}"
) Else (
  REM If we got here something is wrong, send the fail signal
  powershell.exe -command "&{Invoke-RestMethod 'https://hc-ping.com/your-check-url/fail'}"
)

If your script performs more tasks, for instance a local backup first and then a copy to an external destination, you should only send a signal to the monitoring service when all operations have succeded.

If you want to send more info, or customize the User-Agent string (since it is shown in the halthchecks.io logs) you could do it like this:
powershell.exe -command "&{Invoke-RestMethod 'https://hc-ping.com/your-check-url' -headers @{'User-Agent'='%Computername%'} -body 'Documents OK' -method POST}"

This way it is possible to collect results from Duplicacy output to put in Healthchecks.io log.
(you might want to be careful with what details you publish this way)

This works in Windows 7 and use powershell version 2:
powershell.exe -command "&{(New-Object System.Net.WebClient).DownloadString('https://hc-ping.com/your-check-url')}"

Customizing User-Agent and body with this is a little more complex

towerbr · 16 June 2019 00:46

Very good tutorial!

I also use halthchecks.io, but I ping the URLs directly with curl.

I created a small html file that shows me the backups status (loading the healthchecks svg images), something like a simplified dashboard:

dash

They are separated into groups: my computer, my wife’s computer, media files, etc.

akvarius · 16 June 2019 21:31

Thank you, @towerbr!!
I also like Tags, a very elegant feature, easy to set up and very useful!

More importantly, you made me check on curl, which I knew about (and wget) but didn’t realize curl is actually included in Windows 10 since build 1803 (Announcement: https://devblogs.microsoft.com/commandline/windows10v1803/). (I knew SSH is included as a feature you can enable, but curl is available out-of-the-box)

So if the client computer is Windows 10 with at least build 1803 you can do it like @towerbr:
curl https://hc-ping.com/your-check-url

The simplest code sample might then read:

CD /D X:\YourRepositoryFolder
“C:\Program Files\Duplicacy\duplicacy.exe” backup
If not Errorlevel 1 curl https://hc-ping.com/your-check-url

(As @towerbr knows), Curl has a ton of options and can easily do what we did with PowerShell in the first post (I’m literally learning by doing here, please let me know if I make mistakes!):
curl --user-agent "%Computername%" --data "Documents OK” https://hc-ping.com/your-check-url
(The main reason I set the user agent string is to keep the log lines shorter and readable.)

The Healthchecks log might then show this in the entries:
#13 Jun 16 23:07 OK HTTPS POST from 11.22.33.44 - ThisPCName - Documents OK

towerbr · 16 June 2019 21:43

Good idea! I hadn’t thought of that. I’m going to check these options again , I have not played with them for a long time.

akvarius · 16 June 2019 22:49

A word on Laptops…

Healthchecks works really well for computers that are always on.

But if your computer is a laptop you might have a hard time selecting the optimal Period (time between backups). A higher Period setting lowers the chance of false positives but increases the time before flagging a real problem.

Your backup job schedule will also impact the likelihood of backup failure, since the laptop might enter sleep mode or run out of power before the scheduled backup time. You can set the task to run asap after a missed schedule, but generally, setting an early backup schedule will increase the chance that a backup will succeed, at the cost of maybe not getting the latest document changes. Setting a later schedule will improve the chances of backing up the latest changes.

Either way, a laptop will probably not deliver very regular backups just because of their nature.

@Christoph showed me a feature that might help for laptops in How to avoid stupid mistakes in your powershell scripts (self-test your scripts)
You can send the /start signal (https://hc-ping.com/your-check-url/start) every morning the computer is in use. (Use a scheduled task to run early in the morning, and include the setting to run if schedule missed.)

Before we move on:
Period (normally): Time between expected pings (backups).
Grace (normally): Time to allow the task to last or be delayed.

The /start part of the URL (called an endpoint) is intended to start a timer, but in our case we will use this to tell Healthchecks that our computer is in use and to expect a backup confirmation within the Grace time. This means you can set a very high Period (30 days is the max) and still get a warning within a day or two if it fails. I find a Grace time of 2 days should work, but you should experiment to find out what works for you! This allows backups to fail once without alerting.

When used like this (with a start signal), our perception of time settings changes a little:
Period: The maximum time to wait for a successful signal, so 30 days might be OK, perhaps too short for some if they use the laptop seldom.
Grace: Time to wait after a start signal before expecting a backup

Windows 10 build 1803 (and later) with curl:
Curl https://hc-ping.com/your-check-url/start
Windows 10 before build 1803:
powershell.exe -command “&{Invoke-RestMethod ‘https://hc-ping.com/your-check-url/start’}”
Windows 7:
powershell.exe -command "&{(New-Object System.Net.WebClient).DownloadString('https://hc-ping.com/your-check-url/start')}"

Laptop usage is pretty random compared to a server. This method will not 100% avoid false positives (or misses) but perhaps improve it a bit!

akvarius · 17 June 2019 00:17

How to include details from Duplicacy

Let’s assume again your Duplicacy repository is at X:\YourRepositoryFolder and CLI files at “C:\Program Files\Duplicacy”

If we add the –stats option to Duplicacy backup, it will show some info about the backup.
If we add the –log global option, time and date will be added to the lines.

Typical result from a backup:

2019-06-17 01:24:14.700 INFO BACKUP_STATS Files: 23879 total, 40,574M bytes; 1 new, 42K bytes
2019-06-17 01:24:14.700 INFO BACKUP_STATS File chunks: 8259 total, 40,614M bytes; 1 new, 42K bytes, 31K bytes uploaded
2019-06-17 01:24:14.700 INFO BACKUP_STATS Metadata chunks: 5 total, 7,793K bytes; 3 new, 2,163K bytes, 975K bytes uploaded
2019-06-17 01:24:14.700 INFO BACKUP_STATS All chunks: 8264 total, 40,622M bytes; 4 new, 2,206K bytes, 1007K bytes uploaded
2019-06-17 01:24:14.700 INFO BACKUP_STATS Total running time: 00:00:24

Now we can pipe the log to a file,
extract some useful information into a variable, and then

Store some in a local history log
Send detail to Healthchecks along with the signal

Sample batch file:

REM Modify locations to your preference:
Set Log="%Temp%\Backuplog.txt"
Set Tempfile="%Temp%\TempStats.txt"
Set History="%Temp%\Backuplog_History.txt"

CD /D X:\YourRepositoryFolder
"C:\Program Files\Duplicacy\duplicacy.exe" -log backup -stats > %Log%
If Errorlevel 1 goto FAILED

REM (We get here if backup was successful)
REM One of the lines from the log (including –stats) might look like this:
REM 2019-05-17 23:53:10.418 INFO BACKUP_STATS Files: 23829 total, 40,534M bytes; 1 new, 45K bytes
REM (We don’t need the first 42 characters)
type %Log% | find "INFO BACKUP_STATS Files" > %Tempfile%

REM (Optional) Add to a history log:
type %Tempfile% >> %History%

REM Retrieve part of the line to send to the service (from character 42):
set /P Stats=<%Tempfile%
set Stats=%Stats:~42%
curl --user-agent "%Computername%" --data "%Stats%" https://hc-ping.com/your-check-url

REM This skips the rest of the script:
goto :EOF

:FAILED
REM (we could add some more info about why it failed)
curl --user-agent "%Computername%" --data "Backup failed – Exit code %errorlevel%" https://hc-ping.com/your-check-url/fail

Windows 10 before build 1803:

powershell.exe -command "&{Invoke-RestMethod https://hc-ping.com/your-check-url  -headers @{'User-Agent'='%Computername%'} -body '%Stats%' -method POST}"

[Edit:Added code for] Windows 7 with User-Agent and POST data (UploadString)

powershell.exe -command "&{($Web=New-Object System.Net.WebClient).Headers.add('User-Agent','%Computername%');$Web.UploadString('https://hc-ping.com/your-check-url', "POST", '%Stats%')}"

Normally we don’t need to store much data in the log. here’s an example using only User-Agent:
[Edit:Added code for] Windows 7 using only User-Agent and GET (DownloadString)

powershell.exe -command "&{($Web=New-Object System.Net.WebClient).Headers.add('User-Agent','%Computername%  %Stats%');$Web.DownloadString('https://hc-ping.com/your-check-url')}"

This Windows 7 samples also work for Windows 10

Please consider which details you want to send to the public server in case of data breach etc.!

akvarius · 19 June 2019 22:20

A demo to show the result of the above code.

A Healthchecks report

with some details in “User Agent” and Request Body (=data), (redacted sample):

  Ping #46 

  Time Received                      Client IP
  2019-06-19T08:24:08.669066+00:00   [YourPublicIP.22.33.44]

  Protocol                           Method
  https                              POST

  User Agent
  PCName 

  Request Body
     Files: 23883 total, 40,652M bytes; 3 new, 79,807K bytes

The Healthchecks log might look something like this:

#46  Jun 19 10:24 OK  HTTPS POST from [YourPublicIP.22.33.44] - PCName - Files: 23883 total, 40,652M bytes; 3 new, 79,807K bytes   
#45  Jun 18 10:23 OK  HTTPS POST from [YourPublicIP.22.33.44] - PCName - Files: 23880 total, 40,574M bytes; 0 new, 0 bytes  
#44  Jun 17 10:24 OK  HTTPS POST from [YourPublicIP.22.33.44] - PCName - Files: 23880 total, 40,574M bytes; 2 new, 42K bytes   
#43  Jun 16 10:25 OK  HTTPS POST from [YourPublicIP.22.33.44] - PCName - Files: 23879 total, 40,574M bytes; 1 new, 42K bytes

The local %history% log file:

2019-06-19 10.24.44.343 INFO BACKUP_STATS Files: 23883 total, 40,652M bytes; 3 new, 79,807K bytes
2019-06-18 10.24.04.693 INFO BACKUP_STATS Files: 23880 total, 40,574M bytes; 0 new, 0 bytes 
2019-06-17 10.24.47.372 INFO BACKUP_STATS Files: 23880 total, 40,574M bytes; 2 new, 42K bytes 
2019-06-17 10.25.59.761 INFO BACKUP_STATS Files: 23879 total, 40,574M bytes; 1 new, 42K bytes

Christoph · 27 August 2019 06:31

3 posts were split to a new topic: A strategy for monitoring backups using sendmail

Christoph · 27 August 2019 06:37

Where will the stats show up in healthchecks?

akvarius · 27 August 2019 07:09

Good question, thanks for asking!
The log for a check has it’s own URL so you can bookmark them:
https://healthchecks.io/checks/<YourCheckID>/log/
(Note BTW that for personal (free) users, only the last 100 logs are kept (per check))

In the GUI; On the Checks page (where you see a list of your checks, integration, schedule etc), click the Last Ping to see details of the last ping, and also a link to logs.
You can also (On the Checks page) click the (gear) settings icon, which will show a control panel for this Check, and part of the log, but also a “More…” link to see more log items.
Or in mail reports: A summary of the checks statuses are included in the mails, with link to details (same as the settings above, which at the bottom shows the log)

On each log line, you will have record number, date, status, sender address, User-Agent string and more data. This means I could have simplified the script somewhat, and it will look practically the same in the log:

REM Retrieve part of the line to send to the service (from character 42): 
set /P Stats=<%Tempfile%
set Stats=%Stats:~42%
curl --user-agent "%Computername% %Stats%" https://hc-ping.com/your-check-url

And now to save some space on the log line (and see more details), you can also remove %Computername% (as this check is probably for one computer)

Christoph · 27 August 2019 21:49

This is the part I was wondering about. Are you saying that --data bla will simply be appended to the healthchecks log entry?

Yes, this I understand. It’s a smart hack that uses the user agent variable for whatever you want to see on the log. Makes sense because healthchecks obviously outputs that variable into the log. But --data?

akvarius · 27 August 2019 23:11

Yes, some of it, but it’s like the body of the message.
In the log view, you will only see some of that text, but in the details page for a logged event (click a log event), the Request Data (-data) will be shown separately, and up to 10 kilobytes of it will be stored in each log record, so it can be inspected later (last 100, or 1000 entries if you are a commercial user) Too much data to handle for a (Windows) bat file I think, but perhaps in Powershell and *nix.

In my example it was enough to use User-Agent for everything.

We could store backup error messages or check/prune results. If the /fail endpoint is used to send the ping when backup error is detected, the log entry will stand out with a red “Failure” label and we could click it to see more details.

Duplicacy log for a single job is easily too big, but one could save some of it. Also, the backup logs will contain private information (like folder/file names) so one must consider what (parts) to store anyway!

 Ping #46 

  Time Received                      Client IP
  2019-06-19T08:24:08.669066+00:00   [YourPublicIP.22.33.44]

  Protocol                           Method
  https                              POST

  User Agent
  PCName

  Request Body
    (Example) This is the data specified by curl -data in the example above.
    Files: 23883 total, 40,652M bytes; 3 new, 79,807K bytes 
    (Here we could potentially store more data since up to 10 KB is stored for each log entry. 
    Exceeding data is stripped off
    For hobbyists, the limit is 100 log entries
    For paying users, the limit is 1000 log entries

akvarius · 8 September 2019 22:10

Luckily, both curl and the PowerShell Invoke-RestMethod can read data directly from a file, which you might want to filter first.
(in powershell you can filter a log and then pipe it to the Invoke-Restmethod)

There are two tasks I’d like to implement in my script:

Create a copy of the log file with almost anonymous data
This saves log data (for healthchecks we need to fit within 10Kb) and is safer in terms of privacy.
Send the status and (redacted) log file content to the server

Filter a log file:
A simple example for a bat file would be:

set Log_Stats=%Temp%\logstat.txt
type %Log% | find "INFO BACKUP_STATS " > "%Log_Stats%"

Another tool I often use, especially for more complex filter jobs, is GNU Sed, but in Windows it might make better sense to use the native PowerShell, I’ll show a simple example at the end.

Sending the log file with PowerShell:

REM Windows 10: Send healtcheck signal with log file
powershell.exe -command "&{Invoke-RestMethod https://hc-ping.com/your-check-url -headers @{'User-Agent'='%Computername% %Stats%'} -Infile '%Log_Stats%' -method POST}"

or with curl (Windows 10 build 1803 and later):

curl --user-agent "%Computername% %Stats%" https://hc-ping.com/%HealthcheckID% --upload-file "%Log_Stats%"

In powershell it’s possible to combining the to tasks, so you could
Filter and send, in one command-line with PowerShell:

powershell.exe -command "&{ $stat = get-content '%Log%' | Where-Object {$_ -match ' INFO BACKUP_STATS '}|out-string ;$stat | Invoke-RestMethod https://hc-ping.com/%HealthcheckID% -headers @{'User-Agent'='%Computername% %Stats%'} -method POST}"

To break that last command down, the first thing I do is load parts of the file into a variable, filtering with Where-Object. (I could use the -replace operator after that for more modifications) Get-Content loads the text file into an array so I have to convert to a string before piping to Invoke-Restmethod (or else each line would be sent as a separate item):

$stat = get-content '%Log%' | Where-Object {$_ -match ' INFO BACKUP_STATS '}|out-string

Then finally I can pipe that data to Invoke-RestMethod. (Piping data to it is an alternative to using the -data or -infile parameters):

$stat | Invoke-RestMethod https://hc-ping.com/%HealthcheckID% -headers @{'User-Agent'='%Computername% %Stats%'} -method POST}"

I’m sure there are more elegant ways of doing this, perhaps without a variable, but I just wanted to test and show you one way of doing it.
If you are concerned with privacy, filter your log files before uploading to a web server!

Erwin · 21 March 2020 10:01

This is very interesting, but won’t you miss failure notifications if the first backup completed, but not subsequent backups (and you have not restarted the computer since)?

akvarius · 21 March 2020 23:02

You raise an important point!

If your backup script has several backup operations, and only send “success” signals after each operation but not a “fail” signal on error, then that will happen.

E.g.

OK: Backup to local
OK: Prune
OK: Check local
Failed: Copy to remote
Failed: Check remote

You need to send a “fail” signal on error. This way you will get a warning mail. You might also consider aborting the script on error. (send “fail” signal first, to end the check in a failed state.

But even if you continue and end your script with a “success” signal, you will receive a warning mail. (But you won’t get more reminders since end result was success)

Also you can use separate check URLs for separate backup jobs (documents, pictures, etc.). Then you will be notified for the one that fail.

I discovered now my first post had a typo in the /fail sample, fixed now.
A simple “fail” signal sample for Windows 10 build 1803 and later:

curl https://hc-ping.com/your-check-url/fail

Erwin · 22 March 2020 04:31

I can totally see how this avoids Healthchecks not to send a warning if the computer has been off for 25 days for example, but what I still don’t get is how you will receive a warning if, upon executing the first backup successfully (with /start and then the normal ping), the next day the backup does not run for some reason (assuming daily backup)? Of course if it explicitly fails you can notify Healthchecks, but what we want to guard against is also for the backup script not to be running.

If at the end of the first backup you send another /start signal but then put the computer to sleep for 25 days you will receive a warning. If you don’t send another /start signal and leave your computer on for 25 days, then Healthchecks will not find it abnormal not to receive a ping (it’ll only send you a warning after 30 days).

akvarius · 22 March 2020 12:05

Yes, as I mention in the post “A word on Laptops”, the /start signal needs to be sent every day the laptop is in use, and this resets the “short” timer. Use a scheduled task to run early in the morning, and include the setting to run if schedule missed. It’s called “Run task as soon as possible after a scheduled start is missed”

It’s a separate scheduled task, not the backup script, and you can use the curl command directly to avoid potential script file missing or syntax problems. Example action (assuming Windows 10):
Program: Curl
Argument: https://hc-ping.com/your-check-url/start

And this task can fail too, so either way, this is not completely failsafe, it’s a compromise to make it work better with laptops.

(On my “server” I installed a new UPS which caused W10 to believe it was running on battery - which caused tasks with the default condition ‘start the task only if the computer is on AC power’ to not start at all. )

Another way is to send a /start signal at the start of each of the backup commands in your scripts. (But I feel a separate task is also needed in case script fail to start)

Erwin · 22 March 2020 15:42

This is very smart indeed! And safe to assume the task will run (scheduled tasks are quite reliable). It is important that this trigger be independent of the backup script. Thanks, I’ll implement that on my end too, it definitely is a more appropriate setup on laptops.

dreamflasher · 25 March 2021 21:32

Thanks a lot! Especially If not Errorlevel 1 curl https://hc-ping.com/your-check-url is exactly what I was looking for. Is there an equivalent on linux bash?

akvarius · 25 March 2021 22:28

Hi, others here can give better qualified answers, but maybe something like

If [ $? -lt 1] then
Curl …
fi