How to avoid stupid mistakes in your powershell scripts (self-test your scripts)

Christoph · 29 April 2019 09:01

Not sure if anyone is as clumsy as me and manages to have a failing powershell script for over four months before noticing it, but better safe than sorry, so here is a quick and easy way to self-test your scripts.

(Disclaimer: I can’t really say that I understand the intricacies of what I’m doing, so use at your own risk and please feel free to post improvements.)

So, what’s the problem? - If you use scripts to run your duplicacy backups, there is a risk that you make some (often minor) edits, not realizing that you actually just broke your script, i.e. your backup won’t run. Windows Task Scheduler won’t tell you, and if you don’t have a more sophisticated #monitoring mechanism in place, chances are, you won’t notice until it’s too late.

Quick and easy solution: Add the following code to the beginning of your backup script and make sure to adust the paths accordingly. (I just couldn’t be bothered to code this more elegently.)

What does it do? Every time the script runs, it checks itself for any syntax errors. If it doesn’t find any, it writes an empty file with a file-name indicating that all is OK. If it finds an error, it deletes that file and writes another file instead.

function Syntax-OK
{
    [CmdletBinding(DefaultParameterSetName='File')]
    param(
        [Parameter(Mandatory=$true, ParameterSetName='File')]
        [string]$Path, 

        [Parameter(Mandatory=$true, ParameterSetName='String')]
        [string]$Code
    )

    $Errors = @()
    if($PSCmdlet.ParameterSetName -eq 'String'){
        [void][System.Management.Automation.Language.Parser]::ParseInput($Code,[ref]$null,[ref]$Errors)
    } else {
        [void][System.Management.Automation.Language.Parser]::ParseFile($Path,[ref]$null,[ref]$Errors)
    }

    return [bool]($Errors.Count -lt 1)
}

if(Syntax-OK -Path C:\duplicacy\scripts\backup_alpha_C.ps1){
   New-Item -Path 'C:\duplicacy\scripts\backup_alpha_C.ps1-STATUS_OK' -ItemType File
   Remove-Item –Path 'C:\duplicacy\scripts\backup_alpha_C.ps1-STATUS_ERROR-please_check_script'
}else {
   New-Item -Path 'C:\duplicacy\scripts\backup_alpha_C.ps1-STATUS_ERROR-please_check_script' -ItemType File
   Remove-Item –Path 'C:\duplicacy\scripts\backup_alpha_C.ps1-STATUS_OK'
}

Source: validation - How can I automatically syntax check a powershell script file? - Stack Overflow

This is obviously the most basic version of the script possible. Feel free to improve it.

akvarius · 29 April 2019 13:08

Hi,
very nice idea!

I also like @TheBestPessimist idea of using https://healthchecks.io/ on success. (Scripts and utilities index)
You’d write your script to send a message to healthcheck only when successful.
Healthchecks can then trigger a warning if your backup script does not report success for any reason including failed script, server full, power outage etc.

Christoph · 29 April 2019 21:37

Yes, I also like healthchecks (and you don’t even need to send a message to it, you can just “ping” it by calling a secret url). The problem is: you have to set healthchecks.io to a fixed period that it should expect a ping to come in. e.g. every day, every hour or whatever. But what if you are not using your computer for a couple of days. Or you have multiple computers and don’t use each of them every day. If you don’t turn on your PC, your script won’t run, so healthchecks.io will send you a warning. Such false positives are no good and need to be avoided.

So instead of a simple timer alerting me when the url hasn’t been triggered for a certain time, I would like a conditional timer that alerts me when the url hasn’t been triggered for a certain time provided
that another url has been triggered during the last x hours.

I suggested this to Peteris, the guy running healthchecks.io about a year ago and he replied the following:

that is an interesting use case.

One of the earliest feature requests for healthchecks was the ability to monitor the excution time of the script: Monitoring execution time of script · Issue #23 · healthchecks/healthchecks · GitHub If implemented, I think it would help here too. A “boot” would count as the start of the execution, and then the main ping would be the end of execution. Usually script execution times would be measured in seconds or minutes. Here it could take up to 12 hours but that’s fine. I’ll start looking into implementing this. Can’t promise any dates – I only have limited time to work on new features.

A stop-gap solution is to use the "Pause Monitoring of a Check" API call: API Reference - Healthchecks.io

The way this would work is:

For each desktop machine you would set up a single check, with a period of 12 hours. The normal state for all checks is "paused"

when the machine boots, it pings its check. Pinging a paused check "unpauses" it, but does not trigger a notification.

2a) sometime during the next 12 hours, the machine executes the "Pause" API call.
2b) if the machine does not pause its check, when 12 hours are up, you receive a notification

API calls are authenticated with an API key, so you would have to distribute the API key to each machine. If that is not acceptable, the API key could also be hid in a IFTTT / Zapier style service that sits in the middle. Far from elegant but doable – as an experiment I set up a zap in Zapier, which accepts requests like this one:

curl -H "Content-Type: application/json" --data '{"code": "5bf66975-d4c7-4bf5-b1c8-b8d8a82aa278"}' https://hooks.zapier.com/hooks/cetwdp/571957/abcd

It takes the code from the JSON payload, and fires off an authenticated "pause" API call to healthchecks.io.

I think it really speaks for itself that he took the time for such an elaborate answer. Really great! Unfortunately, I never got down to trying the proposed workaround as it is quite complicated for my skill level. But perhaps someone here would like to give it a go?

In any case, I’ll take this as a reminder to check back with Peteris whether the real solution will be implemented any time soon. Will let you know.

Christoph · 30 April 2019 12:57

Well, I haven’t heard back from the developer yet but realized that it seems to be implemented already. See here.

Will try this out asap. If this works, I think it would be the solution for monitoring duplicacy backups!

akvarius · 2 May 2019 22:43

Agree, /start endpoint is better (less complicated and therefor more robust) than API calls.
And simpler to code reliably. But I don’t see how to make it to reliably work as a heartbeat

Avoiding false positives for periodically used devices (laptops and other non-servers) will still be difficult! It’s more difficult than it sounds to reliably send a stop (or pause) signal to http://healthchecks.io/ because there are many ways a user might choose to stop using the computer. For example, shutdown (Event ID 1074 or use a GPO shutdown script)/sleep/hibernation or out of WIFI coverage). It might not be possible to send a stop signal (when/if implemented by http://healthchecks.io/). (Even when you manage to start a script it might not be able to finnish in time)

Perhaps a kind of heartbeat signal could be designed to tell that the computer is in use and one should expect a backup signal (an “OK” signal). If heartbeats not received, the computer is inactive and could have a different and much more relaxed time limit (months even). Many things to consider, and many edge cases which would still cause false positives, or a missed warning.

For my server and users I think I will specify different time limits for each use case. e.g. a server which should backup every day (or be available for SFTP) should have another time limit than a seldom used laptop. (For some users even decide to send positive confirmation instead; a status mail from client both on success and fail.)

False positive is also a matter of perception. The text should explain what the missing signal really means.

“Backup of laptop is DOWN” (perhaps every day) is not the same as
“Your laptop has not stored a new backup in 7 days. If you have not used your computer in this time, this is normal and can be ignored”.

I haven’t found an option to customize the text in http://healthchecks.io/, in the free version anyway , but there are options for frequency for manager reports, and for test results, mails are one-shot. (one-time only, until service restored (backup succesful))

Edit:
I had to test with a live test and some text, not only the Test button

The check MyServer Duplicacy Backup has gone DOWN. 
Additional notes:
Backup to SFTP Server has not happened in the expected time

It’s not at all bad!

A previous well-known product I used for friend-to-friend backup sendt false positives in that it often said computer hadn’t connected in a long time, when it reallly was a case of no files to back up.

For my mostly-on “server” that I don’t touch that often, healthcheck should work really well out-of-the-box! (Just a little peace of code to run at success after backup)

Sorry @Christoph, didn’t mean to hijack your thread!
Excited to see if you find out more on using /start