How to check only the latest revision?

This feature would be very useful. I’m using Nagios to monitor how long it has been since the last completed backup. At the moment my script has to list all backups and then get the time of the last one, and this can take many tens of seconds.

3 Likes

I also think having -1, -2, etc. is not that useful. (maybe even call that unintuitive) When checking my stuff, i generally tend to look either at the last revision, or do a listing of all the files in all the revisions.

imo just having -last (and not -latest since it’s longer to write and doesn’t bring anything in terms of “undersatnding” what the parameter does) added should be enough.

1 Like

A post was split to a new topic: Partial commands by the minimal unabigous chunk?

To add here, it might be useful to enable checking e.g. the last 5 backups. “-2” is not that useful if it checks only the 2nd latest backup, but it is handy if checks the last two backups.

This is for users with medium levels of paranoia (well, aren’t backups for paranoids in general?)

A more elegant way is to run the following:

for /f “skip=2 tokens=4” %%i in (‘duplicacy list -storage local’) do set LAST_REVISION=%%i

Any news on this feature? (see also this one: Restore latest version of a file)

Sorry it hasn’t been implemented. I plan to do it after the memory optimization.

2 Likes

I realize that I’m late to the parameter party here, but please allow me to make two points:

  • I agree that “-last” would be the most intuitive naming, much better than negative number and also better than “-latest”.
  • However I also think that a “-last n” variant would be useful, for those who don’t run check after every backup. In my case I run three backups a day, but only one check at night, so “-last 3” would be what I’d set this to. So then “-last” would be simply shorthand for “-last 1”.
1 Like

I’ve been thinking a better alternative might be to base the parameter on a fixed time-scale - similar to the -keep parameter, which deals in days. So -last on its own would mean the very last revision, and -last 7 would mean the last 7 days’ worth of snapshots.

Because, as you pointed out, there may be a very different time span covering say, 3 snapshots - depending how long your PC has been switched off. (I do backups every 2 hours.) You have to work out how many snapshots (roughly) you might want to check/copy, based on your job schedule.

Whereas a time-based parameter might yield a different number of snapshots, you know it’s a fixed period of time.

2 Likes

This sounds like an interesting alternative!

I’m wondering how this works in the case where we check and also backup extremely rarely?

My own example, using the same storage GDrive

  • pc/home server:
    • backup every 10 days
    • check -all every 20 days (this is the only check i’m running, which checks everything)
  • laptop: backup every 6 hours
  • random server: backup every 7 days

With your new suggestion, I could do the check like duplicacy check -all -last 20 -tabular and it would only check the validity of the last 20 days worth of backups (maybe i should use 21, since they run at different hours so i might miss some backups). That’s pretty nifty!

Would this work properly with -tabular though? Will tabular only show me the last 20 days, or does it still need to check all the backups? (does it even make sense to run tabular like this?)


On another train of thought: this could also work with copy in the same way. Does it make sense to have it there?

No, I don’t think it would, since at the moment -tabular requires/implies -all - some of those columns, like unique and new, need to know about all the revisions in the storage.

I dunno if it’s worth allowing a subset of snapshots to be checked in this way. Perhaps it’s safer, and not all that more resource intensive, to simply check -all snapshots and chunks. After all, Duplicacy has to ListAllFiles() on the storage anyway.

Though I guess you could use -last n to do it slightly differently - work out the differential of chunks created with n days and just check the existence of those chunks one-by-one, though it’s a less-safe form of check.

Also, I don’t know how check is implemented, but strictly speaking, the second half of the process which does the stats/tabular, shouldn’t necessarily require a physical check for all chunks beforehand. Maybe in the existing implementation the stats are built from the list of chunks it has in memory after checking those chunks exist, but the same information could be built from iterating over all the revision metadata without doing that check. i.e. you could separate out the stats and check operations.

Anyway, I’d propose that -last n - in the case of check, or copy (which I’m most eager to see such a flag) - would always do the last revision plus any snapshots created within n days. Then you wouldn’t have to specify an extra day (21) just in case? If you accidentally skipped jobs, or your PC has been off for a good while, -last n would always include 1 revision minimum, just as -last on its own would.

Posting to add my support for a -r latest or -r last option.

I just setup a copy storage where I’d like to run a more aggressive prune.
I use the Web-UI so calculating latest and running a CLI copy isn’t ideal.

EDIT: As a workaround I’ve set a variable LATEST_REVISION which I can reference from in the CLI. Is there a way to pass that variable to the WebUI which I could use as an option, e.g.

2 Likes

Also just realised I need this desperately! The workaround seems impossible to use on duplicacy-web

2 Likes

Bumping this request for a -latest or -last parameter that I can use with copy in the web app.

My use case:
I back up locally to a large drive (where I’d like to keep as many old versions as can fit) then do a copy to a smaller off-site drive which will only be necessary to recover from catastrophe.

Copy copies everything so at the moment I’m limited by the size of the smaller drive.
-latest would let me prune the smaller drive more aggressively.

3 Likes

Is the only way to get the latest revision number to run “duplicacy list” and wait for it to finish? I have a backup with almost 1000 revisions and it was very slow. So I wrote the following bash script which will use a binary search to find the most recent revision a little more quickly. It seems to work so I thought I would share it here in case others might find it useful.

#!/usr/bin/env bash

find_latest_duplicacy_revision() {
    starting_guess=${1:-400}

    found=0
    last_existing=0
    last_failing=0

    rev=$starting_guess
    while (( found == 0 )); do
        echo "trying $rev"
        if duplicacy list -r $rev; then
            last_existing=$rev
        else
            last_failing=$rev
        fi

        if (( last_existing == last_failing - 1 )); then
            found=$last_existing
        elif (( last_failing == 0 )); then
            (( rev = rev * 2 ))
        else
            (( rev = (last_failing + last_existing) / 2 ))
        fi
    done

    if (( found > 0 )); then
        echo "found latest revision: $found"
    else
        echo "unable to find latest revision"
    fi
}

find_latest_duplicacy_revision 500
1 Like

Would not this only work if the repository was never pruned?

The list of snapshots is literally list of files in the target under snapshots\snapshot_id folder.

I can’t imagine getting a folder listing from any remote should take more than few seconds. Maybe connecting directly and fetching the list will be faster.

Also, there is a feature request somewhere for duplicacy to accept -last as revision index, to avoid all of that. Perhaps someone should implemented it and submit PR?

Relevant threads: Struggling with CLI and how to use multiple destinations - #5 by gchen

1 Like

Nobody replied to that feature request.
Yes, “someone should implement it” :smiley:

3 Likes

Has there been any progress on this request?

My use case is exactly the same as some people described above. I run regular backups to a local server using the Web UI and immediately afterwards want to copy the latest revision to an cloud storage for disaster recovery. To save space and money I only keep a subset of revisions in the cloud.

Currently I am using the command line to manually supply the revision number to the copy command. Any of the solutions above (negative index on -r or -last n) would be a great help for me.

2 Likes

Hi, I am also interested in this feature.

1 Like

We are doing daily checks to make sure that the backup has run and we’re just using a bit of Powershell.

C:\Duplicacy\Duplicacy.exe list > C:\Duplicacy\List.txt
Get-Content C:\Duplicacy\List.txt -Last 1