Why add a revision when it is identical with the previous one?

Christoph · 20 January 2018 14:28

I noticed that the revision number increases every single time the backup runs, even when no changes are made whatsoever

Backup for **** at revision 30 completed
Files: 1960 total, 49,037M bytes; 0 new, 0 bytes
File chunks: 9971 total, 49,041M bytes; 0 new, 0 bytes, 0 bytes uploaded
Metadata chunks: 3 total, 1,222K bytes; 0 new, 0 bytes, 0 bytes uploaded
All chunks: 9974 total, 49,043M bytes; 0 new, 0 bytes, 0 bytes uploaded
Total running time: 00:00:03
Backup for **** at revision 31 completed
Files: 1960 total, 49,037M bytes; 0 new, 0 bytes
File chunks: 9971 total, 49,041M bytes; 0 new, 0 bytes, 0 bytes uploaded
Metadata chunks: 3 total, 1,222K bytes; 0 new, 0 bytes, 0 bytes uploaded
All chunks: 9974 total, 49,043M bytes; 0 new, 0 bytes, 0 bytes uploaded
Total running time: 00:00:04

and so on…

Is this intentional and if so, what’s the advantage as compared with increasing the revision number only when the new revision is from the previous one?

gchen · 21 January 2018 03:36

Because this is the most straightforward implementation, i.e., no extra logic to decide if the revision is the same as the latest one. I would argue that this makes sense in some cases, like if a different backup tag is specified, or to stick to a strict schedule of, say, one revision per day.

Christoph · 21 January 2018 09:16

My concern is just that if you have frequent (say: hourly) backups, it wont take long until you have revision numbers that are not easily readable to the human eye and which invite typing errors when you have to type them.

And the GUI restore dropdown listing all revisions becomes extremely long (and eventually also wide, forcing the user to re-adjust the with of the modal). Granted, the GUI mitigates the problem somewhat by including the time and date of the revision (as well as tags, I believe?).

iluke · 21 January 2018 18:56

Came here to ask the same I use Arq backup on osx that I plan to move away from, it only creates a new revision if anything changes.

Perhaps a new method we can use to check if anything has changed, then we can script it to upload or flag of sorts to allows for that behaviour?

In fairness I could probably get away with backing up once a day would prefer hourly, plus computer is put to sleep half of the day.

E.g. assuming I backup once an hour;

if I was working on a file throughout the day, hourly backup manages to catch 4 changes.

Next day I delete it and want to recover it, I’d probably have to go through each snapshot not knowing which snapshot the file changed in? Unless theres a easy way to figure that out now?

Maybe keep the revisions but have a flag to show only revisions with changes?

Another question; can Duplicacy prevent computer sleep while backing up?

(edited post)

gchen · 22 January 2018 03:08

The history command can tell you how a file changes over time/revisions:

duplicacy history -r 10-100 relative/path/to/file

iluke · 22 January 2018 12:49

@gchen thanks I had spotted that just not as friendly.

Think I might just script it with find . -mtime -1 or similar to only backup if anything has changed.

Nelvin · 23 January 2018 07:40

If a new revision is only created if something changed and you have a repository with very static data, there would be a chance you reach the point where you prune the snapshots and then your backup will miss all the unchanged files until you run another backup. Sure chances are small you’ll need the backup in exactly this timeframe but if it happens you’ll have a very bad day.

@iluke the way Duplicacy works now, the file is included in each of your revisions, so you can restore from the latest one. You’re absolutely fine running a backup every hour and if you don’t need to keep all these revisions for a long time, just prune regularly.

iluke · 23 January 2018 11:24

@Nelvin good point hourly should be fine, thanks

gchen · 24 January 2018 00:52

@Nelvin this is a solid reason for creating a new revision when there is no change.

eoj · 4 July 2019 11:50

This makes sense but at the same time its not really a revision if no changes where made to the file. So for example if I setup a daily backup 30 days later its going to show 30 revisions? If this is the case say on day 20 a file was changed, how would I know which revision is different from the rest?

Droolio · 4 July 2019 12:16

(BTW, this thread is nearly 1.5 years old, just so you know )

saspus · 5 July 2019 04:17

Why? It is a revision. Number of changes -zero. Just date of the backup.

Think of it as “snapshot at time” because this is what it is — how your data looked like at the time of backup. If data looks same - well, then that’s what it is. This creates predictable and consistent structure of revisions and works safely with pruning and does not require extra code and is straightforward to understand.