Prune -keep 1:1 keeps oldest revision

Please describe what you are doing to trigger the bug:
Running prune -keep 1:1 on snapshot with multiple revisions during the day keeps the oldest revision. Loses all changes between oldest and youngest revisions.

This may happen with other -keep options. I have only seen it with -keep 1:1.

Please describe what you expect to happen (but doesn’t):
Would expect it to keep the youngest revision to preserve the changes that occurred during the day.

Please describe what actually happens (the wrong behaviour):

Hourly backup, prune and check (in that order). Got 24 revisions for the day.

I would expect revision 32 to have been kept as it encompasses all changes for that day. Instead revision 12 was kept.

11:00 pm:

2019-09-07 23:00:24.992 INFO SNAPSHOT_CHECK

      snap | rev |                          | files |   bytes | chunks |   bytes | uniq |   bytes |  new |    bytes |
 test-snap |  12 | @ 2019-09-07 00:00       |  6609 | 28,670M |   5723 | 27,532M |    3 |  1,692K |   70 | 223,587K |
 test-snap |  13 | @ 2019-09-07 01:00       |  6612 | 28,670M |   5724 | 27,532M |    3 |  1,692K |    4 |   1,738K |
 test-snap |  14 | @ 2019-09-07 02:00       |  6615 | 28,670M |   5725 | 27,532M |    3 |  1,693K |    4 |   1,739K |
 test-snap |  15 | @ 2019-09-07 03:00       |  6618 | 28,670M |   5726 | 27,532M |    3 |  1,693K |    4 |   1,739K |
 test-snap |  16 | @ 2019-09-07 04:00       |  6621 | 28,670M |   5727 | 27,532M |    3 |  1,693K |    4 |   1,740K |
 test-snap |  17 | @ 2019-09-07 05:00       |  6624 | 28,670M |   5728 | 27,532M |    3 |  1,694K |    4 |   1,741K |
 test-snap |  18 | @ 2019-09-07 06:00       |  6627 | 28,670M |   5729 | 27,532M |    3 |  1,694K |    4 |   1,741K |
 test-snap |  19 | @ 2019-09-07 07:00       |  6630 | 28,670M |   5730 | 27,532M |    3 |  1,694K |    4 |   1,742K |
 test-snap |  20 | @ 2019-09-07 08:00       |  6633 | 28,670M |   5731 | 27,532M |    3 |  1,695K |    4 |   1,743K |
 test-snap |  21 | @ 2019-09-07 09:00       |  6640 | 28,670M |   5732 | 27,532M |    3 |  1,695K |    4 |   1,752K |
 test-snap |  22 | @ 2019-09-07 10:00       |  6645 | 28,670M |   5732 | 27,532M |    3 |  1,696K |    5 |   2,422K |
 test-snap |  23 | @ 2019-09-07 11:00       |  6648 | 28,670M |   5733 | 27,532M |    3 |  1,696K |    4 |   1,748K |
 test-snap |  24 | @ 2019-09-07 12:00       |  6651 | 28,670M |   5734 | 27,532M |    3 |  1,696K |    5 |   2,415K |
 test-snap |  25 | @ 2019-09-07 13:00       |  6654 | 28,670M |   5735 | 27,533M |    3 |  1,697K |    4 |   1,749K |
 test-snap |  26 | @ 2019-09-07 14:00       |  6657 | 28,671M |   5736 | 27,533M |    3 |  1,697K |    4 |   1,750K |
 test-snap |  27 | @ 2019-09-07 15:00       |  6659 | 28,675M |   5740 | 27,536M |    4 |  1,994K |   35 |  87,382K |
 test-snap |  28 | @ 2019-09-07 18:00       |  6671 | 28,675M |   5740 | 27,536M |    3 |  1,995K |    4 |   2,100K |
 test-snap |  29 | @ 2019-09-07 19:00       |  6674 | 28,675M |   5741 | 27,536M |    3 |  1,995K |    4 |   2,054K |
 test-snap |  30 | @ 2019-09-07 21:00       |  6680 | 28,675M |   5742 | 27,537M |    3 |  1,996K |    5 |   2,429K |
 test-snap |  31 | @ 2019-09-07 22:00       |  6683 | 28,675M |   5743 | 27,537M |    3 |  1,996K |    4 |   2,057K |
 test-snap |  32 | @ 2019-09-07 23:00       |  6686 | 28,675M |   5744 | 27,537M |    5 |  2,427K |    5 |   2,427K |

Midnight:

2019-09-08 00:00:25.850 INFO SNAPSHOT_CHECK
      snap | rev |                          | files |   bytes | chunks |   bytes | uniq |   bytes |  new |    bytes |
 test-snap |  12 | @ 2019-09-07 00:00       |  6609 | 28,670M |   5723 | 27,532M |    5 |  2,304K |   70 | 223,587K |

Prune:

Options: [-log prune -storage test -keep 7:30 -keep 1:1 -a]
2019-09-08 00:00:07.404 INFO STORAGE_SET Storage set to /path/to/storage
2019-09-08 00:00:08.415 INFO RETENTION_POLICY Keep 1 snapshot every 7 day(s) if older than 30 day(s)
2019-09-08 00:00:08.415 INFO RETENTION_POLICY Keep 1 snapshot every 1 day(s) if older than 1 day(s)
2019-09-08 00:00:08.969 INFO SNAPSHOT_DELETE Deleting snapshot test-snap at revision 13
2019-09-08 00:00:09.009 INFO SNAPSHOT_DELETE Deleting snapshot test-snap at revision 14
2019-09-08 00:00:09.039 INFO SNAPSHOT_DELETE Deleting snapshot test-snap at revision 15
2019-09-08 00:00:09.068 INFO SNAPSHOT_DELETE Deleting snapshot test-snap at revision 16
2019-09-08 00:00:09.100 INFO SNAPSHOT_DELETE Deleting snapshot test-snap at revision 17
2019-09-08 00:00:09.126 INFO SNAPSHOT_DELETE Deleting snapshot test-snap at revision 18
2019-09-08 00:00:09.155 INFO SNAPSHOT_DELETE Deleting snapshot test-snap at revision 19
2019-09-08 00:00:09.183 INFO SNAPSHOT_DELETE Deleting snapshot test-snap at revision 20
2019-09-08 00:00:09.212 INFO SNAPSHOT_DELETE Deleting snapshot test-snap at revision 21
2019-09-08 00:00:09.245 INFO SNAPSHOT_DELETE Deleting snapshot test-snap at revision 22
2019-09-08 00:00:09.273 INFO SNAPSHOT_DELETE Deleting snapshot test-snap at revision 23
2019-09-08 00:00:09.300 INFO SNAPSHOT_DELETE Deleting snapshot test-snap at revision 24
2019-09-08 00:00:09.328 INFO SNAPSHOT_DELETE Deleting snapshot test-snap at revision 25
2019-09-08 00:00:09.355 INFO SNAPSHOT_DELETE Deleting snapshot test-snap at revision 26
2019-09-08 00:00:09.384 INFO SNAPSHOT_DELETE Deleting snapshot test-snap at revision 27
2019-09-08 00:00:09.414 INFO SNAPSHOT_DELETE Deleting snapshot test-snap at revision 28
2019-09-08 00:00:09.444 INFO SNAPSHOT_DELETE Deleting snapshot test-snap at revision 29
2019-09-08 00:00:09.474 INFO SNAPSHOT_DELETE Deleting snapshot test-snap at revision 30
2019-09-08 00:00:09.505 INFO SNAPSHOT_DELETE Deleting snapshot test-snap at revision 31
2019-09-08 00:00:09.532 INFO SNAPSHOT_DELETE Deleting snapshot test-snap at revision 32

@gchen i thought this pruning bug was already fixed some time ago.

This is not a bug. The current implementation always keeps the oldest revision, and then subsequent revisions that are too close will be removed.

I think it is a bug: :d: was supposed to save the first revision of the day, and delete the rest from that day. That’s what i expect at least, and this seems to be what @gimgol expects as well. (i’m 100% sure there was a topic about this somewhere but for the love of xenu, i can’t find it now :roll_eyes:)

If it’s not a bug it is at least unintuitive for the end user and likely to cause confusion.

If I create a file on Sep-07 I expect it to be in the revision for that day. The way prune is currently implemented, files created on a given day may not be in the revision for that day. For example:

Wills-Air:0 will$ ~/.duplicacy-web/bin/duplicacy_osx_x64_2.2.3 list -r 12
Repository set to /Users/will
Storage set to sftp://root@192.168.1.4//srv/dev-disk-by-label-data/tmp
Snapshot wills-air revision 12 created at 2019-09-07 00:00

Let see what duplicacy log files we have:

Wills-Air:0 will$ ~/.duplicacy-web/bin/duplicacy_osx_x64_2.2.3 list -files -r 12 | grep backup-20190907
691 2019-09-07 00:00:03 be2ba3bf8b325a723c3fd3fe5ad115608f2007faf597ceeffc7f6874c79af4e1 .duplicacy-web/logs/backup-20190907-000001.log

I know I made hourly files …

Wills-Air:0 will$ ~/.duplicacy-web/bin/duplicacy_osx_x64_2.2.3 list -r 33
Repository set to /Users/will
Storage set to sftp://root@192.168.1.4//srv/dev-disk-by-label-data/tmp
Snapshot wills-air revision 33 created at 2019-09-08 00:00

Wills-Air:0 will$ ~/.duplicacy-web/bin/duplicacy_osx_x64_2.2.3 list -files -r 33 | grep backup-20190907
2107 2019-09-07 00:00:05 36161c47dffe3befbe8f371d01757c1473baad8fb67fd706b8fbc21f5f0e7043 .duplicacy-web/logs/backup-20190907-000001.log
2108 2019-09-07 01:00:04 010a3d9c25336a05d67d2dc2bea03cf1fbe9bc9d202f8151ae7705dac443990c .duplicacy-web/logs/backup-20190907-010001.log
2108 2019-09-07 02:00:04 5cb47e3bcb802e65e629169b355c7047a08781e141af4e357b665fa773bc40a8 .duplicacy-web/logs/backup-20190907-020001.log
2108 2019-09-07 03:00:04 788d305bfa831882276af32d7b710dfb3e0e4d9a5c66a86185f5e6cd152b7a7f .duplicacy-web/logs/backup-20190907-030001.log
2109 2019-09-07 04:00:04 21d734cfd77699da3b4d843d9343beffac032bebdea3fea01c7f747efaf622db .duplicacy-web/logs/backup-20190907-040001.log
2109 2019-09-07 05:00:04 f0af03711c4da7cdba82d4931279ec2cacb3d13f656375e11109e6261b38863a .duplicacy-web/logs/backup-20190907-050001.log
2109 2019-09-07 06:00:04 91cd51af7b9c26756df72cdf1fc87130fecb9e681d39baff2270f74d267d85a0 .duplicacy-web/logs/backup-20190907-060001.log
2109 2019-09-07 07:00:04 93697b136856dac11aaafe29750629af8fd9b850189b9c204997711f78a4ce48 .duplicacy-web/logs/backup-20190907-070001.log
2109 2019-09-07 08:00:06 c2c46fecf3d89a9b863f992af30a766032836d3c336044ea8aa771fffb2fa810 .duplicacy-web/logs/backup-20190907-080001.log
2732 2019-09-07 09:00:05 7a2da2702a312b6a737677d88a9b8cf0091d90a91b840140a69e2db293d2502e .duplicacy-web/logs/backup-20190907-090001.log
3036 2019-09-07 10:00:05 19a84c5744fd6a612f3299830b4cc91d2fe45b99aba30940ef799b88446a44c9 .duplicacy-web/logs/backup-20190907-100001.log
2109 2019-09-07 11:00:05 2ac77d4fb04b3e33e38eddb24b64cff85e86ad5bb82398dfd918640152c06976 .duplicacy-web/logs/backup-20190907-110001.log
2703 2019-09-07 12:00:06 555665d9555dec67e454de43b556a32c5dd2519a693490353c8475a664e7a539 .duplicacy-web/logs/backup-20190907-120001.log
2109 2019-09-07 13:00:04 c5009e9cba45ab213b0a3d19d6efd93413b35bb0ae0629efadbc17a5506a1dc4 .duplicacy-web/logs/backup-20190907-130001.log
2109 2019-09-07 14:00:04 613913604e586fe5e147cb2e838df1261f563f66822761b99f5115980147143d .duplicacy-web/logs/backup-20190907-140001.log
5476 2019-09-07 15:00:27 1b2faab1e5f9e84a887cebdb867a6a52634a840cbdbde0b1ed57deeb73213fc1 .duplicacy-web/logs/backup-20190907-150001.log
527 2019-09-07 16:00:59 85d768211f49c977b1a4b9093024411565a7c6664213f79ad96b18185c1256cf .duplicacy-web/logs/backup-20190907-160059.log
527 2019-09-07 17:04:14 db2a9e90cb52fc86654640f380c7098d4d513c8e5d2c7beb1575af372ad39ab9 .duplicacy-web/logs/backup-20190907-170414.log
3106 2019-09-07 18:00:07 ca59236214c72e9b9ba008d8977f49bb100746c123374fcad8973d4246163230 .duplicacy-web/logs/backup-20190907-180002.log
2110 2019-09-07 19:00:06 375867c4bef93bc2b2aa7463775255c442170a72e4962870a952cbc43bb57bdd .duplicacy-web/logs/backup-20190907-190001.log
527 2019-09-07 20:04:13 2b932d273d7e358dbb5b2ac8d898577f9c4424c37f8a9c5426e0d460aae6d784 .duplicacy-web/logs/backup-20190907-200413.log
3615 2019-09-07 21:00:06 48b911886c03df06b0ad95ec4cac41b0a3aae12b77f3c9829391261c4bab3934 .duplicacy-web/logs/backup-20190907-210001.log
2109 2019-09-07 22:00:06 b4a214c750437d50ca7be848322ea9cf69ddc2e160e30e64d2c88f248a33d277 .duplicacy-web/logs/backup-20190907-220001.log
2983 2019-09-07 23:00:07 e5248abc72d72188b3c166da2366bb8e98d7ebb3dbd1e3d220d28b9817f38a2c .duplicacy-web/logs/backup-20190907-230001.log

And it gets more confusing …

Note that backup-20190907-000001.log shows up in both revisions.

And a user is going to want to choose revision 12 in the Web-GUI because that is the day that they remember creating the file.

08%20PM

But when they get to the file listing for revision 12 they only have one file.

58%20PM

While revision 33 has all of them.

34%20PM

This is really quite an awkward way of thinking about backup snapshots.

IMO, it’s not unintuitive to expect files in backups to only exist after they were actually made - going by the full timestamp, including the time of the day. And having to consider that when checking the timestamps, rather than zeroing in on the ‘day’ alone.

In fact, it’s not even true that a file created/modified on a given day will necessarily feature in a backup created that day - regardless of pruning. e.g. something created at the end of the day (e.g. 23:01), with hourly snapshots.

Likewise, I don’t expect files that were backed up in the last month (say at the end of the month), to effectively exist in every snapshot of the ‘last month’ (e.g. August), and then the most recent kept because we want to keep the youngest ‘monthly’… because… 08?

Pruning works by intervals. Intervals of one day are really no more special than weekly or monthly. Duplicacy works its way back from the oldest snapshot, checks the age range and applies a retention policy. Any snapshots with a gap less than the policy interval will get deleted.

What you’re suggesting is that it should reverse this process when it gets to 1-day intervals. This seems unintuitive and kinda pointless considering there’s no guarantee - as I said above - a file created on say the 9th will feature in a backup created on the 9th. Edit: it may feature in a backup of the 10th or even later.

1 Like

If I had said “may not be in the selected revision for that day” would that be less awkward?

I get your point, I think the result of the “end of day” scenario would make more sense to the end user than the current process that effectively ignores edits made after 00:59 until the next selected snapshot, with hourly snapshots.

Brings up another question. Do those files then have to be reprocessed and uploaded? I can’t seem to find that happening in the logs.

So how are we expecting a user to know where to look for their file from the 9th? Start at the 9th and keep listing until they find it? Using history is an option, but assuming they don’t remember the full path or are only using the Web UI? That could be several long, anxious minutes.

Not sure what you mean, but pruning only removes snapshots (and collects and deletes chunk fossils). Why would it re-upload files?

I agree that Duplicacy’s ability to find files is very lacking…

However, I expect - as a user - to look at the complete timestamp when looking for files. Even if I can remember when I created a file, I’m gonna look at several snapshots created after that day to see for certain when it was first backed up.

I mean this isn’t really any different from other snapshot-based backups, including Shadow Copies / Previous Versions in Windows, where the default snaphot times are 7am and 12pm. If a file was created after midday, it’ll be in the next day’s 7am shadow copy.

For sure though, Duplicacy needs a better way to find files.

At the very least, the Web Edition should try to locally index files for each repository. That might go against one of Duplicacy’s principles of not keeping a database, but a temporary local index used exclusively for its own locating of files in the aid of restore etc., would be much needed.