Misunderstanding how prune for additional storage works

tom · 4 February 2020 14:15

I have a local and remote storage. The storage was added using add -copy.

My backup/prune commands look like this (encryption/password related stuff omitted for brevity):

duplicacy-wrapper backup 
duplicacy-wrapper prune -a -keep 0:365 -keep 30:30 -keep 7:7 -keep 1:1 
duplicacy-wrapper prune -a -exclusive 
duplicacy-wrapper copy -from default -to b2
duplicacy-wrapper prune -a -storage b2 -keep 0:365 -keep 30:30 -keep 7:7 -keep 1:1 
duplicacy-wrapper prune -a -storage b2 -exclusive 
duplicacy-wrapper check -a 
duplicacy-wrapper check -a -storage b2

And this is a log showing the prune for each storage respectively:

Keep no snapshots older than 365 days
Keep 1 snapshot every 30 day(s) if older than 30 day(s)
Keep 1 snapshot every 7 day(s) if older than 7 day(s)
Keep 1 snapshot every 1 day(s) if older than 1 day(s)
Deleting snapshot backup at revision 22
Fossil collection 1 saved
The snapshot backup at revision 22 has been removed

Keep 1 snapshot every 30 day(s) if older than 30 day(s)
Keep 1 snapshot every 7 day(s) if older than 7 day(s)
Keep 1 snapshot every 1 day(s) if older than 1 day(s)
Deleting snapshot backup at revision 21
Deleting snapshot backup at revision 22
Deleting snapshot backup at revision 26
No snapshot to delete

As you can see, it doesn’t appear that the prune for two-step fossil collection runs for the remote storage. I have to manually run the command later for it work. I also tried using the -exhaustive flag to no effect. I have two questions:

How do I effectively use the copy/prune command so that I don’t need to run prune on the remote storage?
What am I doing wrong w.r.t. the two-step fossil collection prune not running for the remote storage?

gchen · 4 February 2020 20:11

When you specify -exclusive, the prune command doesn’t runt he two-step fossil collection algorithm. Instead, chunks only referenced by snapshots to be deleted are simply deleted because of the exclusive access (which means no other ongoing backup/copy jobs).

To avoid running prune on the remote storage, I think the best method would be to tag backups differently: backups to be kept are assigned one tag and others another tag. Then you can prune on the local storage by tag and copy to the remote storage backups with the to-be-kept tag.

tom · 4 February 2020 20:38

Thanks for the response but I’m still confused. I think I already tried using -exclusive -exhaustive and ran into the same issue (I switched to only -exclusive after reading -exhaustive should be used sparingly).

My expected behavior is that after I use prune then copy, that one would not need to use prune on the remote storage since those chunks/snapshots are deleted. Instead, it appears that what actually happens is that those chunks/snapshots are still copied over, the chunks are not deleted until after I manually run the command again at some later time. For example, here is the result of check after the commands run in the OP:

1 snapshots and 8 revisions
Total chunk size is 7,508M in 2507 chunks
All chunks referenced by snapshot backup at revision 1 exist
All chunks referenced by snapshot backup at revision 10 exist
All chunks referenced by snapshot backup at revision 17 exist
All chunks referenced by snapshot backup at revision 23 exist
All chunks referenced by snapshot backup at revision 24 exist
All chunks referenced by snapshot backup at revision 25 exist
All chunks referenced by snapshot backup at revision 27 exist
All chunks referenced by snapshot backup at revision 28 exist
Storage set to blah-blah
Enter the passphrase for private.pem:********************************Listing all chunks
1 snapshots and 11 revisions
Total chunk size is 7,790M in 2622 chunks
All chunks referenced by snapshot backup at revision 1 exist
All chunks referenced by snapshot backup at revision 10 exist
All chunks referenced by snapshot backup at revision 17 exist
All chunks referenced by snapshot backup at revision 21 exist
All chunks referenced by snapshot backup at revision 22 exist
All chunks referenced by snapshot backup at revision 23 exist
All chunks referenced by snapshot backup at revision 24 exist
All chunks referenced by snapshot backup at revision 25 exist
All chunks referenced by snapshot backup at revision 26 exist
All chunks referenced by snapshot backup at revision 27 exist
All chunks referenced by snapshot backup at revision 28 exist

I think there is some inconsistency with the snapshots being deleted and I need to run prune -exclusive -exhaustive on both storages at the same time at some later time to synchronize it again. I realize I can use bit-identical and rclone but I would prefer to not use that since if repo A has corrupted bits then repo B would also get those corrupted bits.

gchen · 4 February 2020 21:08

No, prune after copy doesn’t guarantee that prune isn’t needed on the remote storage. For instance, a revision to be kept as an hourly backup today may not be kept tomorrow (not a daily backup).

Droolio · 5 February 2020 01:07

There is a small pitfall - which you’ve just correctly highlighted - when pruning local and remote storages, which you need to be aware of…

The issue arises due to the way Duplicacy steps through the revisions to be collected for later deletion. It starts at the first revisions (1 in your case) and determines the age difference between the next revision. If your local and remote storages were pruned on different days, or with different retention periods, you may end up with different revisions on each storage. When copy runs, it may fill the gaps that were previously pruned, with a different set of revisions (because they differ between storages). Then when prune next runs, it deletes them again. And you end up in a cycle of recopying recently pruned revisions - because the numbers are off.

What you need to have is a situation where the set of revisions on each storage, after a prune, are the same. Because then copying won’t fill the pruned gaps and will only copy new backup jobs. The way I make sure of this is to 1) use the same retention periods for both storages, and 2) run both prunes on the same day.

To fix your setup, you can either copy revisions back in the opposite direction - you should end up with the same revisions on both storages, by which point you can abide by the ‘prune both storages rule’. Or you can manually delete revisions so they ‘match’. Until you’ve fixed this mis-match, running your prune+copy script will probably go around in circles recopying pruned revisions.

However, I would encourage you not to use the -exclusive option. Simply because you don’t need it in this scenario, and also there’s a risk if you happen to be running backups. The two-step fossil collection algorithm is exactly that - it happens in two steps. Chunks you find in the first run will normally get deleted in the second. The second run finds yet more chunks (for the third run) and deals with the chunks found in the first. But snapshots themselves are instantly deleted, so unless you want to see instant results with the chunks, you shouldn’t need to use -exclusive.

Droolio · 5 February 2020 01:10

Would it be possible to get tag support for the copy command?

tom · 5 February 2020 01:25

As shown in the OP I am doing that. Checking the logs my most recent backup ended at 22:39:50 so everything happened on the same day.

Hmm I think referencing two step fossil algorithm in my OP was a mistake because I wasn’t thinking straight. If I remember correctly I was using -exclusive (which disables the two step fossil algorithm)/-exhaustive to ensure the chunks were deleted as soon as possible.

Droolio · 5 February 2020 01:37

Yep, however at some point, your revision numbers are out of wack. e.g. you have rev. 21, 22 on one storage but not on another. This needs to be synchronised before you proceed with any more prunes, otherwise it’ll just be a waste of bandwidth.

You can of course do that, but it’s honestly safer to not to do it. You just don’t need to do it under normal circumstances. Though running -exhaustive from time to time is OK to do, but normally it won’t pick up anything unless a previous backup/prune or whatever aborted mid-way.

gchen · 5 February 2020 03:20

Sorry. I forgot the copy command doesn’t support tag. Will do this after the new release of the web GUI.

tom · 10 February 2020 00:13

I started over again and I’m not understanding what I’m doing wrong here. This is for the local storage:

Keep no snapshots older than 365 days
Keep 1 snapshot every 30 day(s) if older than 30 day(s)
Keep 1 snapshot every 7 day(s) if older than 7 day(s)
Keep 1 snapshot every 1 day(s) if older than 1 day(s)
Deleting snapshot backup at revision 2
Fossil collection 1 saved
The snapshot backup at revision 2 has been removed
Storage set to local-storage
Fossil collection 1 found
Fossils from collection 1 is eligible for deletion

It then proceeded to delete all of those chunks (as expected). Then for the remote storage:

Copied snapshot backup at revision 3
Storage set to remote-storage
Keep no snapshots older than 365 days
Keep 1 snapshot every 30 day(s) if older than 30 day(s)
Keep 1 snapshot every 7 day(s) if older than 7 day(s)
Keep 1 snapshot every 1 day(s) if older than 1 day(s)
Deleting snapshot backup at revision 2
An RSA private key is required to decrypt the chunk
Storage set to remote-storage
No snapshot to delete
Storage set to local-storage
Enter the passphrase for private.pem:********************************Listing all chunks
1 snapshots and 2 revisions
Total chunk size is 3,769M in 1357 chunks
All chunks referenced by snapshot backup at revision 1 exist
All chunks referenced by snapshot backup at revision 3 exist
Storage set to remote-storage
Enter the passphrase for private.pem:********************************Listing all chunks
1 snapshots and 3 revisions
Total chunk size is 3,853M in 1402 chunks
All chunks referenced by snapshot backup at revision 1 exist
All chunks referenced by snapshot backup at revision 2 exist
All chunks referenced by snapshot backup at revision 3 exist

Later at some future date I run the command manually:

$ duplicacy-wrapper prune -a -storage b2 -keep 0:365 -keep 30:30 -keep 7:7 -keep 1:1
Storage set to remote-storage
Keep no snapshots older than 365 days
Keep 1 snapshot every 30 day(s) if older than 30 day(s)
Keep 1 snapshot every 7 day(s) if older than 7 day(s)
Keep 1 snapshot every 1 day(s) if older than 1 day(s)
Deleting snapshot backup at revision 2
Fossil collection 1 saved
The snapshot backup at revision 2 has been removed

Two things to note: the RSA private key message isn’t there and it’s deleting revision 2 again. I don’t know what the difference is when I ran the exact same command I have in my cron and in the OP. Could it be a date/time or b2 being slow issue? I’ll try adding a sleep and report the results.

Droolio · 10 February 2020 01:53

Are you pruning one storage, performing a copy, and then pruning your other storage as per first post? I’m not sure if it’ll make any difference but I’d rearrange the steps to be check first, prune local, prune remote, then finally copy.

Those two prunes need to be run as atomically as possible, but at the very least, on the same day. So make sure neither job spans across midnight as it could well remove a different set of revisions.

tom · 10 February 2020 03:15

Upon further inspection, I think this is what’s happening. The:

An RSA private key is required to decrypt the chunk

error message appears in the decrypt function but the prune command doesn’t take a key parameter like some of the other commands. However, without a deep analysis of the code, it seems like PruneSnapshots references DownloadFile/DownloadSnapshot which then call Decrypt. So does prune need the private key to decrypt chunks or am I misunderstanding the code?

Droolio · 10 February 2020 03:32

I don’t use RSA encryption so unable to say. Have you tried providing the key file, or does it not prompt for one?

tom · 10 February 2020 04:10

So it appears to be a bug after all. Here’s how to reproduce it. First, remove or backup your ~/.duplicacy/cache folder. This is required in order to force Duplicacy to download the file and therefore attempt to decrypt the chunk. Then run this command:

duplicacy-wrapper -d -v prune -storage b2 -d -a -keep 0:365 -keep 30:30 -keep 7:7 -keep 0:1

I just used 0:1 to force it to delete some revisions. And the result:

Downloaded file snapshots/backup/1
Downloaded file snapshots/backup/3
Downloaded file snapshots/backup/4
Downloaded file snapshots/backup/5
Snapshot backup at revision 1 to be deleted - older than 1 days
Snapshot backup at revision 3 to be deleted - older than 1 days
Deleting snapshot backup at revision 1
An RSA private key is required to decrypt the chunk

My guess is this is what’s happening: when it first populates the cache, it encounters the RSA key error and doesn’t delete the snapshot. Then when I run prune again, the cache is already populated so the prune succeeds. So the bug is that the error from Decrypt shouldn’t prevent prune from continuing.

gchen · 10 February 2020 04:31

Yes, this is a bug: the prune command should not ask for the RSA private key. I think that is most likely caused by RSA encrypting a meta chunk that shouldn’t be RSA encrypted. I’ll dig into it tomorrow.

As for deleting Revision 2 only in the manual invocation – this is normal because the age of a revision changes over time so Revision 2 could become older than 1 day and thus subject to more frequent deletions.

gchen · 13 February 2020 19:12

The RSA encryption bug has been fixed by Fixed a bug that caused all copied chunks to be RSA encrypted · gilbertchen/duplicacy@cc88abd · GitHub

towerbr · 13 February 2020 23:00

Great! And a new CLI version will be generated?

gchen · 17 February 2020 05:47

I’ll release a new CLI version this week.