Invalid chunks in 1/4 snapshots

Hi,

I use duplicacy to backup 4 machines, all using SFTP backend:

  • 2x on LAN: one laptop, and an always-on Mac Mini
  • 2x remote: one laptop, and one always-on server

After I saw the new release with a fix for a possible bug using SFTP, I ran checks on all those.
3 of them are all OK, but the remote laptop is not…
I tried to check the oldest and newest revisions, and some in between, and all checks seems to fail:
Failed to decrypt the chunk ... cipher: message authentication failed
The chunk in question differ, depending on which revision I’m checking (but will sometimes be the same, when I check revisions that are close together, eg. 254 & 255)

I tried a restore of the latest revision (255), and was able to restore all of it, after excluding one file (found which using -d option with restore). So it seems only one chunk was invalid in that snapshot.

I tried to see how I could find all the invalid chunks, but it seems that the check command aborts as soon as it finds one invalid chunk, so I can’t find all of them.
But I ran the check command by specifying -r n, and tried a few revisions number, and I was not able to find a revision with all the chunks OK…

I know this laptop creates backups late in the evenings (23h15); maybe the user often closes the laptop while it is still backing up? But I would expect the revision file (created in the snapshots folder) to be created when the backup completes…?
How could we explain that many invalid chunks?
Server seems fine; all (3) other computers target the same sftp server, and backup onto the same drive, which has plenty of space left. The laptop that creates backups from LAN is scheduled to create backup hourly, and all chunks are OK for that computer.
I would expect some chunks to be invalid, if the problem was that invalid chunks could be created if the client computer was turned off during a backup.

I will update the duplicacy binary on the remote laptop tonight, and grab old logs at the same time.

How can I “fix” this situation, for new backups to be valid?
Can I just rm the invalid chunk from the latest revision, and the next backup will re-upload it (if necessary)?
I don’t really mind wiping all past revisions, as long as new ones are valid. And if I can do that without forcing the remote client to re-upload everything, that would be great!

Thanks.

Update:

The chunk that is invalid, in revision 255: dae2e81d0e6b0bd15acb9411cb4d0f579925aadf0339299082fba9a98d0aa8f9

Many files are found in chunks/da/e2e81d0e6b0bd15acb9411cb4d0f579925aadf0339299082fba9a98d0aa8f9*:

$ ls -ltr chunks/da/e2e81d0e6b0bd15acb9411cb4d0f579925aadf0339299082fba9a98d0aa8f9*
-rw-r--r-- 1 bob bob 3178496 Jun 20 02:41 chunks/da/e2e81d0e6b0bd15acb9411cb4d0f579925aadf0339299082fba9a98d0aa8f9.fcnvtdcx.tmp
-rw-r--r-- 1 bob bob       0 Jun 20 04:42 chunks/da/e2e81d0e6b0bd15acb9411cb4d0f579925aadf0339299082fba9a98d0aa8f9.evxlycbb.tmp
-rw-r--r-- 1 bob bob       0 Jun 20 04:41 chunks/da/e2e81d0e6b0bd15acb9411cb4d0f579925aadf0339299082fba9a98d0aa8f9.rapixtzs.tmp
-rw-r--r-- 1 bob bob       0 Jun 20 04:42 chunks/da/e2e81d0e6b0bd15acb9411cb4d0f579925aadf0339299082fba9a98d0aa8f9.nigfpxqj.tmp
-rw-r--r-- 1 bob bob       0 Jun 20 04:42 chunks/da/e2e81d0e6b0bd15acb9411cb4d0f579925aadf0339299082fba9a98d0aa8f9.ytedyswb.tmp
-rw-r--r-- 1 bob bob 3211264 Jun 20 10:30 chunks/da/e2e81d0e6b0bd15acb9411cb4d0f579925aadf0339299082fba9a98d0aa8f9.tfimhmpw.tmp
-rw-r--r-- 1 bob bob  360448 Jun 20 12:31 chunks/da/e2e81d0e6b0bd15acb9411cb4d0f579925aadf0339299082fba9a98d0aa8f9.lxvtuoqb.tmp
-rw-r--r-- 1 bob bob  262144 Jun 20 12:31 chunks/da/e2e81d0e6b0bd15acb9411cb4d0f579925aadf0339299082fba9a98d0aa8f9

I hexdumped the 4 files which are not empty, and diffed em.

The .fcnvtdcx.tmp file is completely different (all lines are different). But for the 3 other files, they are the same, until they stop:

$ diff -bc diff -bc e2e81d0.hex e2e81d0.lxvtuoqb.hex
# First file ends at 00040000; everything is the same in both files before that

$ diff -bc e2e81d0.lxvtuoqb.hex e2e81d0.tfimhmpw.hex
# First file ends at 00058000; everything is the same in both files before that

I also checked how many .tmp file I have in there:

$ find chunks/ -name '*.tmp' | wc -l
2788

That seems like a lot…

So I checked the backup log for Jun 20:

Command: /Users/bob/bin/duplicacy -background -log backup -stats -threads 8
2020-06-20 00:40:20.875 INFO STORAGE_SET Storage set to sftp://bob@...
2020-06-20 00:40:26.785 INFO BACKUP_START Last backup at revision 179 found
2020-06-20 00:40:26.785 INFO BACKUP_INDEXING Indexing /Users/bob
2020-06-20 00:40:26.785 INFO SNAPSHOT_FILTER Parsing filter file /Users/bob/.duplicacy/filters
2020-06-20 00:40:26.786 INFO SNAPSHOT_FILTER Loaded 70 include/exclude pattern(s)
2020-06-20 02:41:28.815 INFO BACKUP_THREADS Use 8 uploading threads
2020-06-20 02:41:28.917 INFO PACK_END Packed .DS_Store (30724)
...
2020-06-20 02:41:46.168 INFO PACK_END Packed Library/Containers/com.tinyspeck.slackmacgap/Data/Library/Application Support/Slack/Service Worker/CacheStorage/4c237d5e33167c88df3e45d9c8b59fdd4d727472/036c0077-4430-4340-83a4-faaabc782981/3d24b06e6f97e91d_0 (109639)
2020-06-20 02:41:46.182 INFO PACK_END Packed Library/Containers/com.tinyspeck.slackmacgap/Data/Library/Application Support/Slack/Service Worker/CacheStorage/4c237d5e33167c88df3e45d9c8b59fdd4d727472/036c0077-4430-4340-83a4-faaabc782981/3db9f6f88b72034a_0 (499410)
2020-06-20 04:41:50.905 WARN SFTP_RETRY Encountered an error (Uploaded file but failed to store it at /mnt/hdd5/backups/duplicacy/MacBook-Air-de-bob-bob/chunks/25/3d883926ee47078363ca25f507cd9912eabbcc937dc88de0187fe8cff56bee: failed to send packet: EOF); retry after 1 second(s)
2020-06-20 04:41:50.905 WARN SFTP_RETRY Encountered an error (failed to send packet: EOF); retry after 1 second(s)
2020-06-20 04:41:50.905 WARN SFTP_RETRY Encountered an error (Uploaded file but failed to store it at /mnt/hdd5/backups/duplicacy/MacBook-Air-de-bob-bob/chunks/d5/d6a0bd4db34b7b53436a4358dacc57778432527153d476f4eb4d4d0b1e39c6: failed to send packet: EOF); retry after 1 second(s)
2020-06-20 04:41:50.905 WARN SFTP_RETRY Encountered an error (failed to send packet: EOF); retry after 1 second(s)
2020-06-20 04:41:50.939 WARN SFTP_RETRY Encountered an error (failed to send packet: EOF); retry after 1 second(s)
2020-06-20 04:41:50.941 WARN SFTP_RETRY Encountered an error (failed to send packet: EOF); retry after 1 second(s)
2020-06-20 04:41:50.945 WARN SFTP_RETRY Encountered an error (failed to send packet: EOF); retry after 1 second(s)
2020-06-20 04:41:50.949 WARN SFTP_RETRY Encountered an error (failed to send packet: EOF); retry after 1 second(s)
2020-06-20 04:41:53.518 WARN SFTP_RETRY Encountered an error (failed to send packet: EOF); retry after 2 second(s)
2020-06-20 04:41:54.681 WARN SFTP_RETRY Encountered an error (EOF); retry after 2 second(s)
2020-06-20 04:41:56.146 WARN SFTP_RETRY Encountered an error (EOF); retry after 2 second(s)
2020-06-20 04:41:57.042 WARN SFTP_RETRY Encountered an error (EOF); retry after 2 second(s)
2020-06-20 04:41:57.853 WARN SFTP_RETRY Encountered an error (EOF); retry after 2 second(s)
2020-06-20 04:41:58.656 WARN SFTP_RETRY Encountered an error (EOF); retry after 2 second(s)
2020-06-20 04:41:59.472 WARN SFTP_RETRY Encountered an error (EOF); retry after 2 second(s)
2020-06-20 04:42:00.228 WARN SFTP_RETRY Encountered an error (EOF); retry after 2 second(s)
2020-06-20 04:42:01.004 WARN SFTP_RETRY Encountered an error (EOF); retry after 4 second(s)
2020-06-20 04:42:01.770 WARN SFTP_RETRY Encountered an error (EOF); retry after 4 second(s)
2020-06-20 04:42:04.256 WARN SFTP_RETRY Encountered an error (EOF); retry after 4 second(s)
2020-06-20 04:42:05.038 WARN SFTP_RETRY Encountered an error (EOF); retry after 4 second(s)
2020-06-20 04:42:05.819 WARN SFTP_RETRY Encountered an error (EOF); retry after 4 second(s)
2020-06-20 04:42:06.561 WARN SFTP_RETRY Encountered an error (failed to send packet: EOF); retry after 4 second(s)
2020-06-20 04:42:07.412 WARN SFTP_RETRY Encountered an error (EOF); retry after 4 second(s)
2020-06-20 04:42:08.211 WARN SFTP_RETRY Encountered an error (EOF); retry after 4 second(s)
2020-06-20 04:42:08.955 WARN SFTP_RETRY Encountered an error (EOF); retry after 8 second(s)
2020-06-20 04:42:09.735 WARN SFTP_RETRY Encountered an error (EOF); retry after 8 second(s)
2020-06-20 04:42:10.506 WARN SFTP_RETRY Encountered an error (EOF); retry after 8 second(s)
2020-06-20 04:42:11.281 WARN SFTP_RETRY Encountered an error (EOF); retry after 8 second(s)
2020-06-20 04:42:12.033 WARN SFTP_RETRY Encountered an error (EOF); retry after 8 second(s)
2020-06-20 04:42:12.795 WARN SFTP_RETRY Encountered an error (EOF); retry after 8 second(s)
2020-06-20 04:42:13.596 WARN SFTP_RETRY Encountered an error (EOF); retry after 8 second(s)
2020-06-20 04:42:18.097 WARN SFTP_RETRY Encountered an error (failed to send packet: EOF); retry after 16 second(s)
2020-06-20 04:42:19.077 WARN SFTP_RETRY Encountered an error (EOF); retry after 16 second(s)
2020-06-20 04:42:19.077 WARN SFTP_RETRY Encountered an error (Uploaded file but failed to store it at /mnt/hdd5/backups/duplicacy/MacBook-Air-de-bob-bob/chunks/44/0e5293f288023e0e5ecf48519d9b36d67b787dc927d81b6bf8ba4aa3a0bec1: EOF); retry after 8 second(s)
2020-06-20 04:42:19.852 WARN SFTP_RETRY Encountered an error (EOF); retry after 16 second(s)
2020-06-20 04:42:20.627 WARN SFTP_RETRY Encountered an error (EOF); retry after 16 second(s)
2020-06-20 04:42:21.419 WARN SFTP_RETRY Encountered an error (EOF); retry after 16 second(s)
2020-06-20 04:42:22.204 WARN SFTP_RETRY Encountered an error (EOF); retry after 16 second(s)
2020-06-20 04:42:27.785 WARN SFTP_RETRY Encountered an error (failed to send packet: EOF); retry after 16 second(s)
2020-06-20 04:42:31.898 INFO UPLOAD_PROGRESS Uploaded chunk 12 size 6770581, 4KB/s 11:53:39 14.5%
...
2020-06-20 04:42:31.927 INFO PACK_END Packed Library/Containers/com.tinyspeck.slackmacgap/Data/Library/Application Support/Slack/Service Worker/CacheStorage/4c237d5e33167c88df3e45d9c8b59fdd4d727472/036c0077-4430-4340-83a4-faaabc782981/4115dc1e964c34fc_0 (175446)
2020-06-20 04:42:32.729 INFO UPLOAD_PROGRESS Uploaded chunk 16 size 2084779, 5KB/s 11:02:30 15.4%
2020-06-20 04:42:32.731 INFO PACK_END Packed Library/Containers/com.tinyspeck.slackmacgap/Data/Library/Application Support/Slack/Service Worker/CacheStorage/4c237d5e33167c88df3e45d9c8b59fdd4d727472/036c0077-4430-4340-83a4-faaabc782981/41aa9e9729ad8ac0_0 (592785)
...
2020-06-20 04:42:32.887 INFO PACK_END Packed Library/Containers/com.tinyspeck.slackmacgap/Data/Library/Application Support/Slack/Service Worker/CacheStorage/4c237d5e33167c88df3e45d9c8b59fdd4d727472/036c0077-4430-4340-83a4-faaabc782981/48181aa6ce30844a_0 (657378)
2020-06-20 04:42:35.065 INFO UPLOAD_PROGRESS Uploaded chunk 17 size 8223369, 6KB/s 08:29:46 19.1%
2020-06-20 04:42:35.074 INFO PACK_END Packed Library/Containers/com.tinyspeck.slackmacgap/Data/Library/Application Support/Slack/Service Worker/CacheStorage/4c237d5e33167c88df3e45d9c8b59fdd4d727472/036c0077-4430-4340-83a4-faaabc782981/4893b3e72cbaf660_0 (341719)
...
2020-06-20 04:42:35.112 INFO PACK_END Packed Library/Containers/com.tinyspeck.slackmacgap/Data/Library/Application Support/Slack/Service Worker/CacheStorage/4c237d5e33167c88df3e45d9c8b59fdd4d727472/036c0077-4430-4340-83a4-faaabc782981/6560dd22f5406ed5_0 (4946)
2020-06-20 04:42:36.807 WARN SFTP_RETRY Encountered an error (failed to send packet: EOF); retry after 1 second(s)
2020-06-20 04:42:36.912 WARN SFTP_RETRY Encountered an error (EOF); retry after 32 second(s)
2020-06-20 04:42:38.655 WARN SFTP_RETRY Encountered an error (EOF); retry after 32 second(s)
2020-06-20 04:42:40.288 WARN SFTP_RETRY Encountered an error (Uploaded file but failed to store it at /mnt/hdd5/backups/duplicacy/MacBook-Air-de-bob-bob/chunks/25/3d883926ee47078363ca25f507cd9912eabbcc937dc88de0187fe8cff56bee: failed to send packet: EOF); retry after 32 second(s)
2020-06-20 04:42:40.393 WARN SFTP_RETRY Encountered an error (EOF); retry after 32 second(s)
2020-06-20 06:42:40.770 ERROR UPLOAD_CHUNK Failed to upload the chunk dae2e81d0e6b0bd15acb9411cb4d0f579925aadf0339299082fba9a98d0aa8f9: ssh: handshake failed: read tcp 192.168.0.199:63391->70.31.203.222:2201: read: connection reset by peer

Of note:

  • Backup started at 00:40:20
  • Nothing logged between 00:40:26 and …
  • 02:41:28: BACKUP_THREADS Use 8 uploading threads
  • Nothing logged between 02:41:46 and …
  • 04:41:50: WARN SFTP_RETRY Encountered an error (Uploaded file but failed to store it at ...chunks/25/3d88392...: failed to send packet: EOF); retry after 1 second(s)
  • Errors continued up to 04:42:40; then nothing until …
  • 06:42:40: ERROR UPLOAD_CHUNK Failed to upload the chunk dae2e81d0e6b0bd15acb9411cb4d0f579925aadf0339299082fba9a98d0aa8f9: ssh: handshake failed: read tcp 192.168.0.199:63391->70.31.203.222:2201: read: connection reset by peer

That remote laptop is a Macbook Pro. I think duplicacy started before the laptop was closed for the night. Then, every 2h, the laptop was woken by the OS:

Power Nap
While sleeping, your mac can backup to using Time Machine and periodically check for new email, calendar, and other iCloud updates

And duplicacy tried to continue the backup when the laptop was woken, but the sftp connexion was long disconnected by then.

The same thing happened at 10:30 - 12:30.

...
2020-06-20 10:30:51.197 INFO PACK_END Packed Library/Containers/com.tinyspeck.slackmacgap/Data/Library/Application Support/Slack/Service Worker/CacheStorage/4c237d5e33167c88df3e45d9c8b59fdd4d727472/036c0077-4430-4340-83a4-faaabc782981/3db9f6f88b72034a_0 (499410)
2020-06-20 12:30:58.448 WARN SFTP_RETRY Encountered an error (failed to send packet: EOF); retry after 1 second(s)
...
2020-06-20 12:31:27.571 INFO PACK_END Packed Library/Containers/com.tinyspeck.slackmacgap/Data/Library/Application Support/Slack/Service Worker/CacheStorage/4c237d5e33167c88df3e45d9c8b59fdd4d727472/036c0077-4430-4340-83a4-faaabc782981/6560dd22f5406ed5_0 (4946)
2020-06-20 12:31:28.612 WARN SFTP_RETRY Encountered an error (EOF); retry after 1 second(s)
2020-06-20 12:31:29.357 WARN SFTP_RETRY Encountered an error (failed to send packet: EOF); retry after 16 second(s)
2020-06-20 12:31:30.233 WARN SFTP_RETRY Encountered an error (Uploaded file but failed to store it at /mnt/hdd5/backups/duplicacy/MacBook-Air-de-bob-bob/chunks/da/e2e81d0e6b0bd15acb9411cb4d0f579925aadf0339299082fba9a98d0aa8f9: failed to send packet: EOF); retry after 8 second(s)
2020-06-20 12:31:30.245 WARN SFTP_RETRY Encountered an error (EOF); retry after 16 second(s)
2020-06-20 12:31:31.102 WARN SFTP_RETRY Encountered an error (EOF); retry after 16 second(s)
2020-06-20 12:31:32.014 WARN SFTP_RETRY Encountered an error (EOF); retry after 2 second(s)
2020-06-20 12:31:34.612 WARN SFTP_RETRY Encountered an error (failed to send packet: EOF); retry after 16 second(s)
2020-06-20 12:31:34.806 WARN SFTP_RETRY Encountered an error (failed to send packet: EOF); retry after 16 second(s)
2020-06-20 12:31:35.607 WARN SFTP_RETRY Encountered an error (EOF); retry after 4 second(s)
2020-06-20 12:31:38.173 INFO UPLOAD_PROGRESS Uploaded chunk 12 size 5580726, 7KB/s 07:18:57 21.6%
2020-06-20 12:31:38.174 INFO PACK_END Packed Library/Containers/com.tinyspeck.slackmacgap/Data/Library/Application Support/Slack/Service Worker/CacheStorage/4c237d5e33167c88df3e45d9c8b59fdd4d727472/036c0077-4430-4340-83a4-faaabc782981/6568073c4af87872_0 (58858)
...
2020-06-20 12:31:41.131 INFO PACK_END Packed Library/Containers/com.tinyspeck.slackmacgap/Data/Library/Application Support/Slack/Service Worker/CacheStorage/4c237d5e33167c88df3e45d9c8b59fdd4d727472/036c0077-4430-4340-83a4-faaabc782981/a6a737efbef32941_0 (9077)
2020-06-20 12:31:41.533 WARN SFTP_RETRY Encountered an error (EOF); retry after 1 second(s)
2020-06-20 12:31:42.735 WARN SFTP_RETRY Encountered an error (EOF); retry after 2 second(s)
2020-06-20 12:31:44.582 ERROR UPLOAD_CHUNK Failed to upload the chunk 715ba7f01e2aef64938baca74ed091fc10025ed5e22091e65f598a41ae5c45d0: Uploaded file but failed to store it at /mnt/hdd5/backups/duplicacy/MacBook-Air-de-bob-bob/chunks/71/5ba7f01e2aef64938baca74ed091fc10025ed5e22091e65f598a41ae5c45d0: file does not exist

Considering you were able to restore revision 255 by excluding one file, your storage seems fixable… but simply deleting the bad chunks won’t get Duplicacy to re-upload them, sadly.

What you can do, however, is use a new backup ID, then Duplicacy will treat it as a new repository. It’ll skip chunks that already exist on the storage, so you’ll need to identify all the bad chunks related to that file (not the .tmp files, the chunk files with no file extension) and delete or rename them out of the way.

Then do another test restore, from that new backup ID (revision 1) and if successful, tidy up your storage with a prune -exclusive.

By the way, there is an edge case - where some of your old snapshots are referencing bad chunks, so keeping them around may be risky for future backups. i.e. Bad chunks exist and are still referenced, so won’t get pruned any time soon. If future backups, using the new backup ID, somehow re-reference those chunks through the deterministic nature of the hashing algorithm, your new backups become ‘infected’. An unlikely scenario, but possible.

So, if able, try to identify all bad chunks and simply delete them. That’s if you want to keep your old snapshots around (which you can do; just not their chunks). To be safe, just delete/prune all previous snapshots and do a thorough prune -exclusive, but only after your new backup ID revisions 1 has completed 100%.

Thanks for providing the detailed logs and other useful information. It is clear now that ignoring the returned value of Close() is definitely a bug. If the connection error is detected during the call to Close(), then Duplicacy will continue to rename the incomplete chunk file, which may succeed if another thread happens to create a new connection before the rename operation.

I think CLI 2.6.2 should fix this bug. And @Droolio is right, you need to clean up bad chunks (even at the cost of removing some revision) otherwise they might corrupt future backups.

1 Like

Thanks for the pointer.
What’s is the best way to find all the bad chunks?
Check each revision one at a time, and when it reports a bad chunk, delete it, and the revision?

In this case I might suggest check -chunks, but I dunno if you can tell it to keep checking til the end and report all bad chunks. So you might have to run it several times? Obviously best if you can run Duplicacy local to the storage.

Seems like it would be helpful to modify the behavior of check so that it lists all invalid chunks in one go instead of aborting on the first one–not sure how hard that would be to implement though.

I am currently in a situation where I have a bad chunk due to the SFTP bug but I’m having a heck of a time tracking down which revision it belongs to myself