(My problem is apparently the same as On check: Failed to decrypt the chunk <hash> cipher: message authentication failed; retrying
But I didn’t want revive a Feb 2018 thread.)
I backup various things to Google Drive, using 6 different backups that share the same target (i.e. use the same encryption password).
And I run check
after each backup, but never ran check -chunks
before, because my chunks folder on Google Drive is 5.6 TB.
A few days ago, I started running a check -chunks -r <latest_revision>
on all 6 backups (sequentially).
That found 4 corrupted chunks (plus a 5th I am still working to fix, so let’s ignore that one for now).
Most (I think for 4 out of 5) show the following error:
2020-10-06 02:49:45.834 WARN DOWNLOAD_RETRY Failed to decrypt the chunk 6e134f4ea72cee3f8c4d64373f9ebeb60f706a0e43422ba97f627f49a7505f67: cipher: message authentication failed; retrying
2020-10-06 02:49:48.033 WARN DOWNLOAD_RETRY Failed to decrypt the chunk 6e134f4ea72cee3f8c4d64373f9ebeb60f706a0e43422ba97f627f49a7505f67: cipher: message authentication failed; retrying
2020-10-06 02:49:49.609 WARN DOWNLOAD_RETRY Failed to decrypt the chunk 6e134f4ea72cee3f8c4d64373f9ebeb60f706a0e43422ba97f627f49a7505f67: cipher: message authentication failed; retrying
2020-10-06 02:49:51.054 WARN DOWNLOAD_DECRYPT Failed to decrypt the chunk 6e134f4ea72cee3f8c4d64373f9ebeb60f706a0e43422ba97f627f49a7505f67: cipher: message authentication failed
I was able to recreate those chunks by renaming them to <hash>.trash
on Google Drive, changing the backup id
in the preferences
file, and executing backup
once. (I cancelled the backup as soon as the 4 chunks were uploaded, and changed back the preferences
id
value.)
Some of the corrupted chunks were created in 2018, others in July of this year:
-rw------- 1 gb staff 6872040 11 Sep 2018 6e/134f4ea72cee3f8c4d64373f9ebeb60f706a0e43422ba97f627f49a7505f67.trash
-rw------- 1 gb staff 6947622 4 Dec 2018 d7/c6ac07da0ae8196474aa0c25798aa87577beb4e4e11a8ade0f978af78aa3c2.trash
-rw------- 1 gb staff 16693798 12 Apr 04:59 0b/fc236e0d5ba29dd6d81a9fd5e449824a1b81c68742c350d183128963638e5d.trash
-rw------- 1 gb staff 3735846 7 Jul 11:47 f8/e966fe94683f58f72bb7f78ad2ea17ef1269d0495d185acf85c069a5a9f096.trash
One of the chunks (6e
) has a different file size; the others are the same size before (corrupted) and after (ok). Here’s the ok chunks files’ sizes:
-rw------- 1 gb staff 6872102 5 Oct 08:52 6e/134f4ea72cee3f8c4d64373f9ebeb60f706a0e43422ba97f627f49a7505f67
-rw------- 1 gb staff 6947622 5 Oct 11:22 d7/c6ac07da0ae8196474aa0c25798aa87577beb4e4e11a8ade0f978af78aa3c2
-rw------- 1 gb staff 16693798 5 Oct 11:26 0b/fc236e0d5ba29dd6d81a9fd5e449824a1b81c68742c350d183128963638e5d
-rw------- 1 gb staff 3735846 5 Oct 10:45 f8/e966fe94683f58f72bb7f78ad2ea17ef1269d0495d185acf85c069a5a9f096
I compared the hexdump of the corrupted and ok chunk 6e
(the one with a different file size):
$ hexdump -C 6e134f4ea72cee3f8c4d64373f9ebeb60f706a0e43422ba97f627f49a7505f67.trash | head -3
00000000 64 75 70 6c 69 63 61 63 79 00 a2 88 0c a0 dd 19 |duplicacy.......|
00000010 f9 5d df d9 79 f3 ef 7e 44 a9 06 1f e6 10 1d ef |.]..y..~D.......|
00000020 be ed 18 7a eb 4a c7 5f d3 4f 2e d0 a5 47 7c 47 |...z.J._.O...G|G|
$ hexdump -C 6e134f4ea72cee3f8c4d64373f9ebeb60f706a0e43422ba97f627f49a7505f67 | head -3
00000000 64 75 70 6c 69 63 61 63 79 00 e5 fb 23 72 64 26 |duplicacy...#rd&|
00000010 56 24 02 36 3e b5 a9 9c 26 45 45 f7 c8 cf 4e cc |V$.6>...&EE...N.|
00000020 aa 38 6f 77 ce f0 58 51 1c d0 0f 5f b9 04 c4 b5 |.8ow..XQ..._....|
Then I compared the beginning of the hexdump of the 3 other corrupted chunks, versus their OK counterpart. They seem to differ in the same way:
$ hexdump -C d7c6ac07da0ae8196474aa0c25798aa87577beb4e4e11a8ade0f978af78aa3c2.trash | head -3
00000000 64 75 70 6c 69 63 61 63 79 00 a2 c8 fb 6a 59 23 |duplicacy....jY#|
00000010 90 03 ec 06 a3 e3 dc 36 f4 e4 87 cb 70 e6 62 3b |.......6....p.b;|
00000020 0c b0 25 69 e0 e3 9c 37 3c ec 1b ad 7d c0 0e fb |..%i...7<...}...|
$ hexdump -C d7c6ac07da0ae8196474aa0c25798aa87577beb4e4e11a8ade0f978af78aa3c2 | head -3
00000000 64 75 70 6c 69 63 61 63 79 00 17 cd ac a2 75 38 |duplicacy.....u8|
00000010 c6 cd e6 41 d6 d0 0d 0d fc 15 0b 2e 56 f3 ce c6 |...A........V...|
00000020 80 b5 c3 bb 89 49 2d 0c f6 c8 f3 53 86 12 c5 e2 |.....I-....S....|
$ hexdump -C 0bfc236e0d5ba29dd6d81a9fd5e449824a1b81c68742c350d183128963638e5d.trash | head -3
00000000 64 75 70 6c 69 63 61 63 79 00 f5 06 ee 57 34 02 |duplicacy....W4.|
00000010 e1 a5 d4 d4 2b 58 64 7c 3d fb 38 49 84 44 c9 f4 |....+Xd|=.8I.D..|
00000020 2c 83 00 4f 81 5c 50 7d 1b 8d 4c f1 4a 52 d5 fd |,..O.\P}..L.JR..|
$ hexdump -C 0bfc236e0d5ba29dd6d81a9fd5e449824a1b81c68742c350d183128963638e5d | head -3
00000000 64 75 70 6c 69 63 61 63 79 00 05 c8 db e3 55 47 |duplicacy.....UG|
00000010 55 cb ac 73 c4 15 43 6c bd 72 ca 80 26 fd e8 3d |U..s..Cl.r..&..=|
00000020 86 e7 bb b5 3f be 46 49 9d 8f 07 96 66 11 9a 2e |....?.FI....f...|
$ hexdump -C f8e966fe94683f58f72bb7f78ad2ea17ef1269d0495d185acf85c069a5a9f096.trash | head -3
00000000 64 75 70 6c 69 63 61 63 79 00 60 d4 d9 53 01 88 |duplicacy.`..S..|
00000010 75 f6 13 5f 45 da d3 ee 2f 58 bf 6b dd d6 f0 7c |u.._E.../X.k...||
00000020 c5 c0 f2 1f 9f d2 d2 29 18 00 22 4a a3 90 2d ec |.......).."J..-.|
$ hexdump -C f8e966fe94683f58f72bb7f78ad2ea17ef1269d0495d185acf85c069a5a9f096 | head -3
00000000 64 75 70 6c 69 63 61 63 79 00 16 86 68 12 a7 71 |duplicacy...h..q|
00000010 55 0d df b3 2a f3 0a 40 6b d7 ab db 43 bb 52 b2 |U...*..@k...C.R.|
00000020 fd 2a d1 60 24 79 97 82 23 21 71 3a af e7 2d c8 |.*.`$y..#!q:..-.|
That seems pretty different… Not sure what could have caused this corruption…
I checked the Manage versions
of the 4 failed chunks on Google Drive; all of them show just one version (ie. they were written once, and never changed afterwards).
I have 740,000 chunks files (in my most recent revisions). 4 out of 740,000 is a pretty low number, but it is still worrying, because the only way to notice something is wrong is to re-download everything.
Sadly, I didn’t keep the backup logs of when those 4 chunks were created (I only kept 1 month of logs… Changed to to kepp all logs now.)
I will try to run a check -chunks
(for all revisions, not just the latest). Maybe I’ll be “lucky”, and find a corrupted chunk created in the last month… But it took more than 36h to complete a check
of just the last revisions (with 20 threads, totaling 20-25MB/s); not sure how long it will take to check all revisions.
Ideas or pointers on how this could be debugged further?
Thanks.