Understanding backup and check stats (revisions and storage sizes)

long term duplicacy user here (licensed). been using it for many than 3-4 years IIRC. before the actual question - i’d like to point out that i did try to search the forum extensively before making a new post. the closest threads to my question are Backup vs check stats confusion - #3 by 4degrees and Understanding the -tabular output of the check command, however they don’t answer the below questions.

context: i use duplicacy in a single source (synology NAS), single storage (B2) situation.

first some logs/screenshots:
(a) tail of the output from the most recent backup log:

2022-01-03 09:59:00.667 INFO BACKUP_END Backup for /backuproot at revision 124 completed
2022-01-03 09:59:00.667 INFO BACKUP_STATS Files: 2682641 total, 8530G bytes; 881 new, 15,666M bytes
2022-01-03 09:59:00.667 INFO BACKUP_STATS File chunks: 1865691 total, 8533G bytes; 2922 new, 14,610M bytes, 14,592M bytes uploaded
2022-01-03 09:59:00.667 INFO BACKUP_STATS Metadata chunks: 267 total, 1,349M bytes; 23 new, 178,457K bytes, 56,824K bytes uploaded
2022-01-03 09:59:00.667 INFO BACKUP_STATS All chunks: 1865958 total, 8534G bytes; 2945 new, 14,784M bytes, 14,648M bytes uploaded
2022-01-03 09:59:00.667 INFO BACKUP_STATS Total running time: 00:57:56

(b) json payload sent by backup to the report_url

{
	"computer": "duplicacy",
	"directory": "/backuproot",
	"end_time": 1641184141,
	"new_chunk_size": 15502147584,
	"new_chunks": 2945,
	"new_file_chunk_size": 15319695360,
	"new_file_chunks": 2922,
	"new_file_size": 16426991616,
	"new_files": 881,
	"new_metadata_chunk_size": 182739968,
	"new_metadata_chunks": 23,
	"result": "Success",
	"start_time": 1641180659,
	"storage": "thevault",
	"storage_url": "b2://somewhere",
	"total_chunk_size": 9163312726016,
	"total_chunks": 1865958,
	"total_file_chunk_size": 9162238984192,
	"total_file_chunks": 1865691,
	"total_file_size": 9159017758720,
	"total_files": 2682641,
	"total_metadata_chunk_size": 1414529024,
	"total_metadata_chunks": 267,
	"uploaded_chunk_size": 15359541248,
	"uploaded_file_chunk_size": 15300820992,
	"uploaded_metadata_chunk_size": 58187776
}

(c) the tabular output from the most recent check (after the above backup):

2022-01-03 11:37:25.436 INFO SNAPSHOT_CHECK 
     snap | rev |                          |   files | bytes |  chunks | bytes |    uniq |    bytes |     new |    bytes |
 thevault |  95 | @ 2021-06-18 21:02       | 2643407 | 7462G | 1073501 | 4949G |      42 |  45,505K | 1073501 |    4949G |
 thevault |  98 | @ 2021-07-09 21:02       | 2643903 | 7470G | 1075052 | 4956G |     141 | 683,896K |    1593 |   7,541M |
 thevault |  99 | @ 2021-07-16 21:02       | 2645020 | 7503G | 1081510 | 4987G |      26 |  37,809K |    7525 |  38,018M |
 thevault | 100 | @ 2021-07-24 10:38       | 2650286 | 7531G | 1086457 | 5010G |      27 |  45,918K |    4981 |  23,698M |
 thevault | 103 | @ 2021-08-06 21:02       | 2650140 | 7598G | 1094636 | 5049G |      29 |  36,722K |    8208 |  40,239M |
 thevault | 104 | @ 2021-08-13 21:02       | 2650385 | 7609G | 1097046 | 5061G |       7 |  18,806K |    2440 |  12,117M |
 thevault | 105 | @ 2021-08-20 21:02       | 2650616 | 7614G | 1098067 | 5066G |       5 |  17,743K |    1028 |   5,021M |
 thevault | 106 | @ 2021-08-27 21:02       | 2650799 | 7621G | 1099536 | 5073G |       7 |  17,850K |    1474 |   7,352M |
 thevault | 107 | @ 2021-09-03 14:08       | 2672463 | 7804G | 1121735 | 5183G |    2919 |  16,455M |   22302 | 112,397M |
 thevault | 109 | @ 2021-09-10 21:02       | 2652984 | 7895G | 1126856 | 5208G |      15 |  59,485K |    8277 |  43,497M |
 thevault | 110 | @ 2021-09-17 21:02       | 2653873 | 7923G | 1131641 | 5232G |      16 |  29,868K |    4800 |  24,164M |
 thevault | 111 | @ 2021-09-24 21:02       | 2654568 | 7951G | 1135946 | 5253G |      33 |  49,949K |    4343 |  21,340M |
 thevault | 112 | @ 2021-10-01 21:02       | 2655123 | 7976G | 1139534 | 5270G |     560 | 870,182K |    4219 |  18,187M |
 thevault | 113 | @ 2021-10-08 21:02       | 2655531 | 7991G | 1141551 | 5280G |     483 | 739,608K |    3208 |  14,033M |
 thevault | 114 | @ 2021-10-22 21:02       | 2665450 | 8211G | 1166750 | 5402G |      35 |  52,722K |   25100 | 123,148M |
 thevault | 115 | @ 2021-10-29 21:02       | 2673457 | 8310G | 1173829 | 5437G |      29 |  50,166K |    7131 |  35,930M |
 thevault | 116 | @ 2021-11-05 21:02       | 2674574 | 8392G | 1180754 | 5471G |     608 | 907,362K |    9547 |  45,088M |
 thevault | 117 | @ 2021-11-12 21:02       | 2675090 | 8424G | 1186926 | 5501G |       9 |  20,457K |    6782 |  31,525M |
 thevault | 118 | @ 2021-11-19 21:02       | 2675658 | 8437G | 1189235 | 5512G |      10 |  33,245K |    2322 |  11,345M |
 thevault | 119 | @ 2021-11-26 21:02       | 2676080 | 8451G | 1191442 | 5522G |      11 |  39,101K |    2217 |  10,790M |
 thevault | 120 | @ 2021-12-03 21:02       | 2676912 | 8467G | 1194367 | 5536G |      12 |  15,747K |    2940 |  13,898M |
 thevault | 121 | @ 2021-12-10 21:02       | 2680029 | 8493G | 1197534 | 5551G |      11 |  17,423K |    3188 |  15,746M |
 thevault | 122 | @ 2021-12-17 21:02       | 2680455 | 8504G | 1199585 | 5561G |       9 |  22,628K |    2177 |  10,271M |
 thevault | 123 | @ 2021-12-24 21:02       | 2681893 | 8514G | 1201818 | 5572G |      10 |  22,905K |    2243 |  11,019M |
 thevault | 124 | @ 2022-01-03 09:03       | 2682641 | 8530G | 1204733 | 5586G |    2945 |  14,648M |    2945 |  14,648M |
 thevault | all |                          |         |       | 1214491 | 5624G | 1214491 |    5624G |         |          |

(d) storage size in duplicacy-web:
image

(e) storage size on backblaze website:
image

questions:

  1. For the latest revision (124), the backup log (a) shows new file chunks size as 14,610M bytes and all (file+metadata chunk) size as 14,784M bytes, however the check table (c) shows both the uniq and new chunk size as 14,648M. why the difference?
  2. what is the difference between the ‘uniq’ and ‘new’ column sets? how can new chunks for a revision be > uniq chunks to that revision?
  3. The check table (c) shows the ALL chunk size as 5624G but both duplicacy-web (d) and backblaze b2 website (e) show it as 6.04T (6,044.6GB). The only explanation is 6044000000000/1024/1024/1024 = 5628G, which means both duplicacy-web and backblaze are showing TB/GB and logs are showing TiB/GiB… which indeed seems to be the case if you check the json payload (b) and compare it to the backup sizes shown in (a)

14,784M is the original size and 14,648M is the size after compression and encryption (including some chunk headers as well). This number 14,648M does match with what was uploaded:

2022-01-03 09:59:00.667 INFO BACKUP_STATS All chunks: 1865958 total, 8534G bytes; 2945 new, 14,784M bytes, 14,648M bytes uploaded

‘uniq’ means chunks not shared by any other revisions. ‘new’ means chunks not seen in previous revisions but maybe shared with later revisions.

Yes, this is a discrepancy. I believe the use of KB, MB, and GB was wrong and should be replaced by KiB, MiB and GiB, but changing that may break the backward compatibility.

2 Likes

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.