Why is my duplicacy backup so large?

I run daily backups to google drive of a raspberry pi via duplicacy, and I recently was notified that it was taking up more than 50% of the 15GB allocation. After I ran the first backup, the total gdrive storage used by duplicacy was around 3.05GB, which has recently increased to around 8.1GB after just over a month or so.

After seeing this increase, I just made my prune “keep” rules more strict (duplicacy prune -exclusive -keep 0:30 -keep 2:7 -threads 30) to reduce the number of revision saved, and now I am down to 6.55GB total used with the following stats:

sudo duplicacy check -tabular -threads 30
Storage set to gcd://Duplicacy Backups/BIQU B1
Listing all chunks
1 snapshots and 19 revisions
Total chunk size is 6,705M in 4628 chunks
All chunks referenced by snapshot biqub1 at revision 26 exist
All chunks referenced by snapshot biqub1 at revision 28 exist
All chunks referenced by snapshot biqub1 at revision 30 exist
All chunks referenced by snapshot biqub1 at revision 32 exist
All chunks referenced by snapshot biqub1 at revision 34 exist
All chunks referenced by snapshot biqub1 at revision 36 exist
All chunks referenced by snapshot biqub1 at revision 38 exist
All chunks referenced by snapshot biqub1 at revision 40 exist
All chunks referenced by snapshot biqub1 at revision 42 exist
All chunks referenced by snapshot biqub1 at revision 44 exist
All chunks referenced by snapshot biqub1 at revision 46 exist
All chunks referenced by snapshot biqub1 at revision 48 exist
All chunks referenced by snapshot biqub1 at revision 49 exist
All chunks referenced by snapshot biqub1 at revision 50 exist
All chunks referenced by snapshot biqub1 at revision 51 exist
All chunks referenced by snapshot biqub1 at revision 52 exist
All chunks referenced by snapshot biqub1 at revision 53 exist
All chunks referenced by snapshot biqub1 at revision 54 exist
All chunks referenced by snapshot biqub1 at revision 55 exist

   snap | rev |                          |  files |   bytes | chunks |  bytes | uniq |   bytes |  new |    bytes |
 biqub1 |  26 | @ 2023-11-19 02:25       | 104816 | 10,115M |   1706 | 3,674M |   78 | 60,016K | 1706 |   3,674M |
 biqub1 |  28 | @ 2023-11-21 07:35       | 104828 | 10,429M |   1770 | 3,753M |   54 | 48,841K |  142 | 140,900K |
 biqub1 |  30 | @ 2023-11-24 07:33       | 104914 | 10,745M |   1834 | 3,924M |   73 | 56,551K |  371 | 464,092K |
 biqub1 |  32 | @ 2023-11-26 07:35       | 104903 | 10,829M |   1879 | 3,931M |  105 | 66,826K |  344 | 315,750K |
 biqub1 |  34 | @ 2023-11-28 15:16       | 104923 | 10,345M |   1789 | 3,755M |   47 | 41,162K |  120 | 106,490K |
 biqub1 |  36 | @ 2023-11-30 07:32       | 105159 | 10,355M |   1801 | 3,772M |   65 | 59,850K |  222 | 248,921K |
 biqub1 |  38 | @ 2023-12-02 07:32       | 105241 | 10,483M |   1801 | 3,784M |   73 | 77,686K |  196 | 219,792K |
 biqub1 |  40 | @ 2023-12-04 07:34       | 111127 | 10,463M |   1820 | 3,832M |   47 | 36,120K |  187 | 255,334K |
 biqub1 |  42 | @ 2023-12-06 08:34       | 111138 | 10,595M |   1834 | 3,877M |   55 | 40,086K |  143 | 165,766K |
 biqub1 |  44 | @ 2023-12-08 07:35       | 111163 | 10,577M |   1833 | 3,900M |   50 | 30,848K |  176 | 179,063K |
 biqub1 |  46 | @ 2023-12-10 07:34       | 111229 | 10,675M |   1856 | 3,919M |   72 | 40,771K |  206 | 211,973K |
 biqub1 |  48 | @ 2023-12-12 07:35       | 111235 | 10,620M |   1818 | 3,924M |   48 | 38,450K |  164 | 178,322K |
 biqub1 |  49 | @ 2023-12-13 07:36       | 111236 | 10,599M |   1837 | 3,922M |   58 | 35,453K |  123 | 100,312K |
 biqub1 |  50 | @ 2023-12-14 07:33       | 111256 | 10,636M |   1840 | 3,937M |   59 | 48,160K |  127 | 145,122K |
 biqub1 |  51 | @ 2023-12-15 07:33       | 111289 | 10,545M |   1816 | 3,922M |   35 | 24,374K |  107 | 100,889K |
 biqub1 |  52 | @ 2023-12-16 07:32       | 111292 | 10,547M |   1816 | 3,924M |   34 | 25,898K |   90 |  89,962K |
 biqub1 |  53 | @ 2023-12-18 07:33       | 111307 | 10,600M |   1833 | 3,940M |   13 |  7,665K |  133 | 134,129K |
 biqub1 |  54 | @ 2023-12-18 15:44       | 111307 | 10,632M |   1852 | 3,945M |   13 |  3,506K |   55 |  40,595K |
 biqub1 |  55 | @ 2023-12-18 16:01       | 111307 | 10,634M |   1854 | 3,948M |   16 |  6,256K |   16 |   6,256K |
 biqub1 | all |                          |        |         |   4628 | 6,705M | 4628 |  6,705M |      |          |

My questions are:

  1. Why is the size of the “all” backup category 6,705M, when the latest revision is only 3,948M total? It’s quite a big increase so I was curious what the discrepancy was from and how to fix it. I noticed that the sum of the new “bytes” column adds up to 6,705M, so then if 6,705M is the true size, where does 3,948M come from?
  2. In the new “bytes” column, each daily backup is usually on the order of 100MB or more. This is unexpectedly high, so how can I analyse the backups to figure out which files on the host system are taking up the most space in the backup by changing most often? This is so that I can then potentially exclude those files from the backup filter list.

For reference, I am backing up the entire root folder of the raspbian OS with the following exclusions via this filters list:

-dev/*
-proc/*
-sys/*
-tmp/*
-run/*
-mnt/*
-media/*
-lost+found/*
+boot/
+boot/config.txt
-boot/*
+home/
+home/user/
+home/user/printer_data/
+home/user/printer_data/comms/
-home/user/printer_data/comms/*

Therefore, my gut feeling is that the large size of each incremental daily backup could be due to backing up all of the log files on the system which would be changing daily, but as I said, I would like to find a way to prove this is true and exclude the worst offenders if I can.

You can use list to see what was backed up in each revision and use diff to compare revisions and exclude transient stuff you don’t need