Memory usage problems

Hi! I have a serious memory usage problem with the web edition when it does checks and prunes. For a while I’ve had the problem where every morning when I go back to the computer (which is always on) I find some apps closed and the dialog on my Mac saying that the system ran out of memory. Finally I figured it out… it’s Duplicacy! I have a schedule configured to do a check+prune task on both my local backup and the copy to Wasabi at 5am. I have triggered it manually and the memory usage climbed towards the end to almost 20GB of memory before I stopped it!! (I have 24GB total)

I am happy I found the cause but how do I fix it? I came to check if there’s a new version but the latest is still the 0.2.10 which is what I am using.

Can you please advice? Thanks in advance!

So I split the check and prune tasks into separate schedules, and it appears that the problem is with the check task. My backup schedule runs every 15 minutes. Could it be that I have “too many” revisions to check?

@SkyLinx how many revisions are there and how large is the storage directory?

Hi, there were 4500 revisions and the storage used was 900GB more or less. To manage to get the check completed as a temporary workaround I pruned revisions older than 3 days so I have just a few hundreds on the local storage. It’s now deleting the chunks with another prune. I’m going to do the same with the Wasabi copy but if there is a fix or something to avoid such issues it would be great. Is there a limit to the number of revisions or storage that duplicacy can handle?

Also I don’t understand how come the storage in use was 900GB when the actual repository is around 150GB despite the content remaining more or less the same over time. Now after pruning it’s back to normal size.

Define “more or less the same”. The devil may be in the details here.

Or perhaps you did some testing but never did any pruning? That way, anything duplicacy ever uploaded to that storage would still have been there.

I mean that apart from changes to the code I am working on, pictures etc not much has changed since I started backing up with Duplicacy.

Since code files are really small, :d: can’t deduplicate those so every new snapshot will re-add them. Since you backup every 15 minutes (i imagine you also backup your .git/ folder), I’d imagine there are quite a lot of duplicates in your storage.

Hi. Sounds weird because the repos are not that big, and even if I had changed all the code continuously still wouldn’t account for that massive difference I think.

Anyway I did a prune on the Wasabi copy too deleting all snapshots older than 3 days, but when I alternate check and prune I get this in the logs:

2019-04-16 13:21:36.902 INFO RETENTION_POLICY Keep no snapshots older than 3 days
2019-04-16 13:21:36.902 INFO RETENTION_POLICY Keep 1 snapshot every 1 day(s) if older than 7 day(s)
2019-04-16 13:21:40.615 INFO FOSSIL_COLLECT Fossil collection 1 found
2019-04-16 13:21:40.615 INFO FOSSIL_POSTPONE Fossils from collection 1 can't be deleted because deletion criteria aren't met
2019-04-16 13:21:40.618 INFO FOSSIL_COLLECT Fossil collection 2 found
2019-04-16 13:21:40.618 INFO FOSSIL_POSTPONE Fossils from collection 2 can't be deleted because deletion criteria aren't met
2019-04-16 13:21:40.618 INFO SNAPSHOT_NONE No snapshot to delete

And it doesn’t seem to delete the chunks no longer needed as it did for the local backup. What should I do? Storage in use by the Wasabi copy is still 967 GB while the local backup is 144 GB now. Thanks.

I am still wondering about how to avoid these issues in the future as I would like to keep some backups for a while. Any ideas?

4500 revisions sounds like your problem wrt to memory consumption and pruning those would help, but I’d suggest keeping revisions of code would be more efficient with git (if you’re not already using it).

Duplicacy can then backup such repositories and you wouldn’t have to keep so many backup revisions - just use a sensible pruning policy, like less time gap between revisions the further back you go. i.e. reduce hourly to daily after a week, daily to weekly after a month, weekly to monthly after 12 months etc.

To answer your query as to why your storage was 900GB while the repo was only 150GB, run a check -all -stats -tabular to see the delta info.

And regards to your ‘No snapshot to delete’ log - that’s probably because no new backup has been made/copied to Wasabi since those fossils were collected. This is how Duplicacy does what it does, using a two-step fossil collection algorithm.

You can either force a prune with the -exclusive option (make sure no backup/copies are running!) or wait til you’ve copied a newer revision for each of your repos, from local to Wasabi. Then, another prune should delete those collected fossils (minus any that were found to be needed by your last copied revisions; hence the need for the two-step algorithm), and it will remove those associated snapshots.

1 Like

Thanks, after a new copy to Wasabi I did a prune again and it’s now deleting the chunks no longer needed. I guess this is going to take quite a while. I’m surprised though that 4500 revisions are too many for Duplicacy… anyway I have now changed the schedule to do a backup every hour instead of every 15 minutes so there should be only 168 revisions per the last week, and the prune to keep less revisions in general.

This simple change Fix a memory issue that check -tabular uses too much memory with many… · gilbertchen/duplicacy@4b69c11 · GitHub should significantly reduce the memory usage when there are hundreds or thousands of revisions.

1 Like

Hi, how to I apply this change to the web edition I’m using? Thanks