Is it normal to have a very large "fossils" folder?

I noticed my “fossils” folder in the Duplicacy directory has a size of 2.2 TB. Simply judging by the name of the folder, I’m wondering if it’s normal that it’s so large? Is that all data that could be deleted?

I am running the prune command with the flags -keep 2:60 -keep 1:7 -a -threads 100 once a week. So I would have assumed that anything that’s not required would be always be deleted by that.

If prune is interrupted, there may be some orphans left in the datastore.

Run prune -exhaustive to clean up orphans.

Don’t delete anything manually.

Also, 100 threads?! What is the target?

Ok, thanks! And the target for this is Google Drive, when I configured the prune back then, 100 threads was what worked the fastest without leading to timeouts.

When I was using in the past google drive via shared access, there were some issues that prune was failing to delete data, unless I gave the “manager” access to the account. Prune Fails: GCD_RETRY: The user does not have sufficient permissions for this file - #21 by saspus. Probably not what happens in your case – but see if maybe there are some silent failures, 100 threads seems too much :slight_smile:

I have the same issue here: my duplicacy fossils folder is very large, with the oldest file being from February 2024, even though i run prune weekly.

I have tried multiple times to prune (even -exclusive -exhaustive), but the fossils aren’t deleted.

I expected that pruning should delete the fossils eventually, or, if the fossils are still needed, then they will be moved back into the chunks folders (during check -resurrect), but that didn’t happen either.

Could someone help me please?

Duplicacy version:

VERSION:    3.2.5 (2DEF01)

Prune command used

-d -log prune -a -exhaustive -exclusive -threads 32 -keep 0:700 -keep 365:365 -keep 30:180 -keep 7:30 -keep 1:7

Oldest fossil:

Youngest fossil:

Size:

Combining -exclusive and -keep in the same command is NOT recommended my dude. See.

What backend are you using? Normal file storage?

Seems weird you have a fossils directory at all, because local file and ssh backends rename chunks to .fsl. Are they being accessed in some other way?

BTW it’s been my experience that you need to use -fossils along with -resurrect for the latter to be effective.

Anyway, try -exhaustive -exclusive on its own and see what it does.

Only thing I can think of is there’s some permission issue preventing them being deleted, but I’m still wondering about the fossils\ vs .fsl thing…

Edit: perhaps this was an old storage pulled from cloud and Duplicacy is seeing the fossil\ path and defaulting to that behaviour?

Thank you @Droolio!

Combining -exclusive and -keep in the same command is NOT recommended my dude. See.

I’m sorry, i still don’t understand why after reading your link and the search you posted. To me it looks like they can be used together without any issue, as long as no other backups occur at the same time. (too much bold? maybe too much bold)

My statement is based on this:

  • keep <n:m> - keep 1 snapshot every n days for snapshots older than m days
    • i use the same config in my day-to-day backup
  • exhaustive - remove all unreferenced chunks
    • yes please, i want fossils deleted
  • exclusive - assume exclusive access to the storage (disable two-step fossil collection)
    • yes: i don’t want fossil collection, but i hope fossil deletion still happens, as that’s part of the process right? RIGHT?
  • The last revision can only be deleted in -exclusive mode
    • that’s okay. i don’t care if i lose any revision at any ends of the time (later edit) in this instance, as what i care about now is to cleanup my fossils

What backend are you using? Normal file storage?

Seems weird you have a fossils directory at all, because local file and ssh backends rename chunks to .fsl. Are they being accessed in some other way?

Using Google Drive, both via API and via Google Drive app on desktop. All my automated backups and prune do it only via the API, and i tried to run the prune both with the API and the desktop app. Neither seem to touch the fossils folder.
So the folder is probably created as normal backups are done via the API.

… permissions …

I don’t think this applies to me as uploading works and pruning works. Everything seemed to work except for cleanup of the fossils folder.

perhaps this was an old storage pulled

I don’t think so? I started using Duplicacy with Google Drive long long long ago, and all my backups from all my devices are stored in GD since the beginning.

Anyway, try -exhaustive -exclusive on its own and see what it does.

Trying….

.\.duplicacy\z.exe -d -log prune -a -exhaustive -exclusive -threads 32
  • :x: When using app, it doesn’t do anything. :frowning:
  • :white_check_mark: When using the API, it deleted all the fossils. So that’s nice. :tada:
    Thank you for the suggestion @Droolio

Now comes the next question: why does -keep interfere with -exhaustive (or with -exclusive?)? I run prune multiple times per month (automatically) (-all -exhaustive -keep 0:700 -keep 365:365 -keep 30:180 -keep 7:30 -keep 1:7, so the fossils entries from < 1 month ago should have been deleted, but they were not.
There is nothing in the description of the flags which suggests this to me.

@gchen do you have an answer?

I suspect this is because Duplicacy treats the Google Drive app files as local files, which does a few things differently to cloud/API. That being: expecting .fsl files instead of looking in fossils\. Personally, I’d stick with the API.

I can’t explain why they’re not deleting fossils, but regardless - you shouldn’t be using these options together in the same command. :slight_smile:

You might be alright if you run prune often enough, but -exclusive removes the safeguard of deleting the last revision. I know you say you don’t mind losing the last revision, but it kinda makes no sense when you consider Duplicacy uses the last revision to do its incremental backup! (And TBH, you should only be running -exclusive as a one-off for tidying up things anyway.) Run -keep separately.

I updated my wording to clarify. I used -exclusive only in this instance, because i wanted to delete the fossils. I do not use -exclusive otherwise. I know of the safeguard of 2-step fossil collection, and i am OK with that.

Updated wording from the post above: