Snapshot format documentation no longer correct?

The snapshot format, as documented in the github wiki, doesn’t seem to match the current behavior. For example, here’s a simple backup I made of a folder with a single file called “test”, with contents “asdf”.

Is there more up-to-date documentation somewhere? Or can someone at least tell me where to start looking in the source code?

My efforts to reverse-engineer it are below:

Firstly, there’s no mention of compression-- but I figured out that LZ4 prefix pretty easily. The json output is then:

  "chunks": [
  "end_time": 1717355775,
  "file_size": 4,
  "files": [
  "id": "test-src",
  "lengths": [
  "number_of_files": 1,
  "options": "-hash",
  "revision": 1,
  "start_time": 1717355775,
  "tag": "",
  "version": 1

Firstly, a few of the fields aren’t mentioned.

The chunk file appears to be a list of numbers. In this case it’s [4]? Which is the size of the file in bytes, but maybe it’s a coincidence.

The files 0d55... and bdec... don’t exist in the repo.

There are two other files: 877f649... which appears to be the file contents, and 47428... which appears to not be json, but I see the name of the file in there… and a ref to b913... which I don’t see anywhere either.

My .duplicacy/preferences file is:

        "name": "default",
        "id": "test-src",
        "repository": "",
        "storage": "REDACTED\\test-src\\..\\dest\\",
        "encrypted": false,
        "no_backup": false,
        "no_restore": false,
        "no_save_password": false,
        "nobackup_file": "",
        "keys": null,
        "filters": "",
        "exclude_by_attribute": false

And the repo config is:

    "compression-level": 100,
    "average-chunk-size": 4194304,
    "max-chunk-size": 16777216,
    "min-chunk-size": 1048576,
    "fixed-nesting": true,
    "DataShards": 0,
    "ParityShards": 0,
    "chunk-seed": "6475706c6963616379",
    "hash-key": "6475706c6963616379",
    "id-key": "6475706c6963616379",
    "chunk-key": "",
    "file-key": "",
    "rsa-public-key": ""

You can run duplicacy cat -r <revision> to see the actual content of a snapshot. When you run this command, a new field named chunk_sequence will be created, by reading the contents of chunks listed in chunks. The chunks in chunk_sequence store the file content.

Similarly, the length_sequence field will be created from chunks listed in lengths. It is used to track the length of each chunk in chunk_sequence.

The files field works slightly differently but still follows the same principle. First, a file_sequence field is created which contains the hashes of all metadata chunks. These chunks are then downloaded and concatenated together, and decoded into a list of Entries, each of which represents the metadata for a file.

1 Like