Diff command details

The diff command compares the same file in two different snapshots if a file is given, otherwise compares the two snapshots.

Click here for a list of related forum topics.

Quick overview

NAME:
   duplicacy diff - Compare two snapshots or two revisions of a file

USAGE:
   duplicacy diff [command options] [<file>]

OPTIONS:
   -id <snapshot id>            diff snapshots with the specified id
   -r <revision> [+]            the revision number of the snapshot
   -hash                        compute the hashes of on-disk files
   -storage <storage name>      retrieve files from the specified storage
   -key <private key>           the RSA private key to decrypt file chunks
   -key-passphrase <passphrase> the passphrase to decrypt the RSA private key

Usage

duplicacy diff [command options] [<file>]

Options


-id <snapshot id>

You can specify a different snapshot id rather than the default snapshot id.


-r <revision> [+]

If only one revision is given by -r, the right hand side of the comparison will be the on-disk file.


-hash

The -hash option can then instruct this command to compute the hash of the file.


-storage <storage name>

You can use the -storage option to select a different storage other than the default one.


-key <private key> and -key-passphrase <passphrase>

If the storage is encrypted by an RSA public key, the corresponding private key is needed to diff a file.

Note

The file must be specified with a path relative to the repository.


Meaning of the symbols used in the output

Command:

duplicacy diff -r 10 -r 11
+     is for a file in -r 11 but not in -r 10
-     is for a file in -r 10 but not in -r 11

When a file has changed:

space is for the version of the file in -r 10
*     is for the version of the file in -r 11

The diff command will also use the filesystem to compare against if only one input is given.

Is there a way to diff a specific directory and all files under it?

By not providing a “file” parameter the entire snapshot is compared. So, I thought that something like “public/*” would compare that folder and all contents, but I have been unable to find any such wildcard syntax that works.

Also, it seems that ending a folder name in a backslash (public/) and only providing a single revision will trigger the program to compare the hash of the entire folder on disk to a non-existent hash in the snapshot (therefore giving a mismatch hash message). Is that what it’s doing or am I reading the output wrong?

The diff command doesn’t compare directories. As a workaround, you can use the list -files command to list all files in a revision, filter out all files under the given directory, and save the output to a file:

duplicacy list --files -r revision | grep "path/to/dir/" > list1

Repeat this for the other revision, and then use a text diff tool to compare the 2 lists.

Thanks for the suggestion. I’m sure it will be useful in the future, but unfortunately I needed to compare the file contents/hashes between a backup snapshot and the disk.

To work around this I copied the .duplicacy directory used by the web-ui for repository backups to a temp directory, edited the filters file to only include the directory I wanted to compare, ran the diff command from within that temp directory, and grep’d the output to only the directory that I wanted to compare. It’s not a perfect solution by any means, but it worked faster than comparing the entire repository directory structure.

When I subsequently restored some of the subdirectories and files identified by the diff, I came to the conclusion that Duplicacy is very inefficient when it comes to targeting a single subdirectory under a repository. All my attempts to restore a subdirectory would still crawl the entire repository directory structure on disk. This isn’t a big deal for small repositories, but when the repository is an entire disk backup this is a lot of wasted time.

I’m considering taking a stab at implementing some optimizations for this use case myself. If successful, I’ll submit a pull request on GitHub. Please let me know if there are any efforts already underway or precious “lessons learned” from your own optimization attempts so that I don’t waste my time.