Duplicacy check too slow and the connection was forcibly closed by the remote host

Hi guys, I want to implement a verify procedure with my jobs and based on what I read the recommended command is “duplicacy check -all -files”. After doing this, it was really taking some time, not sure exactly if this is normal and then I got an error message:

That was only 4 revisions and took around 3-4 hours until I saw the error message.

The plan is to do this once a week, but this is the first time I try it honeslty.

Any recommendations on how to make this work properly? Thanks in advance for your time.

M.

You can try running the check with multiple threads:

duplicacy check -all -files -threads 4

But, the connecting being closed may be the result of hitting the rate limit. Using multiple threads may not help at all and even make it worse.

I would suggest the -chunks option instead:

duplicacy check -all -chunks -threads 4

This ensures that each chunk will be downloaded only once.

Hi gchen, thank you very much for your fast reply. I am currently running your last suggestion, will see how it goes. But in the meantime, could you please confirm the -all parameter runs the check of all chunks within that target storage of any job (from any server) that points to it?

Thanks,

M.

OK disregard my previous question, I just reconfirmed what I thought. So “duplicacy check -all -chunks -threads 4” failed again after less than an hour with a similar error:

After confirming that -all checks every job within that storage, I am trying to adjust this to only check the default local job, and I can create independent verify jobs on each server (not a big deal, actually I thought it was like that originally). This will definitely reduce the processing time of the job and hopefully avoid that error message.

Currently running “duplicacy check -chunks” which is supposed to end in 1 hour 15 minutes approx. I am not using -threads anymore due to your previous comment.

I will let you know once done.

Thanks,

M.

OK I can confirm duplicacy check -chunks worked this time:

So, at this point I only have one question, -chunks should be enough to validate the integrity of a backup job?

Thanks,

M.

-chunks should be enough. If you really want to make sure that every file is restorable, you can copy the azure store to a local one and then run check -files against the local storage.

Thanks gchen. However, bad news, the job again failed with the original error message “the connection was forcibly closed by…”. I have tested it in 2 different computers, 2 different jobs with the same target Azure Storage. I can confirm it failed at different chunks and at different times. For example one of the jobs was almost done (90%) and failed after more than 10-15 hours. Currently using duplicacy check -threads 4 -chunks

Based on the error message, I started digging a bit more at the Azure end and tested switching to “Internet Routing”:

Unfortunately, the result was almost the same; not sure if there might be another possible adjustment there.

Do you have any other ideas on how to adjust/handle this?

Thanks again for your time,

M.

Once again, check process was almost done (at least 5 hours running properly) and boom…

Verified chunk 05dfc3b9b7337551c0e6da13b5986b642dba0ffe4e17a4a83d2bdf8decdd1ff9 (20532/21810), 8.87MB/s 00:12:20 94.1%
Failed to download the chunk c46be14edee3ba69241ea60a18156e0dc91843a1f190d7b40cd9740b2c81bfb2: read tcp 192.168.244.20:50212->20.47.31.0:443: wsarecv: An existing connection was forcibly closed by the remote host.

What got my attention this time was that it returned a different exit code. Usually it returns 100, but on this one it was 9009. Does this gives you a light on what could be happening?

Note: not sure if this could be the reason of the new exit code, but I tried -threads 8.

Thanks,

M.

You can just run the check job again. Chunks already verified will not be downloaded again in the next run.

If you have ssh access to your server, then maybe you can run the entire checking process on the remote machine directly?

You can create a dummy repository using the local filesystem directly, which would remove the unstable network and make the checking process more robust and stable.

Hi jiahao_xu, thank you for your reply. Not sure if I understood your suggestion, but the target storage is a container in Azure, not a server.

M.

Hi gchen, I guess I can adjust this in my scripts, however, not sure it would be practical cause I will be retrying the job every time I get an exit code of 100 (if I got your idea properly). And as far as I know, 100 could be different reasons.

Please clarify, when the check process runs with -chunks, it downloads them where and for how long?

Thanks,

M.

You can add the -log option:

duplicacy -log check -all -chunks -threads 4

And when it fails, parse the output to search for the line that contains both ERROR and DOWNLOAD_CHUNK.

1 Like

Thank you for that one, I think I will give it a try. But please don’t forget to answer my question: “when the check process runs with -chunks, it downloads them where and for how long?” (location?)

M.

Well, if you are using Azure container (like kubernates), then I think you can modify the Dockerfile to include duplicacy in the container, set up a dummy repository from the local storage and run duplicacy --check periodically in it directly.

The chunks aren’t saved locally; instead they are downloaded into memory and checked on the fly and then discarded once they passed the integrity check.

You can do something like this:

Dockerfile:

FROM alpine

ARG url="https://github.com/gilbertchen/duplicacy/releases/download/v2.7.2/duplicacy_linux_x64_2.7.2"

WORKDIR /root/dummy/

# /mnt/duplicacy/ is where the volume will be mounted
RUN mkdir -p /mnt/duplicacy && \
    wget "$url" -O /usr/local/bin/duplicacy && chmod +x /usr/local/bin/duplicacy
ADD duplicacy_preferences .duplicacy/preferences

ENTRYPOINT ["/usr/local/bin/duplicacy"]
CMD ["check", "-a", "-fossils", "-resurrect", "-files", "-chunks", "-stats", "-tabular", "-threads", "4"]

While the duplicacy_preferences is:

[
    {
        "name": "default",
        "id": "<Your storage ID here>",
        "repository": "",
        "storage": "/mnt/duplicacy/",
        "encrypted": false,
        "no_backup": false,
        "no_restore": false,
        "no_save_password": false,
        "nobackup_file": "",
        "keys": null,
        "filters": "",
        "exclude_by_attribute": false
    }
]

You then build a container and put it onto Azure and use kubernate yaml config to mount your duplicacy storage volume to /mnt/duplicacy.

Now when this container is started, it would run duplicacy check -a -fossils -resurrect -files -chunks -stats -tabular -threads 4 by default on your storage.

By checking the volume mounted in the container directly instead of connecting to a remote server, it would be much quicker, more robust and you won’t need to keep your computer running.

Edit:

I have tested this Dockerfile locally with a fake duplicacy storage I created and it worked as expected.

2 Likes

The chunks aren’t saved locally; instead they are downloaded into memory and checked on the fly and then discarded once they passed the integrity check.

Got it, thank you for the clarification.

jiahao_xu, thank you very much for taking the time to provide a possible solution. I will review it closely and give it a try if it is the case.

Currently I am also considering another alternative, which might be based on your theory (that makes a lot of sense) of running the job locally/closed to the storage. It is located at East US and I believe that if I run the check process from a device in the same region, the error should be avoided too. I think this should be simpler.

I will get back to you guys with the results.

Thanks,

M.

1 Like

Hi guys, this is an update to this topic and actually the final solution. As I mentioned before, I was able to run the check process directly from a VM at the same region of the Azure storage location (in this case East US). The speed difference was “huge” and the job was finished successfully in 3 hours. To give you an idea the download rate average was 100 MB/s, while locally at the source server its best (not average) was 7 MB/s, which seemed to generate at one point the connection drops after several hours running. So, this is solved. Thank you both for your time and assistance in finding a solution.

M.

1 Like