Duplicacy check too slow and the connection was forcibly closed by the remote host

popo7ich · 5 April 2021 21:11

Hi guys, I want to implement a verify procedure with my jobs and based on what I read the recommended command is “duplicacy check -all -files”. After doing this, it was really taking some time, not sure exactly if this is normal and then I got an error message:

That was only 4 revisions and took around 3-4 hours until I saw the error message.

The plan is to do this once a week, but this is the first time I try it honeslty.

Any recommendations on how to make this work properly? Thanks in advance for your time.

M.

gchen · 6 April 2021 14:01

You can try running the check with multiple threads:

duplicacy check -all -files -threads 4

But, the connecting being closed may be the result of hitting the rate limit. Using multiple threads may not help at all and even make it worse.

I would suggest the -chunks option instead:

duplicacy check -all -chunks -threads 4

This ensures that each chunk will be downloaded only once.

popo7ich · 6 April 2021 14:45

Hi gchen, thank you very much for your fast reply. I am currently running your last suggestion, will see how it goes. But in the meantime, could you please confirm the -all parameter runs the check of all chunks within that target storage of any job (from any server) that points to it?

Thanks,

M.

popo7ich · 6 April 2021 15:37

OK disregard my previous question, I just reconfirmed what I thought. So “duplicacy check -all -chunks -threads 4” failed again after less than an hour with a similar error:

After confirming that -all checks every job within that storage, I am trying to adjust this to only check the default local job, and I can create independent verify jobs on each server (not a big deal, actually I thought it was like that originally). This will definitely reduce the processing time of the job and hopefully avoid that error message.

Currently running “duplicacy check -chunks” which is supposed to end in 1 hour 15 minutes approx. I am not using -threads anymore due to your previous comment.

I will let you know once done.

Thanks,

M.

popo7ich · 6 April 2021 16:50

OK I can confirm duplicacy check -chunks worked this time:

So, at this point I only have one question, -chunks should be enough to validate the integrity of a backup job?

Thanks,

M.

gchen · 7 April 2021 19:31

-chunks should be enough. If you really want to make sure that every file is restorable, you can copy the azure store to a local one and then run check -files against the local storage.

popo7ich · 14 April 2021 17:40

Thanks gchen. However, bad news, the job again failed with the original error message “the connection was forcibly closed by…”. I have tested it in 2 different computers, 2 different jobs with the same target Azure Storage. I can confirm it failed at different chunks and at different times. For example one of the jobs was almost done (90%) and failed after more than 10-15 hours. Currently using duplicacy check -threads 4 -chunks

Based on the error message, I started digging a bit more at the Azure end and tested switching to “Internet Routing”:

Unfortunately, the result was almost the same; not sure if there might be another possible adjustment there.

Do you have any other ideas on how to adjust/handle this?

Thanks again for your time,

M.

popo7ich · 21 April 2021 19:35

Once again, check process was almost done (at least 5 hours running properly) and boom…

Verified chunk 05dfc3b9b7337551c0e6da13b5986b642dba0ffe4e17a4a83d2bdf8decdd1ff9 (20532/21810), 8.87MB/s 00:12:20 94.1%
Failed to download the chunk c46be14edee3ba69241ea60a18156e0dc91843a1f190d7b40cd9740b2c81bfb2: read tcp 192.168.244.20:50212->20.47.31.0:443: wsarecv: An existing connection was forcibly closed by the remote host.

What got my attention this time was that it returned a different exit code. Usually it returns 100, but on this one it was 9009. Does this gives you a light on what could be happening?

Note: not sure if this could be the reason of the new exit code, but I tried -threads 8.

Thanks,

M.

gchen · 23 April 2021 04:25

You can just run the check job again. Chunks already verified will not be downloaded again in the next run.

jiahao_xu · 23 April 2021 08:31

If you have ssh access to your server, then maybe you can run the entire checking process on the remote machine directly?

You can create a dummy repository using the local filesystem directly, which would remove the unstable network and make the checking process more robust and stable.

popo7ich · 27 April 2021 17:11

Hi jiahao_xu, thank you for your reply. Not sure if I understood your suggestion, but the target storage is a container in Azure, not a server.

M.

popo7ich · 27 April 2021 17:21

Hi gchen, I guess I can adjust this in my scripts, however, not sure it would be practical cause I will be retrying the job every time I get an exit code of 100 (if I got your idea properly). And as far as I know, 100 could be different reasons.

Please clarify, when the check process runs with -chunks, it downloads them where and for how long?

Thanks,

M.

gchen · 27 April 2021 18:23

You can add the -log option:

duplicacy -log check -all -chunks -threads 4

And when it fails, parse the output to search for the line that contains both ERROR and DOWNLOAD_CHUNK.

popo7ich · 27 April 2021 18:42

Thank you for that one, I think I will give it a try. But please don’t forget to answer my question: “when the check process runs with -chunks, it downloads them where and for how long?” (location?)

M.

jiahao_xu · 28 April 2021 01:46

Well, if you are using Azure container (like kubernates), then I think you can modify the Dockerfile to include duplicacy in the container, set up a dummy repository from the local storage and run duplicacy --check periodically in it directly.

gchen · 28 April 2021 02:17

The chunks aren’t saved locally; instead they are downloaded into memory and checked on the fly and then discarded once they passed the integrity check.

jiahao_xu · 30 April 2021 04:02

You can do something like this:

Dockerfile:

FROM alpine

ARG url="https://github.com/gilbertchen/duplicacy/releases/download/v2.7.2/duplicacy_linux_x64_2.7.2"

WORKDIR /root/dummy/

# /mnt/duplicacy/ is where the volume will be mounted
RUN mkdir -p /mnt/duplicacy && \
    wget "$url" -O /usr/local/bin/duplicacy && chmod +x /usr/local/bin/duplicacy
ADD duplicacy_preferences .duplicacy/preferences

ENTRYPOINT ["/usr/local/bin/duplicacy"]
CMD ["check", "-a", "-fossils", "-resurrect", "-files", "-chunks", "-stats", "-tabular", "-threads", "4"]

While the duplicacy_preferences is:

[
    {
        "name": "default",
        "id": "<Your storage ID here>",
        "repository": "",
        "storage": "/mnt/duplicacy/",
        "encrypted": false,
        "no_backup": false,
        "no_restore": false,
        "no_save_password": false,
        "nobackup_file": "",
        "keys": null,
        "filters": "",
        "exclude_by_attribute": false
    }
]

You then build a container and put it onto Azure and use kubernate yaml config to mount your duplicacy storage volume to /mnt/duplicacy.

Now when this container is started, it would run duplicacy check -a -fossils -resurrect -files -chunks -stats -tabular -threads 4 by default on your storage.

By checking the volume mounted in the container directly instead of connecting to a remote server, it would be much quicker, more robust and you won’t need to keep your computer running.

Edit:

I have tested this Dockerfile locally with a fake duplicacy storage I created and it worked as expected.

popo7ich · 30 April 2021 14:34

The chunks aren’t saved locally; instead they are downloaded into memory and checked on the fly and then discarded once they passed the integrity check.

Got it, thank you for the clarification.

popo7ich · 30 April 2021 14:44

jiahao_xu, thank you very much for taking the time to provide a possible solution. I will review it closely and give it a try if it is the case.

Currently I am also considering another alternative, which might be based on your theory (that makes a lot of sense) of running the job locally/closed to the storage. It is located at East US and I believe that if I run the check process from a device in the same region, the error should be avoided too. I think this should be simpler.

I will get back to you guys with the results.

Thanks,

M.

popo7ich · 25 May 2021 14:41

Hi guys, this is an update to this topic and actually the final solution. As I mentioned before, I was able to run the check process directly from a VM at the same region of the Azure storage location (in this case East US). The speed difference was “huge” and the job was finished successfully in 3 hours. To give you an idea the download rate average was 100 MB/s, while locally at the source server its best (not average) was 7 MB/s, which seemed to generate at one point the connection drops after several hours running. So, this is solved. Thank you both for your time and assistance in finding a solution.

M.