Network failure causing re-indexing which takes hours

joeysantana3 · 1 November 2024 07:18

The closest post I found was this: Upload speed decreasing until Failed - #8 by saspus
However, the issue above seems to have been caused by limited ram. I have 128gb of ram on this server.

I just started using Duplicacy. I’m testing using it to backup many terabytes of data to AWS S3. My first test is backing up my photos folder. It’s about 6 terabytes. When starting the backup, Duplicacy takes a few hours to complete the indexing step. From what I understand, indexing only takes this long the first time. After the first complete backup, it should take less time?

The issue comes shortly after Duplicacy starts sending data over the wire to S3. It seemed to freeze, so I discovered clicking on the progress bar in the webui opens a new window showing the console output. The last message before the upload froze was: ERROR UPLOAD CHUNK Failed to upload..RequestError: send request failed caused by: Put <bucket url> write tcp <my ip> -> <aws ip>: use of closed network connection.

I didn’t cancel the job, but eventually it quit and the ui said failed. So I restarted the job, and it seems to have begun indexing from scratch. Is the indexing progress not saved anywhere? I know one of the features of Duplicacy is “databaseless” design, so maybe not keeping a cache of the index is on purpose?

I have a 1g down/500mb up internet connection that is usually pretty stable. The server is a supermicro 4u with 4x xeon processors and 128g of ram. The OS is unraid and Duplicacy is running in a container. The data being backed up is stored on a zfs pool of 4x 2tb nvme drives. I guess my questions are:

Was this just a internet blip that caused a dropped connection?
Is this a common error?
I’m assuming there is no way to save the indexing progress in case of a restart?
Is there any tips for completing this first backup?

Screenshot 2024-10-31 at 11.40.31 PM3428×500 118 KB

Thanks! Hopefully I can get this resolved and continue using Duplicacy!

saspus · 1 November 2024 15:58

Yes, at the first backup all files are hashed, and it’s CPU intensive. Subsequent backup only needs/changes files will be processed based on size and modification time.

Your setup is the most ideal setup I have ever seen on this forum: plenty of ram, zfs SSD array, massive internet bandwidth, and ridiculous amount of computing power, and the target is gold standard of cloud services. There shall be literally no issues. I’m very curious what is wrong.

Is the error upload chunk the first error after which it bails or does duplicacy continue uploading? Generally, Duplicacy retires multiple times before giving up. Are you saying Duplicacy just hang, did not exit, and stopped sending traffic?

Failing requests is quite common with any cloud service, that’s why there is a retry mechanism. Usually subsequent retries succeed.

You can add -d flag to the job’s global options to see significantly more logging.

It saves the incomplete snapshot, but it still needs to go and shred data to chunks. They won’t get uploaded again though.

With your setup there should be no issues.

General advice on memory constrained devices is to create few smaller snapshots, but 128GB is plenty.

Tangential to the whole discussion; since you are backing up photos and media that is incompressible, and non-deduplicatable, you may benefit from increasing the default 4MB chunk size to something larger. This will reduce number of chunks and result in fewer api calls, saving you money and increasing performance. You can also reduce the compression to the lowest value (Duplicacy uses zstd and even on lowest setting it’s still great on regular data; and yet any level is useless for incompressible lossy media)

What Amazon s3 tier are you using? ~~How many threads?~~

TLDR: add -d flag to backup and post logs from around the failure point.

saspus · 1 November 2024 17:13

Just noticed this:

perhaps this is an issue? Maybe amazon is doing some sort of rate limiting, or maybe your system, or your gateway are running out of resources (e.g. open connection handles, that are limited by max number of file descriptors, which by default is 1024)? Try reducing it to e.g. 32, you should still be able to saturate the upstream. (Increasing the average chunk size may also help with throughput)

joeysantana3 · 1 November 2024 19:59

Thanks for the detailed response. After reading through your reply and through the docs, I had the same thought about the amount of threads being an issue. I have decreased it to 32 and also enabled debug. I will report back what happens. Thanks again!

joeysantana3 · 1 November 2024 23:07

I think this was the issue. I let it re-index, and after a few hours of indexing, it seems to be uploading to AWS. I’ve been at ~40/MBs up for the last 20 minutes. Thanks for the help! I’ll let you know if something goes wrong. I’ll be purchasing Duplicacy!

system · 11 November 2024 23:07

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.