Threaded backups fail with i/o timeout

New to Duplicacy and am running the issue below. Seems to happen when running 4 or 8 threads…less so with 2. Backups are too slow without threading so I haven’t been able to complete a full backup yet.

2020-01-05 19:11:10.613 INFO SNAPSHOT_FILTER Loaded 6 include/exclude pattern(s)
2020-01-05 19:11:56.310 INFO BACKUP_THREADS Use 8 uploading threads
2020-01-05 19:46:27.355 ERROR UPLOAD_CHUNK Failed to upload the chunk 
RequestTimeout: Your socket connection to the server was not read from or written to within the timeout period, source:  read tcp 38.73.225.12:443->X.X.X.X:60462: i/o timeout status code: 400, request id: XXXXXXXXXXXX, host id:

Watching the backup process through lsof, it seems to happen when processing larger .mov files for some reason. Just observational, I do have a bunch of mov files so it might be just coincidental.

Thanks!

Info:
Duplicacy Web Edition 1.1.0 and duplicacy_linux_x64_2.3.0
Ubuntu : Ubuntu 18.04.3 LTS
Running in Docker
Backing up to Wasabi us east 2
I have a fiber connection to the internet so I don’t think bandwidth is the issue.

Did you select the right endpoint during storage configuration? That is, the storage url should be wasabi://us-east-2@s3.us-east-2.wasabisys.com/bucket/path. I remember if you use s3.wasabisys.com as the endpoint Wasabi will allow you to connect but the connection will be very unstable.

Can you also run a test backup to a us-east-1 bucket?

Thanks for the reply.

the storage url should be `wasabi://us-east-2@s3.us-east-2.wasabisys.com/bucket/path

yup : wasabi://us-east-2@s3.us-east-2.wasabisys.com/…/etc…

The backup runs for a while, but then that error.

I remember if you use s3.wasabisys.com as the endpoint

The endpoint has to match the region. The docs say one thing, but it must be us-east-2@s3.us-east-2.wasabisys.com. Nothing else worked for me as the bucket is created in that region.

test backup to a us-east-1 bucket?

The bucket selection for wasabi does not offer us-east-1 as an option, only east-2, west and europe options iirc. I’m east so it’s close.

Side note…i’m investigating the possibility that this might be caused by an upstream router that is doing FQ_CODEL traffic shaping. It would be the first time I have seen anything actually affected by this, but i’m working to rule it out. I know the duplicacy backup app has rate limiting too, we’ll see if I need to use that instead.

Is that error coming from the S3 libraries being used? I wonder if somehow there is a way to have it recover a bit more gracefully from a socket timeout (instead of failing the whole backup)…or maybe this is some kind of router shenanigans interfering with things in a way the library/protocol doesn’t like.

Followup : I completely turned off FQ_CODEL traffic shaping on the router so that duplicacy has a full pipe between the host and wasabi. Tried to run it with 8 threads and it still produced the same error. Trying to complete a backup now with 2 threads (2d 14h) vs 8 threads (9h).

I’m open to suggestions or experiments to help get to the bottom of this.

I know it’s not the same provider, but here’s what AWS’ documentation has to say. Unless the content-length header is somehow being set incorrectly, it sounds like it’s likely network issues you’d need a packet capture to troubleshoot.

When the connection between the client and the Amazon S3 server remains idle for 20 seconds or longer, Amazon S3 closes the connection. This results in the 400 RequestTimeout error. To troubleshoot this error, check the following:

The number of bytes set in the “Content-Length” header is more than the actual file size

When you send an HTTP request to Amazon S3, Amazon S3 expects to receive the amount of data specified in the Content-Length header. If the expected amount of data isn’t received by Amazon S3, and the connection is idle for 20 seconds or longer, then the connection is closed. Be sure to verify that the actual file size that you’re sending to Amazon S3 aligns with the file size that is specified in the Content-Length header.

Network issues like high latency, packet loss, and congestion

If a few packets or all packets are dropped because of a slow or poor network connection, then Amazon S3 waits for the expected number of bytes to be received. If the connection is idle for 20 seconds or longer, then Amazon S3 closes the connection and returns the 400 RequestTimeout error. To verify whether this is causing the error, perform a packet capture and check for any packet drops.

References

1 Like

Wow, great information. Definitely looks like what I was running into.

As it turns out, Wasabi was having some system/networking issues and us-east-2 went down for about half a day yesterday. I’m chalking this up to issues on their end because I have not had any backup issues in the last 12 hours.

I am running my first backup of about 2TB, but it has failed and been aborted about a dozen or so times over the last few days.

What is the best way to ensure the the data now in wasabi is not corrupt? (i.e. run backup --hash once? maybe -files on check?)

Thanks again.

I would vote for a regular check and add -files on check if you really want to be sure. Just know that using the -files option will result in at least the size of your backup being downloaded (so at least 2TB in your case).

2 Likes