Can anyone tell me why my backup keeps failing?

nathansmart · 16 April 2021 00:53

Can someone tell me what’s going on with my backup? Here is the log:

Running backup command from /Users/__________/.duplicacy-web/repositories/localhost/7 to back up /Volumes/SmartDrive 8TB 3
Options: [-log backup -storage smartdrives -stats]
2021-04-15 12:26:11.021 INFO REPOSITORY_SET Repository set to /Volumes/SmartDrive 8TB 3
2021-04-15 12:26:11.021 INFO STORAGE_SET Storage set to gcd://Duplicacy Backups
2021-04-15 12:26:13.660 INFO BACKUP_START No previous backup found
2021-04-15 12:26:13.660 INFO BACKUP_INDEXING Indexing /Volumes/SmartDrive 8TB 3
2021-04-15 12:26:13.660 INFO SNAPSHOT_FILTER Parsing filter file /Users/nathanscherer/.duplicacy-web/repositories/localhost/7/.duplicacy/filters
2021-04-15 12:26:13.660 INFO SNAPSHOT_FILTER Loaded 0 include/exclude pattern(s)
2021-04-15 12:26:13.663 WARN LIST_FAILURE Failed to list subdirectory: open /Volumes/SmartDrive 8TB 3/.Trashes: permission denied
2021-04-15 12:26:15.602 INFO INCOMPLETE_LOAD Incomplete snapshot loaded from /Users/nathanscherer/.duplicacy-web/repositories/localhost/7/.duplicacy/incomplete
2021-04-15 12:26:15.602 INFO BACKUP_LIST Listing all chunks
2021-04-15 20:06:46.524 ERROR LIST_FILES Failed to list the directory chunks/: read tcp [2603:9001:340d:f291:7977:59b2:7968:efd9]:50487->[2607:f8b0:4002:c10::5f]:443: read: connection reset by peer
Failed to list the directory chunks/: read tcp [2603:9001:340d:f291:7977:59b2:7968:efd9]:50487->[2607:f8b0:4002:c10::5f]:443: read: connection reset by peer

If there is any other information needed - let me know. This keeps happening every time I start a backup. I have other hard drives that work, but I have three that fail every time. Oh, and this is backing up to Google Drive.

gchen · 16 April 2021 20:41

The log showed that it took 8 hours to list the chunks directory. Adding -threads 8 as an option to the job should help. If not, add -d as a global option to enable debug-level logging which should provide more information.

nathansmart · 16 April 2021 21:01

So I’m pretty new to all this. Are these correct settings?
Screen Shot 2021-04-16 at 4.59.53 PM

What does “-threads 8” mean? Or, rather, what is happening that it is taking 8 hours to list those chunks but it’s not doing that with other hard drives? And why would adding more threads help that? Is it just that it will list them faster? Why would it matter how long it takes to list the chunks? Shouldn’t it just keep listing them until it’s done regardless of the time?

I honestly don’t even know what “listing chunks” means so forgive my ignorant questions.

iocularis · 16 April 2021 21:28

Hi,
I suggest to look at the documentation Chunk size details

There are a lot of informations about how Duplicacy works all around the forum.

nathansmart · 17 April 2021 01:32

Thanks for the info. I still don’t really understand but that’s because I’m not really a techy person - I can handle myself a little bit but I’m not really familiar with backup concepts at all. I can follow what is being said but it doesn’t necessarily connect for me on a higher level. I’m a real dum dum!

nathansmart · 17 April 2021 01:35

Okay, I did that but it still failed. I have the full details though. Here is the log: http://nathansmart.com/show_log.txt

saspus · 17 April 2021 01:50

Can you connect to that google drive using other tools, like rclone or native google drive client?

It seems the communication between you and google servers are broken:

Get https://www.googleapis.com/drive/v3/files?alt=json&fields[ SNIP ]: read tcp 
[2603:9001:340d:f291:1f4:1995:adc1:bc61]:54001->[2607:f8b0:4002:c08::5f]:443:
 read: connection reset by peer; retrying after 6.43 seconds (backoff: 4, attempts: 2)

nathansmart · 17 April 2021 03:09

Yeah - I actually use Duplicacy for 8 drives uploading to Google Drive. Currently, 5 of them work, and 3 of them give me the same kinds of errors that the one I’m posting the log for gives. They all fail the same way. Could there be something about these hard drives that google doesn’t want to connect to? Meaning - are they corrupt or something?

saspus · 17 April 2021 03:26

You mean concurrently? Google maybe rate limiting you. Not sure what api returns would there be.

Remove -threads 8 and try to do it one by one.

If that works - you may want to consider issuing credentials from your own google project — that might help.

nathansmart · 17 April 2021 03:31

I have done drives concurrently, but no, I am currently just doing one at a time. For the drives that are working, I have done two at a time. These three drives that fail just won’t connect no matter how I do it. There’s something specific about these drives that makes them fail every time (or, at least that’s how it feels).

If Google is rate limiting me, then it wouldn’t be letting me upload the other drives, right? For instance, this is what my screen looks like right now:

As you can see, the 14TB 1 drive failed (which is one of the three that always fail), but the 14TB 2 is working just fine.

saspus · 17 April 2021 03:48

If you have separate repositories/backup tasks for each drive they may be doing it with different number of threads or credentials.

Have you tried reducing number of threads?

The error is reported by the web service. There is nothing the source data can influence, other than amount of data already backed up and hence larger list requests.

nathansmart · 17 April 2021 04:18

I haven’t tried anything really. I just set this up with default settings last year and hoped for the best. It was working for all of my drives and only recently did I realize that these drives are failing because I have been going one by one and I’m finally to these drives. Why would Google be okay with all of the other drives (of varying size - some bigger, some smaller) but not these specific ones?

Now that I think about it - there is a chance that these drives may have gotten disconnected during an upload. Would that have messed something up?

What I’m trying now is deleting the drive from my backup list and then adding it again and restarting. I’m guessing that won’t matter since it doesn’t delete any data off of Google Drive. Is there a way to delete all of the stuff from these specific drives from GDrive? I went to my Drive but I couldn’t tell what would belong to these failing hard drives and what is connected to the ones that worked (except for the snapshots folder but the folders are empty for those drives of course).

saspus · 17 April 2021 05:39

You would not need to do that. You can clean the repo later.

From your log you are clearly running that specific backup task with 8 threads. Remove the -threads 8 argument and see if the issue reproduces.

nathansmart · 17 April 2021 10:40

Yes, I was running that but that’s because I was told earlier in the thread that it might fix the problem. When I first created this thread, I did not have any options in my backups - they were all default. I can run it again without the -threads 8 option and run the -d option so you can see the full log without it.

saspus · 17 April 2021 19:58

I think that recommendation was based on a ridiculous long it took to list revisions the first time. The second time (in the full log posted) most of it was cached and it took just a minute or two so this rules out suspected issue of, say, token expiring during transfer.

How do you have things setup? One storage and multiple repositories targeting that storage? Or multiple storages? Are they all setup with the same token file?

Can you list contents of that repository using different tools — like rclone?

nathansmart · 18 April 2021 01:02

I have eight external hard drives (in two four-slot enclosures) set to backup to one folder on my Google Drive. Five of the drives work no problem. Three of them all have the same issue where it fails in the exact same way. I have no idea what is different about those three drives to make them fail while the other five work with no problem. I am able to access those three drives through other means so they don’t seem to be corrupt or broken.

I am on a Mac and I don’t really know much about this stuff at all. I found Duplicacy after having trouble with Arq Backup and set it up with basic default options. I just connected my Google Drive and added my hard drives. I don’t have any options set up and I haven’t tried to use anything else.

The question to me seems to be why does Google connect with those five drives, but refuses to connect with the other three? What is going on with those three drives to make Google fail on them? (I am currently backing up one of the five drives that works and I’m getting no error from Google)

gchen · 18 April 2021 03:37

If you add -d as a global option to enable debug level logging and post the log here we might be able to figure out what went wrong.

nathansmart · 18 April 2021 03:41

There is one up above but it was created with the -threads 8 option added. Here is a new one without any options added: http://nathansmart.com/show_log2.txt

saspus · 18 April 2021 05:47

It looks like after 30 seconds of listing files google drops connection

2021-04-17 06:41:26.330 INFO BACKUP_LIST Listing all chunks
2021-04-17 06:41:26.331 TRACE LIST_FILES Listing chunks/
2021-04-17 06:41:56.906 DEBUG GCD_RETRY [0] Get....

Based on the size of the repo there would be massive number of chunks. Perhaps there should be (or maybe already is) a way to retrieve list in pieces?

nathansmart · 18 April 2021 15:06

Does the amount of chunks correspond with the number of files on a hard drive? or the total size of the hard drive? How does Duplicacy decide how many chunks are in a repo?

EDIT: I read the article above and I see there is an algorithm but it still doesn’t make sense to me why bigger drives with more files are being allowed by Google, but not these drives.