Copy command never works

timotree · 12 July 2023 11:48

I’ve never had the Copy command work successfully for my backups. I always get ERROR DOWNLOAD_CORRUPTED failures every time I set it up, like this:

2023-07-12 07:31:09.953 WARN DOWNLOAD_RETRY The chunk d420a34b85a9c07b38626f237558a8fbe7d49047b2a1d59e806f8420aff3ecc8 has a hash id of 18e98303aa96d04873565f7b0e599a98ef10811d78074946c03ab6c0cd28d1ed; retrying
2023-07-12 07:31:09.993 ERROR DOWNLOAD_CORRUPTED The chunk d420a34b85a9c07b38626f237558a8fbe7d49047b2a1d59e806f8420aff3ecc8 has a hash id of 18e98303aa96d04873565f7b0e599a98ef10811d78074946c03ab6c0cd28d1ed
The chunk d420a34b85a9c07b38626f237558a8fbe7d49047b2a1d59e806f8420aff3ecc8 has a hash id of 18e98303aa96d04873565f7b0e599a98ef10811d78074946c03ab6c0cd28d1ed

I’ve tried fresh Docker installs on clean, first backups and I’ve tried various storages targets such as OneDrive, B2, and a local USB HDD. The storages are always initialized as copy-compatible with the source.

This has been an issue for as long as I’ve used Duplicacy (about a year now), and has gotten so annoying that I even gave up entirely on it and tried switching to CloudBerry Backup, which had its own set of problems that were unresolvable.

What am I doing wrong with the Copy command?

Droolio · 12 July 2023 13:00

You mention you tried different storage targets, but what of your source backup storage? Has this ever changed? Have you tried creating a storage with Erasure Coding enabled?

You should probably run some check -chunks on the source first.

Sounds to me your RAM could be a bit iffy and you might wanna check your system with memtest86+.

timotree · 12 July 2023 18:51

The source is a backup on a local HDD. The disk itself hasn’t changed, but the backups have, since I’ve tried fresh Docker setups a few times. I haven’t ever created the storage with Erasure Coding, since I’m not sure what that feature is.

I haven’t run a memtest on the system, but I don’t believe the RAM is failing. The local disk backups (that are supposed to be getting copied) have Check tasks running on a schedule and they’re all passing.

Droolio · 12 July 2023 19:11

Ok well enough - the common denominator with all these chunk errors is that you’re using a single local source to copy from, so that would be where to start from.

A normal check won’t test the integrity of the data within those chunks, however. Therefore, you should do a check with the -chunks flag, on that source storage.

timotree · 12 July 2023 19:58

Okay, I didn’t realize that was a different check with the chunks flag. I just ran it with that flag and two came back as Failed. Is that indicative of RAM issues or something else? I would need to take the NAS which run my other local services offline for a while to run a memtest86 check, which I’m hesitant to do because it would take it offline for a few hours.

Droolio · 12 July 2023 20:23

Not necessarily. I only mentioned memtest86+ since at first it sounded like you were trying to copy from different sources, which would be indicative of dodgy memory.

It’s likely that you just have a few bad chunks and it is easily fixable.

What you could do is identify all those chunks (can’t remember if you have to run check -chunks -persist to get a complete list, and rename them to .bak.

Then re-run with a normal check to get a list of all the snapshot that reference missing chunks.

Manually rename those bad snapshot revisions to .bak too. (The numbered revision files under snapshots\)

Re-run checks to ensure your source storage is in a consistent state, and you should be able to copy from it.

If you already have a second storage where perhaps you already copied some of those chunks before, you may be able to fix your primary storage by copying chunks in the other direction, but it requires a bit of a hack. If you don’t, not a problem. But this is why it’s generally good to have at least two extra copies than your original; one storage can be used to fix another.

timotree · 12 July 2023 22:15

Okay, so I identified the bad chunks and there aren’t that many, only five in total. The issue is that it’s every single snapshot. There are 86 snapshots of each and all 86 are listed as missing the chunks now. Since it’s going all the way back to snapshot 1, that suggests to me that there is no “consistent state” to return to.

The backups are pretty new. I only started it up on July 3.

Droolio · 12 July 2023 22:33

Ok what you can try is to rename all those chunks to .bak and run fresh backups using dummy IDs - e.g. C-Users_DUMMY - backing up the same locations you’re already backing up, just with different IDs.

You might get lucky and the fresh backup will deterministically generate identical chunk IDs and fix your original backups.

You can improve the chances of this by making a temporary location with a copy of the files as close to the time of snapshot revision 1. A restore -persist could get you there, and then you could fill in any missing files (it also might help you identify which files are referenced by the bad chunks). But since all your snapshots refrence these chunks, it’s likely you won’t need to do any of that and a fresh backup of your current files will do the job.

After a fresh backup, run a regular check to see if the missing files were filled in.

If you don’t get lucky, you may have to abandon the old IDs and start anew. Most of the chunks won’t need to be packed and you could just switch to a new set of IDs and manually delete all the bad IDs under snapshots\ and cleanup with prune -exhaustive. (You can reuse the old IDs but don’t run a prune til they’re all complete.) You’ll find it won’t take long at all to complete new backups.

To mitigate this in the future, you could initialise your local storage with Erasure Coding - corrupted chunks can often self-heal. And have a secondary storage so each can patch each other. Run regular checks, but also semi-regular check -chunks. (If you get into the habit of copying between two storages, the act of reading from the source will reveal if there’s any corruption, so you can make the check -chunks much less regular there.

timotree · 13 July 2023 02:05

To your point of abandoning the old IDs and starting anew, that’s essentially what I’ve done several times already. As I mentioned initially, I’ve tried clean-slate Docker images many times over, trashing all the local and remote backups and starting from scratch each time. I would like to assume that starting fresh wouldn’t generate bad chunks every single time.

I had a second backup going to B2 (complete backup, since copy never works), so I decided to run check -chunks -persist on that as well. This target is even newer than my local backup, and those are failing the chunk checks as well. At first I was confident that the backups were good, since the backup and checks were passing, but now I’m concerned that none of them are.

I can try again from the beginning, this time creating the storages with Erasure Coding enabled if you think that would help.

Droolio · 13 July 2023 11:09

Personally, if I was you, I’d run that memtest86+ a few passes just to be sure you don’t have an issue with RAM. Because Duplicacy is pretty robost and bad chunks shouldn’t normally happen - especially to cloud storage. (I can’t say the same for missing chunks, but that’s another topic )

Erasure Coding is pretty useful for local backups, but you shouldn’t need to have it on for B2 (and indeed, it’ll use extra storage space for no real advantage). The good news is you can have an Erasure Coded local storage and copy to a non-Erasure B2 storage.

Again, you can save time by removing all snapshot files and letting Duplicacy fix missing chunks. However, keep in mind it won’t fix chunks that are corrupted - if it sees the deterministic chunk ID on the storage already, it’ll assume it’s in good nick and move onto uploading the next.

This is why you should weed out every bad chunk by deleting them or renaming out of the way. A missing chunk is actually better than a bad chunk (excluding Erasure Coding). You can really only do that with at least one snapshot file, then running a check -chunks or otherwise reading from storage to reveal what’s bad.

Since egress is usually expensive with B2, you can cheat and skip most of the re-upload. Here’s how I’d do it:

Create a brand new local storage and make it copy-compatible with B2, add Erasure Coding 5:2
Backup to your local storage, using IDs different to that of your B2 storage. This is called pre-seeding.
Copy B2 to local (existing chunks should be skipped, bad chunks will be highlighted to you).
If you run into bad chunks on B2, delete/rename them and keep copying to local (don’t worry, the bad chunk won’t get copied).
Delete all snapshot IDs in snapshots/ on your B2 storage (you should now only have good chunks/)
Cleanup part 1. Delete whichever IDs on your local storage you want to keep (in fact, you can do it all again; wipe your local snapshots/ and do a last round of local backups if you wanted).
Copy your local to B2 (again, most chunks should already exist - you’re only sending snapshot IDs).
Cleanup part 2. Now that you have at least one snapshot ID on local and B2, you can remove unreferenced chunks on both with a prune -exclusive.

That’s if you wanna save time/cost with your B2 storage.

Many people want to start from scratch and that’s perfectly understandable, but if you know how Duplicacy works a little bit under the hood - the way it’s been designed makes it extremely possible to fix (remote) storages in these circumstances.

timotree · 22 July 2023 18:54

Sorry for the late reply. I’ve been super busy and haven’t been able to get back to this.

I’m going to try to carve out some time to memtest the system and then start from scratch again with Erasure Coding on the local backup. Hopefully I don’t come back here with no RAM issues but still having chunk issues.

timotree · 30 December 2023 20:31

Hey, I know it’s been a long time, but I finally made some progress on this. I ended up migrating my NAS to a new NUC server and the backup has been working perfectly fine for the past week. Since the old machine that was giving me all these troubles is now idle, I started a memtest on it.

Six minutes into a 32GB test and 114 errors are already found, so you were likely correct in your original assessment. I’ll let the whole test finish, but it’s already clear that RAM has got to go. Thank you!

Droolio · 31 December 2023 04:27

Good to hear!

Tis funny you mention this, as I was only today suggesting someone run a memtest86+ (their PC was bluescreening or randomly rebooting during gaming sessions).

Apparently, they’d already ran the test, and it was even crashing during the memtest (which suggests other hardware issues, such as weak PSU, although RAM could still be iffy there).

Anyway, in your case - re-seating the RAM might work, or swapping the sticks over - but I would run memtest on each stick individually, to identify the culprit if it is just a single stick. Good luck!

system · 10 January 2024 04:28

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.