Completely new: where to start

edgardiego · 24 September 2023 22:41

Send me a direct message with your PayPal or similar, I’d be more than happy to cover the cost. Your assistance has been incredibly valuable, and it’s only fair that I contribute.

The only other possibility I can imagine is that at some point I’ve set some config that is limiting duplicacy performance, but cannot think of anything. Specially taking into account the first results obtained with STORJ:

That is correct.

I actually did not completely understood the use of --transfer so I’m a bit lost here.

But wouldn’t you have obtained also worse results with duplicacy than with rclone? Isn’t a way to confirm that? I’m unsure if maybe in the “patch notes” they refer to which library are they using.

This is above my current knowledge, would have to read a bit and come back. I guess that It would means that duplicacy splits and encrypt the data, and then rclone uploads it to STORJ, then rclone would download it and duplicacy does the opposite process. How would it affect versioning? Would it still be possible? Any way, seems a bit of a complex solution.

To be honest, I simply have no clue of what would that means

It is simply that I’m not in any hurry, in the end I’ve doing the wrong thing for years and (luckily) so far I haven’t lost any data. I rather wait and understand what I’m doing (at least a bit) and have a setup that works properly.

saspus · 25 September 2023 02:40

No worries. I derive much pleasure from tinkering with this. I could have went to the movies instead and paid more

I don’t think there is any hidden config that would affect performance, and indeed, you earlier result is much better. If nothing else changed – maybe peformace of your ISP is inconsistent and all the issues are red herrings, or if that was visa multi hop VPN – maybe routing happened to be favorable.

After reading the documentation, --transfers is how many files to transfer in parallel. So if you only have 10 files and set 20 as the parameter – it will transfer all files in parallel, i.e. 10

For in-file parallel transfer there is indeed a different parameter. We don’t need that, our filreas are already small enough.

So, duplicacy and rclone must get the same results.

I’m going to download stuff with rclone from my home connection tonight varying number of threads and see how it compares with what dulpicacy benchmark reports.

Good point. I’ll try to retry the test, maybe I’ve screwed somethign up.

Nothing changes from duplicacy perspective, just transport to the destination takes two hops. Duplicacy thinks it backs up to sftp, rclone pretends to be sftp server but in reality uploads and downloads files from storj. But I agree, it’s a bit overkill. Actually, I’ll try that tonight as well. Duplicacy benchmark directly and over reclone. I expect exactly same numbers. But we’ll see.

saspus · 25 September 2023 02:58

Yep. Of course I did. I forgot to specify chunk sizes when uploading and these were 4MB ones. Dangit.

saspus · 25 September 2023 04:04

Rclone, downloading 10 64MB Duplicacy chunks from storj, using latest rclone:

rclone, storj	1	2	4	10
Down, MB/s	17.1	25.6	52.4	64.0
cpu, %	75%	118%	236%	291%

Looks like with rclone 10 threads we get to some kind saturation – either my internet or CPU – I have 4 CPU cores on that machine, and it’s probably busy with other stuff like smb. Download reached 64MB/sec.

rclone download logs, 10 threads

alex@truenas ~/tests-duplicacy-storj/rclone-v1.64.0-freebsd-amd64 % rm -rf /tmp/rclone && mkdir /tmp/rclone && time ./rclone copy -P --transfers 1 storj:duplicacy /tmp/rclone
Transferred:   	      640 MiB / 640 MiB, 100%, 17.060 MiB/s, ETA 0s
Transferred:           10 / 10, 100%
Elapsed time:        40.0s
./rclone copy -P --transfers 1 storj:duplicacy /tmp/rclone  24.90s user 5.37s system 75% cpu 40.211 total
alex@truenas ~/tests-duplicacy-storj/rclone-v1.64.0-freebsd-amd64 % rm -rf /tmp/rclone && mkdir /tmp/rclone && time ./rclone copy -P --transfers 2 storj:duplicacy /tmp/rclone
Transferred:   	      640 MiB / 640 MiB, 100%, 25.638 MiB/s, ETA 0s
Transferred:           10 / 10, 100%
Elapsed time:        24.7s
./rclone copy -P --transfers 2 storj:duplicacy /tmp/rclone  24.27s user 5.14s system 118% cpu 24.844 total
alex@truenas ~/tests-duplicacy-storj/rclone-v1.64.0-freebsd-amd64 % rm -rf /tmp/rclone && mkdir /tmp/rclone && time ./rclone copy -P --transfers 4 storj:duplicacy /tmp/rclone
Transferred:   	      640 MiB / 640 MiB, 100%, 52.416 MiB/s, ETA 0s
Transferred:           10 / 10, 100%
Elapsed time:        12.6s
./rclone copy -P --transfers 4 storj:duplicacy /tmp/rclone  25.79s user 4.45s system 236% cpu 12.776 total
alex@truenas ~/tests-duplicacy-storj/rclone-v1.64.0-freebsd-amd64 % rm -rf /tmp/rclone && mkdir /tmp/rclone && time ./rclone copy -P --transfers 10 storj:duplicacy /tmp/rclone
Transferred:   	      640 MiB / 640 MiB, 100%, 63.899 MiB/s, ETA 0s
Transferred:           10 / 10, 100%
Elapsed time:        10.5s
./rclone copy -P --transfers 10 storj:duplicacy /tmp/rclone  27.42s user 3.65s system 291% cpu 10.654 total

I’ve re-run duplicacy benchmark with 10 threads, and got 46.6M/s download performance

duplicacy benchmark logs, 10 threads

alex@truenas ~/tests-duplicacy-storj % ./duplicacy benchmark -chunk-count 10 -chunk-size 64 -download-threads 10 -upload-threads 10 -storage storj
Storage set to storj://12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S@us1.storj.io:7777/duplicacy/duplicacy
Generating 256.00M byte random data in memory
Writing random data to local disk
Wrote 256.00M bytes in 0.04s: 6408.22M/s
Reading the random data from local disk
Read 256.00M bytes in 0.02s: 10416.08M/s
Split 256.00M bytes into 4 chunks without compression/encryption in 17.09s: 14.98M/s
Split 256.00M bytes into 4 chunks with compression but without encryption in 17.47s: 14.65M/s
Split 256.00M bytes into 4 chunks with compression and encryption in 17.88s: 14.32M/s
Deleting 10 temporary files from previous benchmark runs
Generating 10 chunks
Uploaded 640.00M bytes in 793.42s: 826K/s
Downloaded 640.00M bytes in 13.73s: 46.63M/s
Deleted 10 temporary files from the storage

That’s a pretty big discrepancy right there.

I’m going to try to rebuild duplicacy with updated storj library and retry. Duplicacy uses 1.9.0 and current one is 1.12.0. The changelog does mention the word “performance”. I’ll just rebuild and retry.

saspus · 25 September 2023 04:35

Rebuilt, with storj/uplink 1.12.0

git clone https://github.com/gilbertchen/duplicacy && cd duplicacy

# go.mod
# -       storj.io/uplink v1.9.0
# +       storj.io/uplink v1.12.0

go mod tidy
go tool dist list | grep -i free | grep 64
GOOS=freebsd GOARCH=amd64 go build -o duplicacy_storj_1.12.0 duplicacy/duplicacy_main.go

Download performance improved to 59M/s

alex@truenas ~/tests-duplicacy-storj % ~/duplicacy_storj_1.12.0 benchmark -chunk-count 10 -chunk-size 64 -download-threads 10 -upload-threads 10 -storage storj
Storage set to storj://12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S@us1.storj.io:7777/duplicacy/duplicacy
Generating 256.00M byte random data in memory
Writing random data to local disk
Wrote 256.00M bytes in 0.08s: 3190.96M/s
Reading the random data from local disk
Read 256.00M bytes in 0.02s: 10409.92M/s
Split 256.00M bytes into 6 chunks without compression/encryption in 16.83s: 15.21M/s
Split 256.00M bytes into 6 chunks with compression but without encryption in 17.23s: 14.86M/s
Split 256.00M bytes into 6 chunks with compression and encryption in 17.44s: 14.68M/s
Generating 10 chunks
Uploaded 640.00M bytes in 796.58s: 823K/s
Downloaded 640.00M bytes in 10.80s: 59.25M/s
Deleted 10 temporary files from the storage

I’m re-runnign with stock and rebuilt duplicacy few more times to ensure it’s not a fluke. This will have to wait a few hours, as there is quite an activity in my network.

willypo · 25 September 2023 13:39

@saspus is once again a hero to the community trying to help us solve these issues. Following with interest as it seems related to my storj problems, too (linked in #15).

saspus · 25 September 2023 15:42

I’ve setup to run the series benchmark alternating stock duplicacy with the one with updated storj, and alternating their sequences, with slightly more data (size 64, count 32) to make download longer, and reduce number of threads to 4 to eliminate any saturation. 10 runs total.

I did not get as clear a picture. The download performance was all over the place, between 38 and 60 (yes, on just 4 threads, which is interesting):

the updated version showed values anywhere between 38 and 60,
and stock version from 38 to 45.

It seems it could have been an improvement, but could also have been a random chance due to changing network conditions

saspus · 25 September 2023 15:54

I’ve shared the binaries with updated storj library here, if you want to try:

alind · 25 September 2023 16:29

After moving from one part of the Bay Area that supported AT&T fiber symmetric to one that only offered Comcast, I’m stuck with the same.

I see that you wrote this is the maximum available for you. I was also on 1000/30 until I saw that they offered a 1200 tier. I initially considered this as overkill for my uses, but what wasn’t immediately clear was that it also came with a an upgraded 40 mbps upstream. I figured you already checked this but just in case…

saspus · 25 September 2023 16:34

Thank you. They have actually silently upgraded my connection to 40Mbps upstream literally yesterday, without telling anyone, after few nights of outages. Apparently, they are upgrading the equipment, and soon it would be possible to get up to 200Mbps upstream, with a new modem. I’m looking forward to that

edgardiego · 25 September 2023 20:54

So it wasn’t possible to get a conclusive answer, I can try running a test with the updated version you posted.
It would be enough with running the .exe as with regular duplicacy? Something has to be done with go.mod?

Glad to hear that!

saspus · 25 September 2023 21:05

Yes, you can just run .exe. I’ve uploaded .mod just for reference of what versions of what modules it was built with.

edgardiego · 25 September 2023 21:46

I’ve just repeated the benchmark, got 27,79 upload and 70,29 download for THE STOCK VERSION, really don’t know the reason why. Maybe something net related?
Yours reported 27,21 and 79,68, this have been consisted with multiple tries, so it definitely seems to be an improvement!

Download speed is the same as with rclone, I assume the upload one is correct? Shouldn’t both be similar since it is a symmetric connection?

saspus · 25 September 2023 22:55

Internet weather maybe some bottleneck somewhere in the network… who knows.

Oh, this is nice!

@gchen, could you please bump the storj/uplink version in the next duplicacy version? it seems it performs slightly better, and changeless does mention some performance improvement, so it’s worthwhile.

saspus · 25 September 2023 22:56

I’m not sure I understand – do you mean download matches rclones’ but upload does not?

edgardiego · 25 September 2023 23:12

I worded that poorly, when I measured speeds with rclone, I only measured download speed, which appears to be consistent with what I’m currently experiencing with Duplicacy. Is it normal to have such a significant difference between upload and download speeds when using a symmetric fiber connection?

I might explore how to benchmark upload speed with rclone and give it a try.

Tomorrow, I’ll finally begin the backup process, and I’m hoping it to be a much smoother process

I want to express my gratitude once again for all the time you’ve taken to assist me. This has been an incredibly interesting and educational experience.

saspus · 26 September 2023 06:51

Based on this:

Much effort has gone into optimizing this process, for instance when you download a file we attempt to grab 39 pieces when only 29 are required eliminating slow nodes (Long Tail Elimination). This is our base parallelism and allows up to 10 nodes to respond slowly without affecting your download speeds.

When uploading we start with sending 110 erasure-coded pieces per segment in parallel out to the world but stop at 80. This has the same effect as above in eliminating slow nodes (Long Tail Elimination).

I’d expect that upload should be slower — it needs to upload 2.7 times more data

Since with gigabit connection you can upload at maximum 120 MBps, the storj upload will therefore be capped by 44MBps.

If you want to upload faster — you either need bigger upstream or use a gateway — either your own in the cloud or storj provided one. Assuming gateway is capable of maintaining the gigabit upstream from you. The 2.7 multiplication will then happen on a gateway.

With download — you only download minimum amount of chunks necessary to reconstruct the files. While more than minimum required transfers are started, once enough pieces have been downloaded — others are cancelled. There is some overhead associated with clogging your connection with extra transfers that will get cancelled, but the net benefit is positive because you avoid waiting for slow nodes and effectively the fast ones are self-selected.

You are welcome and thank you too for the opportunity to learn something new too: how to cross-compile go programs, how to update dependencies and manage modules, how to start an instance on AWS (I’ve been using Oracle cloud before, AWS is much nicer to deal with. Much less waiting clicking around and much more streamlined and intuitive process. Oracle however provides 10TB of free egress monthly… ).

edgardiego · 26 September 2023 09:12

I see, it is a direct consequence of how it is designed.

What I’m thinking about is that we’ve always set chunk size at 64Mb to get optimum performance out of the native protocol. However, I’m not sure if it possible to set an exact chunk size outside of a benchmark tool. As you mentioned before, -c 32 would control the average chunk size, but this leaves a lot of room for variations far from 64. I assume that is not possible to determine an exact chunk size since file sizes would have to always be multiple of 64, I’m thinkin in using -c 64 -max 64 so it tries to create 64 Mb chunks but leaves room to create smaller ones when needed. Does this make any sense?

saspus · 26 September 2023 15:55

Which is not really a problem in backup scenario: backing things up is a background process, so if it goes slow it does not matter much. (except initial backup, but if you have a lot of data – you can do initial backup over s3) but when you need to access data – it will be fast.

Hmmm. Would not this mean that all chunks are 64? if max is equal to average, then min must be equal to it too.

I think setting average to 32MiB should provide a good balance. It still keeps the chunk size variable, which is good for heterogeneous content, and it’s still almost an order of magnitude (8x) larger than the default (4MiB)

There are scenarios where fixed chunk size is actually prefferend – backing up virtual disks is one example. In fact the vertical backup by the same developer is using fixed block size, because its best suited of this type of content

edgardiego · 26 September 2023 21:53

Sure, it is not a problem by any means, I was just wondering the reason behind that difference.

Yeah, you are right, I did not put much though into it.

I’ll just go for that then, thanks once again