Restore stats look weird?

sevimo · 9 June 2022 09:41

OK, so I have tried to test-restore a file from storage (GCD, but it probably doesn’t matter). Admittedly, I used quite an extreme case of restoring 250GB file out of 800GB storage, but some things don’t quite check out or I have questions about. Here are some log snippets:

2022-06-09 03:50:20.560 INFO RESTORE_START Restoring /mnt/NAS1/Temp to revision 5
2022-06-09 05:17:52.046 INFO DOWNLOAD_PROGRESS Downloaded chunk 1 size 2592726, 0B/s 00:00:00 0.0%

It did take about 1.5 hours in between RESTORE_START until actual chunk download started. So what is happening in that timeframe? There was no significant ingress/egress activity on the router.

Transfer speed/ETA is borked, I think it takes into account this 1.5 hours during which nothing happened. -stats reports 0B/s for downloads which is definitely incorrect (download looks to be about 6-7MB/s according to router stats). It eventually picks up - to 1B/s and 2B/s onwards

2022-06-09 05:22:47.615 INFO DOWNLOAD_PROGRESS Downloaded chunk 334 size 2573734, 0B/s 00:00:00 0.6%
2022-06-09 05:22:48.179 INFO DOWNLOAD_PROGRESS Downloaded chunk 333 size 4196292, 0B/s 00:00:00 0.6%
2022-06-09 05:22:48.808 INFO DOWNLOAD_PROGRESS Downloaded chunk 332 size 16777216, 0B/s 00:00:00 0.6%
2022-06-09 05:22:49.833 INFO DOWNLOAD_PROGRESS Downloaded chunk 335 size 14150015, 1B/s 2943499 days 18:19:57 0.6%
2022-06-09 05:22:50.326 INFO DOWNLOAD_PROGRESS Downloaded chunk 336 size 1230363, 1B/s 2943485 days 12:33:54 0.6%
2022-06-09 05:22:51.172 INFO DOWNLOAD_PROGRESS Downloaded chunk 338 size 3120998, 1B/s 2943449 days 09:37:16 0.6%
…
2022-06-09 05:27:32.474 INFO DOWNLOAD_PROGRESS Downloaded chunk 665 size 1486058, 1B/s 2924490 days 18:00:55 1.2%
2022-06-09 05:27:32.797 INFO DOWNLOAD_PROGRESS Downloaded chunk 666 size 2160463, 1B/s 2924465 days 17:53:12 1.2%
2022-06-09 05:27:33.036 INFO DOWNLOAD_PROGRESS Downloaded chunk 667 size 2186859, 1B/s 2924440 days 10:25:33 1.2%
2022-06-09 05:27:34.351 INFO DOWNLOAD_PROGRESS Downloaded chunk 668 size 6201784, 1B/s 2924368 days 15:42:29 1.2%
2022-06-09 05:27:34.406 INFO DOWNLOAD_PROGRESS Downloaded chunk 671 size 1467061, 2B/s 1462175 days 20:05:44 1.2%
2022-06-09 05:27:35.669 INFO DOWNLOAD_PROGRESS Downloaded chunk 670 size 5379579, 2B/s 1462144 days 16:55:55 1.2%
2022-06-09 05:27:36.815 INFO DOWNLOAD_PROGRESS Downloaded chunk 672 size 5126146, 2B/s 1462115 days 00:58:02 1.2%
…
2022-06-09 05:32:15.184 INFO DOWNLOAD_PROGRESS Downloaded chunk 991 size 1106953, 2B/s 1452634 days 04:42:00 1.9%
2022-06-09 05:32:16.687 INFO DOWNLOAD_PROGRESS Downloaded chunk 990 size 3241003, 2B/s 1452615 days 10:33:38 1.9%
2022-06-09 05:32:16.815 INFO DOWNLOAD_PROGRESS Downloaded chunk 992 size 4504124, 3B/s 968392 days 21:59:31 1.9%
2022-06-09 05:32:16.854 INFO DOWNLOAD_PROGRESS Downloaded chunk 993 size 3543819, 3B/s 968379 days 05:51:38 1.9%

sevimo · 10 June 2022 01:34

Actually, just checked restore with a much smaller file (so there is no delay before chunks download), and download speed/ETA still show the same behaviour (i.e. basically all zeroes to start with).

gchen · 10 June 2022 02:41

I think it is a bug introduced by the major code changes for the memory optimization. Are you running an executable built from the master branch?

gchen · 10 June 2022 03:35

The stats bug should be fixed by Initialize startTime correctly in CreateChunkOperator · gilbertchen/duplicacy@fc2386f · GitHub

As for the 1.5 hours, I think it was spent downloading all metadata chunks for that revision but I agree it should not take that long. If you clear the cache under .duplicacy/cache and rerun the restore command with -d as the global option it will tell you more about what it is doing during this period.

sevimo · 10 June 2022 05:48

I do indeed, needed it for app-specific token support for GDrive. BTW, do we expect a new official release of the CLI, as running with custom-built CLI and official GUI is a bit sketchy It’s been a while, too.

Will check restore with -d, thanks!

sevimo · 10 June 2022 07:05

OK, so I tried to run restore with -d, and it didn’t show anything insightful. There is still the same big gap in time with no info in the log. The thing is, there is no significant bandwidth utilization during that time (comparing to when chunks are finally starting to download), so I am not sure what is the bottleneck here as it is not clear what is happening.

Also, is there a way to add -d flag with the GUI setup (manual editing of configs is fine)? I had to run it from CLI, and unless I am missing something, you can’t just use storage name set up in GUI or can you? So I had to basically reconfigure storage for init before restore, am I missing something?

gchen · 10 June 2022 13:02

You can add -d | in the Options input on the restore page. Anything before | will be interpreted as global options.

Can you post the log with -d here?

sevimo · 13 June 2022 01:33

Here you go:

Nothing super exciting there though, the gap is still there:

2022-06-11 02:31:31.610 INFO RESTORE_START Restoring /mnt/NAS1/Temp to revision 6
2022-06-11 02:31:31.743 TRACE DOWNLOAD_START Downloading Snapshots/@transient/@Veeam/Alpha/Alpha2022-06-06T053016.vbk
2022-06-11 03:44:36.320 DEBUG DOWNLOAD_PREFETCH Prefetching Snapshots/@transient/@Veeam/Alpha/Alpha2022-06-06T053016.vbk chunk 668ea3cca3edb60c31ab469b4c4e57fad07b016d5dcbad570071901322e948a1
2022-06-11 03:44:36.326 TRACE DOWNLOAD_INPLACE Updating /mnt/NAS1/Temp/Snapshots/@transient/@Veeam/Alpha/Alpha2022-06-06T053016.vbk in place
2022-06-11 03:44:38.675 INFO DOWNLOAD_PROGRESS Downloaded chunk 1 size 2592726, 590B/s 5021 days 09:57:55 0.0%

You can also see that ETA/speed fix helped (we don’t have 1B/s for a while), it still starts up incorrectly (probably accounting for this 1hr+ where download doesn’t happen yet):

2022-06-11 03:44:41.658 DEBUG DOWNLOAD_FETCH Fetching chunk f0696224492ef62f071c78fdd82ae2248809b8f88114c7f6e58534d27be91703
2022-06-11 03:44:44.055 INFO DOWNLOAD_PROGRESS Downloaded chunk 3 size 2936623, 2KB/s 1356 days 11:32:15 0.0%
2022-06-11 03:44:44.067 DEBUG DOWNLOAD_FETCH Fetching chunk f0355f0bdad8502a7ef9304038ac9266b6058aa31563dead55622e47c5daca43
2022-06-11 03:44:46.379 INFO DOWNLOAD_PROGRESS Downloaded chunk 4 size 2730122, 3KB/s 1056 days 12:49:28 0.0%
2022-06-11 03:44:46.390 DEBUG DOWNLOAD_FETCH Fetching chunk 3926b230516bcc5e7c70451fd89d7f3187623d65fb7cc6ca535627eb4a0c5913
2022-06-11 03:44:48.132 INFO DOWNLOAD_PROGRESS Downloaded chunk 5 size 1595011, 3KB/s 936 days 00:29:03 0.0%
2022-06-11 03:44:48.138 DEBUG DOWNLOAD_FETCH Fetching chunk d787bd11925f218a482c0cd5a78720057982019cbbfccaa2bcb4f77e1c11bbf0
2022-06-11 03:44:50.794 INFO DOWNLOAD_PROGRESS Downloaded chunk 6 size 4798704, 4KB/s 696 days 09:23:23 0.0%
2022-06-11 03:44:50.811 DEBUG DOWNLOAD_FETCH Fetching chunk 8abb55e8b7eebf4131adecd0cd6e4246bd3bb1eb0a802d286572d69d57c38927
2022-06-11 03:44:54.191 INFO DOWNLOAD_PROGRESS Downloaded chunk 7 size 5076144, 5KB/s 548 days 06:51:56 0.0%

As you can see, it starts with a few KB/s and projected a few hundred days, before eventually converging on ~1.75MB/s steady state. Speaking of, this is pretty slow, and all indications point to just a single thread being in use. Not sure if adding -d dropped threads down to 1, or I just didn’t pay attention, but I am pretty sure it was faster before, at about 6-7MB/s as I mentioned, which would be consistent with 4 threads.

gchen · 13 June 2022 13:06

If /mnt/NAS1 is a networked drive, can you try restoring to a local disk just for testing?

-d should not affect the number of threads. From the log it looks there was only one downloading thread. This is how you enter the options if you to specify both the number of threads and debug logging: -d | -threads 4.

sevimo · 13 June 2022 18:00

Unfortunately, I don’t have any local disks large enough to replicate that restore. /mnt/NAS1 is able to saturate 1GB LAN on a sustained transfer basis, though the story might be different for multiple small open/close transfers. I have tried to restore smaller file (~5GB) to a local SSD and also to /mnt/NAS1, and there is a difference in the gap time - 47 seconds to 108 seconds. It seems to be roughly consistent with the large file restore (proportional), so you’d still expect about half an hour before transfer starts even on a local SSD.

So the question again, what is duplicacy doing during that time?