Copy stalling for hours with minimal chunks on MinIO backend

I’m experiencing an issue where the duplicacy copy operation in the Web GUI stalls for hours before copying any data, even when only a few chunks need to be transferred.

The command spends over 90 minutes printing hundreds of lines like this before actually copying a handful of chunks:

2025-05-03 04:10:34.747 INFO SNAPSHOT_EXIST Snapshot XXXX at revision 3654 already exists at the destination storage
...
2025-05-03 04:10:36.834 INFO SNAPSHOT_EXIST Snapshot XXXX at revision 4005 already exists at the destination storage
2025-05-03 05:58:55.430 INFO SNAPSHOT_COPY Chunks to copy: 5, to skip: 13606, total: 13611
...
2025-05-03 05:58:58.107 INFO SNAPSHOT_COPY Copied snapshot XXXX at revision 4010

I understand that Duplicacy checks for existing snapshots and chunks, but this pre-copy phase seems excessively long, given the minimal amount of data to transfer. If I perform the same backup using the duplicacy backup operation instead, it takes about 30 seconds.

Is there a way to optimize the copy phase in this case? Any insights or tips are appreciated.

Setup:

  • Duplicacy is run on a remote machine that performs offsite backups over the internet to a MinIO backend (running in Docker).
  • The MinIO storage use erasure coding (5 data, 2 parity shards).
  • Average chunk size is 4 MB (min: 1 MB, max: 16 MB).
  • Chunks are encrypted.

It appears that enumerating files on either source or destinations takes a lot of time and is high latency. Check whether it is cpu bound or disk subsystem IOps bound. What is the filesystem there? How is metadata handled?

I’d suspect minio to be a culprit — perhaps there is not enough ram on the host machine to fit all metadata, and every object lookup results in IO hitting the disk.

To find what to copy duplicacy needs to find out what has already been copied. Those are all metadata fetches and shall be very fast on a properly configured system.

If you control that host, and its resource constrained, i would not use minio at all and instead perhaps switch to SFTP. Offloading minio will free up ram for metadata caching and improve performance. But this is all speculation until you pinpoint the bottleneck.

In minio or in duplicacy? (It does not matter for this — I’m just curious)

It’s not CPU-bound as far as I can tell. MinIO uses around 5%, and Duplicacy at its peak uses 3% 4%. The file system is XFS.

MinIO uses less than 1 GB, and the server has 4 GB of RAM in total.

The reason I want to try MinIO is to protect my backups from tampering (e.g., ransomware could make my backups inaccessible) by enabling bucket versioning with lifecycle management. As far as I can tell, this isn’t possible with SFTP in the same way.

I have root access to the VPS, and the physical hardware node is controlled by a hosting provider. If there is an issue with the hosting environment, I can probably contact support to tweak some things.

Any suggestions on how I can troubleshoot further? I’m using an out-of-the-box configuration of MinIO, with only users, groups, and a bucket set up. I presume some adjustments might be necessary.

So far, I’ve run duplicacy benchmark:

Storage set to minios://URL@TO/bucket
Generating 256.00M byte random data in memory
Writing random data to local disk
Wrote 256.00M bytes in 0.37s: 692.38M/s
Reading the random data from local disk
Read 256.00M bytes in 0.07s: 3923.08M/s
Split 256.00M bytes into 49 chunks without compression/encryption in 3.29s: 77.86M/s
Split 256.00M bytes into 49 chunks with compression but without encryption in 4.42s: 57.92M/s
Split 256.00M bytes into 49 chunks with compression and encryption in 9.59s: 26.71M/s
Generating 64 chunks
Uploaded 256.00M bytes in 92.48s: 2.77M/s
Downloaded 256.00M bytes in 10.21s: 25.07M/s
Deleted 64 temporary files from the storage

And fio:

$ fio --name=randread --filename=/media/vdisk1/testfile   --direct=1 --rw=randread --bs=4k --size=1G --numjobs=1   --iodepth=64 --runtime=60 --time_based --ioengine=libaio   --group_reporting
randread: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64
fio-3.36
Starting 1 process
Jobs: 1 (f=1): [r(1)][100.0%][r=211MiB/s][r=53.9k IOPS][eta 00m:00s]
randread: (groupid=0, jobs=1): err= 0: pid=4143540: Sun May  4 05:10:33 2025
  read: IOPS=43.4k, BW=170MiB/s (178MB/s)(9.94GiB/60001msec)
    slat (usec): min=4, max=3402, avg= 5.28, stdev= 9.78
    clat (usec): min=2, max=235725, avg=1467.49, stdev=5454.80
     lat (usec): min=82, max=235730, avg=1472.77, stdev=5454.92
    clat percentiles (usec):
     |  1.00th=[    94],  5.00th=[   114], 10.00th=[   139], 20.00th=[   198],
     | 30.00th=[   273], 40.00th=[   351], 50.00th=[   437], 60.00th=[   529],
     | 70.00th=[   627], 80.00th=[   766], 90.00th=[  1483], 95.00th=[  6521],
     | 99.00th=[ 24249], 99.50th=[ 33162], 99.90th=[ 67634], 99.95th=[ 94897],
     | 99.99th=[149947]
   bw (  KiB/s): min=14472, max=296040, per=99.73%, avg=173255.76, stdev=70622.68, samples=119
   iops        : min= 3618, max=74010, avg=43313.91, stdev=17655.66, samples=119
  lat (usec)   : 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01%, 100=2.30%
  lat (usec)   : 250=24.98%, 500=29.89%, 750=22.13%, 1000=6.98%
  lat (msec)   : 2=5.26%, 4=2.14%, 10=2.95%, 20=1.98%, 50=1.19%
  lat (msec)   : 100=0.15%, 250=0.04%
  cpu          : usr=12.52%, sys=25.72%, ctx=452559, majf=0, minf=85
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued rwts: total=2605870,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: bw=170MiB/s (178MB/s), 170MiB/s-170MiB/s (178MB/s-178MB/s), io=9.94GiB (10.7GB), run=60001-60001msec

Disk stats (read/write):
  vdb: ios=2604988/67, sectors=20840232/880, merge=0/7, ticks=3677663/1146, in_queue=3679798, util=62.69%

Ideally, I would let MinIO handle erasure coding, but since it requires at least 3 nodes to enable this feature, and I only have one node, I’ve gone with Duplicacy for this setup.

The source storage that it is copying is also located on the same machine?

On the source or destination or both?

Oh, no more 3GB is available to cache metadata + all the OS stuff. That could be an issue. When duplicacy asks minio --do you have this object? minio might need to go ask the disk subsystem (or consult it’s database, which could have paged paged out to disk) for the pages. that could be long.
You can check disk activity with iotop.

Since this is VPS – it is probably has storage mounted over NFS or iSCSI, so paging thing in may have non-trivial overhead.

You can backup to a filesystem that supports snapshots. For example, btrfs, or zfs. Then you would enable eg. daily snapshots with limited lifespan. Only root can delete snapshots – that’s a common approach to protection from ransomware.

Introducing Minio just to get snapshot immutability adds quite a fat layer of additional dependencies, and consumes tons of resources, reducing amount of free remaining memory available for the OS to cache filesystem data. Duplicacy does not support object lock anyway – so technically, it does not provide as bulletproof protection as filesystem snapshots would.

I strongly suspect disk io to be a bottleneck. run iotop both on source and destination during that phase of copy and see if anything approaches the limit.

Duplicacy benchmark won’t catch it – it operates with a small amount of data that can be cached in ram.

For example, this is obviously bullshit read from ram:

But this indeed seems too slow.

And yet, (assuming your have more or less symmetric intenet connection between host that runs duplicacy and minio) download of the data that was just uploaded was 8x faster – which also hints that it probably was due it it being cached in ram – because it was just uploaded:

I’m not familiar with fio, I need to look into how it works to interpret it’s output… But the key suspect is:

  1. slow access to data offloaded to disk from cache
  2. too small a cache to fit at least the amount of metadata from duplicacy chunks.

Thank you for your input, Saspus.

I’ve done some additional troubleshooting, and I’ve concluded that the performance issues are likely related to MinIO.

One of the tests I did was with rsync. To make the test somewhat realistic, I followed these steps:

  1. Copied all data to another storage location (as a precaution in case I need to reformat the drive to Btrfs or test other setups).

  2. Performed a sync to reduce the amount of data needing transfer during the test.

  3. Ran the following command to simulate a full sync: time rsync -avhu --delete --progress /path/to/source/ user@destination-server:~/minio-storage-backup/

Result:

sending incremental file list

sent 74.09M bytes  received 522.78K bytes  164.16K bytes/sec
total size is 1.41T  speedup is 18,831.02

real    7m34.449s
user    0m4.432s
sys     0m6.320s

While not amazing, it’s clearly much faster than the ~90 minutes it takes with Duplicacy + MinIO. Of course, this isn’t an apples-to-apples comparison — I still need to run duplicacy copy and check using an SFTP backend to get a more accurate picture.

Given these findings, I’m now seriously considering switching to Btrfs with snapshots for backup immutability and ransomware protection, instead of trying to optimize MinIO on this VPS.

That said — is there a recommended ready-made solution for managing Btrfs snapshots in this kind of use case, or is this something I’d need to build from scratch?

From what I can tell, the following components would be needed from day one:

  1. A system to take periodic snapshots (daily or multiple times per day).
  2. Health checks and alerts (e.g. snapshot failed, disk nearly full), ideally via webhook or Slack.
  3. Automatic pruning of old snapshots (e.g. keep for 14 days).
  4. The ability to temporarily mount a snapshot for inspection.
  5. The ability to restore storage to a specific snapshot.
  6. Manual deletion of snapshots — useful in emergency scenarios (e.g., resolving full disk space).
  7. Easy to administer on a headless server — CLI-based tooling is strongly preferred.

Any suggestions or best practices on how to set this up efficiently?

1 Like

Software storage appliances have that functionality usually built in, but it shall not be too hard to implement.

You would write a script to call btrfs subvolume snapshot (or something along those lines, I don’ remember details) periodically to capture a new snapshot and then enumerate and delete old snapshots. in cron or systemd service

Periodic btrfs scrub shall take care of data consistency. Email and web notification via curl.

I think there is a way to have all snapshots mounted automatically into hidden directory.

see btrfs restore

btrfs subvolume snapshot -r ...
man btrfs 

:slight_smile:

I may be wrong above, read the manual, I’m much more familiar with zfs.