Yeah, the cross-chunk parity was a concern of mine as well, though it’s not too different from the current case where a chunk can’t be pruned because it contains a few bytes of referenced data. Cross-chunk parity would basically just be more data that hangs around a bit longer before it’s pruned. It would increase the amount of space consumed, but that’s what parity does.
Thinking about this more though, I have more questions about the parity being within the chunk. Sure this protects against data loss within a chunk, but what if that corruption occurs in the headers? Would two lost bytes (one in the starting header and one in the ending header mean the whole chunk is lost? How robustly does it try to reconstruct the header from the two parts? Duplicacy could try to mix and match the three pieces of the header to satisfy the checksum. Does there need to be a third header to build a consensus?
I only ask, because I actually just recently started using Reed Solomon on a project and specifically made sure the parity was calculated to include the header data so it could be reconstructed in the event of a lost packet. Granted, that’s a different use case than here, but it raised the concern in my mind. In my case I also kept the number of data and parity shards outside of the packets, but that is also less flexible.
Despite any critiques I may bring up, I really like the idea of being able to include parity to provide extra protection in places where the storage or transport might not always be reliable.