XFS corruption on restore, extended attributes

After a restore, when I do a getfacl or a number of other operations (chmod, etc) on one of the restored folders, it returns Structure needs cleaning

After getfacl /var/log/messages shows
XFS (vdb): Internal error xfs_acl_from_disk at line 40 of file fs/xfs/xfs_acl.c. Caller xfs_get_acl+0xea/0x190 [xfs]

Corruption detected. Unmount and run xfs_repair

I originally thought it might be issues with my volume structure so I removed all lvm and formatted the whole disk as xfs to no avail.

xfs_repair returns a ton of
Too many ACL entries, count 1611530356
entry contains illegal value in attribute named SGI_ACL_FILE or SGI_ACL_DEFAULT
would remove attribute entry 1 for inode 62277025927

xfs_repair seems to be able to fix things by truncating the acl and removing unsupported options unless I restore too much of the data so that it exceeds the repair buffer, in which case it puts tons of data in lost+found. For 4TB of data it put about 200GB of data in lost+found.

The exact count isn’t always the same but it’s always bonkers like that. Sometimes higher or lower depending on how much I’ve restored so far I think.

I originally thought I had done something wrong, but even if I restore just a small folder, I get these issues. I’ve completely reformatted the drives multiple times in different ways including using nothing but mkfs.xfs for the whole drive (/dev/sda). I am able to copy data to the drive from other locations just fine. It’s only when I use duplicacy that the issue occurs.

A sample restore command that I’ve used is
~/duplicacy -log restore -stats -storage local -threads 16 -ignore-owner -r 1125 -- "shares/Public/*" >> restore1125public.log

I have a couple of personal licenses, but since I have a bunch of scripts at this point and I rarely use a gui, I prefer to use the cli.

I think the main issue is something weird that duplicacy is doing with ACLs but I don’t really know. I’ve been working on this for a few days now and I’m at a loss. I’m currently dead in the water with a full sytem failure and need to get back up asap. I’ve had to turn down a couple of jobs because I don’t have the infrastructure.

Are you running 2.7.2? If not, this issue can be the result of not handling extended attributes with the system namespace, which was fixed by https://github.com/gilbertchen/duplicacy/commit/b392302c0680ca4de569630031c43bab21f51d82.

I am using 2.7.2

A few more notes.

I seem to be able to restore to windows and even copy the restored files to the linux machine just fine.

After my system crash I changed from CentOS 7 to 8, so I’m on an updated version of the xfs filesystem.

I don’t need extended attributes restored… I don’t suppose there’s an option to ignore them on restore is there?

I think I just confirmed that this issue does not occur in 2.7.1

Maybe restoring system/user/none should be options for attributes

I’m going to restore some more and see how it goes

Ugh… now I think I’m getting the issue from Fixed a bug that caused a fresh restore of large files to fail without the -overwrite option

 ERROR DOWNLOAD_OVERWRITE File IMG_1399.MOV already exists.  Please specify the -overwrite option to overwrite

I don’t suppose you could release a patch to skip extended attributes…

I tried to download and build, but ended up with a module issue and haven’t looked into it yet

go get: installing executables with 'go get' in module mode is deprecated.
        Use 'go install pkg@version' instead.
        For more information, see https://golang.org/doc/go-get-install-deprecation
        or run 'go help get' or 'go help install'.

I was just going to return out of the set attribute function without doing work…

A restore corrupting the entire xfs fs is a pretty major bug.

To get around the module issue add GO111MODULE=off env variable

Then you may hit swift bug; to fix — checkout older version:

cd $GOPATH/src/github.com/ncw/swift
git checkout tags/v1.0.50

And then it should build.

Edit: reference thread my notes originally came
From: Building recent version fails with "Context.App.Writer undefined (type *cli.App has no field or method Writer)"

1 Like

If writing system extended attributes can corrupt the file system, I think this is an xfs bug.

1 Like

lol. I don’t disagree with that.

I’m not sure the restore is working correctly for me though considering xfs_repair is reporting that there are 1611530356 ACL entries for one file. That doesn’t seem right.

I commented out everything in the SetAttributesToFIle function and rebuilt it and have restored 1TB of data without any apparent issues. I’m not exactly sure what’s going on, but xfs is the default fs for most modern linux based systems and distros now. I’d consider 2.7.2 unstable for restores for now.

Duplicacy sets xattrs via syscalls using this pretty straightforward library https://github.com/gilbertchen/xattr/blob/master/syscall_linux.go

Even if duplicacy sends bad pointers/wrongs sizes to this calls the fact that important kernel data structures are allowed to get corrupted via calls from user space is a security bug in either Linux kernel or xfs driver, depending on where does the validation occurs.

It may be worth it to find small reproducible usecase and reporting it to your distribution maintainers.

I’ve thought about that as I’ve been struggling with this. File system “corruption” is probably too strong of a word. It only warns me when I try to access the files restored by duplicacy that they are corrupt and “the structure needs cleaning”. Upon cleaning, a fair amount of the data restored is fine, but a small fraction of it ends up in the lost+found completely obfuscated. EDIT: actually, a number of times I tried with a small restore of a sample folder, it didn’t even lose any data, but it’s still concerning.

The xfs system doesn’t even go offline. Since everything else besides the restored data seems to work fine still, I’d say my original description duplicacy “corrupting” the file system is exaggerated due to my lack of understanding of what was happening at the time. My current perspective after spending more time on this is that duplicacy is failing to restore extended attributes properly (though I don’t really understand in what way, so I say that loosely).

I’m not really sure how you’d want to handle that. You could try listing the attributes again after setting them. I’m willing to bet that xattr will fail to read them since the fs refuses to return them and you’d be able to handle the error appropriately from there. When that error occurs and halts the restore, you could even print a suggested workaround of using a -no-attributes flag or something.

I’m not really sure if it has to do with the fact that I was using an older version (2.6.1) before the crash for my backups or what, but some simple changes should be able to prevent all of these woes from afflicting other customers.