Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
XFS corruption
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
big_gie
Apprentice
Apprentice


Joined: 31 Aug 2004
Posts: 158

PostPosted: Tue Apr 05, 2011 8:08 pm    Post subject: XFS corruption Reply with quote

Hi all,

I had a sytem failure running on a hardware raid1 for the os and raid60 for our simulations data.

Booting from systemrescuecd, I was able to recover the os (even though it does not boot anymore, grub doesn't even show up).

But then I'm trying to recover the raid60 data, or at least part of it. First, I tried mouting the single xfs partion read-only but got a strange error. The error show-up in dmesg and it looked like the kernel crashed. To prevent a kernel problem to further affect the data, I rebooted the livecd before checking the filesystem. But I can't check it:
Quote:
# xfs_check /dev/sdb1
ERROR: The filesystem has valuable metadata changes in a log which needs to
be replayed. Mount the filesystem to replay the log, and unmount it before
re-running xfs_check. If you are unable to mount the filesystem, then use
the xfs_repair -L option to destroy the log and attempt a repair.
Note that destroying the log may cause corruption -- please attempt a mount
of the filesystem before doing this.


Trying to mount the filesystem gave (probably) the same error as before, this time I saved it. Here it is:
Quote:

# dmesg > dmesg1.txt
# mount -o ro /dev/sdb1 raid/
mount: Structure needs cleaning
# dmesg > dmesg2.txt
# diff dmesg1.txt dmesg2.txt
1098a1099,1128
> XFS mounting filesystem sdb1
> Starting XFS recovery on filesystem: sdb1 (logdev: internal)
> Filesystem "sdb1": XFS internal error xlog_valid_rec_header(1) at line 3428 of file fs/xfs/xfs_log_recover.c. Caller 0xffffffff812e39cc
>
> Pid: 4125, comm: mount Not tainted 2.6.35-std164-amd64 #2
> Call Trace:
> [<ffffffff812d0772>] xfs_error_report+0x3c/0x3e
> [<ffffffff812e39cc>] ? xlog_do_recovery_pass+0x1b5/0x5ee
> [<ffffffff812e0acf>] xlog_valid_rec_header+0xcb/0xd2
> [<ffffffff812e39cc>] xlog_do_recovery_pass+0x1b5/0x5ee
> [<ffffffff812e3e41>] xlog_do_log_recovery+0x3c/0x75
> [<ffffffff812e3e8d>] xlog_do_recover+0x13/0xd8
> [<ffffffff812e3fce>] xlog_recover+0x7c/0x8a
> [<ffffffff812de5d9>] xfs_log_mount+0xd7/0x143
> [<ffffffff812e68ec>] xfs_mountfs+0x310/0x61c
> [<ffffffff812eefd1>] ? kmem_zalloc+0x11/0x2c
> [<ffffffff812e722a>] ? xfs_mru_cache_create+0x117/0x147
> [<ffffffff812f9b6a>] xfs_fs_fill_super+0x1f8/0x372
> [<ffffffff811052b8>] get_sb_bdev+0x137/0x19a
> [<ffffffff812f9972>] ? xfs_fs_fill_super+0x0/0x372
> [<ffffffff812f7c00>] xfs_fs_get_sb+0x13/0x15
> [<ffffffff81104969>] vfs_kern_mount+0xb8/0x1a2
> [<ffffffff81104ab1>] do_kern_mount+0x48/0xe8
> [<ffffffff81119fe8>] do_mount+0x73c/0x7b2
> [<ffffffff810d1cff>] ? copy_from_user+0x3c/0x44
> [<ffffffff810d24e6>] ? strndup_user+0x58/0x82
> [<ffffffff8113876e>] compat_sys_mount+0x262/0x29c
> [<ffffffff81033fd3>] ia32_sysret+0x0/0x5
> XFS: log mount/recovery failed: error 117
> XFS: log mount failed


What's wrong? Is there a bug in the kernel, or is it just its own way of telling me the filesystem is really broken?

Could mouting with "-o ro,norecovery" cause more trouble? There is a couple of files I really need to restore, but don't want to break things even more for not much.

Thanks for your help!
Back to top
View user's profile Send private message
madchaz
l33t
l33t


Joined: 01 Jul 2003
Posts: 995
Location: Quebec, Canada

PostPosted: Tue Apr 05, 2011 8:19 pm    Post subject: Reply with quote

I would suggest you make a copy of your disk to offline media first (use DD)
that way, you can always go back.

However, as long as you are read-only, it "should" be ok, if it works.

You might want to look at the mount.xfs options to force the journal replay as well. (Again, do a backup first)

And when this is all over, I strongly suggest you schedule daily backups ;-)
_________________
Someone asked me once if I suffered from mental illness. I told him I enjoyed every second of it.
Back to top
View user's profile Send private message
big_gie
Apprentice
Apprentice


Joined: 31 Aug 2004
Posts: 158

PostPosted: Tue Apr 05, 2011 8:22 pm    Post subject: Reply with quote

Thanks for the suggestion.
This is exactly what I've done for the OS partitions. Unfortunately, I just can't do this, at least yet. The XFS partition is... 40TB.
Back to top
View user's profile Send private message
madchaz
l33t
l33t


Joined: 01 Jul 2003
Posts: 995
Location: Quebec, Canada

PostPosted: Tue Apr 05, 2011 8:25 pm    Post subject: Reply with quote

Yikes. And no backup? You like living dangerously.
_________________
Someone asked me once if I suffered from mental illness. I told him I enjoyed every second of it.
Back to top
View user's profile Send private message
big_gie
Apprentice
Apprentice


Joined: 31 Aug 2004
Posts: 158

PostPosted: Tue Apr 05, 2011 8:31 pm    Post subject: Reply with quote

No, no backup...
We hoped we would be fast enough to change bad drives so the RAID could be rebuilt. But we suspect the controller to be bad... There was around 75% of the drives that failed inside 20 minutes, which crashed the os and made things a lot more complicated then what they should have been...
Backing up 40TB is quite hard. We'll probably explore the different possibilities after this ;)

For now, it's only a couple hunder megs that I need to salvage. Everything else is luxury.
Back to top
View user's profile Send private message
madchaz
l33t
l33t


Joined: 01 Jul 2003
Posts: 995
Location: Quebec, Canada

PostPosted: Thu Apr 07, 2011 12:50 pm    Post subject: Reply with quote

I suggest you take a good look at different differential backup solutions ;-)
_________________
Someone asked me once if I suffered from mental illness. I told him I enjoyed every second of it.
Back to top
View user's profile Send private message
trippels
Tux's lil' helper
Tux's lil' helper


Joined: 24 Nov 2010
Posts: 137
Location: Berlin

PostPosted: Thu Apr 07, 2011 7:25 pm    Post subject: Reply with quote

In your case I would try the "-L option" of xfs_repair.
You may lose a few seconds of data from shortly before the system failure,
because it will zero the file system log. But most of the data should
still be OK.

You could also contact the friendly xfs developers on their mailing list
and describe your problem @:
linux-xfs@oss.sgi.com
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum