View previous topic :: View next topic |
Author |
Message |
Jimini l33t
Joined: 31 Oct 2006 Posts: 605 Location: Germany
|
Posted: Fri Jun 14, 2019 11:50 am Post subject: [solved] e2fsck skips block checks and fails |
|
|
Hey there,
I am currently trying to fix an ext4 FS using e2fsck.The FS is on a 24TB RAID6, which is currently missing one disk. Unfortunately, scanning the FS takes days, fills up RAM and swapspace and gets killed in the end. I also get a huge number of the following lines:
German original: "Block %$b von Inode %$i steht in Konflikt mit kritischen Metadaten, Blockprüfungen werden übersprungen."
Translated: ""Inode %$i block %$b conflicts with critical metadata, skipping block checks."
What else can I try to scan the FS?
I use e2fprogs-1.44.5. The system has 9.43G RAM and over 100G of swapspace (I added 2 partitions on SSDs just for this scan).
Code: | share ~ # tune2fs -l /dev/mapper/share
tune2fs 1.44.5 (15-Dec-2018)
Filesystem volume name: <none>
Last mounted on: <not available>
Filesystem UUID: c5f0559d-e3bd-473f-abc0-7c42b3115897
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: ext_attr dir_index filetype extent 64bit flex_bg sparse_super huge_file uninit_bg dir_nlink extra_isize
Filesystem flags: signed_directory_hash
Default mount options: user_xattr acl
Filesystem state: not clean
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 366268416
Block count: 5860269056
Reserved block count: 0
Free blocks: 5836914127
Free inodes: 366268405
First block: 0
Block size: 4096
Fragment size: 4096
Group descriptor size: 64
Blocks per group: 32768
Fragments per group: 32768
Inodes per group: 2048
Inode blocks per group: 128
RAID stride: 128
RAID stripe width: 1024
Flex block group size: 16
Filesystem created: Sat Mar 17 14:36:16 2018
Last mount time: n/a
Last write time: Fri Jun 7 10:11:34 2019
Mount count: 0
Maximum mount count: -1
Last checked: Sat Mar 17 14:36:16 2018
Check interval: 0 (<none>)
Lifetime writes: 457 MB
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 256
Required extra isize: 28
Desired extra isize: 28
Default directory hash: half_md4
Directory Hash Seed: 4c37872e-3207-4ff4-8939-a428feaeb49f
Journal backup: inode blocks |
Kind regards,
Jimini _________________ "The most merciful thing in the world, I think, is the inability of the human mind to correlate all its contents." (H.P. Lovecraft: The Call of Cthulhu)
Last edited by Jimini on Sat Jun 22, 2019 8:20 am; edited 3 times in total |
|
Back to top |
|
|
389292 Guru
Joined: 26 Mar 2019 Posts: 504
|
Posted: Fri Jun 14, 2019 1:45 pm Post subject: |
|
|
as I was told once you should never stop fsck, at least if partitions were not mounted RO during the check. fsck in general considered to be risky operation not to be taken lightly, what exactly had happened with your fs? |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54815 Location: 56N 3W
|
Posted: Fri Jun 14, 2019 6:17 pm Post subject: |
|
|
Jimini,
Don't run fsck unless you have a backup or an image of the filesystem. fsck guesses what should be on the filesystem and often makes a bad situation worse.
It works by making the filesystem metadata self consistent but says nothing about any user data on the filesystem.
Its one of the last things to try.
Tell what happened to your raid6, how it came to be down a drive.
Are the underlying drives OK?
Can you post the underlying SMART data for all the drives in the raid set?
Code: | smartctl -a /dev/sda |
I assume the raid set assembles but does it?
What does ?
What about Code: | mdadm -E /dev/[block_device] | for each member of the raid set?
Have you tried mounting the filesystem read only, using an alternate superblock?
Do not expect to do in place data recovery. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
Jimini l33t
Joined: 31 Oct 2006 Posts: 605 Location: Germany
|
Posted: Mon Jun 17, 2019 5:41 am Post subject: |
|
|
etnull & NeddySeagoon, thank you for your replies.
First of all: of course I have a backup :)
It only takes looooooong to copy all the stuff back via GB ethernet, and I'd like to dig a bit deeper into the problem first.
The disks are connected to two Dell PERC H200 controller cards (they are reflashed, since I wanted to use SW RAID):
01:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] (rev 03)
02:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] (rev 03)
These disks are assembled to /dev/md2. This array contains the LUKS container named "share".
One of the disks in this RAID6 got kicked out:
Quote: | May 19 01:13:47 backup kernel: mpt2sas_cm1: log_info(0x31110d00): originator(PL), code(0x11), sub_code(0x0d00)
May 19 01:13:47 backup kernel: mpt2sas_cm1: log_info(0x31110d00): originator(PL), code(0x11), sub_code(0x0d00)
May 19 01:13:47 backup kernel: mpt2sas_cm1: log_info(0x31110d00): originator(PL), code(0x11), sub_code(0x0d00)
May 19 01:13:47 backup kernel: mpt2sas_cm1: log_info(0x31110d00): originator(PL), code(0x11), sub_code(0x0d00)
May 19 01:13:47 backup kernel: sd 1:0:4:0: [sdh] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x0b driverbyte=0x00
May 19 01:13:47 backup kernel: sd 1:0:4:0: [sdh] tag#0 CDB: opcode=0x88 88 00 00 00 00 00 07 fd 88 00 00 00 00 28 00 00
May 19 01:13:47 backup kernel: mpt2sas_cm1: log_info(0x31110d00): originator(PL), code(0x11), sub_code(0x0d00)
May 19 01:13:47 backup kernel: print_req_error: I/O error, dev sdh, sector 134055936
[...]
May 19 01:13:54 backup kernel: print_req_error: I/O error, dev sdh, sector 16
May 19 01:13:54 backup kernel: md: super_written gets error=10
May 19 01:13:54 backup kernel: md/raid:md2: Disk failure on sdh, disabling device.
May 19 01:13:54 backup kernel: md/raid:md2: Operation continuing on 9 devices.
May 19 01:13:54 share mdadm[2912]: Fail event detected on md device /dev/md2, component device /dev/sdh |
(the system itself is called "share", it runs a LXC container with a system called "backup" on it)
Since I was not at home, I could not replace the disk (but that's why I use RAID6 instead of 5). A few days later, logging in via SSH and locally was not possible anymore.
After over two weeks of debugging and trying to get the problem fixed, I needed to use the system again, so I used magic sysrq "S" and "B" to write all cached data to the disks and reboot the system.
Afterwards, I noticed many errors in my syslog:
Quote: | /var/log/messages.2:Jun 7 07:34:36 share kernel: EXT4-fs error (device dm-1): ext4_lookup:1578: inode #11: comm du: deleted inode referenced: 3212912
/var/log/messages.2:Jun 7 07:34:36 share kernel: EXT4-fs error (device dm-1): ext4_lookup:1578: inode #11: comm du: deleted inode referenced: 3212955
/var/log/messages.2:Jun 7 07:34:36 share kernel: EXT4-fs error (device dm-1): ext4_lookup:1578: inode #11: comm du: deleted inode referenced: 3212956
/var/log/messages.2:Jun 7 07:34:36 share kernel: EXT4-fs error (device dm-1): ext4_lookup:1578: inode #11: comm du: deleted inode referenced: 3212957
/var/log/messages.2:Jun 7 07:34:36 share kernel: EXT4-fs error (device dm-1): ext4_lookup:1578: inode #11: comm du: deleted inode referenced: 3212958
/var/log/messages.2:Jun 7 07:34:36 share kernel: EXT4-fs error (device dm-1): ext4_lookup:1578: inode #11: comm du: deleted inode referenced: 3212959
/var/log/messages.2:Jun 7 07:34:36 share kernel: EXT4-fs error (device dm-1): ext4_lookup:1578: inode #11: comm du: deleted inode referenced: 3212960
/var/log/messages.2:Jun 7 07:34:36 share kernel: EXT4-fs error (device dm-1): ext4_lookup:1578: inode #11: comm du: deleted inode referenced: 3212961
/var/log/messages.2:Jun 7 07:34:36 share kernel: EXT4-fs error (device dm-1): ext4_lookup:1578: inode #11: comm du: deleted inode referenced: 3212962
/var/log/messages.2:Jun 7 07:34:36 share kernel: EXT4-fs error (device dm-1): ext4_lookup:1578: inode #11: comm du: deleted inode referenced: 3212963 |
That's why I wanted to check the file system.
The state of the array is as follows:
/proc/mdstat
Code: | Personalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty]
md2 : active raid6 sdd[0] sdb[9] sda[8] sdh[7] sdi[6] sdc[10] sdf[3] sdg[2] sde[1]
23441080320 blocks super 1.2 level 6, 512k chunk, algorithm 2 [10/9] [UUUUU_UUUU]
bitmap: 12/22 pages [48KB], 65536KB chunk |
Output of mdadm -E:
https://pastebin.com/rXthGAAH
Output of smartctl -a:
https://pastebin.com/FGdLFkCh
I did not try to mount the FS using an alternate superblock, since I am unable to locate one:
Quote: | dumpe2fs 1.44.5 (15-Dec-2018)
Filesystem volume name: <none>
Last mounted on: <not available>
Filesystem UUID: c5f0559d-e3bd-473f-abc0-7c42b3115897
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: ext_attr dir_index filetype extent 64bit flex_bg sparse_super huge_file uninit_bg dir_nlink extra_isize
Filesystem flags: signed_directory_hash
Default mount options: user_xattr acl
Filesystem state: not clean
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 366268416
Block count: 5860269056
Reserved block count: 0
Free blocks: 5836914127
Free inodes: 366268405
First block: 0
Block size: 4096
Fragment size: 4096
Group descriptor size: 64
Blocks per group: 32768
Fragments per group: 32768
Inodes per group: 2048
Inode blocks per group: 128
RAID stride: 128
RAID stripe width: 1024
Flex block group size: 16
Filesystem created: Sat Mar 17 14:36:16 2018
Last mount time: n/a
Last write time: Fri Jun 7 10:11:34 2019
Mount count: 0
Maximum mount count: -1
Last checked: Sat Mar 17 14:36:16 2018
Check interval: 0 (<none>)
Lifetime writes: 457 MB
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 256
Required extra isize: 28
Desired extra isize: 28
Default directory hash: half_md4
Directory Hash Seed: 4c37872e-3207-4ff4-8939-a428feaeb49f
Journal backup: inode blocks |
Kind regards,
Jimini _________________ "The most merciful thing in the world, I think, is the inability of the human mind to correlate all its contents." (H.P. Lovecraft: The Call of Cthulhu) |
|
Back to top |
|
|
Jimini l33t
Joined: 31 Oct 2006 Posts: 605 Location: Germany
|
Posted: Thu Jun 20, 2019 7:10 am Post subject: |
|
|
I have an update: I replaced the failed disk in the array and started e2fsck again. Now it seems to fix a bunch of errors - I am curious, if it can end its work this time.
Before I replaced the disk, e2fsck only complained about "Inode %$i block %$b conflicts with critical metadata, skipping block checks." - now it does name actual blocks and inodes.
Kind regards,
Jimini _________________ "The most merciful thing in the world, I think, is the inability of the human mind to correlate all its contents." (H.P. Lovecraft: The Call of Cthulhu) |
|
Back to top |
|
|
Jimini l33t
Joined: 31 Oct 2006 Posts: 605 Location: Germany
|
Posted: Thu Jun 20, 2019 8:33 am Post subject: |
|
|
I was now able to fix the file system. Since e2fsck could fix all errors in ~2 hours, I assume that the degraded (but clean!) RAID6 was the reason for all the problems.
For me, one big question remains unanswered: how redundant is a RAID6, when the FS on it throws errors as long as one disk is missing?
Kind regards,
Jimini _________________ "The most merciful thing in the world, I think, is the inability of the human mind to correlate all its contents." (H.P. Lovecraft: The Call of Cthulhu) |
|
Back to top |
|
|
Jimini l33t
Joined: 31 Oct 2006 Posts: 605 Location: Germany
|
Posted: Fri Jun 21, 2019 4:58 am Post subject: |
|
|
...the problem is NOT solved.
I tried to simulate the problem and set one of the disks in the array faulty. Afterwards, I replaced it and rebuilt the array.
Unfortunately, the ext4 errors occured again:
kernel: EXT4-fs error (device dm-1): ext4_find_dest_de:1802: inode #3833864: block 61343924: comm nfsd: bad entry in directory: rec_len % 4 != 0 - offset=1000, inode=2620025549, rec_len=30675, name_len=223, size=4096
kernel: EXT4-fs error (device dm-1): ext4_lookup:1577: inode #172824586: comm tvh:tasklet: iget: bad extra_isize 13022 (inode size 256)
kernel: EXT4-fs error (device dm-1): htree_dirblock_to_tree:1010: inode #7372807: block 117967811: comm tar: bad entry in directory: rec_len % 4 != 0 - offset=104440, inode=1855122647, rec_len=12017, name_len=209, size=4096
...and so on.
Sorry if I repeat myself, but IMHO a degraded RAID6 should not lead to filesystem corruption.
dumpe2fs:
Code: | Filesystem volume name: <none>
Last mounted on: /home/share
Filesystem UUID: c5f0559d-e3bd-473f-abc0-7c42b3115897
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: ext_attr dir_index filetype extent 64bit flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize
Filesystem flags: signed_directory_hash
Default mount options: user_xattr acl
Filesystem state: clean with errors
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 366268416
Block count: 5860269056
Reserved block count: 0
Free blocks: 755383351
Free inodes: 363816793
First block: 0
Block size: 4096
Fragment size: 4096
Group descriptor size: 64
Blocks per group: 32768
Fragments per group: 32768
Inodes per group: 2048
Inode blocks per group: 128
RAID stride: 128
RAID stripe width: 1024
Flex block group size: 16
Filesystem created: Sat Mar 17 14:36:16 2018
Last mount time: Fri Jun 21 05:25:34 2019
Last write time: Fri Jun 21 05:30:27 2019
Mount count: 3
Maximum mount count: -1
Last checked: Thu Jun 20 08:55:17 2019
Check interval: 0 (<none>)
Lifetime writes: 139 GB
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 256
Required extra isize: 28
Desired extra isize: 28
Default directory hash: half_md4
Directory Hash Seed: 4c37872e-3207-4ff4-8939-a428feaeb49f
Journal backup: inode blocks
FS Error count: 20776
First error time: Thu Jun 20 14:18:47 2019
First error function: ext4_lookup
First error line #: 1577
First error inode #: 172824586
First error block #: 0
Last error time: Fri Jun 21 05:53:24 2019
Last error function: ext4_lookup
Last error line #: 1577
Last error inode #: 172824586
Last error block #: 0 |
And "of course", e2fsck throws dozens of "Inode %$i block %$b conflicts with critical metadata, skipping block checks" lines. Seems like I am unable to fix the FS errors, again.
Kind regards,
Jimini _________________ "The most merciful thing in the world, I think, is the inability of the human mind to correlate all its contents." (H.P. Lovecraft: The Call of Cthulhu) |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54815 Location: 56N 3W
|
Posted: Fri Jun 21, 2019 9:56 pm Post subject: |
|
|
Jimini,
Try Code: | mount -o ro,sb=131072 /dev/dm-1 /mnt/<someplace> |
Read
131072 will be the first alterate superblock.
The default is sb=0.
I think you can pass alternate superblocks to fsck too but I've not needed to for a while.
What happens to a raid set when a component fails depends on the failure mode.
Lots of bits in computer systems these days have built in test. However it has it limitations.
There has to be a flaw in the reasoning that a faulty item can detect that its faulty.
In the face of most failure modes, raid works as intended.
Further, its not clear that the faulty element is the cause of the filesystem corruption.
Correlation does not prove cause and effect. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
Jimini l33t
Joined: 31 Oct 2006 Posts: 605 Location: Germany
|
Posted: Sat Jun 22, 2019 8:19 am Post subject: |
|
|
NeddySeagoon, thank you for your support - due to the misleading output of e2fsck I filed a bug report (https://bugzilla.kernel.org/show_bug.cgi?id=203943).
After setting $LANG to en_GB, e2fsck provided some helpful output, and I was able to clear the erroneous superblocks with debugfs. Afterwards, ef2sck fixed a huge amount of errors.
Although some data loss occured, this fortunately only affects directories the system wrote to, while ext4 was corrupted: local backup data and tv recordings.
The system is in clean shape now, but I will have a detailed look at the logs and the monitoring during the next weeks.
Kind regards,
Jimini _________________ "The most merciful thing in the world, I think, is the inability of the human mind to correlate all its contents." (H.P. Lovecraft: The Call of Cthulhu) |
|
Back to top |
|
|
MPW n00b
Joined: 07 Jun 2020 Posts: 1
|
Posted: Sun Jun 07, 2020 11:14 pm Post subject: |
|
|
Hello Jimini,
I have pretty much the same problem as you had. Could explain further, what you did in detail to fix the filesystem?
I can't get a full list of currupted inodes, as badblocks -b 4096 doesn't run on my system for reasons I don't understand. I have a 11x 4TB raid6 (36 TB net) raid6 with ext4.
In syslog I saw fs errors and I deleted the broken files with debugfs clri. But still I see lot's of metadata conflicts just like you had.
What I don't understand: What does this have to do with the superblock, do I need to work with the backup superblock aswell? My raid is still mountable, but I don't want to destroy anything.
Best,
Matthias |
|
Back to top |
|
|
fturco Veteran
Joined: 08 Dec 2010 Posts: 1181
|
Posted: Mon Jun 08, 2020 4:23 pm Post subject: |
|
|
@MPW: welcome to the Gentoo forums.
MPW wrote: | I can't get a full list of currupted inodes, as badblocks -b 4096 doesn't run on my system for reasons I don't understand. I have a 11x 4TB raid6 (36 TB net) raid6 with ext4. |
Did you get a specific error message from badblocks? |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54815 Location: 56N 3W
|
Posted: Mon Jun 08, 2020 5:41 pm Post subject: |
|
|
MPW,
Why are you using badblocks?
In general, its not useful on any HDD over 4G as they will do dynamic bad block remapping.
If you suspect faulty sectors on your raid set and want to test, Proceed as follows ...
Run Code: | smartctl -a /dev/... | on each drive and save the output. Post it here, it may already point to a problem drive.
Run the long test on each drive.
This is much faster that badblocks as its a single command to the drive and the drive does its itself, so they can all run at the same time.
Wait for the tests to complete, then run Code: | smartctl -a /dev/... | again.
Post the output again. Now we can compare before and after. The changes, not just the actual output may be useful.
With 11 drives, put the results onto a pastebin as it will be too much for a post.
For each raid member block device, runand post the result. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|