View previous topic :: View next topic |
Author |
Message |
Goverp Advocate
Joined: 07 Mar 2007 Posts: 2202
|
Posted: Sun Jul 22, 2018 2:11 pm Post subject: [SOLVED ?] Raid 5 sync error -> filename? |
|
|
My box has just become unstable (kernel Oops, software breaking); I've not installed any new software or changed anything (I think...). There's nothing outstanding from maintenance - --update @world and --depclean ran OK, nothing in @preserved-rebuild. and revdep-rebuild says it's fine. All "stable" stuff, last kernel change a week ago, and everything running fine since then until today.
First symptom was grub legacy no longer finding stage 2. That might be because I foolishly forgot to remove my camera's SD card before rebooting - it appears to cause the BIOS to renumber the drives, or forget which one I boot from, or something. Easily cured by fixing the boot order in BIOS. But Oops etc. as above afterwards. So perhaps hardware?
This is a 4-disk RAID 5 array. smartctl was happy with the drives in the array (apart from one xerror some years ago, IIUC). So I kicked off a sync, which reported one error in syslog:
Code: | /usr/sbin/checkarray --all --idle
md127: mismatch sector in range 30758320-30758328
|
md127 is the only array. It has 4 BIOS-style partitions on it, all ext4, and it's formed from 4 250Gb partitions (not at the start) on each of 4 almost identical drives.
As it's ext4, IIUC the filesystem can't say whether the data is right and the RAID redundancy info wrong, or vice-versa, so "repair" isn't really a good move.
My guess is the error doesn't actually have any bearing on my Oopses and so forth, but if I can turn that sector range into one or more filenames, I could recreate them (if software, reinstall, if data, restore from backup).
TL;DR
How do I convert the sector numbers into a filename. Googling tells me it will involve debugfs, and probably at least one level of arithmetic on partition offsets. I'd be most grateful if anyone could point me at a HowTo or worked example. _________________ Greybeard
Last edited by Goverp on Mon Jul 23, 2018 8:35 pm; edited 1 time in total |
|
Back to top |
|
|
Goverp Advocate
Joined: 07 Mar 2007 Posts: 2202
|
Posted: Mon Jul 23, 2018 8:34 pm Post subject: |
|
|
Oh well, I did something myself, which was to assume the message from sync gave the sector number from the start of the mapped disk. Then look at the partition table to deduce which partition, and subtract its offset. In this case it was the first partition, at offset 4, so the arithmetic was easy. Then run "debugfs /dev/md127p1" am,d use "icheck" to find the inode containing the sectors (there were none), and then "ncheck" to turn it into a path (but there was none).
I've no idea if that process is correct, though it seems logical. And more sense than several threads I read in forums connected to less intelligent distros. IIUC their basic approach was multiply the sector number by its size (which IMHO gives bytes from the start of whatever), treat that as a block number (???) divide by the filesystem Stride value (why not?) and at this point I lost the will to investigate further. Though they might be right and me be wrong. Short of damaging a disk, I'm not sure how to test this.
Having decided it wasn't a real disk error, and having just had an emerge fail because of a typo in the source code that wasn't there when I read the code, I thought "maybe it's a memory error". Boot my trusty Gentoo system rescue USB key and run memcheck. A screenful of blood red error message in a trice. _________________ Greybeard |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|