Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Hard Drive dead? I sure hope not
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
nomind
Apprentice
Apprentice


Joined: 03 Feb 2005
Posts: 270

PostPosted: Sun Feb 26, 2006 7:43 pm    Post subject: Hard Drive dead? I sure hope not Reply with quote

Linux refuses to boot from my root partition due to a corrupt primary superblock. I get the good old "No such file or directory" message, followed by "unable to mount /dev/sda4: check to see that this device exists and try using alternate superblocks with e2fsck -b 8193 <device>". Of course, block 8193 doesn't help because the partition uses 4K blocks. However, neither does block 32768 (as it should). I've followed the instructions posted here, even though it's not the exact same problem. Didn't help. I somehow managed to run fsck on /dev/sda4 through the LiveCD, but that didn't help either.
I think a udev upgrade failed immediately prior to this problem. IIRC, it was because I'm still on kernel 2.6.11. In any case, "emerge -u udev" failed but still asked me to run etc-update, and so I did.
Here's hoping my 1.5yr old Western Digital 80GB SATA is not on its last legs. I would greatly appreciate input regarding this matter.

Thanks.
Back to top
View user's profile Send private message
I.C.Wiener
Tux's lil' helper
Tux's lil' helper


Joined: 25 Jul 2004
Posts: 115
Location: Furtwangen (Germany)

PostPosted: Mon Feb 27, 2006 12:58 am    Post subject: Reply with quote

You can read the drive's smart-log to see if it's a hardware problem. Run
Code:
smartctl -d ata -a /dev/sda
and check if there are any errors.

if you get something like this:
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

everything should be fine with the hardware.

Note: you need a driver/kernel with ioctl()-passthrough for smartctl to work with sata-disks. Since kernel 2.6.15 this is supported by libata. So you will need either a very recent livecd, or a very old one using those ancient IDE-drivers to access your sata-disc.
Back to top
View user's profile Send private message
nomind
Apprentice
Apprentice


Joined: 03 Feb 2005
Posts: 270

PostPosted: Mon Feb 27, 2006 2:40 am    Post subject: Reply with quote

Thanks for the response I.C.Wiener!

I tried booting with my original 2004.3 LiveCD, but apparently that doesn't have the smartctl tool. I did run fsck.ext3 again and did a more thorough check this time. The problem persists. I guess I'll have to nuke the entire partition and reinstall from scratch next weekend. I'll probably go for a JFS filesystem this time around. Not that I blame ext3 for failing; it was my fault anyway, resizing and moving the partition around a zillion times will do that to any fs.

Thanks for the help.
Back to top
View user's profile Send private message
I.C.Wiener
Tux's lil' helper
Tux's lil' helper


Joined: 25 Jul 2004
Posts: 115
Location: Furtwangen (Germany)

PostPosted: Mon Feb 27, 2006 4:10 am    Post subject: Reply with quote

resizing and moving?! You didn't mention that in your first post. Before you delete anything have a look on you hdd with fdisk/cfdisk. Some tools like partition magic sometimes hide partitions after an operation succeeded for whatever reason. Check if the filesystem-type is correct and change it if necessary.
Back to top
View user's profile Send private message
nomind
Apprentice
Apprentice


Joined: 03 Feb 2005
Posts: 270

PostPosted: Mon Feb 27, 2006 4:57 am    Post subject: Reply with quote

Sorry for not stating that initially. Over the course of a year, I resized and moved the Gentoo partition approximately 12 times; first with PM8.0, then recently with Acronis Disk Director 10. This was mostly due to a constantly changing idea of application/data abstraction and the problems of sharing data between different OSes. Nonetheless, the last time I made changes to my disk layout was about 3 months ago.

I will check what I can with fdisk and cfdisk though.
Back to top
View user's profile Send private message
sundialsvc4
Guru
Guru


Joined: 10 Nov 2005
Posts: 436

PostPosted: Mon Feb 27, 2006 4:36 pm    Post subject: Reply with quote

Then let us, of course, assume that something relating to the partition-table is the true culprit in this case, and not a hardware malfunction ... until proven otherwise, you know.
Back to top
View user's profile Send private message
exien
n00b
n00b


Joined: 10 Jan 2005
Posts: 11

PostPosted: Tue Feb 28, 2006 4:12 am    Post subject: Reply with quote

I think I'm experiencing the same thing with my computer. I have a ~AMD64 2005.1 profile etc.

I restarted it today, to have the bootup process tell me that the primary block on /dev/hda2 is corrupt. I tried the same things you did to no avail. Booting and chrooting into my environment via live-cd works perfectly. After failing to read the drive, by typing the root password I can get into the system with the hard drive mounted in a read-only state. (obviously all the information is still there). The reason you and I are getting a "corrupt hard drive block, unable to mount" error message is because the kernel cannot mount the hard drive for one reason or another. Your hard drive is not necessarily corrupt.

Some sleuthing shows me that the /dev tree is almost empty, with about 8 items. This is why /dev/hda2 doesn't exist!! Running udevstart doesn't help (probably b/c the drive is mounted read-only).

I'm quite suspicious of udev and I think it's the reason why I'm having the problem. I have yet to fix it though.
Back to top
View user's profile Send private message
Cintra
Advocate
Advocate


Joined: 03 Apr 2004
Posts: 2111
Location: Norway

PostPosted: Tue Feb 28, 2006 7:10 am    Post subject: Reply with quote

nomind wrote:
Sorry for not stating that initially. Over the course of a year, I resized and moved the Gentoo partition approximately 12 times; first with PM8.0, then recently with Acronis Disk Director 10. This was mostly due to a constantly changing idea of application/data abstraction and the problems of sharing data between different OSes. Nonetheless, the last time I made changes to my disk layout was about 3 months ago.

I will check what I can with fdisk and cfdisk though.

You do have Acronis True Image as well don't you...?
If so, you are back on the air in a flash ;-)
_________________
"I am not bound to please thee with my answers" W.S.
Back to top
View user's profile Send private message
exien
n00b
n00b


Joined: 10 Jan 2005
Posts: 11

PostPosted: Tue Feb 28, 2006 8:14 pm    Post subject: Reply with quote

Here's how I fixed the problem:
emerge =baselayout-1.11.14-r5
emerge =udev-079

I think the root cause is baselayout, and not udev, but I reverted to an older version to be safe.

Reference:
https://forums.gentoo.org/viewtopic-t-428088-highlight-superblock.html
https://forums.gentoo.org/viewtopic-t-424087-start-0.html
https://forums.gentoo.org/viewtopic-t-423958-highlight-suddenly.html
Back to top
View user's profile Send private message
nomind
Apprentice
Apprentice


Joined: 03 Feb 2005
Posts: 270

PostPosted: Wed Mar 01, 2006 5:09 am    Post subject: Reply with quote

Thanks for the tips exien! I'll definitely try that out tomorrow and see how it goes. Unfortunately all I can do is hope to have an older version of baselayout and udev lying around in my /usr/portage/distfiles, since the 2004.3 LiveCD doesn't recognise my network card. That and I'm currently downloading the 2006.0 LiveCD, which should take approximately another 13hrs!

@ Cintra: No I actually don't have True Image. I wasted enough money on PartitionMagic 8.0 that I didn't have much to spare for additional Acronis utilities. Should've known better than to expect safe software from Symantec.
Back to top
View user's profile Send private message
nomind
Apprentice
Apprentice


Joined: 03 Feb 2005
Posts: 270

PostPosted: Fri Mar 03, 2006 12:05 am    Post subject: Reply with quote

As an update, I managed to get past the kernel's filesystem-confusion by upgrading both the kernel and udev to 2.6.15-r1 and 079 respectively. However, attempting to boot with the new kernel results in a kernel panic, with a message like so:
Code:
Kernel BUG at kernel/timer.c: 411!
invalid operand: 0000 [#1]
PREEMPT SMP
Modules linked in:
...
... <a bunch of hex>
...
Kernel Panic - not syncing: Fatal exception in interrupt

followed by me having to perform a hard-reboot.

I'll recompile the kernel and reduce the number of autoloaded modules tomorrow to see if that helps. If not, I'll go ahead with the fresh install. My current installation seems to be getting slower by the day anyway. Must have something to do with a "5.7% discontiguous filesystem", whatever the hell that means (I think fragmentation, not sure though).

Thanks for all your help!
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum