Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
System hangs, hdd light on constantly.
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
blixel
Guru
Guru


Joined: 19 Jul 2004
Posts: 403
Location: Central, Florida

PostPosted: Thu Nov 25, 2004 9:11 pm    Post subject: System hangs, hdd light on constantly. Reply with quote

A while back I was having constant problems with X hanging my system quite often. That issue has since been solved but now I have a new problem.

Sometimes while I'm working at my computer, I'll notice that my system is suddenly unresponsive to *new* commands. I can minimize windows, resize, maximize, but I can't start any new programs, and the programs that are open don't let me do anything. Always when this happens (100% of the time) I'll notice that my HDD light is on. Not blinking, as if it's working, it's just on solid. I have to physically turn my computer off and turn it back on to get it working again. If I just hit the reset button, it will reboot, but doesn't "see" the hard-drives.

I do not believe this is a hardware problem as it never happens when I'm using that *other* Operating System. Also, I haven't noticed this problem under Ubuntu while I've messed with it.

I don't know what causes it to happen. Sometimes my system will run for a few hours, sometimes it will run for a few days. But in the last 3 weeks I have had to shut off my computer at least 10 times due to this problem.

Sometimes when I go away from my computer for a while, when I come back to it (the next morning for example), I will bump the mouse to bring the monitors out of power save mode. When they don't come up, I'll look down at my case and sure enough, the HDD light is on solid. At that point I can't ping or ssh into my machine, CTRL+ALT+BACKSPACE (or DEL) does nothing ... the only option is to power it off.

My guess is that I have something compiled into the kernel that is causing this problem (or something NOT compiled into the kernel that is causing it). Or maybe the current currentl (2.6.9) has a bug that is showing up with my hardware config.

Here is a link to my kernel config http://www.davidcourtney.org/kernel-config

Relevant system specs:

AMD XP3200+
1GB (2x512) DDR400 RAM
ASUS A7V880 Motherboard
IDE Western Digital 120GB, 7200RPM, 8MB Cache
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 55015
Location: 56N 3W

PostPosted: Fri Nov 26, 2004 8:18 pm    Post subject: Reply with quote

blixel,

You have some real bleeding edge ACPI settings, for instance:-
Code:
#
# Power management options (ACPI, APM)
#
CONFIG_PM=y
# CONFIG_PM_DEBUG is not set
CONFIG_SOFTWARE_SUSPEND=y
CONFIG_PM_STD_PARTITION="/dev/hdb2"

and
Code:
# ACPI (Advanced Configuration and Power Interface) Support
#
CONFIG_ACPI=y

Try with ACPI turned off. noapci on the end of your kernel command line should do for a trial.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
blixel
Guru
Guru


Joined: 19 Jul 2004
Posts: 403
Location: Central, Florida

PostPosted: Fri Nov 26, 2004 9:10 pm    Post subject: Reply with quote

NeddySeagoon wrote:
blixel,

You have some real bleeding edge ACPI settings, for instance:-
Code:
#
# Power management options (ACPI, APM)
#
CONFIG_PM=y
# CONFIG_PM_DEBUG is not set
CONFIG_SOFTWARE_SUSPEND=y
CONFIG_PM_STD_PARTITION="/dev/hdb2"



Well, I've never been able to get suspend to work so I can easily take that out of the kernel. I just haven't bothered because I assumed if I wasn't making use of the suspend features, that those bits of code were ignored.
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 55015
Location: 56N 3W

PostPosted: Fri Nov 26, 2004 9:25 pm    Post subject: Reply with quote

blixel,

ACPI is known as a generic source of problems. Its good to turn it right off to narrow things down.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
blixel
Guru
Guru


Joined: 19 Jul 2004
Posts: 403
Location: Central, Florida

PostPosted: Fri Dec 03, 2004 3:09 am    Post subject: Reply with quote

NeddySeagoon wrote:
ACPI is known as a generic source of problems. Its good to turn it right off to narrow things down.


I just had another hard-drive hang up. Exactly 10 minutes ago. This time it happened while I was sitting here. I heard a mechanical noise come from my hard-drive. I've become somewhat familiar with hard-drive noises over the past 10 years or so and this particular sound didn't alarm me much. It sounded like hard-drive head might have been resting itself (or something like that).

Anyway, as soon as I heard it I looked down at the computer and sure enough, the hdd light was on constantly and I couldn't do anything. I shut off the computer (because resetting doesn't work). As soon as it booted back up I checked /var/log/messages. Unfortunately there was absolutely nothing logged anywhere near the time of the lock up. The logs skip from something from over an hour ago until I rebooted.

I did however take my hdparm commands out of my startup. I'm thinking maybe the drive was trying to go to sleep and that triggers the lock up.

Here is what I had in my /etc/conf.d/local.start:

Code:
# /etc/conf.d/local.start:
# $Header: /var/cvsroot/gentoo-src/rc-scripts/etc/conf.d/local.start,v 1.4 2002/11/18 19:39:22 azarah Exp $

# This is a good place to load any misc.
# programs on startup ( 1>&2 )

HDPARM=/sbin/hdparm

$HDPARM -S 180 /dev/hda
$HDPARM -S 180 /dev/hdb
$HDPARM -c 1 -d 1 -m 16 -S 180 /dev/hdb


Several weeks ago I commented out that last line to try to resolve this problem. Now I have commented them all out and made sure hdparm was not set to start during bootup.

Unfortunately it will be at least 2 or 3 weeks (or longer) before I really know if this is the answer. Sometimes the lockups happen within hours, sometimes days, sometimes weeks.
Back to top
View user's profile Send private message
truekaiser
l33t
l33t


Joined: 05 Mar 2004
Posts: 810

PostPosted: Fri Dec 03, 2004 4:29 am    Post subject: Reply with quote

i had somthing similer happen to my dell laptop. it ended up being the hard drive was in the first stages of dieing.
Back to top
View user's profile Send private message
monkeyhead
Tux's lil' helper
Tux's lil' helper


Joined: 19 Mar 2004
Posts: 97
Location: hella-where?

PostPosted: Fri Dec 03, 2004 5:51 am    Post subject: Reply with quote

I had very similar problems and ended up doing exactly what neddy s. suggested and it was like night and day.
Back to top
View user's profile Send private message
blixel
Guru
Guru


Joined: 19 Jul 2004
Posts: 403
Location: Central, Florida

PostPosted: Fri Dec 03, 2004 5:59 am    Post subject: Reply with quote

monkeyhead wrote:
I had very similar problems and ended up doing exactly what neddy s. suggested and it was like night and day.


I recompiled my kernel as he suggested and I disabled ACPI during boot up "acpi=off" ... it didn't help. After 4 days and 2 hours of uptime (roughly) my system hung again.

Here is my latest kernel config though if anyone cares to take another stab at it.

http://www.davidcourtney.org/kernel-config


Last edited by blixel on Fri Dec 03, 2004 6:02 am; edited 1 time in total
Back to top
View user's profile Send private message
xbmodder
Guru
Guru


Joined: 25 Feb 2004
Posts: 404

PostPosted: Fri Dec 03, 2004 6:00 am    Post subject: Reply with quote

yeah ditto
Back to top
View user's profile Send private message
monkeyhead
Tux's lil' helper
Tux's lil' helper


Joined: 19 Mar 2004
Posts: 97
Location: hella-where?

PostPosted: Sat Dec 04, 2004 1:20 am    Post subject: Reply with quote

i just looked at my lilo.conf and i also include 'apci=off'. for the life of me i can't remember why... but i do remember putting it in at the same time of the 'acpi=off'

sorry i couldn't help more. :(
Back to top
View user's profile Send private message
xbmodder
Guru
Guru


Joined: 25 Feb 2004
Posts: 404

PostPosted: Sun Dec 12, 2004 4:03 pm    Post subject: Reply with quote

do a fuser -v /
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum