View previous topic :: View next topic |
Author |
Message |
hannson n00b

Joined: 03 May 2005 Posts: 39
|
Posted: Wed Jul 12, 2006 12:18 am Post subject: Linux hangs/freezes - hdb: lost interrupt |
|
|
Hi I've got a weird problem with my computer. I left it on overnight like always and when I came back the morning after everything was in a frozen-like state. I could swich terminals (alt 1-7) but couldn't login in any of them until my computer snapped out of it. This has been happening repeatedly sinve yesterday and I have no idea what's wrong.
It's definetly not a DMA problem.
I swiched from X to modular X (because I first thought it was that kind of problem)
I checked my hdd with a SMART tool and it passed the test.
I also checked tempature and it was 35°C on motherboard and CPU
My computer setup is:
AMD64 3000+
512MB ram
Asus A8n-deluxe:
NFORCE4-SLI chipset
Currently (writing in another computer now) I'm running System Monitor in the broken computer and it shows 25% Memory usage and 100% CPU, falling to 1% every 4 - 6 minutes.
It took around 7 minutes to go between pages in FF.
Any ideas?
BTW I can't see any process using more than 2 - 7% CPU... maybe all together 14% at most. Still the monitor says 100% usage
Last edited by hannson on Wed Jul 12, 2006 8:28 pm; edited 1 time in total |
|
Back to top |
|
 |
coolsnowmen Veteran


Joined: 30 Jun 2004 Posts: 1479 Location: No.VA
|
Posted: Wed Jul 12, 2006 5:28 am Post subject: |
|
|
100% memory at idle? what process is doing this?
top should tell you |
|
Back to top |
|
 |
hannson n00b

Joined: 03 May 2005 Posts: 39
|
Posted: Wed Jul 12, 2006 2:21 pm Post subject: |
|
|
No sorry, 100% CPU usage. I'll try 'top' when I get home! |
|
Back to top |
|
 |
coolsnowmen Veteran


Joined: 30 Jun 2004 Posts: 1479 Location: No.VA
|
Posted: Wed Jul 12, 2006 3:13 pm Post subject: |
|
|
yeah, post whatever process is taking up 100%
also, can you restart? and see if it still goes 100%?
I have problems with nano running 100 (quit badly) for no reason enough times for me to put
USE=minimal infront of it. |
|
Back to top |
|
 |
hannson n00b

Joined: 03 May 2005 Posts: 39
|
Posted: Wed Jul 12, 2006 5:31 pm Post subject: |
|
|
Ok. I've tried restarting. I've had to hardreboot several times because the GUI hangs for too long and I can't login on other terminals in the meanwhile |
|
Back to top |
|
 |
coolsnowmen Veteran


Joined: 30 Jun 2004 Posts: 1479 Location: No.VA
|
Posted: Wed Jul 12, 2006 5:57 pm Post subject: |
|
|
I think you need to find out what is breaking.
So I would boot with the live cd, to chroot, delete xdm from the runlevel
This will let you test each part of the gui indepenantly...and first prove that it is part of the gui, And not something even earlier.
Then just start X, etc... |
|
Back to top |
|
 |
hannson n00b

Joined: 03 May 2005 Posts: 39
|
Posted: Wed Jul 12, 2006 7:34 pm Post subject: |
|
|
Something tells me this is a damaged hdd problem...
The cpu is pegged at 99% or 100% with most of the utilization in 'wa' (wa stands for "I/O wait state").
Am I right? :-S
EDIT: My dmesg is getting filled with hdb: lost interrupt |
|
Back to top |
|
 |
coolsnowmen Veteran


Joined: 30 Jun 2004 Posts: 1479 Location: No.VA
|
Posted: Thu Jul 13, 2006 1:29 am Post subject: |
|
|
Did you start it w/o xdm?
How about w/o ANY services (networking/samba/cups/...)?
Quote: | EDIT: My dmesg is getting filled with hdb: lost interrupt |
Thats not good, something bad along the hd chain. I'ld put the drive in another computer and try to read it, and backup anything important on it if you havn't already....When you put it in the other computer, use a known good cable...maybe you are lucky and its just a bad cable...
hdb?...what happened to hda? what is your fstab? |
|
Back to top |
|
 |
hannson n00b

Joined: 03 May 2005 Posts: 39
|
Posted: Thu Jul 13, 2006 4:29 pm Post subject: |
|
|
No I was running everything. My guess is that either the cable isn't good enough or the drive is overheating - because it works for a while before it hangs. I recently moved the harddrive to another cable (because the old couldn't use DMA for some reasons) and moved it into the hdd rack inside the case (was using a hotswap drive in a 5.25 slot). Now I'm using the ASUS cable that came with my motherboard - It should be good enough!
My root/boot drive is /dev/hdb - there is no hda
Just that, I used smarttools (smartctl) to check the drive and it passed the test. :-/ |
|
Back to top |
|
 |
ZomAur n00b


Joined: 07 Jul 2006 Posts: 28 Location: Sweden
|
Posted: Mon Nov 27, 2006 8:50 pm Post subject: |
|
|
I'm having a similar problem on my server. The motherboard is a VIA. Since DMA isn't working, i've turned it off. Was working fine until about a week ago, when I started to get lockups once or twice a week. Now, sometimes I can't even boot!
The drive isn't even five months old, so I'm more suspicious about the motherboard. I've had this problem before, one drive just died, the other got the "lost interrupt", so I put it away in a box. Since those two were over two years old I just thought I was unlucky. Now I'm not so sure.
I'm using kernel 2.6.17-gentoo-r7, and all the disks have been IDE-disks. Temperature is normal, never been above 40 as far as I know. |
|
Back to top |
|
 |
st0ne n00b

Joined: 22 Jan 2004 Posts: 18
|
Posted: Tue Nov 28, 2006 9:52 am Post subject: |
|
|
hi,
same problem on my core2duo system with sata harddrives...
sometimes, the whole system hangs... HDD-Led is lighting everytime, and all is freezing completly...
only solution is an hard-reset...
i don't know what it is, but i think it's some problem in kernel... i have also problems with heavy ide-transfers... cpu goes to almost 99% wa (only one core) in top...
so the system comes inresponsible for any interaction.
my kernel is: gentoo-sources-2.6.18-r3
i've testet the kernel with and without preemtion... but it has no effect.
greez st0ne |
|
Back to top |
|
 |
neonman n00b

Joined: 01 Jul 2004 Posts: 30 Location: Luleå, sweden
|
Posted: Tue Nov 28, 2006 12:00 pm Post subject: |
|
|
Same problem here. https://forums.gentoo.org/viewtopic.php?p=3746291
The problem isn't with the CPU wait %, the core of the problem is that the disks become extremly slow, wich in turn causes the high wait % (processes waiting on I/O to complete causes wait % in top) So this has nothing to do with CPU usage etc, it's a problem with I/O.
I got this problem after upgrading my CPU, RAM and mobo. Same disks, and they worked just fine before the upgrade.
I can also add that I get the same problem with both the ata_piix driver and ahci.
When I reboot my system can work just fine for 10 minutes up to a day or so, then this starts.
Something is causing the SATA ports to timeout, the driver the rescans the port and re-initizalise the port. After this happens the system becomes slow as hell, and I get the low I/O performance. So maby there's 2 problems.. The first being that the port resets(check my post for dmesg output etc) and the second being that after the port re-initzialises performance suffers. |
|
Back to top |
|
 |
|