View previous topic :: View next topic |
Author |
Message |
cfgauss l33t
![l33t l33t](/images/ranks/rank_rect_4.gif)
![](images/avatars/8723152405324be293c030.jpg)
Joined: 18 May 2005 Posts: 726 Location: USA
|
Posted: Sat Jul 14, 2012 10:21 pm Post subject: Hardware freeze: keyboard LEDs blink |
|
|
This is a problem which used to happen rarely and now seems to happen twice a month. I'm not sure what the cause is.
The symptom is that my PS2 keyboard will freeze and two of its LEDs will blink synchronously. When this happens, the entire box freezes. The mouse doesn't work. KDE windows are not updated. I can't ssh to the box, etc. The only solution is a hard reboot (with accompanying file system repair, etc.) When the keyboard lights start blinking, I've replaced the keyboard by a known working PS2 keyboard and its LEDs start blinking. I.e. it's frozen as well.
Does this indicate that the PS2 hardware port on the motherboard is malfunctioning? If so, I'd assume I could solve the problem by using a USB keyboard.
Thanks for any debugging suggestions.
[SOLVED] The video card fan is dead. See below. [/SOLVED]
[UNSOLVED]
I now don't believe that the video card fan was the root of the problem. Today the keyboard froze right after booting. The CPU and graphic card temperatures were all pretty low, as indicated by gkrellm, so I don't think overheating was the cause. Only the keyboard froze (with blinking lights, etc.) The mouse and all other processes were running normally.
Is this an indication that the problem is with the PS2 keyboard hardware or the PS2 keyboard driver?
Thanks for any debugging suggestions.
[/UNSOLVED]
Last edited by cfgauss on Thu Sep 20, 2012 8:15 pm; edited 2 times in total |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
eccerr0r Watchman
![Watchman Watchman](/images/ranks/rank-G-2-watchman.gif)
Joined: 01 Jul 2004 Posts: 9892 Location: almost Mile High in the USA
|
Posted: Sat Jul 14, 2012 10:24 pm Post subject: |
|
|
When this happens this means the kernel panicked. A serious problem occurred that the kernel could not recover from, so it freezes the state of the machine to make sure no additional corruption can occur.
Since it's frozen, it makes it a bit harder to debug. Ideally it would be nice to catch it in the act somehow to help debug.
Is there any specific thing you did that would trigger this freeze? _________________ Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching? |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
cfgauss l33t
![l33t l33t](/images/ranks/rank_rect_4.gif)
![](images/avatars/8723152405324be293c030.jpg)
Joined: 18 May 2005 Posts: 726 Location: USA
|
Posted: Sat Jul 14, 2012 10:35 pm Post subject: |
|
|
eccerr0r wrote: | Is there any specific thing you did that would trigger this freeze? |
I don't think I did anything. The last time it happened was a half-hour ago. I emerged world and left it running (chromium was compiling). When I came back an hour later, it was frozen.
Is there a log file which might contain clues? If it's a kernel panic, does that mean that I have malfunctioning hardware? E.g. something on the motherboard?
Thanks for your help. |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
eccerr0r Watchman
![Watchman Watchman](/images/ranks/rank-G-2-watchman.gif)
Joined: 01 Jul 2004 Posts: 9892 Location: almost Mile High in the USA
|
Posted: Sat Jul 14, 2012 10:42 pm Post subject: |
|
|
Usually in this state, the machine does not sync, so if it did write anything to the disk it would not have made it.
Probably best to setup another machine and serial console into it, and make the kernel log to the serial console.
But if it's just doing an emerge, perhaps the machine overheated? What CPU is this (older AMD?) Could you try running a memtest program too, also perhaps check if your CPU fan is clean?
But yes, this usually means hardware issue, but software issues could also trigger... _________________ Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching? |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
cfgauss l33t
![l33t l33t](/images/ranks/rank_rect_4.gif)
![](images/avatars/8723152405324be293c030.jpg)
Joined: 18 May 2005 Posts: 726 Location: USA
|
Posted: Fri Jul 27, 2012 4:12 am Post subject: |
|
|
eccerr0r wrote: | But if it's just doing an emerge, perhaps the machine overheated? What CPU is this (older AMD?) Could you try running a memtest program too, also perhaps check if your CPU fan is clean? |
The CPU is about four years old, an Intel Quad-Core Q6600.
I found a Gentoo HW Guide that suggested CPU stress tests and memtest. It passed memtest but after 2-3 hours of the cpubuild stress test the hottest core is at about 155F. (The Guide suggests that 160F is too hot.) But it didn't lock up during the test. The CPU fan seems to function normally and the heatsink isn't particularly hot to the touch.
The Guide also suggested installing gkrellm which monitored the temp on my nVidia video card. It's 150F all the time. I checked and the attached fan is dead. I'm assuming this is the problem and will replace the card with one that has a working fan.
[EDIT] The new card with a working attached fan runs 20F cooler. Somehow I had expected more. [/EDIT]
Thanks very much for your suggestions.
Last edited by cfgauss on Sat Jul 28, 2012 3:22 pm; edited 1 time in total |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
DirtyHairy l33t
![l33t l33t](/images/ranks/rank_rect_4.gif)
![](images/avatars/gallery/Monkey Island/Monkey_Island_-_Murray.gif)
Joined: 03 Jul 2006 Posts: 608 Location: Würzburg, Deutschland
|
Posted: Fri Jul 27, 2012 7:46 am Post subject: |
|
|
Software definitely is a possible culprit, and trying out a different kernel or provoking the panic without X running (to check whether graphics are related) won't hurt. However, the best thing would be to capture a log of the panic via serial or netconsole. This way, the backtrace might tell which parts of the kernel are involved in the panic. |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
cfgauss l33t
![l33t l33t](/images/ranks/rank_rect_4.gif)
![](images/avatars/8723152405324be293c030.jpg)
Joined: 18 May 2005 Posts: 726 Location: USA
|
Posted: Fri Jul 27, 2012 1:20 pm Post subject: |
|
|
DirtyHairy wrote: | Software definitely is a possible culprit, and trying out a different kernel or provoking the panic without X running (to check whether graphics are related) won't hurt. However, the best thing would be to capture a log of the panic via serial or netconsole. This way, the backtrace might tell which parts of the kernel are involved in the panic. |
Is netconsole logging to a box on the network? I'd like to try that. Can you point to documentation on how to do this?
Thanks. |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
depontius Advocate
![Advocate Advocate](/images/ranks/rank-G-1-advocate.gif)
Joined: 05 May 2004 Posts: 3526
|
Posted: Fri Jul 27, 2012 2:36 pm Post subject: |
|
|
Incidentally, at one point they had the LEDs blinking Morse code to tell you a little bit about the kernel panic you were having. I haven't done Morse code in some 40 years, so I wouldn't be a good to tell you. But if you have any ham radio friends, you might see if you can trigger this while he/she is watching. I also don't know how much information is being blinked out, what information, if it repeats, etc. _________________ .sigs waste space and bandwidth |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
DirtyHairy l33t
![l33t l33t](/images/ranks/rank_rect_4.gif)
![](images/avatars/gallery/Monkey Island/Monkey_Island_-_Murray.gif)
Joined: 03 Jul 2006 Posts: 608 Location: Würzburg, Deutschland
|
Posted: Fri Jul 27, 2012 3:11 pm Post subject: |
|
|
I've never done it myself, but take a look at Documentation/networking/netconsole.txt in the linux sources. |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
|