Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
[solved] Kernel Panic - Unknown Cause
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
noup
l33t
l33t


Joined: 21 Mar 2003
Posts: 917

PostPosted: Tue Feb 22, 2005 2:09 am    Post subject: [solved] Kernel Panic - Unknown Cause Reply with quote

Sorry if the title looks something like taken from a "geeky" movie, but i'm really out of solutions about how to solve this...
I upgraded my computer hardware on friday (new motherboard/processor/memory) and it was up from friday to sunday. Though, since sunday i keep getting kernel panics, apparently happening randomly. The last "major" thing i remember doing was upgrading gcc to version 3.4.3-20050110 and re-emerging system a couple of times.
I only saw the kernel panic message twice (the other times i was in X), and it was like:
Quote:

Kernel Panic
Version 2.84 Reloading

I don't think it's hardware fault, since the computer worked correctly from friday do sunday. Now, it only lasts about 2 hours. However, if i'm not using it, it seems to last more time.
One time, i noticed it was about to crash because the sound stopped working correctly, so i ran "top" and the process i was using (firefox) was using 99,9% cpu at that time.

So, does this have anything to do with gcc upgrade? Is there a way to see any kernel output related to the system crash?
Some specs:
Code:

$ uname -a
Linux nouplab 2.6.10-gentoo-r6 #9 SMP Mon Feb 21 19:01:54 WET 2005 i686 Intel(R) Pentium(R) 4 CPU 3.00GHz GenuineIntel GNU/Linux


Thanks in advance.
_________________
noup.


Last edited by noup on Thu Apr 14, 2005 10:02 am; edited 2 times in total
Back to top
View user's profile Send private message
TheRAt
Veteran
Veteran


Joined: 03 Jun 2002
Posts: 1580

PostPosted: Tue Feb 22, 2005 3:54 am    Post subject: Reply with quote

You have recompiled your kernel to get support for your new devices? You do not mention that in your post.. Anything in the logs to indicate what was happening before the crash? Could your hardware be overheating?
_________________
All reality is the construct of the observer.

Get Firefox and rediscover the web!

BOFH Excuse #295:
The Token fell out of the ring. Call us when you find it.
Back to top
View user's profile Send private message
noup
l33t
l33t


Joined: 21 Mar 2003
Posts: 917

PostPosted: Tue Feb 22, 2005 4:30 am    Post subject: Reply with quote

TheRAt wrote:
You have recompiled your kernel to get support for your new devices? You do not mention that in your post.. Anything in the logs to indicate what was happening before the crash? Could your hardware be overheating?

Thanks for answering.
The new devices were only the CPU (with hyperthreading) and the LAN card (which i don't use - i use my pci wireless one. But yes, i've recompiled it several times, with the lan card built in and as module (but i think it won't influence, since it isn't being used). I didn't find any usefull information in the logs. Sometimes, it panics when i'm working, but the first kernel panic occured when i was doing nothing: i was remotely connected by ssh and did some emerge's, then i stopped for some time (say, half an hour, i think) and it just froze.is
I don't think it was overheating, since it is pretty well cooled. Strange thing, since my post it hasn't broke. I've been playing enemy-territory, which is very resourceful, and the machine is still up. :?
Isn't there a way to get the kernel print some information in case it panics? And also, can this be due to the gcc 3.4 version?
_________________
noup.
Back to top
View user's profile Send private message
TheRAt
Veteran
Veteran


Joined: 03 Jun 2002
Posts: 1580

PostPosted: Tue Feb 22, 2005 5:20 am    Post subject: Reply with quote

Could be any number of things as the crashing is unpredictable.. I am sure there is probably a way to leave a kernel dump or something that has more error information, I just do not know how..

I'm still betting on hardware, but I have been wrong more times than I can count safely in one day, so...
_________________
All reality is the construct of the observer.

Get Firefox and rediscover the web!

BOFH Excuse #295:
The Token fell out of the ring. Call us when you find it.
Back to top
View user's profile Send private message
noup
l33t
l33t


Joined: 21 Mar 2003
Posts: 917

PostPosted: Tue Feb 22, 2005 5:45 am    Post subject: Reply with quote

TheRAt wrote:
Could be any number of things as the crashing is unpredictable.. I am sure there is probably a way to leave a kernel dump or something that has more error information, I just do not know how..

I'm still betting on hardware, but I have been wrong more times than I can count safely in one day, so...

Heh i wish you are wrong again (on the hardware thing)! :D
One thing i did before booting last time was enabling ACPI2 in the BIOS. Perhaps this had any influence? I thought i would be able to see my system's temperature with ACPI, since it's a new system, but acpi just gives no information... but this is another problem. :)
In menuconfig, there is a "Kernel debugging" option, under "kernel Hacking".. perhaps this can help, i've gotta search how.
I'm gonna sleep now and leave this on during all night, to see what it gives. I'll post the results first thing in the morning :D er... in the afternoon. :)
_________________
noup.
Back to top
View user's profile Send private message
TheRAt
Veteran
Veteran


Joined: 03 Jun 2002
Posts: 1580

PostPosted: Tue Feb 22, 2005 5:51 am    Post subject: Reply with quote

good luck!
_________________
All reality is the construct of the observer.

Get Firefox and rediscover the web!

BOFH Excuse #295:
The Token fell out of the ring. Call us when you find it.
Back to top
View user's profile Send private message
GungHo
Apprentice
Apprentice


Joined: 27 Aug 2004
Posts: 254

PostPosted: Tue Feb 22, 2005 8:04 am    Post subject: Reply with quote

boah, you are very curageous 8), I would never compile a kernel with an unstable compiler release. Maybe a nonrelevant application, but not a core component of the system :?
Back to top
View user's profile Send private message
noup
l33t
l33t


Joined: 21 Mar 2003
Posts: 917

PostPosted: Tue Feb 22, 2005 2:13 pm    Post subject: Reply with quote

GungHo wrote:
boah, you are very curageous 8), I would never compile a kernel with an unstable compiler release. Maybe a nonrelevant application, but not a core component of the system :?

I come to believe in this the hard way... :?

I forgot to say, one last change i also did was compiling the kernel with gcc-3.3.5 (yesterday). And well, the computer stayed up all night in a {emerge mplayer && sleep 2400 } cycle, and it is still up 8 hours late (bah i couldn't sleep as much as i wanted :D ). This isn't a true proof of stability, so i will let the kernel remain unchanged for the next week and hope it all goes well.
But i have a doubt... My compilation toolkit (linux26-headers, glibc and binutils) are all compiled with gcc-3.4. Even though i compiled my kernel with gcc-3.3.5, shouldn't this affect it anyway?
_________________
noup.
Back to top
View user's profile Send private message
noup
l33t
l33t


Joined: 21 Mar 2003
Posts: 917

PostPosted: Tue Feb 22, 2005 3:42 pm    Post subject: Reply with quote

Bah, it just broke again... :?

in my /var/log/kernel/log-2005-02-22-16:10:49 file, i have:
Code:

Feb 22 15:10:49 [kernel] scheduling ile atomic: firefox-bin/0xfffffeff/10193sch$
Feb 22 15:10:49 [kernel]  [<c01122e1>] smp_apic_timer_interrupt+0xe1/0xf0
Feb 22 15:10:49 [kernel]  [<c042b69e>] schedule_timeout+0xbe/0xc0
Feb 22 15:10:49 [kernel]  [<c015c2b9>] fget+0x49/0x60
Feb 22 15:10:49 [kernel]  [<c016f18b>] do_pollfd+0x5b/0xa0
Feb 22 15:10:49 [kernel]  [<c016f278>] do_poll+0xa8/0xd0
Feb 22 15:10:49 [kernel]  [<c016f401>] sys_poll+0x161/0x240
Feb 22 15:10:49 [kernel]  [<c016e750>] __pollwait+0x0/0xd0
Feb 22 15:10:49 [kernel]  [<c0103159>] sysenter_past_esp+0x52/0x75
Feb 22 15:10:49 [kernel] scheduling while atomic: firefox-bin/0xfffffeff/10193
Feb 22 15:10:49 [kernel]  [<c01122e1>] smp_apic_timenterrupt+0xe1/0xf0

and in /var/log/critical/log-2005-02-22-16:08:32:
Code:

Feb 22 15:08:29 [kernel] ] schedule_timeout+0xbe/0xc0
                - Last output repeated twice -
Feb 22 15:08:31 [kernel] ] sys_pl+0x161/0x240
Feb 22 15:08:31 [kernel] ] sl+0x161/0x240
Feb 22 15:08:31 [kernel] ]l+0x161/0x240
Feb 22 15:08:31 [kernel] ] sl+0x161/0x240
Feb 22 15:08:31 [kernel] ] do_po/0xd0
Feb 22 15:08:31 [kernel] ] sys_pl+0x161/0x240
                - Last output repeated twice -
Feb 22 15:08:32 [kernel] ] sl+0x161/0x240

Don't know if this is useful or not though...
_________________
noup.
Back to top
View user's profile Send private message
GungHo
Apprentice
Apprentice


Joined: 27 Aug 2004
Posts: 254

PostPosted: Tue Feb 22, 2005 6:02 pm    Post subject: Reply with quote

Hi,

I would never compile such vital parts of my box like glibc, the kernel or binutils (maybe less critical) or other shared libs with an unstable release of gcc, because such components are the foundation of the whole system and it's stability.

But I have no proof, it looks suspect to me, but ...
Back to top
View user's profile Send private message
noup
l33t
l33t


Joined: 21 Mar 2003
Posts: 917

PostPosted: Tue Feb 22, 2005 6:28 pm    Post subject: Reply with quote

GungHo wrote:
Hi,

I would never compile such vital parts of my box like glibc, the kernel or binutils (maybe less critical) or other shared libs with an unstable release of gcc, because such components are the foundation of the whole system and it's stability.

But I have no proof, it looks suspect to me, but ...

I'm recompiling my whole system back to gcc 3.3.5 version now. Will be the last thing i'll try before thinking about looking at the hardware for problems.
I'll be posting my results. :)
_________________
noup.
Back to top
View user's profile Send private message
GungHo
Apprentice
Apprentice


Joined: 27 Aug 2004
Posts: 254

PostPosted: Tue Feb 22, 2005 8:10 pm    Post subject: Reply with quote

Hi,

boah, re-compiling the whole system :cry: . But thats the next stability test for your system 8)

But be aware, I can not guarantee, that this fixes your problem. I simply do not trust unstable software. But that is my personal opinion.

In the moment I have had a look into http://gcc.gnu.org/. The 3.4er line is the Current release series, the 3.3er line is the Previous release series. The 4.0er line is the Active development (mainline). But no release greater than 3.3.5-r1 is marked stable in Gentoo's portage. In the past there have sometimes been probs with some releases of gcc, and as it's a very vital part of the system (which generates all binaries, shared libs etc) in my opinion it's better to stay on the safe side. For Gentoo being a source based distribution more than any other binary based dist. Sigh :?

Good luck, and post the results of your system-rebuild :)
Back to top
View user's profile Send private message
TheRAt
Veteran
Veteran


Joined: 03 Jun 2002
Posts: 1580

PostPosted: Tue Feb 22, 2005 8:35 pm    Post subject: Reply with quote

sys-devel/gcc-3.4.3.20050110 works fine here... for everything.. using it for a while on my laptop and desktop and about 2 weeks on the server... no problems.. YMMV.
_________________
All reality is the construct of the observer.

Get Firefox and rediscover the web!

BOFH Excuse #295:
The Token fell out of the ring. Call us when you find it.
Back to top
View user's profile Send private message
noup
l33t
l33t


Joined: 21 Mar 2003
Posts: 917

PostPosted: Tue Feb 22, 2005 9:45 pm    Post subject: Reply with quote

GungHo wrote:
Hi,

boah, re-compiling the whole system :cry: . But thats the next stability test for your system 8)

But be aware, I can not guarantee, that this fixes your problem. I simply do not trust unstable software. But that is my personal opinion.

In the moment I have had a look into http://gcc.gnu.org/. The 3.4er line is the Current release series, the 3.3er line is the Previous release series. The 4.0er line is the Active development (mainline). But no release greater than 3.3.5-r1 is marked stable in Gentoo's portage. In the past there have sometimes been probs with some releases of gcc, and as it's a very vital part of the system (which generates all binaries, shared libs etc) in my opinion it's better to stay on the safe side. For Gentoo being a source based distribution more than any other binary based dist. Sigh :?

Good luck, and post the results of your system-rebuild :)

See, i don't honestly think that gcc would make my system freeze, mostly because (and this was the reason that made me switch the gcc version) there were so many people reporting that it was working so well. I understand it is a vital part of the system, but could it really produce binaries that simply freeze my system? I really don't know... :?
_________________
noup.
Back to top
View user's profile Send private message
noup
l33t
l33t


Joined: 21 Mar 2003
Posts: 917

PostPosted: Tue Feb 22, 2005 9:46 pm    Post subject: Reply with quote

TheRAt wrote:
sys-devel/gcc-3.4.3.20050110 works fine here... for everything.. using it for a while on my laptop and desktop and about 2 weeks on the server... no problems.. YMMV.

Bahh you shouldn't have said that :D
_________________
noup.
Back to top
View user's profile Send private message
noup
l33t
l33t


Joined: 21 Mar 2003
Posts: 917

PostPosted: Tue Feb 22, 2005 9:58 pm    Post subject: Reply with quote

OK, here is a list of possibilities about what is causing this. They're mostly things i don't know how good their current support in linux is.

  • APIC enabled
  • ACPI enabled
  • APM disabled
  • ACPI2 extra tables (whatever this is) enabled in the BIOS
  • using ndiswrapper with an SMP enabled kernel

If anyone knows if any of these things usually causes a lockup, please tell me... :?
_________________
noup.
Back to top
View user's profile Send private message
noup
l33t
l33t


Joined: 21 Mar 2003
Posts: 917

PostPosted: Sat Feb 26, 2005 3:57 am    Post subject: Reply with quote

ok, some "Results", at last....
well, i think the problem seems to be solved. :D i did wait a long time before saying this, because i always had the feeling that the moment i would say it my computer would crash :o (yeeepie, it didn't! 8) ).
so now to the technical details, i tried disabling apic/acpi/hyperthreading/smp/preempt both in the kernel and in the bios, and the computer ran rock solid for a day and a half. then, all of a sudden, i luckily led myself to this thread which, for my surprise, showed all the things i had experienced!
So the problem was in fact with ndiswrapper, which i solved by upgrading to the latest unstable version (which is much more stable than the stable one :D ). Afterwards, i enabled acpi/apic/everything again, and all is working splendid. ;)
Man... it feels so good not to have kernel panics :wink:
Me thanks everyone for all the help, this forum rocks! :P
_________________
noup.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum