View previous topic :: View next topic |
Author |
Message |
Neilo n00b
Joined: 15 Apr 2005 Posts: 15 Location: Shiremoor, UK
|
Posted: Sun Jul 17, 2005 3:07 pm Post subject: Kernel Panics suddenly start to happen... |
|
|
Today my machine has started crashing randomly - halting as if someone hit the "pause" button, whatever was on the screen before is fine. Occasionally it hangs at the initial bios screen aswell, but one time trying to boot up it hung, and left me a friendly note:
cpu 0: machine check exception: 4 bank 4: b200001000010c0f
tsc 1872ea1dc0
kernel panic - not syncing: machine check
Does anyone here know what this means? I thought the crashing was originally my graphics card, but I think it could be my RAM now, running memtest86 atm to find out. Any help appreciated
System: AMD64 3000+ Newcastle, 2x512 PC2700 Corsair, Nvidia GF6600GT, Abit KV8Pro Motherboard, running Gentoo, kernel 2.6.11. _________________ My All-purpose Gaming PC and Web/fileserver: A64 3000+ on a Abit NF8, 2x512MB Corsair RAM, Geforce 6600GT. Running on Gentoo, formatted my Windows partition in a fit of rage. |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54810 Location: 56N 3W
|
Posted: Sun Jul 17, 2005 3:41 pm Post subject: |
|
|
Neilo,
Sounds like hardware problems or overheating.
Is it any better if you take the lid off ?
Can you see the fans running ? _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
Apopatos Guru
Joined: 17 Oct 2004 Posts: 512 Location: Hellas
|
Posted: Sun Jul 17, 2005 3:41 pm Post subject: |
|
|
As far as I know, machine check is an option of the kernel which, automatically, turn off the system if overheated.
Check your Bios about temperature, maybe the fan is problematic and your proccessor begins to work in high temperatures. |
|
Back to top |
|
|
Neilo n00b
Joined: 15 Apr 2005 Posts: 15 Location: Shiremoor, UK
|
Posted: Sun Jul 17, 2005 3:45 pm Post subject: |
|
|
my system temp is 45degC, CPU temp is 47degC - WITH the case open, in BIOS just after the panic, all fans are running at full: CPU, case rear, case front, gfx card, PSU fans and harddrive fans. The room is pretty warm, although the pc has survived in hotter times. memtest86 hung first time through, going throught it again to check the memory over. Any suggestions on what I should do, possibly without the need to buying watercooling but any suggestion is great. Also, the CPU has not been overclocked, its been left alone untweaked. _________________ My All-purpose Gaming PC and Web/fileserver: A64 3000+ on a Abit NF8, 2x512MB Corsair RAM, Geforce 6600GT. Running on Gentoo, formatted my Windows partition in a fit of rage. |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54810 Location: 56N 3W
|
Posted: Sun Jul 17, 2005 4:23 pm Post subject: |
|
|
Neilo,
With the power off at the wall socket, make sure the memory modules and PCI cards are properly seated.
If you can remove one or more sticks of memory and the PC will still work, try each stick on its own.
Do not attempt to reseat the processor. You will need new heatsink comppound if you disturb it. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
Neilo n00b
Joined: 15 Apr 2005 Posts: 15 Location: Shiremoor, UK
|
Posted: Sun Jul 17, 2005 7:38 pm Post subject: |
|
|
It seems to be ok now, i've set a fan up to pull air in from outside and blow it in the general direction of the pc, plus my room has cooled down a fair bit, it *seems* to be stable again, memtest seemed fine, but I'm going to invest in some better cooling for my system, bigger CPU heatsink and some RAM with a heatsink. Best be safe than sorry its the only decent working system I have _________________ My All-purpose Gaming PC and Web/fileserver: A64 3000+ on a Abit NF8, 2x512MB Corsair RAM, Geforce 6600GT. Running on Gentoo, formatted my Windows partition in a fit of rage. |
|
Back to top |
|
|
Neilo n00b
Joined: 15 Apr 2005 Posts: 15 Location: Shiremoor, UK
|
Posted: Tue Jul 19, 2005 8:49 am Post subject: |
|
|
Ok, its started happening again...but I managed to catch what one of the kernel panics said on boot, and fed it into mcelog:
CPU 0 4 northbridge TSC 1872ea1dc0
Northbridge CRC error
link number = 1
bit57 = processor context corrupt
bit61 = error uncorrected
bus error 'local node observed, request didn't time out
generic error mem transaction
generic access, level generic'
STATUS b200001000010c0f MCGSTATUS 4
Kernel panic - not syncing: Machine check
I really hope this doesn't mean a new motherboard, am I right in thinking I need a new mobo, or should I stick a bigger cooler on my northbridge?
Edit: Ordered a chipset cooler with a fan for my NB, and an exhaust for my GFX card, lets see if this solves the problem. _________________ My All-purpose Gaming PC and Web/fileserver: A64 3000+ on a Abit NF8, 2x512MB Corsair RAM, Geforce 6600GT. Running on Gentoo, formatted my Windows partition in a fit of rage. |
|
Back to top |
|
|
Apopatos Guru
Joined: 17 Oct 2004 Posts: 512 Location: Hellas
|
Posted: Tue Jul 19, 2005 1:10 pm Post subject: |
|
|
Try to low the Hz of RAM through the BIOS. I had a motherboard which had similar problems. I believe it will work after that. Ok in fewer Hz but at least it will work |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54810 Location: 56N 3W
|
Posted: Tue Jul 19, 2005 6:21 pm Post subject: |
|
|
Neilo,
The Northbridge mediates the CPU/Memory transfers. The message means that it went wrong somewhere but was detected and found to be uncorrectable.
That narrows it down to The CPU, the Northbride, the memory, or maybe the PSU.
Unfortunately, the message does not appear to give the address of teh failed transaction.
Code: | bit57 = processor context corrupt | means the CPU has just done a task switch and what was read from memory did not make sense.
Unless you have ECC memory and a motherboard to make use of it, most errors like this will go undetected. You may find you get things crashing with a SIG 11 (SIGSEGV) too. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
Neilo n00b
Joined: 15 Apr 2005 Posts: 15 Location: Shiremoor, UK
|
Posted: Wed Jul 20, 2005 11:52 pm Post subject: |
|
|
Lowering by a few MHz didn't work at all. I could only drop it by 4 - from 204Mhz to 200Mhz. I'm worried, I hope its just my motherboard. _________________ My All-purpose Gaming PC and Web/fileserver: A64 3000+ on a Abit NF8, 2x512MB Corsair RAM, Geforce 6600GT. Running on Gentoo, formatted my Windows partition in a fit of rage. |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|