View previous topic :: View next topic |
Author |
Message |
starnix Guru
Joined: 02 Mar 2003 Posts: 530
|
Posted: Wed May 12, 2004 1:59 pm Post subject: Computer issues (SOLVED) |
|
|
About a week ago I noticed some strange behavior on my Gentoo system. I was ripping a CD and the computer just spontaneously rebooted. When it came back up I started ripping a cd again and a few minutes later the system just shutdown. I ran chkrootkit and it looked like either logfile corruption or someone deleting logs. So to play it safe I started to reinstall gentoo. Now last night I was emerging system when it just shutdown again. I am beginning to think its a hardware issue.
Now the problem is, what could it be. The ram would cause it to lock up not reboot as far as I know, the powersupply is only about 3 months old and is more than big enough to support my system. Could the CPU overheating cause this? I am thinking that maybe the heatsync on the CPU is clogged or something and causing it to overheat. Could that be the case?
I installed fedora to just get a system up and running quick while I planned the gentoo reinstall and it shutdown with fedora too.
Anyone have any ideas?
Thanks
Last edited by starnix on Fri May 14, 2004 1:51 pm; edited 1 time in total |
|
Back to top |
|
|
Dr_Claw n00b
Joined: 26 Dec 2003 Posts: 8 Location: en_GB
|
Posted: Wed May 12, 2004 4:15 pm Post subject: |
|
|
Faulty RAM can cause a computer to just reboot, not just lock up. I had this problem a while ago on a PC of mine, traced the problem down to a faulty RAM chip.
So, before checking if anything else is wrong with the hardware, I think the best thing for you to start with is to check the RAM. http://www.memtest86.com/ is one of the standard ways of doing this. Just download the utility and use it as detailed, and it'll tell you if you RAM is ok or knackered!
Hope that's of some help... |
|
Back to top |
|
|
starnix Guru
Joined: 02 Mar 2003 Posts: 530
|
Posted: Wed May 12, 2004 4:18 pm Post subject: |
|
|
Thanks. I was hoping it was not ram since it is only about 6 months old |
|
Back to top |
|
|
robmoss Retired Dev
Joined: 27 May 2003 Posts: 2634 Location: Jesus College, Oxford
|
Posted: Wed May 12, 2004 4:25 pm Post subject: |
|
|
Also, don't assume that because it passes the first round of tests, everything is okay - this is not necessarily the case! Let it run through about 10 times, and if it's still all clear, it's almost certainly not the RAM (unless it's magically healed itself).
The 2.6 kernel allows you to exclude certain areas of RAM from use, and memtest86 will even output the command you need to pass to your bootloader in order to do this. So you shouldn't need to buy any new RAM. _________________ Reality is for those who can't face Science Fiction.
emerge -U will kill your Gentoo
ecatmur, Lord of Portage Bash Scripts |
|
Back to top |
|
|
starnix Guru
Joined: 02 Mar 2003 Posts: 530
|
Posted: Wed May 12, 2004 5:15 pm Post subject: |
|
|
Thanks!!! |
|
Back to top |
|
|
Souperman Guru
Joined: 14 Jul 2003 Posts: 449 Location: Cape Town, South Africa
|
Posted: Thu May 13, 2004 12:58 pm Post subject: |
|
|
And to answer part of your original question, yes, overheating can cause this. Take a look in your BIOS setup program, the threshold is usually there. _________________ moo |
|
Back to top |
|
|
starnix Guru
Joined: 02 Mar 2003 Posts: 530
|
Posted: Thu May 13, 2004 2:20 pm Post subject: |
|
|
Well I looked at my bios and the CPU Temp Warning is disabled. However heres the thing. My computer ran a memory test all night without shutting down. As soon as I try to start compiling something it will shutoff in about 5 minutes. I looked in the bios and the temp seems to be between 65 and 75. Is that too hot? Now it seems like it will shut down at about 70 because last time it shutoff I rebooted quickly into bios and checked the temp and it was at 63. Do you think my chip could be damaged? I blew everything out last night with compressed air, my CPU heat sync had a small cat stuck in the fins and my system case was warm to the touch. What is the normal temp for an athlon xp 1700+ to be running because with a completely cleaned out system it still shuts off. I am thinking the chip is damaged now. Any thoughts? |
|
Back to top |
|
|
lbrtuk l33t
Joined: 08 May 2003 Posts: 910
|
Posted: Thu May 13, 2004 2:41 pm Post subject: |
|
|
The simplest explanation is that your PSU is malfunctioning. I had this problem years ago. Drove me mad. It was the PSU in my case. |
|
Back to top |
|
|
starnix Guru
Joined: 02 Mar 2003 Posts: 530
|
Posted: Thu May 13, 2004 2:56 pm Post subject: |
|
|
It is only about 6 months old, 430w. Also, why would it run fine for 8 hours just idling and when I compile it shuts down in 5 minutes? |
|
Back to top |
|
|
grepcomputers Guru
Joined: 16 Sep 2003 Posts: 375
|
Posted: Thu May 13, 2004 3:06 pm Post subject: |
|
|
75C is approaching 170F (sorry, I still don't properly think in metrics...yet). That is pretty damn hot. What CPU are you running? I have a P4, and while the case has pretty good airflow, the computer is in a corner partly covered by a blanket (I really don't like the noise at night). (the moral of this story?) I don't think my CPU has ever gone over 130F even during long compiles or long gaming sessions. The CPU is also overclocked and *slightly* overvolted, so it runs hotter than normal.
AFAIK, CPU Temp Warning is just an alarm that beeps when your CPU gets too hot. I *really* would suggest turning this on, it will alert you to shut your computer down if the CPU is reaching potentially damaging heat levels. (My old roommate messed up the heatsink on his athlon classic once, when he turned his computer on we started to smell something burning and then the warning tone went off. he turned off the computer and fixed the problem, thus saving the CPU). However, this could depend on your motherboard...
cheers...
...grep |
|
Back to top |
|
|
starnix Guru
Joined: 02 Mar 2003 Posts: 530
|
Posted: Thu May 13, 2004 3:10 pm Post subject: |
|
|
I have an NForce2 motherboard. Maybe 6 months old. So the Ram is about 5 months old, the Powersupply and Motherboard are about 6 months old. THe CPU is about 2 - 3 years old. And it did seem to be pretty hot when I shut it off last night and pulled it apart. I blew out all the fans and heatsyncs and tried again and it STILL shuts off when compiling. But running memtest86 it stayed up over 10 hours. So the problem is when the cpu gets racing. I first noticed it when ripping and encoding a CD. Then when compiling. Do you think the CPU needs replacing? An XP 3200 would be nice
|
|
Back to top |
|
|
grepcomputers Guru
Joined: 16 Sep 2003 Posts: 375
|
Posted: Thu May 13, 2004 4:12 pm Post subject: |
|
|
I'm afraid I've never owned a CPU long enough to have it go bad on me. I am not entirely sure if randomly shutting down could be a sign of a bad CPU. One problem is, it depends on how the motherboard handles particular errors. A bad CPU might throw some kind of error, and some motherboards may shut down the system...others systems may just hard lock...
One thing you could try - open up the case, have a big fan blow air into it, then try doing something CPU instensive. If it still shuts down, either your heatsink is really badly installed, or this is not a heat issue.
I am not sure about AMD chips, but running that hot for a while might be enough to damage the chip.
Do you have a spare, compatible CPU you could try in the existing machine? If it runs properly under load, you've isolated your problem.
cheers... |
|
Back to top |
|
|
Souperman Guru
Joined: 14 Jul 2003 Posts: 449 Location: Cape Town, South Africa
|
Posted: Thu May 13, 2004 5:45 pm Post subject: |
|
|
Not sure if this will help you, but I have a 2600+ and the temperatures on a hot day (about 35C outside, 28C or so inside) are between about 47C idle and 52C under 1h+ full load (encoding a DivX). I have a pretty decent airflow, so my figures may not be representative of the average XP CPU, but I do know that a friend of mine has the same CPU as I do and his runs hotter than mine by about 3 or 4 degrees.
Off-topic: anyone know how I can get the little "degrees" symbol? You know ... the little superscript circle thingum. _________________ moo |
|
Back to top |
|
|
starnix Guru
Joined: 02 Mar 2003 Posts: 530
|
Posted: Thu May 13, 2004 5:59 pm Post subject: |
|
|
Well, it has run fine for about 6 months without doing this. And just as a note, the weather here has gotten much warmer in the last few days. Also last night I started compiling and actually the whole time was blowing compressed air over the heatsync to keep it cool. Still shut down. My buddy was thinking it could be a disk IO problem. He said I should try booting a dos boot disk because dos wont idle the CPU but also wont access the HD's and if it still dies it is most likely the CPU. What do you think of that idea? |
|
Back to top |
|
|
shiftless Tux's lil' helper
Joined: 08 Oct 2003 Posts: 128
|
Posted: Thu May 13, 2004 6:31 pm Post subject: |
|
|
Since it keeps dieing under load, my guess would be the power supply or over heating. The harder your cpu/computer in general works the more power it will draw. A bad power supply can often work fine under a light load but the more draw put on it the more unstable it will become...
You may also want to emerge lm-sensors. It will show you what temperature the hardware sensors are reporting. Checking them in the bios after a reboot is usually not a good indicator.
I would also say that it sounds like your computer is running pretty hot to me... Right now with a room temperature of 88F (31C) my computer will get up to around 110F (43C) inside the case, and 154F (68C) at the cpu when it is under load. On a cooler day, its hard for me to get it to go above 140F (60C). This is with an athlon 2600 XP.
For an athlon i believe the cutoff temp that you dont want to go above is 185F (85C) |
|
Back to top |
|
|
starnix Guru
Joined: 02 Mar 2003 Posts: 530
|
Posted: Thu May 13, 2004 7:26 pm Post subject: |
|
|
Well, lmsensors is not an option since I cant even get an OS installed let alone any programs. I will try swapping the power supply and see if that works. Would the CPU really draw that much more power at 100% than at idle? It is a 430w power supply. I do have case lights. If I unhook those and run it, if it still dies, willl that rule out the PSU? |
|
Back to top |
|
|
PowerFactor Veteran
Joined: 30 Jan 2003 Posts: 1693 Location: out of it
|
Posted: Thu May 13, 2004 7:59 pm Post subject: |
|
|
70°C is hotter than an athlon-xp 1700+ should run. I run a very quiet, relatively low airflow system and my 1700+ never exceeds 55°C when compiling. And that's when it is in dire need of cleaning like right now. When the heatsink is clean it maxes out around 47°C. Granted I do have a heatsink designed for low airflow, but I'm not moving much air over it. Your cpu should be cooler than it is IMO. Your cpu may or may not be failing, but that isn't what is causing it to run hot.
On way you could eliminate the disk IO possibility is to do something like this. mount tmpfs and copy some big tar.bz2 file to it.
Code: | mkdir /mnt/tmp
mount -t tmpfs tmpfs /mnt/tmp
cp /usr/portage/distfiles/gcc-3.3.1.tar.bz2 /mnt/tmp |
Then bunzip that file to /dev/null repeatedly.
Code: | while true
do
bunzip2 < /mnt/tmp/gcc-3.3.1.tar.bz2 > /dev/null
done |
That will work the cpu quite hard without using the disk. You should be able to do something like that from the livecd as well. If the os were installed you could also use cpuburn but I most definitely would NOT recommend running that without having lm-sensors configured as you might end up with your athlon smoking if your cooling isn't up to snuff and you don't have the bios auto cutoff enabled. The bunzip2 trick shouldn't get it much if any hotter than compiling.
And about the only way you can rule out the PSU is to try another one, or if the problem goes away when you replace something else.. However if you unplug your lights and it doesn't quit then that would point to the PSU as the problem.
Souperman wrote: | Off-topic: anyone know how I can get the little "degrees" symbol? You know ... the little superscript circle thingum. | If you enable html in your posts then should get it for you. I usually just copy and paste it from somewhere, if I bother at all, works just as well and no need to enable html in the post. |
|
Back to top |
|
|
starnix Guru
Joined: 02 Mar 2003 Posts: 530
|
Posted: Thu May 13, 2004 8:25 pm Post subject: |
|
|
I wonder if the silver oxide between the CPU and heatsync broke or something. I will have to check when I get home. |
|
Back to top |
|
|
starnix Guru
Joined: 02 Mar 2003 Posts: 530
|
Posted: Fri May 14, 2004 1:51 pm Post subject: |
|
|
It's Fixed. I replaced the CPU heatsync and fan. Its been compiling ever since. Thanks for all your help guys / gals. |
|
Back to top |
|
|
|