View previous topic :: View next topic |
Author |
Message |
zojas Veteran
![Veteran Veteran](/images/ranks/rank_rect_5_vet.gif)
![](images/avatars/18dd09e23cc8912744049.png)
Joined: 22 Apr 2002 Posts: 1138 Location: Phoenix, AZ
|
Posted: Tue Dec 23, 2003 1:21 pm Post subject: machine crashing, help me diagnose |
|
|
I have a 700MHz athlon thunderbird with an Asus A7V motherboard. I built the whole system from new parts in the summer of 2000.
I bought all good parts, except the ram, which I bought cheap & generic, and the power supply, which was a 300 watt athlon certified power supply which came with the cheap case.
This machine has been solidly reliable until sometime this year.
It has started to lock up hard on me. I use it as a web & email server now, so I'm typically not logged in to the console. but what will happen is that it will stop responding to network activity, won't answer a ping or anything.
then when I turn on the monitor, the monitor will indicate that it's plugged in to a working computer (light turns solid green) but the screen stays black. I try to hit keys on the keyboard to get the display back but it never comes back on. (I have the option in the kernel which blanks the text console using APM)
it first did this probably 6 months ago, and it has probably done it about once a month, but it happened this morning, and it happened last week. So I think it is becoming more frequent.
I typically run folding @ home on it all the time but I have turned that off as of this morning.
how would I go about diagnosing this? I believe it's hardware-related, it has occurred with multiple kernels, probably 4 or 5 versions of 2.4 and now 2.6.0.
once last week it happened right after booting into 2.6.0 for the first time. but this morning it was just sitting there.
for peripherals, I have a parallel-port zip drive, 1 dvd burner, a floppy, two hard drives (a 120gb western digital drive and a 20gb quantum drive), an old pci rage 128, an ethernet card, and a soundblaster live card.
I don't think it's heat related, I'm in the coldest part of the year here (I don't think the room it's in gets below 70 deg F though)
any suggestions welcome. _________________ http://www.desertsol.com/~kevin/ppc |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
krusty_ar Guru
![Guru Guru](/images/ranks/rank_rect_3.gif)
![](images/avatars/gallery/Simpsons/simpsons_krusty.gif)
Joined: 03 Oct 2002 Posts: 560 Location: Rosario, Argentina
|
Posted: Tue Dec 23, 2003 1:29 pm Post subject: |
|
|
Maybe your motherboard is dying, try to check if all the fans that are suposed to be runing are doing it. I had a similar problem that went away taking the whole box apart and reasembling it, there was also a problem with a dirty fan that was taking either too much or too few power and the mobo tought it was borked so the box would freeze to "protect the CPU". _________________ I am Beta, don't expect correct behaviour from me.
Take part of the adopt an unaswered post initiative |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
zojas Veteran
![Veteran Veteran](/images/ranks/rank_rect_5_vet.gif)
![](images/avatars/18dd09e23cc8912744049.png)
Joined: 22 Apr 2002 Posts: 1138 Location: Phoenix, AZ
|
Posted: Tue Dec 23, 2003 2:08 pm Post subject: |
|
|
good suggestion, I'm sure my fans are all crusted up. I'll clean them all up and see. that's a good start. _________________ http://www.desertsol.com/~kevin/ppc |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
MasterX Veteran
![Veteran Veteran](/images/ranks/rank_rect_5_vet.gif)
Joined: 26 Jun 2003 Posts: 1165
|
Posted: Tue Dec 23, 2003 4:36 pm Post subject: |
|
|
Are you doing any kind of overclocking?
Asus allowes you to do this. I believe it is worth spending some time on your M/B's Bios |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
zojas Veteran
![Veteran Veteran](/images/ranks/rank_rect_5_vet.gif)
![](images/avatars/18dd09e23cc8912744049.png)
Joined: 22 Apr 2002 Posts: 1138 Location: Phoenix, AZ
|
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
loudawg n00b
![n00b n00b](/images/ranks/rank_rect_0.gif)
Joined: 15 Oct 2003 Posts: 18
|
Posted: Fri Jan 02, 2004 8:58 am Post subject: |
|
|
Take a very close look at the capacitors on your motherboard. There's a good possibility that they may be going bad. I had this happen to me on my motherboard. You can identify a problem by seeing if any of the capacitors are bulging and/or leaking at the top. If so, chances are this is your problem. Just a thought.... |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
Mace68 n00b
![n00b n00b](/images/ranks/rank_rect_0.gif)
![](images/avatars/14556782273fd0c47a6a6ac.gif)
Joined: 05 Dec 2003 Posts: 33
|
Posted: Fri Jan 02, 2004 9:44 am Post subject: |
|
|
These are all good suggestions.
In my experience I've seen faulty RAM, overheating and Windows (not an issue here ) responsible for this behavior before. You should probably test your RAM somehow, especially if you bought cheap stuff. I think the Gentoo live cd comes with a memory test if my memory serves me correctly. I've also had fairly good results with DocMemory which can be downloaded from http://www.simmtester.com/page/products/doc/docinfo.asp. Also, as krusty_ar suggested, check your fans and make sure they are spinning properly (good and fast) to rule out any overheating issues and make sure you blow all the dust off everything as it can hold heat in and cause overheating as well.
I'll post again if I can think of anything else (too late to think right now).
HTH |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
zojas Veteran
![Veteran Veteran](/images/ranks/rank_rect_5_vet.gif)
![](images/avatars/18dd09e23cc8912744049.png)
Joined: 22 Apr 2002 Posts: 1138 Location: Phoenix, AZ
|
Posted: Sat Jan 03, 2004 12:12 am Post subject: |
|
|
I opened up the machine and blew all the dust out with a can of compressed air (with the vacuum cleaner sucking all the dust out of the air).
couldn't see any obviously toasted capacitors.
pulled out all the cheap ram, put in a single 512mb dimm from crucial (using the link at gentoo.org so hopefully they got a kickback).
I'll report back if it makes it to a week of uptime or if it crashes again. ![Smile :)](images/smiles/icon_smile.gif) _________________ http://www.desertsol.com/~kevin/ppc |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
tomk Bodhisattva
![Bodhisattva Bodhisattva](/images/ranks/rank-bodhisattva.gif)
![](images/avatars/21003072644c471d218211e.jpg)
Joined: 23 Sep 2003 Posts: 7221 Location: Sat in front of my computer
|
Posted: Sat Jan 03, 2004 2:45 pm Post subject: |
|
|
Try memtest86 for testing the RAM http://www.memtest86.com/ you can also
It's really good at testing RAM.
Other than that it's just a case of stripping the computer down to the bare minimum that it needs and testing each thing one by one until you find your problem, once you've tested everything then it's probably your motherboard _________________ Search | Read | Answer | Report | Strip |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
zojas Veteran
![Veteran Veteran](/images/ranks/rank_rect_5_vet.gif)
![](images/avatars/18dd09e23cc8912744049.png)
Joined: 22 Apr 2002 Posts: 1138 Location: Phoenix, AZ
|
Posted: Tue Jan 06, 2004 12:33 am Post subject: |
|
|
I put all the old ram from the crashing server in my other machine (a crappy motherboard from gateway with a pIII 1ghz) and ran memtest86.
the 512mb dimm seems to fail. but, in the past i tried to put a 512mb dimm in this machine and it wouldn't boot. this time it boots & the bios sees it, but memtest86 barfs all over it.
so anyway I've pulled out the suspected bad 512 mb dimm.
memtest86 seems happy with the other 2 chips (a 256 & a 128) so I plan to use those 2 plus another 256mb dimm i already had in this crappy gateway machine for my desktop (for 640mb ram total) and leave the new 512mb dimm from Crucial by itself in the server machine (which used to crash but hasn't crashed once since I put the new ram in on fri, jan 2)
so the problem so far seems to be a bad ram chip. I think the ram chip is at least 1.5 years old & generic. didn't know they could go bad, I thought they either died within the first few hours or lived forever. that's what I get for buying generic ram I guess. never again! _________________ http://www.desertsol.com/~kevin/ppc |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
Mace68 n00b
![n00b n00b](/images/ranks/rank_rect_0.gif)
![](images/avatars/14556782273fd0c47a6a6ac.gif)
Joined: 05 Dec 2003 Posts: 33
|
Posted: Tue Jan 06, 2004 8:07 am Post subject: |
|
|
Glad to hear your server's working so far. I'll have to check out memtest86 and add it to my arsenal since I'm probably moving to linux forever. The "died within the first few hours or lived forever" scenario is often the case with good quality stuff. The cheap stuff seems to be more prone to erratic behavior. I find this to be true of all electronic equipment. Just my experience though. |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
|