View previous topic :: View next topic |
Author |
Message |
johntramp Guru
![Guru Guru](/images/ranks/rank_rect_3.gif)
![](images/avatars/56157827742c1ca7353e76.jpg)
Joined: 03 Feb 2004 Posts: 457 Location: New Zealand
|
Posted: Sun Oct 03, 2004 7:17 pm Post subject: Computer repeatedly crashes. |
|
|
Hi, I am not sure what is causing this, but every so often my computer will crash to the point where nothing will respond and ssh is not open. It is happening fairly often but I can see no pattern to it. I can compile programs for hours with no problem and then it will crash when using the internet with no other processes running. It has happend from a knoppix live cd aswell as my gentoo install so that means it is a hardware fault, right? I have run memtest86 for 13 hours with no errors, is this test 100% reliable? Are there any other similar tests or anything else I can try to figure out the problem?
Thanks
-John |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
nlightn Apprentice
![Apprentice Apprentice](/images/ranks/rank_rect_2.gif)
Joined: 16 Sep 2003 Posts: 171
|
Posted: Sun Oct 03, 2004 7:35 pm Post subject: |
|
|
What kind of hardware are you running? What kind of "crash" are you experiencing? Hard locks? Random power-offs? I'd be particularly suspicious of the power supply and/or hard disk. |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
solarium_rider Tux's lil' helper
![Tux's lil' helper Tux's lil' helper](/images/ranks/rank_rect_1.gif)
![](images/avatars/119513471940f1d34c85534.jpg)
Joined: 23 Jun 2003 Posts: 88 Location: San Francisco
|
Posted: Sun Oct 03, 2004 8:11 pm Post subject: |
|
|
I too have been experience this lately. Sometimes it'll happen while idle, sometimes while compiling, sometimes while browsing. I can't really seem to figure it out. I've had a few crashes that actually spit out errors, and the error was related to an "Interrupt," so i'm think it's kernel related. I didn't write it down, so I forget the exact error message.
Last night i went to recompile firefox and checked it about 15 minutes later and the video was completely corrupted and everything was locked up. Typically when it locks the cpu also shoots to 100% which isn't to weird I suppose for a lock. If I have sound playing sometimes it will keep playing for a short bit, but eventually stop. I'm not sure how long though, I usually fall asleep streaming music and someties i'll wake up and it'll be dead.
I think we should compile a list of people having the same issues and their relevent hardware/software configurations and see if we can find a pattern.
Hardware:
cpu - x86 athlon xp
video - nvidia
sound - audigy
input devices - gravis joystick, usb mouse
other - usb printer (thought it was crashing before this was connected)
Software:
kernel - 2.6.8.1-ck8 (w/ power management enabled)
X - xorg-x11 6.7.0-r2
video - nvidia drivers
wm - fluxbox
browser - firefox
configs:
CFLAGS="-march=athlon-xp -m3dnow -msse -mfpmath=sse -mmmx -Os
-pipe -fforce-addr -fomit-frame-pointer -frerun-cse-after-loop
-frerun-loop-opt -maccumulate-outgoing-args -ffast-math"
USE="3dnow mmx -nls sse -kde -gnome tiff -arts alsa mozilla dvd dvdr divx4linux
joystick xvid directfb fbcon -cups -wmf -tcpd -esd cups ppds usb"
I also have the "cool" bit set on the kt333 chipset to allow the HALT cpu cmd when idling.
I'm quite suspect of the kernel/kernel configuration, seems 2.6.8 and above started crashing alot. |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
nxsty Veteran
![Veteran Veteran](/images/ranks/rank_rect_5_vet.gif)
![](images/avatars/135823966743134be2dbc8b.jpg)
Joined: 23 Jun 2004 Posts: 1556 Location: .se
|
Posted: Sun Oct 03, 2004 8:25 pm Post subject: |
|
|
Update your kernel to ck9! Ck8 has a bug that can cause crashes. |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
johntramp Guru
![Guru Guru](/images/ranks/rank_rect_3.gif)
![](images/avatars/56157827742c1ca7353e76.jpg)
Joined: 03 Feb 2004 Posts: 457 Location: New Zealand
|
Posted: Sun Oct 03, 2004 10:17 pm Post subject: |
|
|
I have had some problems in the past with this ram, but it started working again so I left it at that. My setup is as as follows...
Linux odysseus 2.6.9-rc1-love1 #2 SMP Fri Oct 1 15:06:34 NZST 2004 i686 AMD Athlon(tm) XP 2600+ AuthenticAMD GNU/Linux
Albatron KX600s Pro motherboard
athlonxp 2600+
nvidia gfx
onboard via sound
x-org
fluxbox
firefox
...so quite similar to yours solarium_rider
The thing is tho it has been happend when running off knoppix aswell so I think it has nothing to do with my software.
I am going to try installing something like debian sid to see if the problem is still there. |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
jschellhaass Guru
![Guru Guru](/images/ranks/rank_rect_3.gif)
Joined: 20 Jan 2004 Posts: 341
|
Posted: Sun Oct 03, 2004 10:47 pm Post subject: |
|
|
What does cat /proc/interrupts show (any conflicts with the video card)?
You may want to try with acpi=off.
jeff |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
Mben Guru
![Guru Guru](/images/ranks/rank_rect_3.gif)
Joined: 29 Mar 2004 Posts: 465 Location: New York, USA
|
Posted: Sun Oct 03, 2004 10:58 pm Post subject: |
|
|
do you have any dust in your case? overheats get me every few months if i dont blow out my case (i have cats )
good luck |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
firephoto Veteran
![Veteran Veteran](/images/ranks/rank_rect_5_vet.gif)
![](images/avatars/2824330814ace3ee664238.png)
Joined: 29 Oct 2003 Posts: 1612 Location: +48° 5' 23.40", -119° 48' 30.00"
|
Posted: Sun Oct 03, 2004 11:37 pm Post subject: |
|
|
I just read about the ck8 problem on cons mailing list. Seems that was what hit me at random sometimes. I also had this happen with ck7 once or twice but nothing in the logs, just a lockup. It's related to having preempt turned on in the kernel I guess. Is preemempt good or bad, my system seems "quicker" with preempt on.
I ended up going back to plain 2.6.8.1 without reiser4 for now since the newer rc's have nvidia issues. ![Sad :(](images/smiles/icon_sad.gif) |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
johntramp Guru
![Guru Guru](/images/ranks/rank_rect_3.gif)
![](images/avatars/56157827742c1ca7353e76.jpg)
Joined: 03 Feb 2004 Posts: 457 Location: New Zealand
|
Posted: Mon Oct 04, 2004 12:05 am Post subject: |
|
|
jschellhaass wrote: | What does cat /proc/interrupts show (any conflicts with the video card)?
You may want to try with acpi=off.
jeff |
Quote: | [root /home/john]$ cat /proc/interrupts
CPU0
0: 224314 IO-APIC-edge timer
1: 127 IO-APIC-edge i8042
9: 0 IO-APIC-level acpi
12: 7647 IO-APIC-edge i8042
14: 11429 IO-APIC-edge ide0
15: 1435 IO-APIC-edge ide1
17: 2413 IO-APIC-level eth0
20: 1241 IO-APIC-level libata
21: 0 IO-APIC-level ehci_hcd, uhci_hcd, uhci_hcd, uhci_hcd, uhci_hcd
22: 0 IO-APIC-level via82cxxx
NMI: 0
LOC: 224316
ERR: 0
MIS: 0 | how would I go about trying with acpi=off ?
What does that mean / do? |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
johntramp Guru
![Guru Guru](/images/ranks/rank_rect_3.gif)
![](images/avatars/56157827742c1ca7353e76.jpg)
Joined: 03 Feb 2004 Posts: 457 Location: New Zealand
|
Posted: Mon Oct 04, 2004 12:06 am Post subject: |
|
|
Mben wrote: | do you have any dust in your case? overheats get me every few months if i dont blow out my case (i have cats )
good luck | I dont think it is heat because my bios has a warning / alarm ~5 degres before the hard shutdown. I will look into that tho, maybe set up the lm_sensors or whatever it is that measures the temps. |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
jschellhaass Guru
![Guru Guru](/images/ranks/rank_rect_3.gif)
Joined: 20 Jan 2004 Posts: 341
|
Posted: Mon Oct 04, 2004 1:04 am Post subject: |
|
|
I don't see nvidia listed anywhere. If you run cat /proc/interrupts within a terminal under X the nvidia card should show up on one of the interrupts. I'm just wondering if you have a IRQ conflict between the nvidia card and something else.
In order to boot without acpi add acpi=off to the kernel line of the boot manager. In grub.conf it would be something like this.
Code: |
kernel /bzImage-2.6.8-gentoo-r5 root=/dev/hde3 vga=791 acpi=off
|
jeff |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
Mben Guru
![Guru Guru](/images/ranks/rank_rect_3.gif)
Joined: 29 Mar 2004 Posts: 465 Location: New York, USA
|
Posted: Mon Oct 04, 2004 1:31 am Post subject: |
|
|
johntramp wrote: | Mben wrote: | do you have any dust in your case? overheats get me every few months if i dont blow out my case (i have cats )
good luck | I dont think it is heat because my bios has a warning / alarm ~5 degres before the hard shutdown. I will look into that tho, maybe set up the lm_sensors or whatever it is that measures the temps. |
blow it out anyway. my bios never classifies it as overheated but the computer locks anyway. just take some compressed air or a vaccume (air works better usually but be carefull not to use too high a pressure or let the fans overspeed) to the fans and vets
good luck |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
johntramp Guru
![Guru Guru](/images/ranks/rank_rect_3.gif)
![](images/avatars/56157827742c1ca7353e76.jpg)
Joined: 03 Feb 2004 Posts: 457 Location: New Zealand
|
Posted: Mon Oct 04, 2004 2:48 am Post subject: |
|
|
I have had a look and there was a little dust that has been through the cpu fan and been blown onto the ram, I have vaccumed this out and a little around the fans. Also I have swapped the ram with another computer to see how that goes.
Quote: | I don't see nvidia listed anywhere. If you run cat /proc/interrupts within a terminal under X the nvidia card should show up on one of the interrupts. I'm just wondering if you have a IRQ conflict between the nvidia card and something else. | Does me not having installed the nvidia drivers yet affect this? I am still running the 2d nv drivers that came with the install.
I will try that kernel line aswell, see if that makes a difference too.
Thanks |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
johntramp Guru
![Guru Guru](/images/ranks/rank_rect_3.gif)
![](images/avatars/56157827742c1ca7353e76.jpg)
Joined: 03 Feb 2004 Posts: 457 Location: New Zealand
|
Posted: Mon Oct 04, 2004 7:25 am Post subject: |
|
|
I have just noticed, when the computer 'hung' that the music I was listening to, about ~1min later started again for a second or so, and then stopped. This would happen about once every minute or so, so I was able to reboot the computer without a hard boot. This too is still happening in knoppix even with the ram being replaced.
Reading from the bios, after the computer being idle for hours
Quote: | System temp: 28degres C / 82degres F
CPU temp: 35degres C / 95degres F
Any possibilities on what else this could be ?? |
|
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
johntramp Guru
![Guru Guru](/images/ranks/rank_rect_3.gif)
![](images/avatars/56157827742c1ca7353e76.jpg)
Joined: 03 Feb 2004 Posts: 457 Location: New Zealand
|
Posted: Mon Oct 04, 2004 9:38 am Post subject: |
|
|
I realised that I had put a UPS in line with the computer about a week ago, the same time since this started happening. I have now moved it out of the way and things seem to be looking up.
I had the computer and ups feeding the computer, maybe that was just too much for it ?
I will see how it is in a couple of hours... hopefully it is sorted
Thanks for your help if so. |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
Incabulos n00b
![n00b n00b](/images/ranks/rank_rect_0.gif)
![](images/avatars/18858701263e9a031d081b2.jpg)
Joined: 14 Apr 2003 Posts: 28 Location: Sydney, Australia
|
Posted: Mon Oct 04, 2004 11:46 am Post subject: |
|
|
Sounds like your CPU is operating at a pretty normal temperature, the shutdowns/crashing certainly isnt caused by it overheating.
I'd check the load on your UPS too, most have a serial cable via which you can monitor load, run time remaining, uptime, and so on. If its overloaded then power will fluctuate to all connected devices in a fairly bad way I assume, sudden shutdowns or lockups might be a power problem.
'dmesg | tail' will show you the last events the kernel has seen, this might help in diagnosing things. You might also want to tone down the more aggressive compiler optimisations in your make.conf too if they are set, and recompile the most crucial components with the more conservative settings ( glibc & kernel come to mind ).
HTH. |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
Incabulos n00b
![n00b n00b](/images/ranks/rank_rect_0.gif)
![](images/avatars/18858701263e9a031d081b2.jpg)
Joined: 14 Apr 2003 Posts: 28 Location: Sydney, Australia
|
Posted: Mon Oct 04, 2004 11:48 am Post subject: |
|
|
Sounds like your CPU is operating at a pretty normal temperature, the shutdowns/crashing certainly isnt caused by it overheating.
I'd check the load on your UPS too, most have a serial cable via which you can monitor load, run time remaining, uptime, and so on. If its overloaded then power will fluctuate to all connected devices in a fairly bad way I assume, sudden shutdowns or lockups might be a power problem.
'dmesg | tail' will show you the last events the kernel has seen, this might help in diagnosing things. You might also want to tone down the more aggressive compiler optimisations in your make.conf too if they are set, and recompile the most crucial components with the more conservative settings ( glibc & kernel come to mind ).
HTH. |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
Mben Guru
![Guru Guru](/images/ranks/rank_rect_3.gif)
Joined: 29 Mar 2004 Posts: 465 Location: New York, USA
|
Posted: Mon Oct 04, 2004 8:49 pm Post subject: |
|
|
if you have a regular powerstrip try just taking the ups out of the system |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
johntramp Guru
![Guru Guru](/images/ranks/rank_rect_3.gif)
![](images/avatars/56157827742c1ca7353e76.jpg)
Joined: 03 Feb 2004 Posts: 457 Location: New Zealand
|
Posted: Mon Oct 04, 2004 11:08 pm Post subject: |
|
|
yes, I have taken the ups out, and now it has been up for about 4 hours and it seems to be fine
there is no serial port out of the ups, I assume it is fairly old as it was given to me for free.
Thanks for your help,
-John |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
johntramp Guru
![Guru Guru](/images/ranks/rank_rect_3.gif)
![](images/avatars/56157827742c1ca7353e76.jpg)
Joined: 03 Feb 2004 Posts: 457 Location: New Zealand
|
Posted: Tue Oct 05, 2004 12:45 am Post subject: |
|
|
it's happening again now without the UPS ![Sad :(](images/smiles/icon_sad.gif) |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
Moloch Apprentice
![Apprentice Apprentice](/images/ranks/rank_rect_2.gif)
![](images/avatars/1128566154a09ad91c4f0c.jpg)
Joined: 17 Mar 2003 Posts: 293 Location: Albuquerque, NM, US
|
Posted: Tue Oct 05, 2004 1:06 am Post subject: |
|
|
Forever I was having problems with my system crashing when using athcool to set the cool bit for my KT333.
I kept it off, finally one day I got tired of having a hot CPU and listening to that damn temperature sensitive processor fan whine at almost high.
So I spent about a week going through kernel settings and found nothing. Then moved to BIOS settings after I set my BIOS to safe mode defaults, athcool worked. I believe the problems lies in a couple of settings. First the enhance performance setting for both RAM and AGP caused the lockups. Also the CPU decode setting. It has 3, normal, fast, and ultra. Normal and fast work fine, ultra locks it up.
I really don't notice any performance change between all these settings, so I'm happy to have found the issue.
I've also heard of some boards turn to the cooling bit on by default and you can use athcool to turn it off and see if that makes a difference. _________________ Understanding is a three-edged sword: your side, their side, and the truth. --Kosh
1010011010 |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
johntramp Guru
![Guru Guru](/images/ranks/rank_rect_3.gif)
![](images/avatars/56157827742c1ca7353e76.jpg)
Joined: 03 Feb 2004 Posts: 457 Location: New Zealand
|
Posted: Tue Oct 05, 2004 2:24 am Post subject: |
|
|
I have had a little look in my bios, I will go and look a little deeper. I have not done any overclocking tho or changed anything like that in the bios.
Another thing is that I can leave the computer on it's own, downloading or compiling or whatever and it is fine. Soon as I jump back on the internet or anything it will lock up again :S |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
Moloch Apprentice
![Apprentice Apprentice](/images/ranks/rank_rect_2.gif)
![](images/avatars/1128566154a09ad91c4f0c.jpg)
Joined: 17 Mar 2003 Posts: 293 Location: Albuquerque, NM, US
|
Posted: Tue Oct 05, 2004 2:49 am Post subject: |
|
|
Well if it definately seems internet oriented. Then, how are you connected? Ethernet, dial-up, some usb crap, etc? What drivers are you using? Kernel modules, something from portage, etc? _________________ Understanding is a three-edged sword: your side, their side, and the truth. --Kosh
1010011010 |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
johntramp Guru
![Guru Guru](/images/ranks/rank_rect_3.gif)
![](images/avatars/56157827742c1ca7353e76.jpg)
Joined: 03 Feb 2004 Posts: 457 Location: New Zealand
|
Posted: Tue Oct 05, 2004 7:17 am Post subject: |
|
|
well the thing is I dont think it is software as it happens in knoppix aswell. I can also leave my computer on the internet downloading on DC and that can run flawlessly for hours.
I will try installing another distro somewhere else and see if it still happens there, maybe a stable debian. |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
johntramp Guru
![Guru Guru](/images/ranks/rank_rect_3.gif)
![](images/avatars/56157827742c1ca7353e76.jpg)
Joined: 03 Feb 2004 Posts: 457 Location: New Zealand
|
Posted: Tue Oct 05, 2004 7:22 am Post subject: |
|
|
Incabulos wrote: | 'dmesg | tail' will show you the last events the kernel has seen, this might help in diagnosing things. |
Quote: | [root /var/log]$ dmesg | tail
ReiserFS: sda1: found reiserfs format "3.6" with standard journal
ReiserFS: sda1: using ordered data mode
ReiserFS: sda1: journal params: device sda1, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
ReiserFS: sda1: checking transaction log (sda1)
ReiserFS: sda1: replayed 1 transactions in 0 seconds
ReiserFS: sda1: Using r5 hash to sort names
nvidia: module license 'NVIDIA' taints kernel.
ACPI: PCI interrupt 0000:01:00.0[A] -> GSI 16 (level, low) -> IRQ 16
NVRM: loading NVIDIA Linux x86 NVIDIA Kernel Module 1.0-6111 Tue Jul 27 07:55:38 PDT 2004
r8169: eth0: link up |
|
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
|