View previous topic :: View next topic |
Author |
Message |
mdpye Tux's lil' helper
Joined: 18 Apr 2002 Posts: 102 Location: Nottingham, England
|
Posted: Fri Jul 05, 2002 10:19 pm Post subject: Stability, XFS, preempt, nVidia, KDE, all kinds of shi te... |
|
|
I have a new Athlon XP 2100+ box with a KT333 chipset, 512MB RAM, GeForce 2 and I have compiled Gentoo on it in a very similar manner as on my old PIII. However, where the PIII was rck solid, I am seeing segfaults abound and regular lockups of Konqueror, Kicker, KDM and Konsole, particularly when I'm emerging. These are genuine freezes, a la winblows, and I'm not used to them.
Also, I am seeing starange results from top, for example the following is sorted by CPU usage. Notice the 0.2% idle time, and yet the running processes only amount to 8% usage. I am also seeing very high load averages (2.5+).
Code: | 11:01pm up 15 min, 0 users, load average: 1.71, 0.67, 0.24
57 processes: 53 sleeping, 4 running, 0 zombie, 0 stopped
CPU states: 90.6% user, 9.0% system, 0.0% nice, 0.2% idle
Mem: 514840K av, 233752K used, 281088K free, 0K shrd, 16K buff
Swap: 257000K av, 0K used, 257000K free 161184K cached
PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND
5788 root 18 0 4876 4876 1248 R 3.6 0.9 0:00 cc1
29714 root 15 0 271M 15M 4284 R 1.7 3.0 0:17 X
31523 root 17 0 1720 1720 460 S 1.2 0.3 0:00 cpp0
7802 mp 15 0 15036 14M 13124 R 1.0 2.9 0:00 kdeinit
13862 mp 15 0 16028 15M 13980 S 0.2 3.1 0:00 kdeinit
4342 root 16 0 1064 1064 624 S 0.2 0.2 0:00 make
31112 mp 15 0 984 984 764 R 0.1 0.1 0:00 top
1 root 15 0 528 528 452 S 0.0 0.1 0:04 init
2 root 15 0 0 0 0 SW 0.0 0.0 0:00 keventd
3 root 34 19 0 0 0 SWN 0.0 0.0 0:00 ksoftirqd_CPU0
4 root 15 0 0 0 0 SW 0.0 0.0 0:00 kswapd
5 root 25 0 0 0 0 SW 0.0 0.0 0:00 bdflush
6 root 15 0 0 0 0 SW 0.0 0.0 0:00 kupdated
7 root 15 0 0 0 0 SW 0.0 0.0 0:00 pagebuf_daemon
25404 root 21 0 0 0 0 SW 0.0 0.0 0:00 khubd
6595 root 15 0 712 712 592 S 0.0 0.1 0:00 devfsd
29855 root 15 0 580 580 508 S 0.0 0.1 0:00 metalog |
Here is a lost of all the information I think might be relevant, I really hope someone can help out here.
gentoo-sources 2.4.19-r7
GCC 2.95.3-r5
KDE 3
Preemp and low latency enabled
nVidia drivers
grsecurity medium
C/CXXFLAGS="-O3 -march=i686 -pipe"
XFS filesystem
hardware:
Athlon 2100+ Palomino core
Gigabyte GA-7VRXP (KT333 chipset)
512MB RAM
20GB Maxtor HD
I doubt that my compiler flags, gcc or kde are to blame as they are all identical to the rock solid ones used on my old machine.
Any ideas? These stability problems are SERIOUS, I've lost this post once already through konqueror going down, so I wrote it in kwrite. As far as I can tell they are not specifically reproducable, but if I work while emerging (and sometimes when not), they are inevitable.
MP _________________ Cheers, MP |
|
Back to top |
|
|
trythil Tux's lil' helper
Joined: 06 Jun 2002 Posts: 123 Location: RHIT, Terre Haute, IN, USA
|
Posted: Fri Jul 05, 2002 11:41 pm Post subject: |
|
|
Preemptive kernel + low latency patches should be considered highly experimental. The same software configuration on different hardware configurations, as you know (or should know, heh) never guarantees similar stability.
There's some material here in the Gentoo forums about people experiencing odd behavior using those + XFS. The fact that these occur when you're emerging seems to point the fault not at KDE, but rather a bad interaction between the patches and the filesystem.
However, think you could capture some system log output and post it here? |
|
Back to top |
|
|
delta407 Bodhisattva
Joined: 23 Apr 2002 Posts: 2876 Location: Chicago, IL
|
Posted: Sat Jul 06, 2002 12:26 am Post subject: |
|
|
Try disabling one or both, recompiling the kernel, and running diagnostics again. _________________ I don't believe in witty sigs. |
|
Back to top |
|
|
taskara Advocate
Joined: 10 Apr 2002 Posts: 3763 Location: Australia
|
Posted: Sat Jul 06, 2002 12:32 am Post subject: |
|
|
hey - I have a feeling it might be the mainboard.
We built about 30 pcs with that mainboard here, and almost all of them had HUGE stability problems (we had windows not linux).
Some of them were ram, others were incompatibilities with the NIC, and other things.
Try stripping your mainboard of all non-neccesary components, run some ram testers, and update to the latest bios version.
http://ftp.gigabyte.com.tw/support/temp/7vrxp_f7.zip
goodluck! |
|
Back to top |
|
|
Malakin Veteran
Joined: 14 Apr 2002 Posts: 1692 Location: Victoria BC Canada
|
Posted: Sat Jul 06, 2002 8:56 am Post subject: |
|
|
My main box uses a GA-7VRXP and I've sold quite a few. Never seen a single problem.
You could run a memory tester on it.
http://www.memtest86.com/
You could also try a vanilla kernel. I'm using 2.4.18-pre7 with preempt patch. |
|
Back to top |
|
|
mdpye Tux's lil' helper
Joined: 18 Apr 2002 Posts: 102 Location: Nottingham, England
|
Posted: Sat Jul 06, 2002 1:14 pm Post subject: |
|
|
trythil wrote: | Preemptive kernel + low latency patches should be considered highly experimental. The same software configuration on different hardware configurations, as you know (or should know, heh) never guarantees similar stability.
There's some material here in the Gentoo forums about people experiencing odd behavior using those + XFS. The fact that these occur when you're emerging seems to point the fault not at KDE, but rather a bad interaction between the patches and the filesystem.
However, think you could capture some system log output and post it here? |
I had a look at the logs, but I didn't save any. I'll try to recreate them later. It's just difficult to find any bloody useful logging inbetween all the preemt counts...
Anyway, the killing of one of the frozen apps was logged as from grsec, so I tried lowering my grsecurity level in the kernel, but to no avail. I've also tried recompiling without the preempt patch, but when I booted the new kernel, it still spat out all the preempt counts etc. I'll have a look directly at the .config, cos I've had a few fsck ups with menuconfig recently as well.
I'm a bit stuck now really though, cos XFS+PREEMPT doesn't play nice, Resier is still considered highly corruptable and ext3 is s...l...o...w! _________________ Cheers, MP |
|
Back to top |
|
|
Robert Tux's lil' helper
Joined: 19 Apr 2002 Posts: 103 Location: Syracuse, NY
|
Posted: Sun Jul 07, 2002 6:27 am Post subject: |
|
|
Quote: |
I'm a bit stuck now really though, cos XFS+PREEMPT doesn't play nice, Resier is still considered highly corruptable and ext3 is s...l...o...w! |
You are not alone on this one ;/ I really like XFS, but have not been able to get preempt to work properly at all. I ended up using just the low latency patch, which works pretty good. I'll be glad though when they resolve the xfs/preempt issues :) |
|
Back to top |
|
|
KiTaSuMbA Guru
Joined: 28 Jun 2002 Posts: 430 Location: Naples Italy
|
Posted: Sun Jul 07, 2002 9:11 am Post subject: |
|
|
There are 2 different big NO-NOs here:
XFS + preempt: this must be one of the reasons for your box cowardly falling on emerging
nVIDIA + Athlon XP: there has been a rather big bruhaha on the LKML (linux kernel mail list) as AMD has posted information on why and how this happens along with a 2.4.19 patch to at least limit these freezes.
I also use nVIDIA + Athlon XP and I have witnessed such freezes on both slack (vanilla 2.4.18 + nVIDIA) as well as gentoo (XFS). These freezes were the very reason I opted for a journaled filesystem and XFS was the best choice. Chosing XFS though prevented me from using preempt, as I had already heard of the "dogfight" these 2 patches were having inside the kernel.
Interestingly, not pushing the agp too hard (blackbox instead of kde, 1280x1024 instead of 1600x1200) made on both distros these lockups less frequent. _________________ Need to flame people LIVE on IRC? Join #gentoo-otw on freenode! |
|
Back to top |
|
|
klieber Bodhisattva
Joined: 17 Apr 2002 Posts: 3657 Location: San Francisco, CA
|
Posted: Mon Jul 08, 2002 5:28 pm Post subject: |
|
|
I have the same motherboard that you do, combined with an Athlon XP 1800. I use the nvidia drivers, pre-empt and low-latency kernel patches and have _zero_ stability problems. I'm also not using the on-board audio, though I doubt that makes a difference.
What I don't use: XFS and grsecurity. I use ext3 and don't have grsecurity compiled in my kernel. (even as a module) Thus, I would look at the following as possible culprits: (in order)
- XFS
- grsecurity
- Bad RAM
- CPU overheating
Though in the case of the heat issue, it would likely fry your CPU rather than just cause random instabilities.
My $.02.
--kurt _________________ The problem with political jokes is that they get elected |
|
Back to top |
|
|
delta407 Bodhisattva
Joined: 23 Apr 2002 Posts: 2876 Location: Chicago, IL
|
Posted: Mon Jul 08, 2002 5:34 pm Post subject: |
|
|
klieber wrote: | Though in the case of the heat issue, it would likely fry your CPU rather than just cause random instabilities. |
Mmm.... no. I've worked with a bunch of Athlons, and when given inadequate cooling they don't fry, they lock up first. If it's locked up hard, the temperature usually continues going up until it reaches equilibruim with the fan/heatsink combo, which usually plateaus the temperature above the operable range but below the melted-silicon range.
Besides which, if it was a heat issue, it would have to be off for a little while for the CPU temperature to decrease so it could boot up again.
I would bet it's XFS. _________________ I don't believe in witty sigs. |
|
Back to top |
|
|
abhishek Retired Dev
Joined: 28 Jun 2002 Posts: 393 Location: Los Angeles, CA
|
Posted: Mon Jul 08, 2002 5:36 pm Post subject: |
|
|
WHile we're on cpu heat, what progs for linux can monitor temp sensors on my mb(soyo dragon)? |
|
Back to top |
|
|
klieber Bodhisattva
Joined: 17 Apr 2002 Posts: 3657 Location: San Francisco, CA
|
Posted: Mon Jul 08, 2002 5:48 pm Post subject: |
|
|
delta407 wrote: | Mmm.... no. |
Mmm.... yes, and I've got the trashed Athlon CPU to prove it.
--kurt _________________ The problem with political jokes is that they get elected |
|
Back to top |
|
|
klieber Bodhisattva
Joined: 17 Apr 2002 Posts: 3657 Location: San Francisco, CA
|
Posted: Mon Jul 08, 2002 5:51 pm Post subject: |
|
|
data_the_android wrote: | WHile we're on cpu heat, what progs for linux can monitor temp sensors on my mb(soyo dragon)? |
lm_sensors provides the low-level stuff and I use Mondo to send me alerts when parameters get out of whack.
EDIT: link I provided for mondo appears to be down, so try the sourceforge page as a secondary link for now.
--kurt _________________ The problem with political jokes is that they get elected |
|
Back to top |
|
|
delta407 Bodhisattva
Joined: 23 Apr 2002 Posts: 2876 Location: Chicago, IL
|
Posted: Mon Jul 08, 2002 5:57 pm Post subject: |
|
|
klieber wrote: | delta407 wrote: | Mmm.... no. |
Mmm.... yes, and I've got the trashed Athlon CPU to prove it. |
I have a fried Thunderbird 1.0 GHz (a heatsink clip on the socket broke, so the heatsink got loose but didn't fall off) -- but, it didn't fry in that incident. I powered it off right after the snapping sound and the screen going black (using the switch on the PSU) and it survived. (It later fried when the remaining clips broke...) _________________ I don't believe in witty sigs. |
|
Back to top |
|
|
Ozymandias Tux's lil' helper
Joined: 10 Apr 2002 Posts: 81 Location: Netherlands
|
Posted: Tue Jul 09, 2002 9:45 am Post subject: |
|
|
try 'mem=nopentium' as a boot parameter to the kernel, as this might be the bug with AMD and AGP and speculative write.
greetz Ozy |
|
Back to top |
|
|
mdpye Tux's lil' helper
Joined: 18 Apr 2002 Posts: 102 Location: Nottingham, England
|
Posted: Tue Jul 09, 2002 1:57 pm Post subject: |
|
|
KiTaSuMbA wrote: | There are 2 different big NO-NOs here:
XFS + preempt: this must be one of the reasons for your box cowardly falling on emerging
nVIDIA + Athlon XP: there has been a rather big bruhaha on the LKML (linux kernel mail list) as AMD has posted information on why and how this happens along with a 2.4.19 patch to at least limit these freezes.
|
Well, I'm trying to remove preempt from the kernel, (but it seems stubborn. I'll unmerge / remerge the gentoo sources, cos something in there is messed!), so just two questions now:
Will the AMD patch apply to the current gentoo-sources cleanly (or even at all)?
and, as some people mentioned temp, and I had considered it myself
What is considered to be a suitable operating temperature range for the CPU? I can only check from the BIOS at the moment, cos I don't have lmsensors etc set up, but generally I see 50-55, up to 60-62 after heavy CPU use. _________________ Cheers, MP |
|
Back to top |
|
|
mdpye Tux's lil' helper
Joined: 18 Apr 2002 Posts: 102 Location: Nottingham, England
|
Posted: Thu Jul 11, 2002 3:45 pm Post subject: OK, I've fixed it |
|
|
Removing pre-empt from the kernel does indeed solve the problem, I've been happily playing for a few hours now with no problems. I think we might need a note in the install text though, at the point where XFS is recommended poiting out that preempt should NOT be used together with it... _________________ Cheers, MP |
|
Back to top |
|
|
klieber Bodhisattva
Joined: 17 Apr 2002 Posts: 3657 Location: San Francisco, CA
|
Posted: Thu Jul 11, 2002 4:11 pm Post subject: Re: OK, I've fixed it |
|
|
mdpye wrote: | I think we might need a note in the install text though, at the point where XFS is recommended poiting out that preempt should NOT be used together with it... |
Sounds like a great suggestion to be filed on bugs.gentoo.org.
--kurt _________________ The problem with political jokes is that they get elected |
|
Back to top |
|
|
Malakin Veteran
Joined: 14 Apr 2002 Posts: 1692 Location: Victoria BC Canada
|
Posted: Thu Jul 11, 2002 7:12 pm Post subject: |
|
|
Quote: | What is considered to be a suitable operating temperature range for the CPU? I can only check from the BIOS at the moment, cos I don't have lmsensors etc set up, but generally I see 50-55, up to 60-62 after heavy CPU use. |
According to AMD below 67C is within operating temperature. BUT I've seen a system running at 62C and it was crashing on 3d games (until I changed the fan/heatsink). Personally I'd suggest about 58C as a high end.
What type of cpu fan/heatsink are you using?
Here's a list of AMD approved fans:
http://www.amd.com/us-en/Processors/ProductInformation/0,,30_118_756_759%5E1039%5E1050,00.html
I'd pick a model approved for your XP2100 by AMD with the lowest dba (noise) rating.
About the fried cpu's they'll only fry if one of the clips break or the heatsink somehow stops being in proper contact with the cpu, this assumes of course that you're using a fan/heatsink approved by AMD, if you're not wtf are you doing? Some newer motherbaords have protection against this and this will be standard soon.
If you assemble an Athlon XP system for yourself I suggest getting the boxed retail cpu as they come with a good quiet fan and it only costs a few dollars more. I imagine the chance of the heatsink breaking off is also much less on the boxed fans then on a cheap oem fan. |
|
Back to top |
|
|
klieber Bodhisattva
Joined: 17 Apr 2002 Posts: 3657 Location: San Francisco, CA
|
Posted: Thu Jul 11, 2002 7:32 pm Post subject: |
|
|
Malakin wrote: | According to AMD below 67C is within operating temperature. BUT I've seen a system running at 62C and it was crashing on 3d games (until I changed the fan/heatsink). Personally I'd suggest about 58C as a high end. |
Actually, the maximum die temperature is 90C according to AMD. I've regularly ran an old 1700+ into the low 70's for sustained periods and not had one problem with it.
--kurt _________________ The problem with political jokes is that they get elected |
|
Back to top |
|
|
mdpye Tux's lil' helper
Joined: 18 Apr 2002 Posts: 102 Location: Nottingham, England
|
Posted: Thu Jul 11, 2002 8:00 pm Post subject: |
|
|
Malakin wrote: | According to AMD below 67C is within operating temperature. BUT I've seen a system running at 62C and it was crashing on 3d games (until I changed the fan/heatsink). Personally I'd suggest about 58C as a high end.
What type of cpu fan/heatsink are you using?
Here's a list of AMD approved fans:
http://www.amd.com/us-en/Processors/ProductInformation/0,,30_118_756_759%5E1039%5E1050,00.html
I'd pick a model approved for your XP2100 by AMD with the lowest dba (noise) rating.
About the fried cpu's they'll only fry if one of the clips break or the heatsink somehow stops being in proper contact with the cpu, this assumes of course that you're using a fan/heatsink approved by AMD, if you're not wtf are you doing? Some newer motherbaords have protection against this and this will be standard soon.
If you assemble an Athlon XP system for yourself I suggest getting the boxed retail cpu as they come with a good quiet fan and it only costs a few dollars more. I imagine the chance of the heatsink breaking off is also much less on the boxed fans then on a cheap oem fan. |
I got the OEM, thinking it was the retail box (bugger!), but I bought a bloody expensive fan from Taisol which was one of the only ones approved by AMD for a 2100+. It is sort of loud (compared with my old PIII, but that used an extractor fan and pipe rather than being attached to the heatsink), but has a good throughput and reports properly to the MB for monitoring. It also has good clips which lock down on all six socket clips rather than just two, has a copper contact and phase change thermal conductor. I also got a 38CFM extractor for the back of the case and you can feel it suck if you put your hand across the FRONT grille.
I did build the box myself, and I spent big money on a good case, AMD approved PSU, fan etc. Perhaps those extra pennies whould have gone better on a 2.1GHz P4. Never mind. _________________ Cheers, MP |
|
Back to top |
|
|
leej Apprentice
Joined: 18 May 2002 Posts: 280
|
Posted: Tue Jul 16, 2002 5:14 pm Post subject: |
|
|
klieber wrote: |
Actually, the maximum die temperature is 90C according to AMD. I've regularly ran an old 1700+ into the low 70's for sustained periods and not had one problem with it. |
But apparently, the reported temperature can be 20-30C below the actual temperature. Some Mobo's are more accurate than others I guess. I fried an Athlon TB at 80C (reported in BIOS screen when testing different HS/fans with different heat compound). The ceramic melted and I had read that it should've survived up to 90C - but it was later that I found out about the 20-30C discrepancy.
Hey, I was bored that day.
FWIW, I've found AMD recommended HS/Fans to be pretty abysmal. Paying two or three times the price for a superior HS/Fan, some Arctic Silver thermal compound and an exhaust usually provides far superior cooling to those naff Coolermaster fans recommended by AMD. You can't be too careful with Athlon TB's. |
|
Back to top |
|
|
chatwood2 n00b
Joined: 20 Jun 2002 Posts: 39 Location: Washington DC, Pittsburgh PA
|
Posted: Tue Jul 16, 2002 6:52 pm Post subject: |
|
|
Just thought I'd throw in my 2c.
I'm running a Athlon XP 1800 with an nvidia geforce4 TI4600 (running at 1600x1024) and XFS. I have not had any problems with the "amd agp bug" and never had issues with random lockups, XFS speed or file corruption. My kernel was the 2.4.18 xfs-sources (thus no pre-emp or low-latency).
I'd suggest changing your cooling setup, artic silver is definaly he way to go with modern CPUs, it helps a ton. Also, disable pre-emp and low-latency if you want to use XFS, they just don't play nice.
- Chris |
|
Back to top |
|
|
mdpye Tux's lil' helper
Joined: 18 Apr 2002 Posts: 102 Location: Nottingham, England
|
Posted: Tue Jul 16, 2002 8:22 pm Post subject: |
|
|
leej wrote: | FWIW, I've found AMD recommended HS/Fans to be pretty abysmal. Paying two or three times the price for a superior HS/Fan, some Arctic Silver thermal compound and an exhaust usually provides far superior cooling to those naff Coolermaster fans recommended by AMD. You can't be too careful with Athlon TB's. |
I certainly didn't get one of the coolermaster ones, I bought a more expensive one with coper contact etc and I also bought an exhaust fan which shifts 38cfm, enough to feel the breeze on my legs when I'm sitting at my desk!
I'm not throwing more good money after bad, I actually rather hope it dies so I can send it all back and get a P4. After all, I've followed the book to the letter, and the machine really isn't what it should be. _________________ Cheers, MP |
|
Back to top |
|
|
AnimalMachine Tux's lil' helper
Joined: 27 Apr 2002 Posts: 106 Location: Milwaukee, WI USA
|
Posted: Wed Jul 17, 2002 12:55 pm Post subject: |
|
|
When I bought my TB 1.4 a while back, I got this monster golden orb heat sink (I think the 7200 rpm model). It was quite a mistake because the dang thing is L O U D! There's definatly something to be said for having a quiet heatsink. In retrospect it was pretty stupid not to realize it would be loud - it was the 7200 rpm model after all, not the 5400 one.
FWIW, I just pieced together a 1800xp system from a retail box cpu. While the heatsink looks flakey it's small, quiet, and seems to get the job done. I have a normal case with one case fan installed by the cpu (can't remember the air movement rate, but nothing special) and a Geforce 3 (original make) in there, so it gets warm, but so far I've had no problems - even with extended periods of gaming in UT, Evercrack, and Warcraft 3. |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|