View previous topic :: View next topic |
Author |
Message |
RioFL Guru
Joined: 31 Oct 2002 Posts: 407
|
Posted: Sun Oct 19, 2003 10:55 am Post subject: Lockups with nvidia driver - need nvidia guru |
|
|
I am getting random lockups in the system, sometimes only X other times the entire system where I cannot ssh into it to shut it down. I made severe hardware changes in my system, so I wiped the drive and installed using stage 1 with Pentium3 optimizations. System does not lock using the X built-in nv driver.
hardware:
Tyan Tiger dual proc mobo using via apollo t133 chipset
cmos settings conservative, agp4x off, no fast writes, agp drive 0xEC which is the middle of nvidia's recommended range of 0xEA - 0xEE (mobo default is 0xCC), video bios shadow off.
2 pentium III 933 mhz processors
nvidia geforce 4200 ti 64mb ram with memory fins
640mb system ram
--------
important software:
2.4.20-gentoo-r7 kernel (also tried 2.4.22-aa1 kernel same problem)
kernel settings are agressive (tried conservative.. same thing)
kde 3.1.4 and gnome 2.4.0
latest nvidia kernel. tried the nvidia-kernel-1.0.4496-r3 using emerge, and then got the version 2 off the nvidia site. same behavior, then in desparation i got the older 3123 version off the nvidia site. again same behavior/
tried both nvagp and kernel agpgart
renderaccel false
noddc true (one of my monitors lies)
twinview settings presently turned off but it locks either way
I am wondering if this behavior using nvidia-kernel driver may be revealing a marginal/flaky power supply? after discovering both procs use 110w, i feel a 250w supply is not sufficient. ordered a 400w which will be delivered next week. does the nvidia kernel cause the card to draw substantially more current than the nv driver? if so this may be my problem. dont know. kinda need some guidance here if there are software issues , and if so need fixes. the nv driver is suitable to finish my installation, but not suitable for my work.
help?
Chuck |
|
Back to top |
|
|
Yoshi Assim Apprentice
Joined: 16 Apr 2003 Posts: 234 Location: Girona (Spain)
|
Posted: Sun Oct 19, 2003 11:58 am Post subject: |
|
|
Try ck-sources kernel... |
|
Back to top |
|
|
RioFL Guru
Joined: 31 Oct 2002 Posts: 407
|
Posted: Sun Oct 19, 2003 12:06 pm Post subject: |
|
|
i will try ck-sources, but somehow i think i am not seeing the root picture here. the video card worked just fine with my single 933, intel board using nvidia-kernel and the gentoo sources r5 kernel with twinview, nvagp, and aggressive nvidia kernel settings running. worked perfectly. then i upgraded to this dual proc mobo and 2 matched processors, and upon reinstalling (and of course updating software in the process), everything fell apart. |
|
Back to top |
|
|
fizz Guru
Joined: 31 Aug 2003 Posts: 309 Location: Florida
|
Posted: Sun Oct 19, 2003 12:26 pm Post subject: |
|
|
make surre AGPGart is compiled as a module, when its built in it will cause problems. Also, remove DRI from kernel.
In your XF86Config, remove any reference to dri, and maybe post your xf86config so we can check it out _________________ Athlon 64 3200, MSI NEO NForce 3, 1Gig PC3700, EVGA Geforce 6800 GT |
|
Back to top |
|
|
RioFL Guru
Joined: 31 Oct 2002 Posts: 407
|
Posted: Sun Oct 19, 2003 12:38 pm Post subject: |
|
|
agpgart is a module and is not loaded when using nvagp option. here is the config:
Code: |
Section "Files"
RgbPath "/usr/X11R6/lib/X11/rgb"
FontPath "/usr/X11R6/lib/X11/fonts/local/"
FontPath "/usr/X11R6/lib/X11/fonts/misc/"
FontPath "/usr/X11R6/lib/X11/fonts/75dpi/:unscaled"
FontPath "/usr/X11R6/lib/X11/fonts/100dpi/:unscaled"
FontPath "/usr/X11R6/lib/X11/fonts/Type1/"
FontPath "/usr/X11R6/lib/X11/fonts/CID/"
FontPath "/usr/X11R6/lib/X11/fonts/Speedo/"
FontPath "/usr/X11R6/lib/X11/fonts/75dpi/"
FontPath "/usr/X11R6/lib/X11/fonts/100dpi/"
FontPath "/usr/X11R6/lib/X11/fonts/TTF"
ModulePath "/usr/X11R6/lib/modules"
EndSection
Section "Module"
Load "dbe"
Load "glx"
SubSection "extmod"
EndSubSection
Load "type1"
Load "freetype"
EndSection
Section "ServerFlags"
Option "blank time" "10" # 10 minutes
EndSection
Section "InputDevice"
Identifier "Keyboard1"
Driver "keyboard"
Option "AutoRepeat" "250 30"
Option "XkbRules" "xfree86"
Option "XkbModel" "pc104"
Option "XkbLayout" "us"
EndSection
Section "InputDevice"
Identifier "Mouse1"
Driver "mouse"
Option "Protocol" "imps/2"
Option "Device" "/dev/psaux"
Option "ZAxisMapping" "4 5"
EndSection
Section "Monitor"
Identifier "G810"
HorizSync 30-97
VertRefresh 50-180
EndSection
Section "Device"
Identifier "NVidia GeForce 4 4200"
Driver "nvidia"
# Driver "nv"
VideoRam 65535
Option "RenderAccel" "0"
Option "NoLogo" "1"
Option "NvAgp" "1"
Option "NoDDC" "1"
# Option "TwinView" "True"
# Option "TwinViewOrientation" "RightOf"
# Option "SecondMonitorHorizSync" "30-87"
# Option "SecondMonitorVertRefresh" "50-160"
# Option "MetaModes" "1600x1200, 1280x1024"
EndSection
Section "Screen"
Identifier "Screen 1"
Device "NVidia GeForce 4 4200"
Monitor "G810"
DefaultDepth 24
SubSection "Display"
Depth 24
Modes "1600x1200"
ViewPort 0 0
EndSubSection
EndSection
Section "ServerLayout"
Identifier "simple layout"
Screen "Screen 1"
InputDevice "Mouse1" "CorePointer"
InputDevice "Keyboard1" "CoreKeyboard"
EndSection
|
|
|
Back to top |
|
|
abarrett79 n00b
Joined: 19 Oct 2003 Posts: 9
|
Posted: Sun Oct 19, 2003 4:26 pm Post subject: |
|
|
I had this same problem in RH9, I finally fixed it by running the nvuidia driver with the -expert option and having it recompile everything in the driver. By the time I figured that out though I had already played with other settings and screwed up my x system beyond what I could fix. That's why I'm here on Gentoo |
|
Back to top |
|
|
RioFL Guru
Joined: 31 Oct 2002 Posts: 407
|
Posted: Sun Oct 19, 2003 4:55 pm Post subject: |
|
|
umm, not meaning to sound really dumb, but what -expert option? I just searched the readme and makefile and installer and .run file helps and the word expert does not appear. |
|
Back to top |
|
|
RioFL Guru
Joined: 31 Oct 2002 Posts: 407
|
Posted: Sun Oct 19, 2003 5:12 pm Post subject: |
|
|
one more note. i have been compiling the nvidia kernel from scratch |
|
Back to top |
|
|
shakti Guru
Joined: 15 May 2002 Posts: 358 Location: omnipresent
|
Posted: Sun Oct 19, 2003 5:28 pm Post subject: |
|
|
do you get the lockups in both 24 and 16 bit color or just in 24 bit? Are the lockups very random or is there some kind of pattern? I had nvidia lockups could not find the cause until one day i found my fx5200 seld-distructed (one chip and one capacitor exploded...) no screen at all but system booted , replaced it with a new one and never had lockups since..... _________________ Using Gentoo since 2002. |
|
Back to top |
|
|
RioFL Guru
Joined: 31 Oct 2002 Posts: 407
|
Posted: Sun Oct 19, 2003 5:55 pm Post subject: |
|
|
shakti wrote: | do you get the lockups in both 24 and 16 bit color or just in 24 bit? Are the lockups very random or is there some kind of pattern? I had nvidia lockups could not find the cause until one day i found my fx5200 seld-distructed (one chip and one capacitor exploded...) no screen at all but system booted , replaced it with a new one and never had lockups since..... |
both color depths and they are quite random. it could be 5 min , 1 min or one time it went 5 hours. it appears to be aggrivated by watching a movie on mplayer, it wil lock faster that way. i was emerging openoffice in one desktop and watching the hp movie in another one, and within 5 min it locked.
im wondering if marginal power supply may cause this. i cannot find any specs on the standard text current drain of the vid card and the current drain when it is driven by the nvidia-kernel driver. i would expect it would be considerably more. |
|
Back to top |
|
|
fizz Guru
Joined: 31 Aug 2003 Posts: 309 Location: Florida
|
Posted: Sun Oct 19, 2003 5:58 pm Post subject: |
|
|
have you tried to emerge nvidia-kernel nvidia-glx
?
maybe something is going bad during compile time, just a thought _________________ Athlon 64 3200, MSI NEO NForce 3, 1Gig PC3700, EVGA Geforce 6800 GT |
|
Back to top |
|
|
RioFL Guru
Joined: 31 Oct 2002 Posts: 407
|
Posted: Sun Oct 19, 2003 6:54 pm Post subject: |
|
|
fizz wrote: | have you tried to emerge nvidia-kernel nvidia-glx
?
maybe something is going bad during compile time, just a thought |
been there, done that. also unmerged them and installed from the version 2 tar off nvidia.com.. still no go.. i am going to look for 4363 on the mirrors. that i think worked, i may even downgrade my kernel back to -r5 to match.
i am resonably sure its not the vid card because i substituted a pci riva tnt2 card i had laying around and it still locked. and the older driver with the older kernel did not lock when running that pci card. i am almost positive there is some kind of incompatibility between current versions of the gentoo kernel and nvidia kernel. hopefully i can find the old -r5 kernel and 4363 or even older of the nvidia kernel to try. |
|
Back to top |
|
|
RioFL Guru
Joined: 31 Oct 2002 Posts: 407
|
Posted: Sun Oct 19, 2003 6:58 pm Post subject: |
|
|
however... i pushed submit too fast , i am using the exact same versions of everything on the machine i am using now which is a single proc intel board, same 933mhz processor and i am using the pci riva tnt2 with current nvidia kernel and gentoo r7 kernel with no odd problems at all. none. i almost wonder if the nvidia kernel doesnt like working with dual processors or the via apollo t133 chipset:( if thats the case i hope they fix it VERY soon (like in 3 more days?)
Additionally, when I put the geforce card into this machine described above, it begins to lock up. In all fairness, this machine only has a 235w supply so again i may be starving the card in both machines. I don't know. What would be the power supply recommended for use with this card on a dual proc machine? I guess I will know if that is it when my 400w shows up. the other thing that makes me suspicious is my sensors voltage report .. all is well except CPU core is 1.78, and should be about 2, and the +2.5v bus is 1.53v.
and on the intel board my VCore is 1.68v and it calls for 2.0-2.44..
who knows, i am about ready to toss this thing in the trash and begin again. |
|
Back to top |
|
|
shakti Guru
Joined: 15 May 2002 Posts: 358 Location: omnipresent
|
Posted: Sun Oct 19, 2003 7:26 pm Post subject: |
|
|
my best guess is you have a heat problem...2 cpus more heat....watching movies...more heat....power supply to the max....more heat.....more heat = unstable system
then i read about your voltages....ahhhhhhhhhh....get a better power supply...... _________________ Using Gentoo since 2002. |
|
Back to top |
|
|
RioFL Guru
Joined: 31 Oct 2002 Posts: 407
|
Posted: Sun Oct 19, 2003 7:43 pm Post subject: |
|
|
shakti wrote: | my best guess is you have a heat problem...2 cpus more heat....watching movies...more heat....power supply to the max....more heat.....more heat = unstable system
then i read about your voltages....ahhhhhhhhhh....get a better power supply...... |
hopefully the supply will cure it.. heat isnt a problem. the case temp never exceeds 80F inside, the output air from the power supply fan never exceeds 90F and the cpu temps according to lmsensors never exceed 100F (after 1 hr of compiling, i can barely detect warmth at the base of the cpu coolers). the vid card has its own cpu fan and i also have a case fan blowing air into the case positioned directly opposite the vid card (at the front), and the ambient temp of this room never exceeds 70f.
there are a total of 5 case fans in this sucker to keep it extremely cool. it moves enough air that when you look at the back of the machine while its running your hair blows around. the weakest case fan is 100cfm. (as you can tell i dont care about noise, i just want results) |
|
Back to top |
|
|
LJ Apprentice
Joined: 27 Dec 2002 Posts: 156
|
Posted: Sun Oct 19, 2003 9:56 pm Post subject: |
|
|
If the powersupply is making your system unstable then booting to a uniprocessor kernel should make the a little more stable and disconnecting a harddrive or CDrom drive should push it back away from the edge, so to speak.
Getting good, quality power supply is important but while you're waiting for it you may be able to get your system stable by doing one of the above or slightly increasing the core voltage to your processors (unless your video card is causing the crashes). |
|
Back to top |
|
|
RioFL Guru
Joined: 31 Oct 2002 Posts: 407
|
Posted: Mon Oct 20, 2003 12:10 am Post subject: |
|
|
LJ wrote: | If the powersupply is making your system unstable then booting to a uniprocessor kernel should make the a little more stable and disconnecting a harddrive or CDrom drive should push it back away from the edge, so to speak.
Getting good, quality power supply is important but while you're waiting for it you may be able to get your system stable by doing one of the above or slightly increasing the core voltage to your processors (unless your video card is causing the crashes). |
more likely its the vid card dragging it down causing the lockups. there are no voltage adjustments, its fully automated and no place to do any manual adjustments in cmos or on the mobo. i didnt do a uniprocessor kernel but i did just as good.. i removed the 2nd cpu and it still locked. |
|
Back to top |
|
|
RioFL Guru
Joined: 31 Oct 2002 Posts: 407
|
Posted: Mon Oct 20, 2003 7:38 pm Post subject: |
|
|
unless someone can assure me that the nvidia kernel makes the geforce draw considerably more current thereby locking up on a marginal supply i wonder. 28hrs uptime with the nv driver and pushing the living hell outta it with both processors running average 90% during that entire time. yet i use the nvidia kernel and it locks up almost immediately.
i am seriously beginning to believe it is an incompatibility between the apollo t133 chipset and the nvidia kernel |
|
Back to top |
|
|
Jeff Poulin n00b
Joined: 31 Dec 2002 Posts: 10
|
Posted: Fri Nov 14, 2003 10:00 am Post subject: |
|
|
FWIW, I get the exact same problems too. I just switched to a dual MP platform with MPX chipset. With the nvidia kernel driver, KDE locks up very quickly (few seconds). With the NV driver, it's stable. I used to run a single proc on the same hardware (except for a different motherboard) and the nvidia kernel never gave me a problem. I'm using gentoo sources 2.4.20-r8. |
|
Back to top |
|
|
RioFL Guru
Joined: 31 Oct 2002 Posts: 407
|
Posted: Fri Nov 14, 2003 11:45 am Post subject: |
|
|
Jeff Poulin wrote: | FWIW, I get the exact same problems too. I just switched to a dual MP platform with MPX chipset. With the nvidia kernel driver, KDE locks up very quickly (few seconds). With the NV driver, it's stable. I used to run a single proc on the same hardware (except for a different motherboard) and the nvidia kernel never gave me a problem. I'm using gentoo sources 2.4.20-r8. |
Ok, man in the last month I learned enough about all this to get my PHD
first, my problem turned out to be several things the most important being the card was bad. replaced the card however there were other things too.
1. power supply was too small. had to put a 400w in.
2. chipset limitations... in the t133 chipset, agp 4x is unstable so i had to set it to 2x.
3. you have to experiment here..compile agpgart as a module. i found that there was a significant performance boost by not using agpgart module and using the NVAGP=1 in XF86Config instead (using nvidia agp)
i was afraid that 2x was going to be way too slow, but pleasantly enough, my performance in glxgears exceeds my friend's performance with a machine twice the power of mine:)
also experimenting with the fastwrite options can boost perf/cause lockups, however the SBA option appears to be quite stable.
suggest that if you have fastwrite and 4x/8x options in your bios, disable them to start with. also if your bios allows agp drive settings, i had to set mine to 0xEC. nvidia hardware likes 0xEA - 0xEE the best. my motherboard default was 0xCC.
compile agpgart as a module and for testing purposes do not autoload it or the nvidia driver. X will attempt to load them both when started. here are my device options from my xf86config file:
Section "Device"
Identifier "GeForce"
Driver "nvidia"
VideoRam 65535
# Option "DPMS"
Option "NoLogo" "1"
Option "NvAgp" "1"
Option "NoDDC" "1"
Option "TwinView" "True"
Option "TwinViewOrientation" "RightOf"
Option "SecondMonitorHorizSync" "30-87"
Option "SecondMonitorVertRefresh" "50-160"
Option "MetaModes" "1600x1200, 1280x1024"
BusID "PCI:1:0:0"
Screen 0
EndSection
I have to use NoDDC because my smaller monitor lies to the vid system about its capabilities.
you can get a lot of information about your video setup in proc. /proc/driver/nvidia/agp after you load the nvidia kernel with those options manually while keeping X out of the picture ( /etc/init.d/xdm stop ) you can get all the info you may need. also mess around in terminal mode for a bit with that loaded to see if it still locks up. then bring X up and see what happens.
as a side note, i found that the series 2 driver from the nvidia site gave me quite a pickup in performance over the series 1 driver distributed by gentoo so I emerge unmerge nvidia-glx and nvidia-kernel and installed the driver from the nvidia site. if you do this, do not run opengl-update nvidia after installing. the nvidia install script runs the proper things to switch for you.
hope this helps some. I am not familiar with the MPX chipset but I think the above that I learned should be generic info for all chipsets. once you get it stable, then you can try one thing at a time.. increase agp to 4x or 8x and see if that locks things up. generally fastwrites appears to be a bad idea, but i hear it works perfectly with some chipsets.
this experience has aged me , but now my system is rock stable and i am glad to have gone through the learning process.
Chuck
|
|
Back to top |
|
|
Javier Lopez Guru
Joined: 13 Sep 2002 Posts: 377 Location: Barcelona
|
Posted: Fri Nov 14, 2003 5:17 pm Post subject: |
|
|
My system (ck-sources-2.4.22-r2, nvidia-kernel 43.63-r3) lock up when I play UT2003 or ET.
Disabling ACPI in kernel solves it. |
|
Back to top |
|
|
RioFL Guru
Joined: 31 Oct 2002 Posts: 407
|
Posted: Fri Nov 14, 2003 5:26 pm Post subject: |
|
|
Javier Lopez wrote: | My system (ck-sources-2.4.22-r2, nvidia-kernel 43.63-r3) lock up when I play UT2003 or ET.
Disabling ACPI in kernel solves it. |
ahh, i never think of that because i keep apm and acpi disabled by default. all power control is disabled except the on/off switch. i dont ever want a machine to decide for itself it needs to turn off or slow down or suspend or anything. when its on its meant to be totally alive until i decide to turn it off.
Chuck |
|
Back to top |
|
|
|