Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Lockups with nvidia driver - need nvidia guru
View unanswered posts
View posts from last 24 hours
View posts from last 7 days

 
Reply to topic    Gentoo Forums Forum Index Desktop Environments
View previous topic :: View next topic  
Author Message
RioFL
Guru
Guru


Joined: 31 Oct 2002
Posts: 407

PostPosted: Sun Oct 19, 2003 10:55 am    Post subject: Lockups with nvidia driver - need nvidia guru Reply with quote

I am getting random lockups in the system, sometimes only X other times the entire system where I cannot ssh into it to shut it down. I made severe hardware changes in my system, so I wiped the drive and installed using stage 1 with Pentium3 optimizations. System does not lock using the X built-in nv driver.

hardware:
Tyan Tiger dual proc mobo using via apollo t133 chipset
cmos settings conservative, agp4x off, no fast writes, agp drive 0xEC which is the middle of nvidia's recommended range of 0xEA - 0xEE (mobo default is 0xCC), video bios shadow off.

2 pentium III 933 mhz processors
nvidia geforce 4200 ti 64mb ram with memory fins
640mb system ram

--------

important software:
2.4.20-gentoo-r7 kernel (also tried 2.4.22-aa1 kernel same problem)
kernel settings are agressive (tried conservative.. same thing)
kde 3.1.4 and gnome 2.4.0
latest nvidia kernel. tried the nvidia-kernel-1.0.4496-r3 using emerge, and then got the version 2 off the nvidia site. same behavior, then in desparation i got the older 3123 version off the nvidia site. again same behavior/
tried both nvagp and kernel agpgart
renderaccel false
noddc true (one of my monitors lies)
twinview settings presently turned off but it locks either way

I am wondering if this behavior using nvidia-kernel driver may be revealing a marginal/flaky power supply? after discovering both procs use 110w, i feel a 250w supply is not sufficient. ordered a 400w which will be delivered next week. does the nvidia kernel cause the card to draw substantially more current than the nv driver? if so this may be my problem. dont know. kinda need some guidance here if there are software issues , and if so need fixes. the nv driver is suitable to finish my installation, but not suitable for my work.

help?
:)

Chuck
Back to top
View user's profile Send private message
Yoshi Assim
Apprentice
Apprentice


Joined: 16 Apr 2003
Posts: 234
Location: Girona (Spain)

PostPosted: Sun Oct 19, 2003 11:58 am    Post subject: Reply with quote

Try ck-sources kernel...
Back to top
View user's profile Send private message
RioFL
Guru
Guru


Joined: 31 Oct 2002
Posts: 407

PostPosted: Sun Oct 19, 2003 12:06 pm    Post subject: Reply with quote

i will try ck-sources, but somehow i think i am not seeing the root picture here. the video card worked just fine with my single 933, intel board using nvidia-kernel and the gentoo sources r5 kernel with twinview, nvagp, and aggressive nvidia kernel settings running. worked perfectly. then i upgraded to this dual proc mobo and 2 matched processors, and upon reinstalling (and of course updating software in the process), everything fell apart.
Back to top
View user's profile Send private message
fizz
Guru
Guru


Joined: 31 Aug 2003
Posts: 309
Location: Florida

PostPosted: Sun Oct 19, 2003 12:26 pm    Post subject: Reply with quote

make surre AGPGart is compiled as a module, when its built in it will cause problems. Also, remove DRI from kernel.
In your XF86Config, remove any reference to dri, and maybe post your xf86config so we can check it out :)
_________________
Athlon 64 3200, MSI NEO NForce 3, 1Gig PC3700, EVGA Geforce 6800 GT
Back to top
View user's profile Send private message
RioFL
Guru
Guru


Joined: 31 Oct 2002
Posts: 407

PostPosted: Sun Oct 19, 2003 12:38 pm    Post subject: Reply with quote

agpgart is a module and is not loaded when using nvagp option. here is the config:


Code:

Section "Files"
    RgbPath   "/usr/X11R6/lib/X11/rgb"
    FontPath   "/usr/X11R6/lib/X11/fonts/local/"
    FontPath   "/usr/X11R6/lib/X11/fonts/misc/"
    FontPath   "/usr/X11R6/lib/X11/fonts/75dpi/:unscaled"
    FontPath   "/usr/X11R6/lib/X11/fonts/100dpi/:unscaled"
    FontPath   "/usr/X11R6/lib/X11/fonts/Type1/"
    FontPath   "/usr/X11R6/lib/X11/fonts/CID/"
    FontPath   "/usr/X11R6/lib/X11/fonts/Speedo/"
    FontPath   "/usr/X11R6/lib/X11/fonts/75dpi/"
    FontPath   "/usr/X11R6/lib/X11/fonts/100dpi/"
    FontPath    "/usr/X11R6/lib/X11/fonts/TTF"
    ModulePath   "/usr/X11R6/lib/modules"
EndSection


Section "Module"
    Load   "dbe"
    Load           "glx"
    SubSection   "extmod"
    EndSubSection
    Load   "type1"
    Load   "freetype"
EndSection


Section "ServerFlags"
    Option   "blank time"   "10"   # 10 minutes
EndSection

Section "InputDevice"
    Identifier   "Keyboard1"
    Driver   "keyboard"
    Option   "AutoRepeat"   "250 30"
    Option   "XkbRules"   "xfree86"
    Option   "XkbModel"   "pc104"
    Option   "XkbLayout"   "us"
EndSection


Section "InputDevice"
    Identifier   "Mouse1"
    Driver   "mouse"
    Option   "Protocol"   "imps/2"
    Option   "Device"   "/dev/psaux"
    Option   "ZAxisMapping"  "4 5"
EndSection

Section "Monitor"
    Identifier   "G810"
    HorizSync   30-97         
    VertRefresh   50-180       

EndSection

Section "Device"
     Identifier "NVidia GeForce 4 4200"
     Driver      "nvidia"
#   Driver   "nv"
     VideoRam    65535
    Option          "RenderAccel" "0"
     Option       "NoLogo"   "1"
     Option       "NvAgp"    "1"
     Option       "NoDDC"    "1"
#    Option      "TwinView" "True"
#    Option      "TwinViewOrientation" "RightOf"
#    Option      "SecondMonitorHorizSync" "30-87"
#    Option      "SecondMonitorVertRefresh" "50-160"
#    Option      "MetaModes" "1600x1200, 1280x1024"
EndSection



Section "Screen"
    Identifier   "Screen 1"
    Device   "NVidia GeForce 4 4200"
    Monitor   "G810"
    DefaultDepth 24
    SubSection  "Display"
        Depth   24
             Modes   "1600x1200"
       ViewPort   0 0
    EndSubSection
EndSection

Section "ServerLayout"
    Identifier   "simple layout"
    Screen      "Screen 1"
    InputDevice   "Mouse1" "CorePointer"
    InputDevice "Keyboard1" "CoreKeyboard"
EndSection


Back to top
View user's profile Send private message
abarrett79
n00b
n00b


Joined: 19 Oct 2003
Posts: 9

PostPosted: Sun Oct 19, 2003 4:26 pm    Post subject: Reply with quote

I had this same problem in RH9, I finally fixed it by running the nvuidia driver with the -expert option and having it recompile everything in the driver. By the time I figured that out though I had already played with other settings and screwed up my x system beyond what I could fix. That's why I'm here on Gentoo :)
Back to top
View user's profile Send private message
RioFL
Guru
Guru


Joined: 31 Oct 2002
Posts: 407

PostPosted: Sun Oct 19, 2003 4:55 pm    Post subject: Reply with quote

umm, not meaning to sound really dumb, but what -expert option? I just searched the readme and makefile and installer and .run file helps and the word expert does not appear.
Back to top
View user's profile Send private message
RioFL
Guru
Guru


Joined: 31 Oct 2002
Posts: 407

PostPosted: Sun Oct 19, 2003 5:12 pm    Post subject: Reply with quote

one more note. i have been compiling the nvidia kernel from scratch
Back to top
View user's profile Send private message
shakti
Guru
Guru


Joined: 15 May 2002
Posts: 358
Location: omnipresent

PostPosted: Sun Oct 19, 2003 5:28 pm    Post subject: Reply with quote

do you get the lockups in both 24 and 16 bit color or just in 24 bit? Are the lockups very random or is there some kind of pattern? I had nvidia lockups could not find the cause until one day i found my fx5200 seld-distructed (one chip and one capacitor exploded...) no screen at all but system booted , replaced it with a new one and never had lockups since.....
_________________
Using Gentoo since 2002.
Back to top
View user's profile Send private message
RioFL
Guru
Guru


Joined: 31 Oct 2002
Posts: 407

PostPosted: Sun Oct 19, 2003 5:55 pm    Post subject: Reply with quote

shakti wrote:
do you get the lockups in both 24 and 16 bit color or just in 24 bit? Are the lockups very random or is there some kind of pattern? I had nvidia lockups could not find the cause until one day i found my fx5200 seld-distructed (one chip and one capacitor exploded...) no screen at all but system booted , replaced it with a new one and never had lockups since.....


both color depths and they are quite random. it could be 5 min , 1 min or one time it went 5 hours. it appears to be aggrivated by watching a movie on mplayer, it wil lock faster that way. i was emerging openoffice in one desktop and watching the hp movie in another one, and within 5 min it locked.

im wondering if marginal power supply may cause this. i cannot find any specs on the standard text current drain of the vid card and the current drain when it is driven by the nvidia-kernel driver. i would expect it would be considerably more.
Back to top
View user's profile Send private message
fizz
Guru
Guru


Joined: 31 Aug 2003
Posts: 309
Location: Florida

PostPosted: Sun Oct 19, 2003 5:58 pm    Post subject: Reply with quote

have you tried to emerge nvidia-kernel nvidia-glx
?
maybe something is going bad during compile time, just a thought :)
_________________
Athlon 64 3200, MSI NEO NForce 3, 1Gig PC3700, EVGA Geforce 6800 GT
Back to top
View user's profile Send private message
RioFL
Guru
Guru


Joined: 31 Oct 2002
Posts: 407

PostPosted: Sun Oct 19, 2003 6:54 pm    Post subject: Reply with quote

fizz wrote:
have you tried to emerge nvidia-kernel nvidia-glx
?
maybe something is going bad during compile time, just a thought :)


been there, done that. also unmerged them and installed from the version 2 tar off nvidia.com.. still no go.. i am going to look for 4363 on the mirrors. that i think worked, i may even downgrade my kernel back to -r5 to match.

i am resonably sure its not the vid card because i substituted a pci riva tnt2 card i had laying around and it still locked. and the older driver with the older kernel did not lock when running that pci card. i am almost positive there is some kind of incompatibility between current versions of the gentoo kernel and nvidia kernel. hopefully i can find the old -r5 kernel and 4363 or even older of the nvidia kernel to try.
Back to top
View user's profile Send private message
RioFL
Guru
Guru


Joined: 31 Oct 2002
Posts: 407

PostPosted: Sun Oct 19, 2003 6:58 pm    Post subject: Reply with quote

however... i pushed submit too fast :), i am using the exact same versions of everything on the machine i am using now which is a single proc intel board, same 933mhz processor and i am using the pci riva tnt2 with current nvidia kernel and gentoo r7 kernel with no odd problems at all. none. i almost wonder if the nvidia kernel doesnt like working with dual processors or the via apollo t133 chipset:( if thats the case i hope they fix it VERY soon (like in 3 more days?) :)

Additionally, when I put the geforce card into this machine described above, it begins to lock up. In all fairness, this machine only has a 235w supply so again i may be starving the card in both machines. I don't know. What would be the power supply recommended for use with this card on a dual proc machine? I guess I will know if that is it when my 400w shows up. the other thing that makes me suspicious is my sensors voltage report .. all is well except CPU core is 1.78, and should be about 2, and the +2.5v bus is 1.53v.

and on the intel board my VCore is 1.68v and it calls for 2.0-2.44..

who knows, i am about ready to toss this thing in the trash and begin again.
Back to top
View user's profile Send private message
shakti
Guru
Guru


Joined: 15 May 2002
Posts: 358
Location: omnipresent

PostPosted: Sun Oct 19, 2003 7:26 pm    Post subject: Reply with quote

my best guess is you have a heat problem...2 cpus more heat....watching movies...more heat....power supply to the max....more heat.....more heat = unstable system
then i read about your voltages....ahhhhhhhhhh....get a better power supply......
_________________
Using Gentoo since 2002.
Back to top
View user's profile Send private message
RioFL
Guru
Guru


Joined: 31 Oct 2002
Posts: 407

PostPosted: Sun Oct 19, 2003 7:43 pm    Post subject: Reply with quote

shakti wrote:
my best guess is you have a heat problem...2 cpus more heat....watching movies...more heat....power supply to the max....more heat.....more heat = unstable system
then i read about your voltages....ahhhhhhhhhh....get a better power supply......



hopefully the supply will cure it.. heat isnt a problem. the case temp never exceeds 80F inside, the output air from the power supply fan never exceeds 90F and the cpu temps according to lmsensors never exceed 100F (after 1 hr of compiling, i can barely detect warmth at the base of the cpu coolers). the vid card has its own cpu fan and i also have a case fan blowing air into the case positioned directly opposite the vid card (at the front), and the ambient temp of this room never exceeds 70f.

there are a total of 5 case fans in this sucker to keep it extremely cool. it moves enough air that when you look at the back of the machine while its running your hair blows around. the weakest case fan is 100cfm. (as you can tell i dont care about noise, i just want results) :)
Back to top
View user's profile Send private message
LJ
Apprentice
Apprentice


Joined: 27 Dec 2002
Posts: 156

PostPosted: Sun Oct 19, 2003 9:56 pm    Post subject: Reply with quote

If the powersupply is making your system unstable then booting to a uniprocessor kernel should make the a little more stable and disconnecting a harddrive or CDrom drive should push it back away from the edge, so to speak.

Getting good, quality power supply is important but while you're waiting for it you may be able to get your system stable by doing one of the above or slightly increasing the core voltage to your processors (unless your video card is causing the crashes).
Back to top
View user's profile Send private message
RioFL
Guru
Guru


Joined: 31 Oct 2002
Posts: 407

PostPosted: Mon Oct 20, 2003 12:10 am    Post subject: Reply with quote

LJ wrote:
If the powersupply is making your system unstable then booting to a uniprocessor kernel should make the a little more stable and disconnecting a harddrive or CDrom drive should push it back away from the edge, so to speak.

Getting good, quality power supply is important but while you're waiting for it you may be able to get your system stable by doing one of the above or slightly increasing the core voltage to your processors (unless your video card is causing the crashes).


more likely its the vid card dragging it down causing the lockups. there are no voltage adjustments, its fully automated and no place to do any manual adjustments in cmos or on the mobo. i didnt do a uniprocessor kernel but i did just as good.. i removed the 2nd cpu and it still locked.
Back to top
View user's profile Send private message
RioFL
Guru
Guru


Joined: 31 Oct 2002
Posts: 407

PostPosted: Mon Oct 20, 2003 7:38 pm    Post subject: Reply with quote

unless someone can assure me that the nvidia kernel makes the geforce draw considerably more current thereby locking up on a marginal supply i wonder. 28hrs uptime with the nv driver and pushing the living hell outta it with both processors running average 90% during that entire time. yet i use the nvidia kernel and it locks up almost immediately.

i am seriously beginning to believe it is an incompatibility between the apollo t133 chipset and the nvidia kernel
Back to top
View user's profile Send private message
Jeff Poulin
n00b
n00b


Joined: 31 Dec 2002
Posts: 10

PostPosted: Fri Nov 14, 2003 10:00 am    Post subject: Reply with quote

FWIW, I get the exact same problems too. I just switched to a dual MP platform with MPX chipset. With the nvidia kernel driver, KDE locks up very quickly (few seconds). With the NV driver, it's stable. I used to run a single proc on the same hardware (except for a different motherboard) and the nvidia kernel never gave me a problem. I'm using gentoo sources 2.4.20-r8.
Back to top
View user's profile Send private message
RioFL
Guru
Guru


Joined: 31 Oct 2002
Posts: 407

PostPosted: Fri Nov 14, 2003 11:45 am    Post subject: Reply with quote

Jeff Poulin wrote:
FWIW, I get the exact same problems too. I just switched to a dual MP platform with MPX chipset. With the nvidia kernel driver, KDE locks up very quickly (few seconds). With the NV driver, it's stable. I used to run a single proc on the same hardware (except for a different motherboard) and the nvidia kernel never gave me a problem. I'm using gentoo sources 2.4.20-r8.


Ok, man in the last month I learned enough about all this to get my PHD :D

first, my problem turned out to be several things the most important being the card was bad. replaced the card however there were other things too.

1. power supply was too small. had to put a 400w in.
2. chipset limitations... in the t133 chipset, agp 4x is unstable so i had to set it to 2x.
3. you have to experiment here..compile agpgart as a module. i found that there was a significant performance boost by not using agpgart module and using the NVAGP=1 in XF86Config instead (using nvidia agp)

i was afraid that 2x was going to be way too slow, but pleasantly enough, my performance in glxgears exceeds my friend's performance with a machine twice the power of mine:)

also experimenting with the fastwrite options can boost perf/cause lockups, however the SBA option appears to be quite stable.

suggest that if you have fastwrite and 4x/8x options in your bios, disable them to start with. also if your bios allows agp drive settings, i had to set mine to 0xEC. nvidia hardware likes 0xEA - 0xEE the best. my motherboard default was 0xCC.


compile agpgart as a module and for testing purposes do not autoload it or the nvidia driver. X will attempt to load them both when started. here are my device options from my xf86config file:


Section "Device"
Identifier "GeForce"
Driver "nvidia"
VideoRam 65535
# Option "DPMS"
Option "NoLogo" "1"
Option "NvAgp" "1"
Option "NoDDC" "1"
Option "TwinView" "True"
Option "TwinViewOrientation" "RightOf"
Option "SecondMonitorHorizSync" "30-87"
Option "SecondMonitorVertRefresh" "50-160"
Option "MetaModes" "1600x1200, 1280x1024"
BusID "PCI:1:0:0"
Screen 0
EndSection

I have to use NoDDC because my smaller monitor lies to the vid system about its capabilities.

you can get a lot of information about your video setup in proc. /proc/driver/nvidia/agp after you load the nvidia kernel with those options manually while keeping X out of the picture ( /etc/init.d/xdm stop ) you can get all the info you may need. also mess around in terminal mode for a bit with that loaded to see if it still locks up. then bring X up and see what happens.

as a side note, i found that the series 2 driver from the nvidia site gave me quite a pickup in performance over the series 1 driver distributed by gentoo so I emerge unmerge nvidia-glx and nvidia-kernel and installed the driver from the nvidia site. if you do this, do not run opengl-update nvidia after installing. the nvidia install script runs the proper things to switch for you.

hope this helps some. I am not familiar with the MPX chipset but I think the above that I learned should be generic info for all chipsets. once you get it stable, then you can try one thing at a time.. increase agp to 4x or 8x and see if that locks things up. generally fastwrites appears to be a bad idea, but i hear it works perfectly with some chipsets.

this experience has aged me :), but now my system is rock stable and i am glad to have gone through the learning process.

Chuck


Code:
Back to top
View user's profile Send private message
Javier Lopez
Guru
Guru


Joined: 13 Sep 2002
Posts: 377
Location: Barcelona

PostPosted: Fri Nov 14, 2003 5:17 pm    Post subject: Reply with quote

My system (ck-sources-2.4.22-r2, nvidia-kernel 43.63-r3) lock up when I play UT2003 or ET.

Disabling ACPI in kernel solves it.
Back to top
View user's profile Send private message
RioFL
Guru
Guru


Joined: 31 Oct 2002
Posts: 407

PostPosted: Fri Nov 14, 2003 5:26 pm    Post subject: Reply with quote

Javier Lopez wrote:
My system (ck-sources-2.4.22-r2, nvidia-kernel 43.63-r3) lock up when I play UT2003 or ET.

Disabling ACPI in kernel solves it.


ahh, i never think of that because i keep apm and acpi disabled by default. all power control is disabled except the on/off switch. i dont ever want a machine to decide for itself it needs to turn off or slow down or suspend or anything. when its on its meant to be totally alive until i decide to turn it off.

Chuck
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Desktop Environments All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum