View previous topic :: View next topic |
Author |
Message |
ZippyJay n00b
Joined: 30 Nov 2004 Posts: 73 Location: Ix
|
Posted: Wed Jul 19, 2006 1:43 pm Post subject: Server locks up... |
|
|
I have a fairly new Gentoo server that is locking up about once a week. It displays nothing at the console and requires a reboot. This box is primarily a mail server. I am not sure what would be causing this and dont quite know how to go about finding out.
I have looked through the logs, but I am not sure what I am looking for. I realize that there could be a million different things that are causing this.
If anyone could offer some direction on how to trouble shoot this sort of problem, it would be greatly appreciated.
Thanks!
P.S. In the all the years I have been using Linux, I have never had a lockup issue like this. Now that I think of it, that is kind of amazing. _________________ ZippyJay |
|
Back to top |
|
|
anello Guru
Joined: 17 Jul 2005 Posts: 557 Location: EU -> DE -> Stuttgart
|
Posted: Wed Jul 19, 2006 3:07 pm Post subject: |
|
|
Post the last couple entries of /var/log/messages right before it locks up. _________________ Antonino Catinello | http://catinello.eu |
|
Back to top |
|
|
ZippyJay n00b
Joined: 30 Nov 2004 Posts: 73 Location: Ix
|
Posted: Wed Jul 19, 2006 3:43 pm Post subject: |
|
|
Messages log doesn't show much. It just shows the syslogd restarting (as normal) and then on the 19th it shows another restart of syslogd (this is when I manually rebooted it). The other things are just sudo commands that I have ran (messing around with logwatch exceptions the days before, and then later starting to look through log files ) Nothing of much use.
Code: | Jul 15 03:12:59 mail syslogd 1.4.1: restart.
Jul 16 03:12:05 mail syslogd 1.4.1: restart.
Jul 17 03:12:03 mail syslogd 1.4.1: restart.
Jul 17 09:04:32 mail sudo: zippyjay : TTY=pts/0 ; PWD=/etc ; USER=root ; COMMAND=/bin/cat fstab
Jul 17 10:56:25 mail sudo: zippyjay : TTY=pts/0 ; PWD=/etc/log.d/conf ; USER=root ; COMMAND=/bin/nano -w ignore.conf
Jul 18 03:12:17 mail syslogd 1.4.1: restart.
Jul 18 09:13:25 mail sudo: zippyjay : TTY=pts/0 ; PWD=/etc/log.d/conf ; USER=root ; COMMAND=/bin/nano -w ignore.conf
Jul 19 10:51:01 mail syslogd 1.4.1: restart.
Jul 19 10:51:05 mail kernel: PCI: Found IRQ 5 for device 0000:02:0c.0
Jul 19 07:42:21 mail sudo: zippyjay : TTY=pts/0 ; PWD=/var/log ; USER=root ; COMMAND=/bin/cat auth.log
etc...
etc...
|
Thanks for your response. Any other thoughts? _________________ ZippyJay |
|
Back to top |
|
|
anello Guru
Joined: 17 Jul 2005 Posts: 557 Location: EU -> DE -> Stuttgart
|
Posted: Wed Jul 19, 2006 8:46 pm Post subject: |
|
|
How do you reboot? You need to reset your server or can you still access the terminal and type reboot? If so, what does exactly happens when you say the server locks up? _________________ Antonino Catinello | http://catinello.eu |
|
Back to top |
|
|
ZippyJay n00b
Joined: 30 Nov 2004 Posts: 73 Location: Ix
|
Posted: Wed Jul 19, 2006 9:20 pm Post subject: |
|
|
When the server locks, it will no longer received or send mail, or allow login. The activity light for the hard drive is lit steady. I had to perform a hard reset by pressing the power button. When it locks up, I have no access to the console/terminal or ssh. _________________ ZippyJay |
|
Back to top |
|
|
anello Guru
Joined: 17 Jul 2005 Posts: 557 Location: EU -> DE -> Stuttgart
|
Posted: Thu Jul 20, 2006 5:19 pm Post subject: |
|
|
Hmm, did you have any filesystem issues lately? Maybe your harddrive is corrupt. I would also suspect the RAM. Did you already make some tests on this hardware? _________________ Antonino Catinello | http://catinello.eu |
|
Back to top |
|
|
ZippyJay n00b
Joined: 30 Nov 2004 Posts: 73 Location: Ix
|
Posted: Fri Jul 28, 2006 3:07 pm Post subject: |
|
|
Sorry for the delayed response. I ran the memtest86 from the 2006.0 minimal Gentoo install disk, and it didn't find any errors in the RAM. I haven't had a chance to run any hard disk tests. What programs do you suggest for testing the HD? I suppose I could just boot from a UBCD and run some of test programs they provide.
Any thoughts?
Thanks, _________________ ZippyJay |
|
Back to top |
|
|
wynn Advocate
Joined: 01 Apr 2005 Posts: 2421 Location: UK
|
Posted: Fri Jul 28, 2006 3:25 pm Post subject: |
|
|
Quote: | What programs do you suggest for testing the HD? | The manufacturer's web site should have an HD testing program. HGST drives here so I use the IBM/Hitachi Drive Fitness test. _________________ The avatar is jorma, a "duck" from "Elephants Dream": the film and all the production materials have been made available under a Creative Commons Attribution 2.5 License, see orange.blender.org for details. |
|
Back to top |
|
|
ZippyJay n00b
Joined: 30 Nov 2004 Posts: 73 Location: Ix
|
Posted: Fri Jul 28, 2006 4:48 pm Post subject: |
|
|
wynn wrote: | The manufacturer's web site should have an HD testing program. HGST drives here so I use the IBM/Hitachi Drive Fitness test. |
I am assuming there is no way to run these programs on a live system, correct? In other words, I have to boot to a CD or something to that effect in order to run this sort of check on my HD. At least, that is always how I performed HD tests in the past. This is not the end of the world, but it would be nice to not have to down the system in order to test the drives. _________________ ZippyJay |
|
Back to top |
|
|
davidgurvich Veteran
Joined: 23 Apr 2004 Posts: 1063
|
Posted: Fri Jul 28, 2006 5:09 pm Post subject: |
|
|
You could also emerge smartmontools from portage.
However, the problem may not be hd related at all, but related to the kernel and motherboard. Have you tried starting the kernel with options that disable acpi and other features? Or perhaps using a different kernel entirely. Many acpi issues go away with a different kernel.
Could you post emerge --info and the motherboard information? |
|
Back to top |
|
|
ZippyJay n00b
Joined: 30 Nov 2004 Posts: 73 Location: Ix
|
Posted: Fri Jul 28, 2006 5:35 pm Post subject: |
|
|
I have updated the kernel with a newer version, but I didn't make any changes really. I will look into the acpi disable suggestion.
Here is the output from emerge --info
Code: | Portage 2.1-r1 (default-linux/x86/2006.0, gcc-3.3.6, glibc-2.3.6-r4, 2.6.16-gentoo-r12 i686)
=================================================================
System uname: 2.6.16-gentoo-r12 i686 Pentium III (Coppermine)
Gentoo Base System version 1.6.15
app-admin/eselect-compiler: [Not Present]
dev-lang/python: 2.3.5-r2, 2.4.3-r1
dev-python/pycrypto: 2.0.1-r5
dev-util/ccache: [Not Present]
dev-util/confcache: [Not Present]
sys-apps/sandbox: 1.2.17
sys-devel/autoconf: 2.13, 2.59-r7
sys-devel/automake: 1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6-r2
sys-devel/binutils: 2.16.1-r3
sys-devel/gcc-config: 1.3.13-r3
sys-devel/libtool: 1.5.22
virtual/os-headers: 2.6.11-r2
ACCEPT_KEYWORDS="x86"
AUTOCLEAN="yes"
CBUILD="i686-pc-linux-gnu"
CFLAGS="-O2 -mcpu=i686 -fomit-frame-pointer"
CHOST="i686-pc-linux-gnu"
CONFIG_PROTECT="/etc"
CONFIG_PROTECT_MASK="/etc/env.d /etc/gconf"
CXXFLAGS="-O2 -mcpu=i686 -fomit-frame-pointer"
DISTDIR="/usr/portage/distfiles"
FEATURES="autoconfig distlocks metadata-transfer sandbox sfperms strict"
GENTOO_MIRRORS="ftp://gentoo.chem.wisc.edu/gentoo/ http://gentoo.chem.wisc.edu/gentoo/ http://prometheus.cs.wmich.edu/gentoo"
MAKEOPTS="-j2"
PKGDIR="/usr/portage/packages"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --delete-after --stats --timeout=180 --exclude='/distfiles' --exclude='/local' --exclude='/packages'"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
SYNC="rsync://rsync.us.gentoo.org/gentoo-portage"
USE="x86 alsa apache2 apm bitmap-fonts cli crypt dlloader dri eds emboss encode esd foomaticdb fortran gdbm gif gpm gstreamer ipv6 isdnlog jpeg libg++ libwww mp3 mpeg ncurses nls nptl ogg opengl pam pcre perl png pppd python qt3 qt4 readline reflection sasl sdl session spell spl ssl tcpd truetype truetype-fonts type1-fonts udev vhost vorbis xml xorg zlib elibc_glibc input_devices_keyboard input_devices_mouse input_devices_evdev kernel_linux userland_GNU"
Unset: CTARGET, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LANG, LC_ALL, LDFLAGS, LINGUAS, PORTAGE_RSYNC_EXTRA_OPTS, PORTDIR_OVERLAY |
What motherboard info would you like? _________________ ZippyJay |
|
Back to top |
|
|
davidgurvich Veteran
Joined: 23 Apr 2004 Posts: 1063
|
Posted: Fri Jul 28, 2006 6:40 pm Post subject: |
|
|
What chipset, manufacturer, model #. But for now try changing the version of portage from 2.1-r1 to 2.1.1_pre3-r5 or 2.0.54-r2. Then emerge --metadata, emerge --sync, and see what happens with emerge -Dp world. |
|
Back to top |
|
|
ZippyJay n00b
Joined: 30 Nov 2004 Posts: 73 Location: Ix
|
Posted: Thu Aug 03, 2006 2:19 pm Post subject: |
|
|
davidgurvich wrote: | What chipset, manufacturer, model #. |
DELL OptiPlex GX200
P3 866MHz, 80526
Motherboard 5026D
davidgurvich wrote: | But for now try changing the version of portage from 2.1-r1 to 2.1.1_pre3-r5 or 2.0.54-r2. Then emerge --metadata, emerge --sync, and see what happens with emerge -Dp world. |
Thankyou for the response. Can you explain what this will do and why I should try this? Thanks for humoring me! _________________ ZippyJay |
|
Back to top |
|
|
davidgurvich Veteran
Joined: 23 Apr 2004 Posts: 1063
|
Posted: Fri Aug 04, 2006 2:35 am Post subject: |
|
|
Many older motherboards have buggy acpi.
I've had trouble with dependencies using this version of portage, so suggest changing to different version.
Code: | #emerge --metadata ----- Rebuild the cache --sync does that also, but slower as files downloaded, also different db for 2.0.x and 2.1.x
#emerge --sync ------ might as well get recent updates
#emerge -Dp world ------ emerge --deep --pretend world |
|
|
Back to top |
|
|
|