Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
unexpected server halt
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo
View previous topic :: View next topic  
Author Message
knight77
n00b
n00b


Joined: 29 Jun 2009
Posts: 25

PostPosted: Mon Jun 29, 2009 9:24 am    Post subject: unexpected server halt Reply with quote

Hello.

We are managing 2 Gentoo servers and one of them just shut down without any root telling it to do so.

Checking /var/log/everything/current we found the following lines:
[...]
Jun 29 10:21:39 [postfix/qmgr] 6EED11CAE7: removed
Jun 29 10:22:10 [vol_id] no device_
- Last output repeated 8 times -
Jun 29 10:22:13 [shutdown] shutting down for system halt
Jun 29 10:22:13 [init] Switching to runlevel: 0
Jun 29 10:22:16 [snmpd] Received TERM or STOP signal... shutting down...
[...]

As far as i can see, right after the [vol_id] message the server started to shut down. I have no idea how come the vol_id entries were issue, all i could find on Internet is that it may be related to udev, but nothing containing vol_id and "no device_".

Anybody ever ran into a similar problem? Any information about the server is available on request.

The Gentoo server is fully up to date and serves as a mail and web server (postfix + apache (vhosts)).

Thank you for your time or hints.
Back to top
View user's profile Send private message
audiodef
Watchman
Watchman


Joined: 06 Jul 2005
Posts: 6639
Location: The soundosphere

PostPosted: Mon Jun 29, 2009 8:20 pm    Post subject: Reply with quote

Never saw this happened. Has it happened more than once?
_________________
decibel Linux: https://decibellinux.org
Github: https://github.com/Gentoo-Music-and-Audio-Technology
Facebook: https://www.facebook.com/decibellinux
Discord: https://discord.gg/73XV24dNPN
Back to top
View user's profile Send private message
knight77
n00b
n00b


Joined: 29 Jun 2009
Posts: 25

PostPosted: Tue Jun 30, 2009 5:28 am    Post subject: Reply with quote

Nope, it didn't happen before, it's the first time i ever see this behaviour. That's why i'm confused too, since i can't figure out how come the server just decided to shutdown by itself.

It's located in a provider datacenter, so another possibility, as far as i can guess is that somebody mistook it for another server and maybe believing it was a Windows, hit the Ctrl-Alt-Del making it shut down. Still, if so, it should have rebooted, not halted.

What i still can't explain is what caused the 9 [vol_id] messages. It may be related or not to the shutdown, so it may be the closest explaination to why it shut down.

Any idea anybody of what might have caused the [vol_id] syslog messages? If any data about the server is needed, please ask.

Thank you for your time.
Back to top
View user's profile Send private message
audiodef
Watchman
Watchman


Joined: 06 Jul 2005
Posts: 6639
Location: The soundosphere

PostPosted: Tue Jun 30, 2009 1:17 pm    Post subject: Reply with quote

Has it happened since?

Also, any reason to believe someone might be trying to hack/crack in the data center?
_________________
decibel Linux: https://decibellinux.org
Github: https://github.com/Gentoo-Music-and-Audio-Technology
Facebook: https://www.facebook.com/decibellinux
Discord: https://discord.gg/73XV24dNPN
Back to top
View user's profile Send private message
knight77
n00b
n00b


Joined: 29 Jun 2009
Posts: 25

PostPosted: Wed Jul 01, 2009 9:32 am    Post subject: Reply with quote

No, it hasn't happened since.

I have rechecked the server and the other Gentoo server we manage (similar role, but without a MTA), and the vol_id message didn't show up again in the last 5 day at least.

The datacenter provider has quite everything under lock, so we believe it's a very slim possibility somebody physically interfered with the server.

As fas as can tell, there are 2 possibilities:
1. the vol_id is related to the shutdown being initiated, but we don't know how.
2. something else still unknown caused the halt.

The only "strange" thing about that server is that the temperature of a hard-drive was nearing it's upper limit (it was running at 53 degrees C for a few days, with the maximum allowed in the vendor specs of 55). Right now it's running at 47.

The rc-status on the self-halted server is the following:
Runlevel: default
apache2
coldplug
courier-authlib
courier-imapd
courier-pop3d-ssl
fcron
fwinit
local
lsa
metalog
mysql
named
net.eth0
net.eth1
netmount
postfix
pure-ftpd
rngd
saslauthd
snmpd
sshd
uptimed

Note: The lsa service is a hardware resources (CPU, RAM, Disks) monitoring agent we use.

So far we're still in the dark as to what caused the halt. In case nobody else updates this thread with any idea, i believe the thread can be closed, we'll reopen it (if possible) or create a new one if another unexpected shutdown happens again.

Thank you for your time.
Back to top
View user's profile Send private message
audiodef
Watchman
Watchman


Joined: 06 Jul 2005
Posts: 6639
Location: The soundosphere

PostPosted: Wed Jul 01, 2009 6:05 pm    Post subject: Reply with quote

Since you said the hard disk was running hot, do you have a temp sensor running that could shut down the machine if it gets too hot? Or perhaps there is a HW sensor that simply shuts the machine off under certain heat conditions without the need for software daemons.

Also, if you have a log daemon running, you should check the logs for clues.
_________________
decibel Linux: https://decibellinux.org
Github: https://github.com/Gentoo-Music-and-Audio-Technology
Facebook: https://www.facebook.com/decibellinux
Discord: https://discord.gg/73XV24dNPN
Back to top
View user's profile Send private message
unixbhaskar
Tux's lil' helper
Tux's lil' helper


Joined: 29 Nov 2007
Posts: 119
Location: India

PostPosted: Fri Jul 03, 2009 5:48 pm    Post subject: Hope this is the right place for apache problem Reply with quote

Once I tried to emerge apache I got this error;

checking if POSIX sems affect threads in the same process... no
checking if SysV sems affect threads in the same process... no
checking if fcntl locks affect threads in the same process... no
checking if flock locks affect threads in the same process... no
checking for entropy source... configure: error: /dev/urandom not found or unreadable.

!!! Please attach the following file when seeking support:
!!! /var/tmp/portage/dev-libs/apr-1.3.5/work/apr-1.3.5/config.log
*
* ERROR: dev-libs/apr-1.3.5 failed.
* Call stack:
* ebuild.sh, line 49: Called src_configure
* environment, line 2653: Called econf '--enable-layout=gentoo' '--enable-nonportable-atomics' '--enable-threads' '--with-devrandom=/dev/urandom'
* ebuild.sh, line 534: Called die
* The specific snippet of code:
* die "econf failed"
* The die message:
* econf failed
*
* If you need support, post the topmost build error, and the call stack if relevant.
* A complete build log is located at '/var/tmp/portage/dev-libs/apr-1.3.5/temp/build.log'.
* The ebuild environment file is located at '/var/tmp/portage/dev-libs/apr-1.3.5/temp/environment'.
*

>>> Failed to emerge dev-libs/apr-1.3.5, Log file:

>>> '/var/tmp/portage/dev-libs/apr-1.3.5/temp/build.log'

* Messages for package dev-libs/apr-1.3.5:

*
* ERROR: dev-libs/apr-1.3.5 failed.
* Call stack:
* ebuild.sh, line 49: Called src_configure
* environment, line 2653: Called econf '--enable-layout=gentoo' '--enable-nonportable-atomics' '--enable-threads' '--with-devrandom=/dev/urandom'
* ebuild.sh, line 534: Called die
* The specific snippet of code:
* die "econf failed"
* The die message:
* econf failed
*
* If you need support, post the topmost build error, and the call stack if relevant.
* A complete build log is located at '/var/tmp/portage/dev-libs/apr-1.3.5/temp/build.log'.
* The ebuild environment file is located at '/var/tmp/portage/dev-libs/apr-1.3.5/temp/environment'.
*

Any clear cut solution would be appreciated .Thanks in advance.
Back to top
View user's profile Send private message
knight77
n00b
n00b


Joined: 29 Jun 2009
Posts: 25

PostPosted: Tue Jul 07, 2009 10:35 am    Post subject: unexpected server halt Reply with quote

Hello again.

We checked with the datacenter provider and they confirmed nobody had accessed the room where the server is located in the timeframe when the server started the halt. As such, accidental halt by somebody in the datacenter has been ruled out.

We'll try to stop the server for a few minutes in order to check the BIOS settings for any hardware temperature protection that might be enabled. Also, as i side note, we'll try to remove the cover on the tower hoping the A/C will cool it better than the already installed fans in the case (too few).

I will post here again in case we find out something new.

Thank you for your time.

PS. What does the apache emerge error from the previous post have anything to do with this thread? Doesn't unixbhaskar know how to open a new thread?
Back to top
View user's profile Send private message
audiodef
Watchman
Watchman


Joined: 06 Jul 2005
Posts: 6639
Location: The soundosphere

PostPosted: Tue Jul 07, 2009 12:46 pm    Post subject: Re: unexpected server halt Reply with quote

knight77 wrote:
Hello again.
We'll try to stop the server for a few minutes in order to check the BIOS settings for any hardware temperature protection that might be enabled. Also, as i side note, we'll try to remove the cover on the tower hoping the A/C will cool it better than the already installed fans in the case (too few).


As long as you keep the ciruitry dust-free, that might help. Good luck! :)
_________________
decibel Linux: https://decibellinux.org
Github: https://github.com/Gentoo-Music-and-Audio-Technology
Facebook: https://www.facebook.com/decibellinux
Discord: https://discord.gg/73XV24dNPN
Back to top
View user's profile Send private message
unixbhaskar
Tux's lil' helper
Tux's lil' helper


Joined: 29 Nov 2007
Posts: 119
Location: India

PostPosted: Tue Jul 07, 2009 2:54 pm    Post subject: Reply with quote

Ignore the apache thing ,I have rectify it.

Knight have you read the subject line of my post???
_________________
Musing with GNU/Linux :)

Lenovo Thinkpad x250
x86_64 Intel(R) Core(TM) i5-5200U CPU @ 2.20GHz GenuineIntel GNU/Linux
RAM : 8 GB
Kernel :Latest customized kernel
OS: Gentoo/Arch/Slackware/Debian/openSUSE/Fedora
Intel 965GM Chipset
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum