Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
What the heck is wrong with Gentoo
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo
View previous topic :: View next topic  
Author Message
rkfsm
Tux's lil' helper
Tux's lil' helper


Joined: 03 Jun 2005
Posts: 100
Location: Charleston, SC, USA

PostPosted: Sun Jan 13, 2008 6:10 pm    Post subject: What the heck is wrong with Gentoo Reply with quote

Several weeks ago, my Gentoo machine died. It was like someone had picked 30% of my files at random and deleted them. I rebuilt again from scratch on a new machine. New motherboard, CPU and, most importantly, new HDD's. I made sure I used all new passwords. Well, now it has happened again, only worse. This time about 75% of the files are gone. Family photos, 20GB of music, work documents and my web site are all gone. Most of it has been backed up, but still.... WHY?

The damage is spread between two hard drives; a 40GB Maxtor and a 120GB Seagate.

Can someone tell me what I can do to analyze what's going on so that I can stop it?

RK
Back to top
View user's profile Send private message
Veldrin
Veteran
Veteran


Joined: 27 Jul 2004
Posts: 1945
Location: Zurich, Switzerland

PostPosted: Sun Jan 13, 2008 6:22 pm    Post subject: Reply with quote

you mentioned new hdds: only for the system, or for you data too? 40 and 120 GB does not really sound new... how old are those 2?

install smartmontools. check the disk themselves, and run a long test...
Code:
# smartctl -t offline /dev/<hdd>
# smartctl -A /dev/<hdd>
# smartctl -t long /dev/<hdd>


the first command collects some information about you hdd - offline smart updates
the second command should give you a nice table, giving the approximate state of the disk
the third does some intensive testing - this might take some time (even hours)

cheers
V.
Back to top
View user's profile Send private message
schachti
Advocate
Advocate


Joined: 28 Jul 2003
Posts: 3765
Location: Gifhorn, Germany

PostPosted: Sun Jan 13, 2008 6:42 pm    Post subject: Reply with quote

Which file system did you use?
_________________
Never argue with an idiot. He brings you down to his level, then beats you with experience.

How-To: Daten verschlüsselt auf DVD speichern.
Back to top
View user's profile Send private message
rkfsm
Tux's lil' helper
Tux's lil' helper


Joined: 03 Jun 2005
Posts: 100
Location: Charleston, SC, USA

PostPosted: Sun Jan 13, 2008 7:34 pm    Post subject: Reply with quote

/dev/hda1 /boot ext2 (OK)
/dev/hda3 / reiserfs (damaged)
/dev/hdb2 /usr/portage reiserfs (OK)
/dev/hdb3 /pub jfs (damaged)

hda2 and hdb1 are both swap partitions.

Every directory had damage done. /boot was deleted entirely, /home and /root still existed but were empty. /bin and /sbin had only a few files in each.

The system log has been corrupted.

RK
Back to top
View user's profile Send private message
Habbit
Apprentice
Apprentice


Joined: 01 Sep 2007
Posts: 237
Location: 3.7137 W, 40.3873 N

PostPosted: Sun Jan 13, 2008 8:15 pm    Post subject: Reply with quote

Well, I don't know of a "rampage mode" kernel patch being included in the gentoo-sources patches, so if you tell us the new machine is a new installation with no relationship to the old one (new hardware, software & system configs), I can only assume the damage comes from outside. The main causes that spring to mind are power failures (usually here some people will scream reiserfs is to blame, but a rock-hard jfs partition has also been damaged it seems) and human attack. You mentioned you ran a website. Can you check any logs that haven't been compromised? (you mentioned the syslog was corrupted, maybe by an attacker, so you could try with your ISP's) Have you put proper security measures? (new passwords are not enough, the Hardened profile would do much better for a server)
_________________
Code:
~ $ objdump -d ./habbit_mind
90      xchg %rax, %rax
EB FD   jmp $-3
Back to top
View user's profile Send private message
MostAwesomeDude
Guru
Guru


Joined: 12 Aug 2007
Posts: 373

PostPosted: Sun Jan 13, 2008 10:26 pm    Post subject: Reply with quote

I would gladly blame reiserfs, but something else is going on here.

First off, were the disks dirty at any point, or did they always mount clean? Are the disks really new, especially the 40GB one? (A quick search is telling me that there are Maxtor is not making new 40GB drives.) What does badblocks have to say?
_________________
Don't believe the "n00b" under my name.
Back to top
View user's profile Send private message
rkfsm
Tux's lil' helper
Tux's lil' helper


Joined: 03 Jun 2005
Posts: 100
Location: Charleston, SC, USA

PostPosted: Sun Jan 27, 2008 2:24 am    Post subject: Reply with quote

The Maxtor is fairly old and came from an unused MS box, but smartctl shows no errors. The Seagate was given to me at Christmas.

I have not had any dirty partitions for a long time. However, after my system crashed, when I was getting errors saying things like the init could not be found, I had to do a hard reset and when I ran fsck from a livecd to look at the damage, I had dozens of screens of text about missing links and metadata.

It's all moot now.

I have reformated and am installing hardened with SSP, PIE, PAX and Selinux on XFS (root, home and pub) and EXT2 (boot and usr/portage) partitions.

RK
Back to top
View user's profile Send private message
padoor
Advocate
Advocate


Joined: 30 Dec 2005
Posts: 4185
Location: india

PostPosted: Sun Jan 27, 2008 3:54 am    Post subject: Reply with quote

i have a feeling new version of boost has something to do with this problem.
i have no proof to support
but the followings happened.
i crashed my very nicely working system by installing kde-4 which failed to install withglxchooseview error.
then i run a depclean and kill my whole system with missing gcc among others.
reinstall is what i decided to do and started new installation
before emerging kde4 as my desktop i updated the system only with xorg installed.
expat problem was small as small world only present.
then after emerging boost i found libstdc++ missing
revdep rebuild installed gcc-3.3,4 and stdc problem solved.
then i find that the boost wants to emerge again this time gcc-4.1.2 went missing.
it took 6 tries to emerge boost latest version.
many times it tried to kill init,reboot,some files missing etc.
tis can be your reason for trouble.
_________________
reach out a little bit more to catch it (DON'T BELIEVE the advocate part under my user name)
Back to top
View user's profile Send private message
likewhoa
l33t
l33t


Joined: 04 Oct 2006
Posts: 778
Location: Brooklyn, New York

PostPosted: Sun Jan 27, 2008 7:57 am    Post subject: Reply with quote

padoor wrote:
i have a feeling new version of boost has something to do with this problem.
i have no proof to support
but the followings happened.
i crashed my very nicely working system by installing kde-4 which failed to install withglxchooseview error.
then i run a depclean and kill my whole system with missing gcc among others.
reinstall is what i decided to do and started new installation
before emerging kde4 as my desktop i updated the system only with xorg installed.
expat problem was small as small world only present.
then after emerging boost i found libstdc++ missing
revdep rebuild installed gcc-3.3,4 and stdc problem solved.
then i find that the boost wants to emerge again this time gcc-4.1.2 went missing.
it took 6 tries to emerge boost latest version.
many times it tried to kill init,reboot,some files missing etc.
tis can be your reason for trouble.


Have you filed a bug report?
Back to top
View user's profile Send private message
gkmac
Guru
Guru


Joined: 19 Jan 2003
Posts: 333
Location: West Sussex, UK

PostPosted: Sun Jan 27, 2008 1:26 pm    Post subject: Re: What the heck is wrong with Gentoo Reply with quote

rkfsm wrote:
I rebuilt again from scratch on a new machine. New motherboard, CPU and, most importantly, new HDD's.
Did you change the PSU as well?

I've had an old hard drive come up with mysterious errors and replaced it with a brand new one, only to find the same mysterious errors manifesting.

I swapped the PSU out for a brand new one, and both new and old hard drives have been error-free since.
_________________
If ~amd64 ebuilds are cutting edge, then git-9999 ebuilds are chainsaws.
"Not everyone can ride a unicycle, does that mean we should put another wheel on it?" - Lokheed
Back to top
View user's profile Send private message
Clad in Sky
l33t
l33t


Joined: 04 May 2007
Posts: 888
Location: Germany

PostPosted: Sat Feb 02, 2008 8:32 am    Post subject: Reply with quote

Hello there,
I got a similar problem with a hdd. I ran the smartctl -t long command.
How do I get the results of that test? Shell says:
Code:

eichhorn susanne # smartctl -t long /dev/hda2
smartctl version 5.37 [i686-pc-linux-gnu] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Extended self-test routine immediately in off-line mode".
Drive command "Execute SMART Extended self-test routine immediately in off-line mode" successful.
Testing has begun.
Please wait 35 minutes for test to complete.
Test will complete after Sat Feb  2 10:59:03 2008

Use smartctl -X to abort test.
eichhorn susanne #

So what do I do after these 35min? smartctl -A /dev/hda2 ?
Back to top
View user's profile Send private message
psycco
n00b
n00b


Joined: 27 Jan 2006
Posts: 31

PostPosted: Sat Feb 02, 2008 9:39 am    Post subject: Reply with quote

smartctl --all /dev/hda (Prints all SMART information)
Back to top
View user's profile Send private message
jcat
Veteran
Veteran


Joined: 26 May 2006
Posts: 1337

PostPosted: Tue Feb 05, 2008 9:34 am    Post subject: Reply with quote

padoor wrote:
i have a feeling new version of boost has something to do with this problem.
i have no proof to support
but the followings happened.
i crashed my very nicely working system by installing kde-4 which failed to install withglxchooseview error.
then i run a depclean and kill my whole system with missing gcc among others.
reinstall is what i decided to do and started new installation
before emerging kde4 as my desktop i updated the system only with xorg installed.
expat problem was small as small world only present.
then after emerging boost i found libstdc++ missing
revdep rebuild installed gcc-3.3,4 and stdc problem solved.
then i find that the boost wants to emerge again this time gcc-4.1.2 went missing.
it took 6 tries to emerge boost latest version.
many times it tried to kill init,reboot,some files missing etc.
tis can be your reason for trouble.


It sounds like you problem is quite different from the one in this thread!




Cheers,
jcat
Back to top
View user's profile Send private message
cwr
Veteran
Veteran


Joined: 17 Dec 2005
Posts: 1969

PostPosted: Tue Feb 05, 2008 12:59 pm    Post subject: Reply with quote

I've seen something like this, tho' not with Gentoo, when the CMOS setting went out
to lunch and corrupted the memory timings. Everything appeared to work, and every
file and directory written was corrupted. See if your memory timings are what you
think they are.

Good luck - Will
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum