Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
[Solved] File system corruption at boot time
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Installing Gentoo
View previous topic :: View next topic  
Author Message
christerk
n00b
n00b


Joined: 28 Dec 2007
Posts: 6

PostPosted: Fri Dec 28, 2007 3:51 pm    Post subject: [Solved] File system corruption at boot time Reply with quote

Hi.

I've been trying to install my third Gentoo machine and have ran into a strange issue.

The install process works fine with booting the install disc, partitioning, getting and unpacking the stage3/portage files, chrooting, building the kernel, building stuff such as syslog and cron, installing grub..

However, when I reboot fir the first time, I run into a pretty major problem. The kernel loads up and detects my devices perfectly fine, but as soon as the root system is about to be started I get bad error messages. I've tried reinstalling a couple of times tweaking things, but end up in the same place.. The system inevitably panics with what appears to be file system corruption. ld.so complains, I've seen some glibc error messages, and various nondescript panics.

After booting back to the install CD and mounting the file system again, I find that a number of files have gone corrupt (stag3 bzip being one example), and a fsck of the device starts to spew out error messages all over the place.

Now, I would normally suspect harddrive failure but in this particular case I have two drives in a raid1 mirror (using md, properly configured and kernel support is compiled in). The two drives are brand new Seagate drives if it matters. I have also tried installing on one single drive, but get the same thing.

What is strange to me is that the chrooted part of the install works flawlessly. A lot of file access and compiling going on, so if the drives were that bad, it should show up long before the system was installed. Also, /proc/mdstat never reports any problems at all. Both drives are fully operational according to md.


Some basic info of the hardware:

- Asus M2N-SLI motherboard, updated with the latest BIOS
- AMD Phenom 9500 CPU
- 2x2G DDR2-800 memory (Transcend branded) in dual channel setup
- 2x160G SATA Seagate Barracudas (new)

The Motherboard/CPU combination is confirmed to be working by another person on these forums, so there are no compatibility issues on that front.

Basic install setup:

- Latest amd64 install disc
- drives set up in raid1, with ext2 fs (Tried ext3 and without raid1 as well without any change)
- Using latest amd64 stage3 package
- latest gentoo-sources, using default settings with raid1 support enabled
- Grub as boot loader


Anyone here happen to have an idea of what could cause this type of problem, or what I can do to try isolating the cause?


Last edited by christerk on Sat Dec 29, 2007 3:05 pm; edited 1 time in total
Back to top
View user's profile Send private message
BradN
Advocate
Advocate


Joined: 19 Apr 2002
Posts: 2391
Location: Wisconsin (USA)

PostPosted: Sat Dec 29, 2007 12:00 am    Post subject: Reply with quote

I suppose the best strategy is to find out when exactly the corruption occurs. I would try unmounting and remounting the partitions via the livecd, then rebooting the livecd and mounting again, and if everything's still fine at that point, then the corruption must be occuring when rebooting to the real kernel.

Make sure you properly unmount the filesystems before rebooting the livecd, although I can't imagine one improper unmount would be enough to totally hose it.

My first suspicion would be that the RAID setup is screwing something up, especially if it's a vendor format RAID (as opposed to linux software RAID). In general, if you want to have your root filesystem be raid'ed, I'd recommend having a non-raid /boot partition and use an initrd to mount the root raid filesystem. But, you said you tried it without RAID, so I don't know...
Back to top
View user's profile Send private message
christerk
n00b
n00b


Joined: 28 Dec 2007
Posts: 6

PostPosted: Sat Dec 29, 2007 3:04 pm    Post subject: Reply with quote

I managed to get the system installed in the end. What I did was to install over ReiserFS and md raid1 (been using software raid all along). After unmounting, I did a fsck which showed me some corrupt files (/proc/mdstat showed both drives up and running as normal). Fixed these and rebooted into the new system. Had some corrupt files in non-essential places (portage cache).

After the reboot it's working flawlessly. I've been running IO tests without problems and have copied in about 1.5 million files onto the raid drive. I've also rebooted multiple times to force remounts.

I suspect that there may be a slight problem with the 2.6.19 kernel that comes with the minimal install disc and my system. 2.6.23 which I installed seems to work just fine.[/glep]
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Installing Gentoo All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum