Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
dma timeouts
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
luciano
Tux's lil' helper
Tux's lil' helper


Joined: 18 Nov 2004
Posts: 132

PostPosted: Wed Dec 08, 2004 8:58 pm    Post subject: dma timeouts Reply with quote

Hi All,

Today when I got home my fileserver was a bit chuggy so I unmounted my 160 GB shared drive. Now running a check on it says it's corrupt, and I'm unable to fix it :cry: . If ANYONE can help me out I'd be incredibly thankful!

I'm running 2.6.10-rc2-mm2 on an old athlon. The disk in question has a single reiser4 partition. fsck.reiser4 complains about superblock magic numbers, but can't seem to fix anything due to i/o errors. This makes me think it's more of a hardware problem.

The fact that I'm also getting dma timeout errors from the kernel/system loggers seem to also point to this. I get messages like this:

Code:

ide: failed opcode was: unknown
end_request: I/O error, dev hdc, sector 63128031
hdc: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hdc: dma_intr: error=0x40 { UncorrectableError }, LBAsect=63131479, high=3, low=12799831, sector=63131479


when I try to mount, kernel says:

Code:
reiser4[mount(5341)]: _init_read_super (fs/reiser4/init_super.c:198)[nikita-2608]:
WARNING: hdc1: wrong master super block magic.


and when I try to fix it with fsck, it complains:

Code:

***** fsck.reiser4 started at Wed Dec  8 20:35:31 2004
Fatal: Wrong magic found in the master super block.
Master super block cannot be found. Do you want to build a new one on
(/dev/hdc1)?
(Yes/No): yes
Which block size do you use? [4096]:
Warn : A new master superblock is created on (/dev/hdc1).
Error: Can't find disk-format plugin by its id 0xffff.
Error: Cannot open the on-disk format on (/dev/hdc1)
Info : The format 'format40' is detected. Rebuilding with it.
Error: Can't read bitmap block 4943136. Input/output error.
Error: Can't load ondisk bitmap.
Error: Can't initialize block allocator.
Fatal: Failed to open the block allocator.


I've checked other possible causes: I've changed the IDE cables to new ones and checked teh connections. My server is a bit hot at times, but I'd think that after cooling it properly it shouldn't be gettin i/o errors. The disk is brand new, and my primary disk that's running in the same box doesn't have any problems (it's also reiser4).

I haven't tried moving the disk to my other machine, but I'll attempt that if noone can think of anything eles!
Back to top
View user's profile Send private message
/dev/random
l33t
l33t


Joined: 26 Nov 2004
Posts: 704
Location: Austin, Texas, USA

PostPosted: Thu Dec 09, 2004 3:33 am    Post subject: Reply with quote

My educated guess would be it's probably reiser4. However, it's possible that your IDE controller is bad, but it sounds more likely its the fault of reiser4. I recommend you don't use reiser4 on a server until its stable.
Back to top
View user's profile Send private message
luciano
Tux's lil' helper
Tux's lil' helper


Joined: 18 Nov 2004
Posts: 132

PostPosted: Thu Dec 09, 2004 1:52 pm    Post subject: Reply with quote

/dev/random wrote:
However, it's possible that your IDE controller is bad


you mean the IDE controller on the motherboard? If I stuck the disk in my other machine (and the problem was the controller), it should work then, no? I'm going to try this tonight.

What I find strangest is that the reiser4progs can't fix the problem. If it was an issue with reiser 4, then I should at least be able to fix the partition, I think. Obviously this doesn't mean it wont' corrupt again.

Maybe I should have mentioned I was running NFS on top of it.
Back to top
View user's profile Send private message
iulianpojar
n00b
n00b


Joined: 17 May 2004
Posts: 38
Location: Moldova

PostPosted: Thu Dec 09, 2004 10:16 pm    Post subject: Reply with quote

Hey did you check the power connectors ? I usualy get dma timeout because of them ( i have a lot of diferent harddiscks that i have to change and when i take the power connector from one and put it to another it ussualy baddly pluginns , this happens because harrds from diferent brands have power connector pinns of diferent diameters). :D
Back to top
View user's profile Send private message
luciano
Tux's lil' helper
Tux's lil' helper


Joined: 18 Nov 2004
Posts: 132

PostPosted: Fri Dec 10, 2004 10:43 am    Post subject: Reply with quote

Hi All,

Thanks for all your responses. I seem to have found the problem. I tried mounting the disk on another machine and I get the same problem. So I installed the really useful SMARTmontools (emerge smartmontools) and ran a few tests.

It turns out I seem to have a corrupt sector on the boot record and/or superblock! This is REALLY bad news :(, and I'm probably going to loose a bunch of data , even if I can rebuild the filesystem.

I don't understand how this could happen! I'm confused as to why you can have a corrupt sector on your disk .. someone tell me if this is right:

each sector has some sort of a checksum after it. If the checksum doesn't match, then the sector is considered corrupt. So this doesn't necessarily mean that there's a hardware problem, but that maybe the disk just shut down in the middle of writing a block..

I can think of this as the only explanation- as I said, the disk is a brand new Seagate (one month old), so I'm considering whether to claim on the warranty.. unless I can be sure that it's not a hardware problem!

Once again, thank you all for your invaluable input!
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum