Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
harddrive dma reset issues
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
GenKreton
l33t
l33t


Joined: 20 Sep 2003
Posts: 828
Location: Cambridge, MA

PostPosted: Tue Oct 18, 2005 1:14 am    Post subject: harddrive dma reset issues Reply with quote

Code:
hda: dma_timer_expiry: dma status == 0x21
hda: DMA timeout error
hda: dma timeout error: status=0xd0 { Busy }
ide: failed opcode was: unknown
hda: DMA disabled
ide0: reset: success


I get this very often on my laptop. Sometimes it happens quickly, sometimes it can take hours (maybe even days) before this happens. Some addidtional information:

Code:

aesir ~ # hdparm -I /dev/hda

/dev/hda:

ATA device, with non-removable media
        Model Number:       IC25N030ATMR04-0
        Serial Number:      MRG2E0KBFAMKRJ
        Firmware Revision:  MOAOAD0A
Standards:
        Used: ATA/ATAPI-6 T13 1410D revision 3a
        Supported: 6 5 4 3
Configuration:
        Logical         max     current
        cylinders       16383   65535
        heads           16      1
        sectors/track   63      63
        --
        CHS current addressable sectors:    4128705
        LBA    user addressable sectors:   58605120
        LBA48  user addressable sectors:   58605120
        device size with M = 1024*1024:       28615 MBytes
        device size with M = 1000*1000:       30005 MBytes (30 GB)
Capabilities:
        LBA, IORDY(can be disabled)
        bytes avail on r/w long: 4      Queue depth: 1
        Standby timer values: spec'd by Vendor, no device specific minimum
        R/W multiple sector transfer: Max = 16  Current = 16
        Advanced power management level: 128 (0x80)
        Recommended acoustic management value: 128, current value: 254
        DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 *udma5
             Cycle time: min=120ns recommended=120ns
        PIO: pio0 pio1 pio2 pio3 pio4
             Cycle time: no flow control=240ns  IORDY flow control=120ns
Commands/features:
        Enabled Supported:
           *    NOP cmd
           *    READ BUFFER cmd
           *    WRITE BUFFER cmd
           *    Host Protected Area feature set
           *    Look-ahead
           *    Write cache
           *    Power Management feature set
                Security Mode feature set
           *    SMART feature set
           *    FLUSH CACHE EXT command
           *    Mandatory FLUSH CACHE command
           *    Device Configuration Overlay feature set
           *    48-bit Address feature set
                Automatic Acoustic Management feature set
                SET MAX security extension
                Address Offset Reserved Area Boot
           *    SET FEATURES subcommand required to spinup after power up
                Power-Up In Standby feature set
           *    Advanced Power Management feature set
           *    General Purpose Logging feature set
           *    SMART self-test
           *    SMART error logging
Security:
        Master password revision code = 65534
                supported
        not     enabled
        not     locked
                frozen
        not     expired: security count
        not     supported: enhanced erase
        26min for SECURITY ERASE UNIT.
HW reset results:
        CBLID- above Vih
        Device num = 0 determined by the jumper
Checksum: correct



Also my default hdparm settings are as follows (feel free to correct them even if they are not the cause of the problem):
Code:
/dev/hda:
 multcount    = 16 (on)
 IO_support   =  0 (default 16-bit)
 unmaskirq    =  1 (on)
 using_dma    =  1 (on)
 keepsettings =  0 (off)
 readonly     =  0 (off)
 readahead    = 64 (on)
 geometry     = 16383/255/63, sectors = 30005821440, start = 0
Back to top
View user's profile Send private message
GenKreton
l33t
l33t


Joined: 20 Sep 2003
Posts: 828
Location: Cambridge, MA

PostPosted: Tue Oct 18, 2005 5:50 pm    Post subject: Reply with quote

If nobody can help, does anyone else know where this would be appropriate to get help on? I was leaning towards the kernel guys but I'm not sure.
Back to top
View user's profile Send private message
widan
Veteran
Veteran


Joined: 07 Jun 2005
Posts: 1512
Location: Paris, France

PostPosted: Tue Oct 18, 2005 7:02 pm    Post subject: Re: harddrive dma reset issues Reply with quote

GenKreton wrote:
Code:
hda: dma_timer_expiry: dma status == 0x21
hda: DMA timeout error
hda: dma timeout error: status=0xd0 { Busy }
ide: failed opcode was: unknown
hda: DMA disabled
ide0: reset: success

I get this very often on my laptop. Sometimes it happens quickly, sometimes it can take hours (maybe even days) before this happens.

What happens is that your disk fails to respond in time to a command sent to it. After some time, Linux decides the drive is probably confused and resets the IDE bus. The reason it also disables DMA is that drives can become confused from a too high UDMA setting, and disabling DMA will make them work again (even if it is unlikely to be the case for you).

Are you trying to put the drive in standby mode for low power ? A drive in standby mode won't respond to ATA commands, and will need an IDE bus reset before it comes back to life. You can try to see what causes the errors with smartctl (emerge smartmontools if you don't have it):
Code:
smartctl -l error /dev/hda

If the list contains "STANDBY" or "STANDBY IMMEDIATE", then something asked the drive to go to standby mode.
Back to top
View user's profile Send private message
GenKreton
l33t
l33t


Joined: 20 Sep 2003
Posts: 828
Location: Cambridge, MA

PostPosted: Tue Oct 18, 2005 7:43 pm    Post subject: Reply with quote

I emerged it now, I guess I'll wait and check till next time I see it off

Also, I saw this in my hdparm output.
Configuration:
Logical max current
cylinders 16383 65535

Is it bad that current is so much bigger than max?
Back to top
View user's profile Send private message
widan
Veteran
Veteran


Joined: 07 Jun 2005
Posts: 1512
Location: Paris, France

PostPosted: Tue Oct 18, 2005 8:43 pm    Post subject: Reply with quote

GenKreton wrote:
I emerged it now, I guess I'll wait and check till next time I see it off.

You can run it now. The drive stores the last 5 or so errors in its internal memory.
GenKreton wrote:
Is it bad that current is so much bigger than max?

No. Linux only uses LBA addressing. CHS drive geometry means nothing for modern drives, only the LBA values matter. The reason the values are set like they are is compatibility with some BIOSes.
Back to top
View user's profile Send private message
GenKreton
l33t
l33t


Joined: 20 Sep 2003
Posts: 828
Location: Cambridge, MA

PostPosted: Tue Oct 18, 2005 9:16 pm    Post subject: Reply with quote

Code:
aesir ~ # smartctl -l error /dev/hda
smartctl version 5.33 [i686-pc-linux-gnu] Copyright (C) 2002-4 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART Error Log Version: 1
No Errors Logged


And I ran
Code:
smartctl -a /dev/hda
 [remove snippets]

Warning! SMART Selective Self-Test Log Structure error: invalid SMART checksum.
SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing


That's the only thing I could find wrong. I ran the long test as well and that reported no errors.
Back to top
View user's profile Send private message
widan
Veteran
Veteran


Joined: 07 Jun 2005
Posts: 1512
Location: Paris, France

PostPosted: Tue Oct 18, 2005 9:24 pm    Post subject: Reply with quote

Thinking of it again, it's "normal" that no error was logged. The drive (obviously) won't store an error record when it is in standby (assuming that's the cause)... In any case, it seems the drive is fine. You can try to disable any power management tools you might have loaded, and see if the "problem" disappears.
Back to top
View user's profile Send private message
dashnu
l33t
l33t


Joined: 21 Jul 2004
Posts: 703
Location: Casco Maine

PostPosted: Mon Nov 14, 2005 1:59 pm    Post subject: Reply with quote

I have the same issue. kernel-2.6.13 gentoo-sources.

Similar Drive:

Code:

ATA device, with non-removable media
powers-up in standby; SET FEATURES subcmd spins-up.
        Model Number:       IC35L060AVV207-0                       
        Serial Number:      VNVB30G8UL5JXH
        Firmware Revision:  V22OA66A
Standards:
        Used: ATA/ATAPI-6 T13 1410D revision 3a
        Supported: 6 5 4 3
Configuration:
        Logical         max     current
        cylinders       16383   65535
        heads           16      1
        sectors/track   63      63
        --
        CHS current addressable sectors:    4128705
        LBA    user addressable sectors:   78156288
        LBA48  user addressable sectors:   78156288
        device size with M = 1024*1024:       38162 MBytes
        device size with M = 1000*1000:       40016 MBytes (40 GB)


smart reports are ok.

This happens when my server is under a lot of load.

I do not have any power management set up.

I have tried a few kernels also with no luck.
_________________
write quit bang
Back to top
View user's profile Send private message
SweD
n00b
n00b


Joined: 08 Jun 2004
Posts: 7

PostPosted: Fri Nov 18, 2005 3:00 pm    Post subject: Reply with quote

I had this exakt same problem. What solved it for me was that I started noticing that my harddrives were running at a temp of approx 40-45 Celsius. Subjecting them to something intensive brought them up to the 50-55 mark. I kept having these DMA resets all the time, especially when I put more than one disk on the same channel of my pdc20265 onboard promise controller. I don't mean litterally all the time, but if the machine had been on for some time, these errors kept popping up more and more often.

I've since bought a new, substantially bigger chassi, put in extra fans to make sure thermal issues wouldn't be part of the problem, and presto.
My harddrives now run at 25-30 Celsius, and I have not had a single dma reset since. I mean that litterally, not a single one, and I've been using the new setup for approx 2 months, daily.

The conclusion can't be a general one, I'm sure, but for me it was a thermall issue through and through.
I had this issue ever since I put a second drive in, using something like gentoo-sources-2.4.20. I kept reading the osdl bugreports, specifically bug¤# 2494, where many people had the same issues, thinking it would be flaky drivers for the promise controller. It was, to some extent, but after they were "fixed", in some kernel version or another, the problems remained, until I fixed the temp.

Just my story, milage may well differ, of course.

Regards,

/Dennis
Back to top
View user's profile Send private message
GenKreton
l33t
l33t


Joined: 20 Sep 2003
Posts: 828
Location: Cambridge, MA

PostPosted: Sat Dec 10, 2005 1:40 am    Post subject: Reply with quote

Thank you swed. Though I am using the 2.6 series it seems very possible it is the same problem. It only occured under high harddrive load situations and it is a laptop - so I lack the same ability to expand my chasis though I will try to take more efforts to keep it cool and see how it turns out.
Back to top
View user's profile Send private message
SweD
n00b
n00b


Joined: 08 Jun 2004
Posts: 7

PostPosted: Wed Dec 21, 2005 8:09 pm    Post subject: Reply with quote

Just for completeness, and to be clear, I'm also using 2.6, and have had this problem with 2.6 for a long time. The 2.4 reference was to point out that my problems started a looong time ago :-D
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum