Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
WD Raptor 75GB (WD740ADFD-00NLR1) buggy NCQ implementation
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
blubbi
Guru
Guru


Joined: 27 Apr 2003
Posts: 564
Location: Halle (Saale), Germany

PostPosted: Tue Jun 26, 2007 9:42 am    Post subject: WD Raptor 75GB (WD740ADFD-00NLR1) buggy NCQ implementation Reply with quote

Okay, for everyone who wants to buy this drive.
Western Digital Raptor 75GB WD740ADFD (with firmware 00NLR1)

BE WARNED

The NCQ implementation in this drive (with this firmware) is really bad, so in future Kernels it will be blacklisted.

Code:
ata1.00: exception Emask 0x2 SAct 0x1fe00 SErr 0x0 action 0x2 frozen
ata1.00: (spurious completions during NCQ issue=0x0 SAct=0x1fe00
FIS=004040a1:00040000)
ata1.00: cmd 61/18:48:d0:4e:6d/00:00:05:00:00/40 tag 9 cdb 0x0 data 12288 out
         res 40/00:90:28:a6:6c/00:00:05:00:00/40 Emask 0x2 (HSM violation)
ata1.00: cmd 61/10:50:f0:4e:6d/00:00:05:00:00/40 tag 10 cdb 0x0 data 8192 out
         res 40/00:90:28:a6:6c/00:00:05:00:00/40 Emask 0x2 (HSM violation)
ata1.00: cmd 61/08:58:48:9c:6d/00:00:05:00:00/40 tag 11 cdb 0x0 data 4096 out
         res 40/00:90:28:a6:6c/00:00:05:00:00/40 Emask 0x2 (HSM violation)
ata1.00: cmd 61/08:60:b0:9c:6d/00:00:05:00:00/40 tag 12 cdb 0x0 data 4096 out
         res 40/00:90:28:a6:6c/00:00:05:00:00/40 Emask 0x2 (HSM violation)
ata1.00: cmd 61/28:68:90:9d:6d/00:00:05:00:00/40 tag 13 cdb 0x0 data 20480 out
         res 40/00:90:28:a6:6c/00:00:05:00:00/40 Emask 0x2 (HSM violation)
ata1.00: cmd 61/08:70:50:a1:6d/00:00:05:00:00/40 tag 14 cdb 0x0 data 4096 out
         res 40/00:90:28:a6:6c/00:00:05:00:00/40 Emask 0x2 (HSM violation)
ata1.00: cmd 61/08:78:a8:a1:6d/00:00:05:00:00/40 tag 15 cdb 0x0 data 4096 out
         res 40/00:90:28:a6:6c/00:00:05:00:00/40 Emask 0x2 (HSM violation)
ata1.00: cmd 61/08:80:b0:a1:6d/00:00:05:00:00/40 tag 16 cdb 0x0 data 4096 out
         res 40/00:90:28:a6:6c/00:00:05:00:00/40 Emask 0x2 (HSM violation)
ata1: soft resetting port
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata1.00: configured for UDMA/133
ata1: EH complete
SCSI device sda: 145226112 512-byte hdwr sectors (74356 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: write cache: enabled, read cache: enabled, doesn't support DPO
or FUA


To see traces and the bugreport on kernel bugzilla follow the link below:
http://bugzilla.kernel.org/show_bug.cgi?id=8627

I tried to contact WD and requested a new firmware. But no response so far.

If anyone has a 74GB Raptor with another Firmware, please have a look if you get HSM Violations.
If not, than we should only blacklist the drives with firmware 00NLR1

Regards
blubbi


Last edited by blubbi on Wed Jun 27, 2007 8:36 am; edited 1 time in total
Back to top
View user's profile Send private message
tnt
Veteran
Veteran


Joined: 27 Feb 2004
Posts: 1227

PostPosted: Wed Jun 27, 2007 3:42 am    Post subject: Re: WD Raptor 75GB (WD740ADFD-00NLR1) buggy NCQ implementati Reply with quote

blubbi wrote:
The NCQ implementation is really bad, so in future Kernels it will be disabled.
The NCQ impelementation in this drive causes "HSM violation"


is it bad in kernel, so it will be disabled for good for all drives, or you're talking only about raptors?
_________________
gentoo user
Back to top
View user's profile Send private message
blubbi
Guru
Guru


Joined: 27 Apr 2003
Posts: 564
Location: Halle (Saale), Germany

PostPosted: Wed Jun 27, 2007 8:34 am    Post subject: Reply with quote

I am talking about the Raptor I mentioned in the topick.

http://en.wikipedia.org/wiki/WD_Raptor#The_WD740GD

regards
blubbi
Back to top
View user's profile Send private message
blubbi
Guru
Guru


Joined: 27 Apr 2003
Posts: 564
Location: Halle (Saale), Germany

PostPosted: Wed Jun 27, 2007 8:37 am    Post subject: Reply with quote

If anyone has a 74GB Raptor with another Firmware, please have a look if you get HSM Violations.
If not, than we should only blacklist the drives with firmware 00NLR1
Back to top
View user's profile Send private message
oldnavy23
Tux's lil' helper
Tux's lil' helper


Joined: 09 Jul 2007
Posts: 86
Location: USA

PostPosted: Mon Jul 09, 2007 6:55 pm    Post subject: Reply with quote

do you know if the new 150gb raptor has this same issue or not ?
Back to top
View user's profile Send private message
blubbi
Guru
Guru


Joined: 27 Apr 2003
Posts: 564
Location: Halle (Saale), Germany

PostPosted: Mon Jul 09, 2007 7:00 pm    Post subject: Reply with quote

No idea... but buy one, try it... post your results here.

If it shows errors just return it and let us know about it.

post the output of
Code:
hdparm -I /dev/sd?


thanks
blubbi
Back to top
View user's profile Send private message
blubbi
Guru
Guru


Joined: 27 Apr 2003
Posts: 564
Location: Halle (Saale), Germany

PostPosted: Mon Jul 09, 2007 7:01 pm    Post subject: Reply with quote

maybe you should read here:
http://en.wikipedia.org/wiki/WD_Raptor#The_WD1500
Back to top
View user's profile Send private message
oldnavy23
Tux's lil' helper
Tux's lil' helper


Joined: 09 Jul 2007
Posts: 86
Location: USA

PostPosted: Mon Jul 09, 2007 7:09 pm    Post subject: Reply with quote

well with that looks like ncq is on the 150 too i guess it will be trial and error to test it out and also do you think it would make my server allot faster with a raptor if i can get it too work i have 2 gigs ram 1tb hd and amd 3500
Back to top
View user's profile Send private message
blubbi
Guru
Guru


Joined: 27 Apr 2003
Posts: 564
Location: Halle (Saale), Germany

PostPosted: Mon Jul 09, 2007 7:18 pm    Post subject: Reply with quote

Depends on what kind of server it will be.
But thats OT in this thread. better ask in a new thread or in your previous page.

And I am the wrong person to answere your question... I actually can't say how much faster your system would be with a 7200RPM Hdd WITH NCQ compared to a RAPTOR _without_ NCQ... but Raptor WITH NCQ would speed up your system... but how much and if this gain is noticeable... I don't know.
Maybe a Raptor without NCQ would be faster than an 7200 rpm HDD with NCQ ... I have no clue. Sry

regards
blubbi
Back to top
View user's profile Send private message
obrut<-
Apprentice
Apprentice


Joined: 01 Apr 2005
Posts: 183
Location: near hamburg, germany

PostPosted: Sun Jul 15, 2007 8:03 am    Post subject: Reply with quote

hmm...
i never encountered any errors with my raptor. ncq is enabled.
Code:
   41.405326] scsi 1:0:0:0: Direct-Access     ATA      WDC WD740ADFD-60 20.0 PQ: 0 ANSI: 5
[   41.413533] sd 1:0:0:0: [sdb] 145226112 512-byte hardware sectors (74356 MB)
[   41.421724] sd 1:0:0:0: [sdb] Write Protect is off
[   41.429740] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[   41.429771] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[   41.438012] sd 1:0:0:0: [sdb] 145226112 512-byte hardware sectors (74356 MB)
[   41.446144] sd 1:0:0:0: [sdb] Write Protect is off
[   41.454229] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[   41.454261] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[   41.462416]  sdb: sdb1 sdb2 sdb3 < sdb5 sdb6 sdb7 sdb8 > sdb4
[   41.515319] sd 1:0:0:0: [sdb] Attached SCSI disk
Code:
[   40.226872] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[   40.236471] ata1.00: ATA-7: WDC WD740ADFD-60NLR1, 20.07P20, max UDMA/133
[   40.244523] ata1.00: 145226112 sectors, multi 16: LBA48 NCQ (depth 0/32)
[   40.254988] ata1.00: configured for UDMA/133
Code:
# hdparm -I /dev/sdb

/dev/sdb:

ATA device, with non-removable media
        Model Number:       WDC WD740ADFD-60NLR1
        Serial Number:      WD-WMAN********
        Firmware Revision:  20.07P20
Standards:
        Used: ATA/ATAPI-7 published, ANSI INCITS 397-2005
        Supported: 7 6 5 4
Configuration:
        Logical         max     current
        cylinders       16383   16383
        heads           16      16
        sectors/track   63      63
        --
        CHS current addressable sectors:   16514064
        LBA    user addressable sectors:  145226112
        LBA48  user addressable sectors:  145226112
        device size with M = 1024*1024:       70911 MBytes
        device size with M = 1000*1000:       74355 MBytes (74 GB)
Capabilities:
        LBA, IORDY(can be disabled)
        Queue depth: 32
        Standby timer values: spec'd by Standard, with device specific minimum
        R/W multiple sector transfer: Max = 16  Current = 16
        DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6
             Cycle time: min=120ns recommended=120ns
        PIO: pio0 pio1 pio2 pio3 pio4
             Cycle time: no flow control=120ns  IORDY flow control=120ns
Commands/features:
        Enabled Supported:
           *    SMART feature set
           *    Power Management feature set
           *    Write cache
           *    Look-ahead
           *    WRITE_BUFFER command
           *    READ_BUFFER command
           *    NOP cmd
           *    DOWNLOAD_MICROCODE
           *    48-bit Address feature set
           *    Device Configuration Overlay feature set
           *    Mandatory FLUSH_CACHE
           *    FLUSH_CACHE_EXT
           *    SMART error logging
           *    SMART self-test
           *    General Purpose Logging feature set
           *    SATA-I signaling speed (1.5Gb/s)
           *    Native Command Queueing (NCQ)
           *    Phy event counters
                DMA Setup Auto-Activate optimization
           *    Software settings preservation
           *    SMART Command Transport (SCT) feature set
           *    SCT Long Sector Access (AC1)
           *    SCT LBA Segment Access (AC2)
           *    SCT Error Recovery Control (AC3)
           *    SCT Features Control (AC4)
           *    SCT Data Tables (AC5)
                unknown 206[12]
Checksum: correct
Back to top
View user's profile Send private message
blubbi
Guru
Guru


Joined: 27 Apr 2003
Posts: 564
Location: Halle (Saale), Germany

PostPosted: Sun Jul 15, 2007 10:51 am    Post subject: Reply with quote

Okay, interesting.

I See you got this:
Code:
ATA device, with non-removable media
        Model Number:       WDC WD740ADFD-60NLR1
        Serial Number:      WD-WMAN********
        Firmware Revision:  20.07P20


And I got this:
Code:
ATA device, with non-removable media
        Model Number:       WDC WD740ADFD-00NLR1
        Serial Number:      WD-WMAN.........
        Firmware Revision:  20.07P20


Now it would be interesting to know the difference between
60NLR1 and 00NLR1 and what exactly that Number expresses.

If you could turn on debugung in your libata driver, and recompile the kernel, you could look for misbehavior of NCQ.
Would be good to know, cause for now the entire WD740ADFD series is on the NCQ blacklist in the 2.6.22 Kernel.

Thanks
blubbi
Back to top
View user's profile Send private message
obrut<-
Apprentice
Apprentice


Joined: 01 Apr 2005
Posts: 183
Location: near hamburg, germany

PostPosted: Sun Jul 15, 2007 11:27 am    Post subject: Reply with quote

when you tell me where to find those debug options i'll do that.
Code:
# uname -a
Linux desktop 2.6.21-kamikaze6 #1 SMP PREEMPT Mon Jul 2 00:11:29 CEST 2007 x86_64 AMD Athlon(tm) 64 X2 Dual Core Processor 3600+ AuthenticAMD GNU/Linux
the corresponding 2.6.22 kernel hast just been compiled but not yet booted.
btw: tach, landsmann! *g*
Back to top
View user's profile Send private message
blubbi
Guru
Guru


Joined: 27 Apr 2003
Posts: 564
Location: Halle (Saale), Germany

PostPosted: Sun Jul 15, 2007 12:53 pm    Post subject: Reply with quote

grep your drivers/ata/libata-core.c

for the string "WDC WD740ADFD-00NLR1" to see if the patch from Tejun Heo has been included.

To enable debugging do the following:

edit include/linux/libata.h and change 2 to the following:
#define ATA_DEBUG
#define ATA_VERBOSE_DEBUG

- Rebuild the kernel, install it and reboot
- Look in /var/log/messages for the debug messages

Look out for
"spurious completions during NCQ"
and/or
"HSM violation"

regards
blubbi
Back to top
View user's profile Send private message
obrut<-
Apprentice
Apprentice


Joined: 01 Apr 2005
Posts: 183
Location: near hamburg, germany

PostPosted: Sun Jul 15, 2007 1:24 pm    Post subject: Reply with quote

changed the 2 lines from #undef to #define, but:
Code:
# make && make install
  CHK     include/linux/version.h
  CHK     include/linux/utsrelease.h
  HOSTCC  scripts/basic/fixdep
  HOSTCC  scripts/basic/docproc
  HOSTCC  scripts/mod/file2alias.o
  HOSTCC  scripts/mod/sumversion.o
  HOSTLD  scripts/mod/modpost
  CHK     include/linux/compile.h
  HOSTCC  usr/gen_init_cpio
  GEN     usr/initramfs_data.cpio.gz
  AS      usr/initramfs_data.o
  LD      usr/built-in.o
  CC      drivers/ata/libata-core.o
In file included from drivers/ata/libata-core.c:55:
include/linux/libata.h:2: Fehler: expected »=«, »,«, »;«, »asm« or »__attribute__« before numeric constant
include/linux/libata.h:6: Fehler: expected »=«, »,«, »;«, »asm« or »__attribute__« before »can«
include/linux/libata.h:8: Fehler: expected »=«, »,«, »;«, »asm« or »__attribute__« before »version«
include/linux/libata.h:12: Fehler: expected »=«, »,«, »;«, »asm« or »__attribute__« before »even«
include/linux/libata.h:17: Fehler: expected »=«, »,«, »;«, »asm« or »__attribute__« before »the«
In Datei, eingefügt von drivers/ata/libata-core.c:55:
include/linux/libata.h:18:63: Fehler: ungültige Ziffer »9« in Oktal-Konstante
include/linux/libata.h:21:43: Warnung: Zeichenkonstante zu lang für ihren Typ
drivers/ata/libata-core.c: In Funktion »ata_dev_configure«:
drivers/ata/libata-core.c:1788: Warnung: in Vergleich verschiedener Zeigertypen fehlt Typkonvertierung
make[2]: *** [drivers/ata/libata-core.o] Fehler 1
make[1]: *** [drivers/ata] Fehler 2
make: *** [drivers] Fehler 2

changing back does not help. 8O

p.s.:
Code:
# cat drivers/ata/libata-core.c | grep 740
        { "WDC WD740ADFD-00",   NULL,           ATA_HORKAGE_NONCQ },

in 2.6.22 there a 2 lines with raptors
Back to top
View user's profile Send private message
blubbi
Guru
Guru


Joined: 27 Apr 2003
Posts: 564
Location: Halle (Saale), Germany

PostPosted: Sun Jul 15, 2007 2:13 pm    Post subject: Reply with quote

I see you are german, do you know about #gentoo.de in freenode.
If so we could meet there an try to fix it. My nick is blubbi in freenode. Just send me a query.


I wuold suggest a "make clean && make distclean && make mrproper" backup your .config first!

remove all modules and start with this complete new kernel.

regards
blubbi
Back to top
View user's profile Send private message
obrut<-
Apprentice
Apprentice


Joined: 01 Apr 2005
Posts: 183
Location: near hamburg, germany

PostPosted: Sun Jul 15, 2007 3:09 pm    Post subject: Reply with quote

the same error again.
btw i'm in gentoo.de atm
Back to top
View user's profile Send private message
AndyRTR
n00b
n00b


Joined: 17 Sep 2003
Posts: 13
Location: Germany

PostPosted: Tue Aug 14, 2007 9:34 pm    Post subject: Reply with quote

any progress? i also have a 74MB Raptor drive with the critical firmware and these entries in dmesg:

with ICH9R controller in IDE mode:
ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2
ata1.00: (BMDMA stat 0x26)
ata1.00: cmd ca/00:08:91:0d:45/00:00:00:00:00/e4 tag 0 cdb 0x0 data 4096 out
res 51/84:08:91:0d:45/00:00:00:00:00/e4 Emask 0x30 (host bus error)
ata1: soft resetting port
ata1.00: configured for UDMA/133
ata1: EH complete
sd 0:0:0:0: [sda] 145226112 512-byte hardware sectors (74356 MB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

in AHCI mode:
Aug 12 03:02:21 workstation64 ata1.00: exception Emask 0x10 SAct 0x0 SErr 0x400100 action 0x2 frozen
Aug 12 03:02:21 workstation64 ata1.00: (irq_stat 0x08000000, interface fatal error)
Aug 12 03:02:21 workstation64 ata1.00: cmd 35/00:00:19:7e:3d/00:04:00:00:00/e0 tag 0 cdb 0x0 data 524288 out
Aug 12 03:02:21 workstation64 res 50/00:00:08:26:59/00:00:00:00:00/e5 Emask 0x10 (ATA bus error)
Aug 12 03:02:21 workstation64 ata1: soft resetting port
Aug 12 03:02:21 workstation64 ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Aug 12 03:02:21 workstation64 ata1.00: configured for UDMA/33
Aug 12 03:02:21 workstation64 ata1: EH complete
Aug 12 03:02:21 workstation64 sd 0:0:0:0: [sda] 145226112 512-byte hardware sectors (74356 MB)
Aug 12 03:02:21 workstation64 sd 0:0:0:0: [sda] Write Protect is off
Aug 12 03:02:21 workstation64 sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
Aug 12 03:02:21 workstation64 sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

so it should be more than just disabling NCQ. the problem still exists here with kernel 2.6.23rc3

any idea?

btw: running ArchLinux x86_64 here.
Back to top
View user's profile Send private message
blubbi
Guru
Guru


Joined: 27 Apr 2003
Posts: 564
Location: Halle (Saale), Germany

PostPosted: Wed Aug 15, 2007 12:07 pm    Post subject: Reply with quote

Mmmh, the patch should be implemented as of vanilla 2.6.22
You can check this by greping your drivers/ata/libata-core.c for "WD740ADFD-00NLR1" You should find this string in the NCQ blacklist section.

You can follow my the solved bug here. http://bugzilla.kernel.org/show_bug.cgi?id=8627

But I guess your bug is not caused by NCQ. Cause you shold read something like this
Code:
ata1.00: exception Emask 0x2 SAct 0x1fe00 SErr 0x0 action 0x2 frozen
ata1.00: (spurious completions during NCQ issue=0x0 SAct=0x1fe00
FIS=004040a1:00040000)
ata1.00: cmd 61/18:48:d0:4e:6d/00:00:05:00:00/40 tag 9 cdb 0x0 data 12288 out
         res 40/00:90:28:a6:6c/00:00:05:00:00/40 Emask 0x2 (HSM violation)


I have found another bug in 2.6.22.2 and it may be present in 2.6.23 have a look here.
http://bugzilla.kernel.org/show_bug.cgi?id=8791
(scroll down to the bottom)
But I still think you have an other problem, cause this bug is IMHO related to MD or pata_hpt37x (not resolved yet)

I do not longer have any issues with the Raptors after the Raptors got blacklisted for NCQ.

Take a chance and and write a bug "http://bugzilla.kernel.org/" the guys are really fast in fixing things!
but first read these thwo threads: http://www.mail-archive.com/linux-ide@vger.kernel.org/msg06991.html and http://groups.google.com/group/fa.linux.kernel/browse_thread/thread/2c895e1ac0a8ccf9/d5e910d37451ca33?lnk=raot

regards
blubbi
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum