Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Disk error -- Should I be concerned
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
jl_678
n00b
n00b


Joined: 12 Dec 2003
Posts: 56

PostPosted: Wed Dec 31, 2008 3:56 pm    Post subject: Disk error -- Should I be concerned Reply with quote

I am running Gentoo with the 2.6 kernel. Everything is running smoothly on the machine with no problems. I current run 2 250 GB drives mirrored and have seen no RAID errors or other disk issues. I was perusing /var/log/messages and I found the error below. I grepped the logs and see that this only appeared 8 times yesterday and so it is not clear if this recurring or a one time event. That said, has anyone seen this before and have any idea what it means? Should I be concerned about this? As I mentioned, the system has been running fine. Here is the actual message:

Code:
Dec 30 10:43:41 foo hda: irq timeout: status=0xd0 { Busy }
Dec 30 10:43:41 foo ide: failed opcode was: 0xea


I appreciate any thoughts on this.

Thank you,

JL
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54809
Location: 56N 3W

PostPosted: Wed Dec 31, 2008 4:27 pm    Post subject: Reply with quote

jl_678,

It means the IRQ raised by the drive was not answered in a reasonable time.
Its not normally a disk problem but indicates an issue somewhere else.

Post your /proc/interrupts and hdparm /dev/hda output please.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
jl_678
n00b
n00b


Joined: 12 Dec 2003
Posts: 56

PostPosted: Sat Jan 10, 2009 7:52 pm    Post subject: Requested commands Reply with quote

Hi,

Sorry about the delay in getting back to this. I noticed that the problem just happened 2x on 1/3/09 as well. It is odd in that it is intermittent. Here is the data that you are looking for:

Code:
# cat /proc/interrupts
           CPU0
  0:  414645296   IO-APIC-edge      timer
  1:          2   IO-APIC-edge      i8042
  2:          0    XT-PIC-XT        cascade
  6:          2   IO-APIC-edge      floppy
  8:          2   IO-APIC-edge      rtc
 10:   13074336   IO-APIC-fasteoi   eth0
 12:          0   IO-APIC-fasteoi   uhci_hcd:usb1, uhci_hcd:usb2
 14:   24825867   IO-APIC-edge      ide0
 15:         37   IO-APIC-edge      ide1
NMI:          0
LOC:  414645134
ERR:          0
MIS:          0


# hdparm /dev/hda

/dev/hda:
 multcount    = 16 (on)
 IO_support   =  1 (32-bit)
 unmaskirq    =  1 (on)
 using_dma    =  1 (on)
 keepsettings =  0 (off)
 readonly     =  0 (off)
 readahead    = 256 (on)
 geometry     = 30401/255/63, sectors = 488397168, start = 0
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54809
Location: 56N 3W

PostPosted: Sat Jan 10, 2009 8:44 pm    Post subject: Reply with quote

jl_678,

I suspect its because you have
Code:
unmaskirq    =  1 (on)
This allows lower priority IRQs to be processed inside a disk interrupt.
This is generally a bad thing as it takes longer to process the disk IRQ and occasionally causes timeouts.
The kernel will revert to PIO when it becomes an issue. Thats very slow and CPU intensive.

Use hdpam to set unmaskirq to off. It was originally intended to allow serial ports to achieve rates above 9600 baud without dropped charaters.
Now, serial ports are either not used or have buffers, so the setting is not needed.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
lucapost
Veteran
Veteran


Joined: 24 Nov 2005
Posts: 1419
Location: <ud|me|ts> - Italy

PostPosted: Sun Jan 18, 2009 1:34 pm    Post subject: Reply with quote

I have the same problem.
This is /var/log/messages:
Code:

...
...
Jan 18 14:26:27 jarod hda: status timeout: status=0xd0 { Busy }
Jan 18 14:26:27 jarod ide: failed opcode was: unknown
Jan 18 14:26:27 jarod hda: drive not ready for command
Jan 18 14:26:27 jarod hda: status timeout: status=0xd0 { Busy }
Jan 18 14:26:27 jarod ide: failed opcode was: unknown
Jan 18 14:26:27 jarod hda: drive not ready for command
Jan 18 14:26:32 jarod hda: status timeout: status=0xd0 { Busy }
Jan 18 14:26:32 jarod ide: failed opcode was: unknown
Jan 18 14:26:32 jarod hda: drive not ready for command

This is made by hald, and when I kill it, machine works normaly.
Another info:
Code:
#> emerge --info
Portage 2.1.6.4 (default/linux/amd64/2008.0, gcc-4.1.2, glibc-2.6.1-r0, 2.6.27-gentoo-r7 x86_64)
=================================================================
System uname: Linux-2.6.27-gentoo-r7-x86_64-AMD_Turion-tm-_64_X2_Mobile_Technology_TL-50-with-glibc2.2.5
Timestamp of tree: Fri, 16 Jan 2009 08:05:01 +0000
app-shells/bash:     3.2_p39
dev-lang/python:     2.5.2-r7
dev-util/cmake:      2.4.6-r1
sys-apps/baselayout: 2.0.0
sys-apps/openrc:     0.4.2
sys-apps/sandbox:    1.2.18.1-r2
sys-devel/autoconf:  2.13, 2.63
sys-devel/automake:  1.7.9-r1, 1.9.6-r2, 1.10.2
sys-devel/binutils:  2.18-r3
sys-devel/gcc-config: 1.4.0-r4
sys-devel/libtool:   1.5.26
virtual/os-headers:  2.6.27-r2
ACCEPT_KEYWORDS="amd64"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-march=k8 -O2 -pipe"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/terminfo /etc/texmf/web2c /etc/udev/rules.d"
CXXFLAGS="-march=k8 -O2 -pipe"
DISTDIR="/usr/portage/distfiles"
FEATURES="collision-protect distlocks fixpackages parallel-fetch protect-owned sandbox sfperms strict unmerge-orphans userfetch"
GENTOO_MIRRORS="http://distfiles.gentoo.org http://distro.ibiblio.org/pub/linux/distributions/gentoo"
LANG="en_US.utf8"
LC_ALL="en_US.utf8"
LDFLAGS="-Wl,-O1"
LINGUAS="en"
MAKEOPTS="-j2"
PKGDIR="/usr/portage/packages"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/usr/local/portage"
SYNC="rsync://rsync.gentoo.org/gentoo-portage"
USE="X alsa amd64 berkdb bzip2 cli cracklib crypt cups dri fortran gdbm gpm gtk iconv isdnlog jpeg midi mmx mudflap multilib ncurses nls nptl nptlonly opengl openmp pam pcre png pppd readline reflection session spl sse sse2 ssl svg sysfs tcpd tiff unicode vim-syntax xorg zlib" ALSA_CARDS="hda-intel" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" ELIBC="glibc" INPUT_DEVICES="keyboard evdev synaptics mouse" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LINGUAS="en" USERLAND="GNU" VIDEO_CARDS="nvidia"
Unset:  CPPFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, FFLAGS, INSTALL_MASK, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS

And the problem is the same when I disable unmaskirq via hdparm:
Code:
#> hdparm /dev/hdc
/dev/hdc:
 multcount     =  0 (off)
 IO_support    =  1 (32-bit)
 unmaskirq     =  0 (off)
 using_dma     =  1 (on)
 keepsettings  =  0 (off)
 readonly      =  0 (off)
 readahead     = 256 (on)
 geometry      = 65535/16/63, sectors = 117210240, start = 0

and with two alternate harddisk.
_________________
LP
Back to top
View user's profile Send private message
lucapost
Veteran
Veteran


Joined: 24 Nov 2005
Posts: 1419
Location: <ud|me|ts> - Italy

PostPosted: Wed Jan 21, 2009 5:48 pm    Post subject: Reply with quote

up!
_________________
LP
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54809
Location: 56N 3W

PostPosted: Wed Jan 21, 2009 7:27 pm    Post subject: Reply with quote

lucapost,

emerge smartmontools and look at the drives internal error log.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
drescherjm
Advocate
Advocate


Joined: 05 Jun 2004
Posts: 2790
Location: Pittsburgh, PA, USA

PostPosted: Tue Feb 03, 2009 3:17 am    Post subject: Reply with quote

Quote:
This is made by hald, and when I kill it, machine works normaly.


I am having similar bad behavior when hald is running. It prevents me from using hdparm to have the drive spin down after a period of inactivity. With hald installed when I do

Code:
hdparm -S 243 /dev/sda


The hard drive activity light sticks on and it appears to be having trouble accessing the disk. When hald is not started I do not have this. Also there are absolutely no errors with smart and smart tests all pass.

Code:
[  140.282945] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[  140.282954] ata2.00: cmd e3/00:f3:00:00:00/00:00:00:00:00/40 tag 0
[  140.282956]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[  140.282961] ata2.00: status: { DRDY }
[  140.282973] BAR5:00:02 01:7F 02:22 03:CA 04:00 05:00 06:00 07:00 08:00 09:00 0A:00 0B:00 0C:07 0D:00 0E:00 0F:00
[  146.554873] ata2: link is slow to respond, please be patient (ready=0)
[  152.226507] ata2: device not ready (errno=-16), forcing hardreset
[  152.226517] ata2: soft resetting link
[  158.976976] ata2: link is slow to respond, please be patient (ready=0)
[  165.001733] ata2: SRST failed (errno=-16)
[  165.001745] ata2: soft resetting link


This does not look like either error message. I am thinking I should open up my own post. It does only happen when hald is running though.
_________________
John

My gentoo overlay
Instructons for overlay
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum