View previous topic :: View next topic |
Author |
Message |
oneeyedelf1 Tux's lil' helper


Joined: 04 Feb 2004 Posts: 124
|
Posted: Sun Oct 09, 2005 8:22 pm Post subject: is my hd failing? raid5 |
|
|
well I have had my computer all setup, all I do is turn it off when I go back home for breaks. Now I get this upon boot...
Basically Disk hda is going and protecting part of itself. When I boot the drive in another computer I dont get this agrivation. Note no real configuration change has been made(since it worked) other then I was seting up mdadm to monitor the status of my raid5 set(which is minor and didnt involve touching the set). Is it the disk, my computer, the chipset, the kernel, a status flag that I can reset? please help. Oh yeah its part of a raid5 data set, and has seperately the root and boot on the drive. Below is dmesg output, up until the point where it stops to boot, and asks for my password because md fails.
Code: |
Linux version 2.6.12-rc6 (root@localhost) (gcc version 3.3.5 (Gentoo Linux 3.3.5-r1, ssp-3.3.2-3, pie-8.7.7.1)) #3 SMP Mon Jun 20 14:27:18 EDT 2005
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 000000002fff0000 (usable)
BIOS-e820: 000000002fff0000 - 000000002fff8000 (ACPI data)
BIOS-e820: 000000002fff8000 - 0000000030000000 (ACPI NVS)
BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved)
BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
BIOS-e820: 00000000fffc0000 - 0000000100000000 (reserved)
767MB LOWMEM available.
found SMP MP-table at 000fb9b0
On node 0 totalpages: 196592
DMA zone: 4096 pages, LIFO batch:1
Normal zone: 192496 pages, LIFO batch:31
HighMem zone: 0 pages, LIFO batch:1
DMI 2.3 present.
ACPI: RSDP (v000 AMI ) @ 0x000fa800
ACPI: RSDT (v001 AMIINT AMIINI09 0x00000010 MSFT 0x0100000d) @ 0x2fff0000
ACPI: FADT (v001 AMIINT AMIINI09 0x00000011 MSFT 0x0100000d) @ 0x2fff0030
ACPI: MADT (v001 AMIINT AMIINI09 0x00000011 MSFT 0x0100000d) @ 0x2fff00c0
ACPI: DSDT (v001 VIA VIA_K7 0x00001000 MSFT 0x0100000d) @ 0x00000000
ACPI: Local APIC address 0xfee00000
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
Processor #0 6:4 APIC version 16
ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 2, version 2, address 0xfec00000, GSI 0-23
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 low level)
ACPI: IRQ0 used by override.
ACPI: IRQ2 used by override.
ACPI: IRQ9 used by override.
Enabling APIC mode: Flat. Using 1 I/O APICs
Using ACPI (MADT) for SMP configuration information
Allocating PCI resources starting at 30000000 (gap: 30000000:cec00000)
Built 1 zonelists
Kernel command line: root=/dev/hda3
mapped APIC to ffffd000 (fee00000)
mapped IOAPIC to ffffc000 (fec00000)
Initializing CPU#0
PID hash table entries: 4096 (order: 12, 65536 bytes)
Detected 1406.876 MHz processor.
Using tsc for high-res timesource
Console: colour VGA+ 80x25
Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
Memory: 774116k/786368k available (2888k kernel code, 11740k reserved, 1116k data, 240k init, 0k highmem)
Checking if this processor honours the WP bit even in supervisor mode... Ok.
Calibrating delay loop... 2768.89 BogoMIPS (lpj=1384448)
Mount-cache hash table entries: 512
CPU: After generic identify, caps: 0183fbff c1c7fbff 00000000 00000000 00000000 00000000 00000000
CPU: After vendor identify, caps: 0183fbff c1c7fbff 00000000 00000000 00000000 00000000 00000000
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 256K (64 bytes/line)
CPU: After all inits, caps: 0183fbff c1c7fbff 00000000 00000020 00000000 00000000 00000000
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
Enabling fast FPU save and restore... done.
Checking 'hlt' instruction... OK.
CPU0: AMD Athlon(tm) Processor stepping 04
Total of 1 processors activated (2768.89 BogoMIPS).
ENABLING IO-APIC IRQs
..TIMER: vector=0x31 pin1=2 pin2=-1
Brought up 1 CPUs
CPU0 attaching sched-domain:
domain 0: span 01
groups: 01
domain 1: span 01
groups: 01
NET: Registered protocol family 16
PCI: PCI BIOS revision 2.10 entry at 0xfdb01, last bus=1
PCI: Using configuration type 1
mtrr: v2.0 (20020519)
ACPI: Subsystem revision 20050309
ACPI: Interpreter enabled
ACPI: Using IOAPIC for interrupt routing
ACPI: PCI Root Bridge [PCI0] (0000:00)
PCI: Probing PCI hardware (bus 00)
Boot video device is 0000:01:00.0
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: Power Resource [URP1] (off)
ACPI: Power Resource [URP2] (off)
ACPI: Power Resource [FDDP] (off)
ACPI: Power Resource [LPTP] (off)
ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 10 *11 12 14 15)
ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 10 *11 12 14 15)
ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 *10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 10 *11 12 14 15)
Linux Plug and Play Support v0.97 (c) Adam Belay
pnp: PnP ACPI init
pnp: PnP ACPI: found 13 devices
SCSI subsystem initialized
PCI: Using ACPI for IRQ routing
PCI: If a device doesn't work, try "pci=routeirq". If it helps, post a report
Machine check exception polling timer started.
audit: initializing netlink socket (disabled)
audit(1128874355.788:0): initialized
Installing knfsd (copyright (C) 1996 okir@monad.swb.de).
Initializing Cryptographic API
ACPI: Power Button (FF) [PWRF]
lp: driver loaded but no devices found
Linux agpgart interface v0.101 (c) Dave Jones
[drm] Initialized drm 1.0.0 20040925
PNP: PS/2 Controller [PNP0303:PS2K,PNP0f03:PS2M] at 0x60,0x64 irq 1,12
serio: i8042 AUX port at 0x60,0x64 irq 12
serio: i8042 KBD port at 0x60,0x64 irq 1
Serial: 8250/16550 driver $Revision: 1.90 $ 8 ports, IRQ sharing disabled
ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
parport: PnPBIOS parport detected.
parport0: PC-style at 0x378 (0x778), irq 7 [PCSPP(,...)]
lp0: using parport0 (interrupt-driven).
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered
Floppy drive(s): fd0 is 1.44M
FDC 0 is a post-1991 82077
ACPI: PCI Interrupt 0000:00:09.0[A] -> GSI 17 (level, low) -> IRQ 17
3c59x: Donald Becker and others. www.scyld.com/network/vortex.html
0000:00:09.0: 3Com PCI 3c905C Tornado at 0xd800. Vers LK1.1.19
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
PDC20268: IDE controller at PCI slot 0000:00:0a.0
ACPI: PCI Interrupt 0000:00:0a.0[A] -> GSI 18 (level, low) -> IRQ 18
PDC20268: chipset revision 2
PDC20268: ROM enabled at 0xdffe0000
PDC20268: 100% native mode on irq 18
ide2: BM-DMA at 0xdc00-0xdc07, BIOS settings: hde:pio, hdf:pio
ide3: BM-DMA at 0xdc08-0xdc0f, BIOS settings: hdg:pio, hdh:pio
Probing IDE interface ide2...
hde: Maxtor 6Y160P0, ATA DISK drive
ide2 at 0xec00-0xec07,0xe802 on irq 18
Probing IDE interface ide3...
hdg: WDC WD1600JB-00DUA3, ATA DISK drive
ide3 at 0xe400-0xe407,0xe002 on irq 18
PDC20268: IDE controller at PCI slot 0000:00:0c.0
ACPI: PCI Interrupt 0000:00:0c.0[A] -> GSI 16 (level, low) -> IRQ 16
PDC20268: chipset revision 2
PDC20268: ROM enabled at 0xdffb0000
PDC20268: 100% native mode on irq 16
ide4: BM-DMA at 0xc400-0xc407, BIOS settings: hdi:pio, hdj:pio
ide5: BM-DMA at 0xc408-0xc40f, BIOS settings: hdk:pio, hdl:pio
Probing IDE interface ide4...
hdi: WDC WD1600JB-00FUA0, ATA DISK drive
ide4 at 0xd400-0xd407,0xd002 on irq 16
Probing IDE interface ide5...
hdk: WDC WD1600JB-00EVA0, ATA DISK drive
ide5 at 0xcc00-0xcc07,0xc802 on irq 16
VP_IDE: IDE controller at PCI slot 0000:00:11.1
ACPI: PCI Interrupt 0000:00:11.1[A]: no GSI
VP_IDE: chipset revision 6
VP_IDE: not 100% native mode: will probe irqs later
VP_IDE: VIA vt8233 (rev 00) IDE UDMA100 controller on pci0000:00:11.1
ide0: BM-DMA at 0xfc00-0xfc07, BIOS settings: hda:DMA, hdb:pio
ide1: BM-DMA at 0xfc08-0xfc0f, BIOS settings: hdc:DMA, hdd:pio
Probing IDE interface ide0...
hda: probing with STATUS(0x50) instead of ALTSTATUS(0x7f)
hda: Maxtor 6L200P0, ATA DISK drive
hdb: probing with STATUS(0x00) instead of ALTSTATUS(0x7f)
hdb: probing with STATUS(0x00) instead of ALTSTATUS(0x7f)
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
Probing IDE interface ide1...
hdc: Maxtor 6Y160P0, ATA DISK drive
ide1 at 0x170-0x177,0x376 on irq 15
hde: max request size: 1024KiB
hde: 320173056 sectors (163928 MB) w/7936KiB Cache, CHS=19929/255/63, UDMA(100)
hde: cache flushes supported
hde: hde1
hdg: max request size: 1024KiB
hdg: 312581808 sectors (160041 MB) w/8192KiB Cache, CHS=19457/255/63, UDMA(100)
hdg: cache flushes supported
hdg: hdg1
hdi: max request size: 1024KiB
hdi: 312581808 sectors (160041 MB) w/8192KiB Cache, CHS=19457/255/63, UDMA(100)
hdi: cache flushes supported
hdi: hdi1
hdk: max request size: 1024KiB
hdk: 312581808 sectors (160041 MB) w/8192KiB Cache, CHS=19457/255/63, UDMA(100)
hdk: cache flushes supported
hdk: hdk1
hda: max request size: 1024KiB
hda: Host Protected Area detected.
current capacity is 398297088 sectors (203928 MB)
native capacity is 208391808845824 sectors (106696606129 MB)
hda: task_no_data_intr: status=0x51 { DriveReady SeekComplete Error }
hda: task_no_data_intr: error=0x04 { DriveStatusError }
ide: failed opcode was: 0x37
hda: 398297088 sectors (203928 MB) w/8192KiB Cache, CHS=24792/255/63, UDMA(100)
hda: cache flushes supported
hda: hda1 hda2 hda3 hda4
hdc: max request size: 1024KiB
hdc: 320173056 sectors (163928 MB) w/7936KiB Cache, CHS=19929/255/63, UDMA(100)
hdc: cache flushes supported
hdc: hdc1
libata version 1.11 loaded.
ieee1394: raw1394: /dev/raw1394 device initialized
mice: PS/2 mouse device common for all mice
md: raid5 personality registered as nr 4
raid5: measuring checksumming speed
8regs : 1880.000 MB/sec
8regs_prefetch: 1780.000 MB/sec
32regs : 1316.000 MB/sec
32regs_prefetch: 1388.000 MB/sec
pII_mmx : 3760.000 MB/sec
p5_mmx : 5044.000 MB/sec
raid5: using function: p5_mmx (5044.000 MB/sec)
md: md driver 0.90.1 MAX_MD_DEVS=256, MD_SB_DISKS=27
device-mapper: 4.4.0-ioctl (2005-01-12) initialised: dm-devel@redhat.com
Advanced Linux Sound Architecture Driver Version 1.0.9rc2 (Thu Mar 24 10:33:39 2005 UTC).
ALSA device list:
No soundcards found.
oprofile: using NMI interrupt.
NET: Registered protocol family 2
IP: routing cache hash table of 4096 buckets, 64Kbytes
TCP established hash table entries: 131072 (order: 9, 2097152 bytes)
TCP bind hash table entries: 65536 (order: 7, 786432 bytes)
TCP: Hash tables configured (established 131072 bind 65536)
ip_conntrack version 2.1 (6143 buckets, 49144 max) - 220 bytes per conntrack
ip_tables: (C) 2000-2002 Netfilter core team
input: AT Translated Set 2 keyboard on isa0060/serio0
ipt_recent v0.3.1: Stephen Frost <sfrost@snowman.net>. http://snowman.net/projects/ipt_recent/
arp_tables: (C) 2002 David S. Miller
NET: Registered protocol family 1
NET: Registered protocol family 17
md: Autodetecting RAID arrays.
md: autorun ...
md: ... autorun DONE.
EXT3-fs: mounted filesystem with ordered data mode.
VFS: Mounted root (ext3 filesystem) readonly.
Freeing unused kernel memory: 240k freed
kjournald starting. Commit interval 5 seconds
input: ImExPS/2 Generic Explorer Mouse on isa0060/serio1
Adding 1469936k swap on /dev/hda2. Priority:-1 extents:1
EXT3 FS on hda3, internal journal
md: md0 stopped.
md: raidstart(pid 4118) used deprecated START_ARRAY ioctl. This will not be supported beyond 2.6
md: autorun ...
md: considering hdk1 ...
md: adding hdk1 ...
md: adding hdi1 ...
md: adding hdg1 ...
md: adding hde1 ...
md: adding hdc1 ...
md: adding hda4 ...
md: created md0
md: bind<hda4>
md: bind<hdc1>
md: bind<hde1>
md: bind<hdg1>
md: bind<hdi1>
md: bind<hdk1>
md: running: <hdk1><hdi1><hdg1><hde1><hdc1><hda4>
raid5: device hdk1 operational as raid disk 5
raid5: device hdi1 operational as raid disk 4
raid5: device hdg1 operational as raid disk 3
raid5: device hde1 operational as raid disk 2
raid5: device hdc1 operational as raid disk 1
raid5: device hda4 operational as raid disk 0
raid5: allocated 6292kB for md0
raid5: raid level 5 set md0 active with 6 out of 6 devices, algorithm 3
RAID5 conf printout:
--- rd:6 wd:6 fd:0
disk 0, o:1, dev:hda4
disk 1, o:1, dev:hdc1
disk 2, o:1, dev:hde1
disk 3, o:1, dev:hdg1
disk 4, o:1, dev:hdi1
disk 5, o:1, dev:hdk1
md: ... autorun DONE. |
|
|
Back to top |
|
 |
flybynite l33t

Joined: 06 Dec 2002 Posts: 620
|
Posted: Mon Oct 10, 2005 3:32 am Post subject: |
|
|
I am not a raid expert!! I have had disks fail however. You can use mdadm to fail this disk and then run the raid array degraded if you need to. Then you can treat this disk like any other.
This is the first I've seen of the host protected area. Look at this page to see some stuff about HPA down a little from the top.
http://www.sleuthkit.org/informer/sleuthkit-informer-17.html
It seems some software can detect it and some can't. That maybe why it shows now after a kernel or other software upgrade?
You need to run smartctl to check the status of the disk. This will check the builtin failure detection of the disk.
* sys-apps/smartmontools
Available versions: 5.33
Installed: 5.33
Homepage: http://smartmontools.sourceforge.net/
Description: control and monitor storage systems using the Self-Monitoring, Analysis and Reporting Technology
emerge -va smartmontools
Then
smartctl -a /dev/hda
Which will show something like this:
Code: |
gate1 ~ # smartctl -a /dev/hdc
smartctl version 5.33 [i686-pc-linux-gnu] Copyright (C) 2002-4 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
=== START OF INFORMATION SECTION ===
Device Model: ST3200822A
Serial Number: 3LJ06ZE9
Firmware Version: 3.01
User Capacity: 200,049,647,616 bytes
Device is: In smartctl database [for details use: -P show]
ATA Version is: 6
ATA Standard is: ATA/ATAPI-6 T13 1410D revision 2
Local Time is: Sun Oct 9 22:28:26 2005 CDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 430) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
No General Purpose Logging support.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 111) minutes.
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 051 047 006 Pre-fail Always - 72127917
3 Spin_Up_Time 0x0003 096 096 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 90
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 087 060 030 Pre-fail Always - 568101779
9 Power_On_Hours 0x0032 090 090 000 Old_age Always - 9521
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 112
194 Temperature_Celsius 0x0022 044 052 000 Old_age Always - 44
195 Hardware_ECC_Recovered 0x001a 051 046 000 Old_age Always - 72127917
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0000 100 253 000 Old_age Offline - 0
202 TA_Increase_Count 0x0032 100 253 000 Old_age Always - 0
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 6427 -
# 2 Short offline Completed without error 00% 6185 -
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
|
|
|
Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|