Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
emerge sync fails, is it ext3 or is my drive dying? [solved]
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
nick_already_taken
Tux's lil' helper
Tux's lil' helper


Joined: 15 Jan 2005
Posts: 137

PostPosted: Sat Nov 22, 2008 9:52 pm    Post subject: emerge sync fails, is it ext3 or is my drive dying? [solved] Reply with quote

[Part Two]

Here is a bit more information:

I have verified my drive with HUTIL 2.10 (the harddrive diagnostics software) from Samsung. The diagnostic data shows no errors. Meanwhile I have
set my ext3 partition to get verified after each mount. In one of three system boots the filesystemcheck detects an error, that has to be corrected
with fschk.ext3 manually.

I have upgraded my motherboard to the lastest ASROCK BIOS version. It made no difference. I have run memtest to assure that my memory
isn't dying with no effect.

Has anyone an idea what else I could do? I am quite desperate as I have absolutely no idea what is going on here.

Thanks for reading.



[Parte One]


Hi,

I am experiencing repeated problems https://forums.gentoo.org/viewtopic-t-708985-highlight-.html after running "emerge sync".

This time I receive:

Quote:

System Info: AMD Athlon 64 3700+, 2048 MB DDR-RAM, 2*160 GB HDD, Gentoo Linux, Xen
Connection: 100 MBit/s, rsync limited to 20 connections
Location: Nuernberg, Germany
Contact: Sven Wegener <swegener@gentoo.org>

receiving incremental file list
file has vanished: "/usr/portage/mail-filter/libmilter/ChangeLog"
IO error encountered -- skipping file deletion
mail-filter/libmilter/ChangeLog
rsync: rename "/usr/portage/mail-filter/libmilter/.ChangeLog.dUT8s4" -> "mail-filter/libmilter/ChangeLog": No such file or directory (2)

Number of files: 129050
Number of files transferred: 1
Total file size: 167374355 bytes
Total transferred file size: 2053 bytes
Literal data: 2053 bytes
Matched data: 0 bytes
File list size: 3027160
File list generation time: 0.001 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 19591
Total bytes received: 3111611

sent 19591 bytes received 3111611 bytes 152741.56 bytes/sec
total size is 167374355 speedup is 53.45
rsync error: some files could not be transferred (code 23) at main.c(1497) [generator=3.0.2]
>>> Exceeded PORTAGE_RSYNC_RETRIES: 3
real 223.74
user 5.44
sys 7.71
* emerge --sync failed


The damaged file looks like after running "ls -l":

Quote:

-????????? ? ? ? ? ? ChangeLog


After I run fsck.ext3 everything is fine. But only a few days later the problem shows up again.

I use a 2.6.24-gentoo-r8 kernel together with an ext3 filesystem

Quote:

Filesystem volume name: <none>
Last mounted on: <not available>
Filesystem UUID: 8c2281ee-8457-46d8-b931-280fca816728
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery sparse_super large_file
Filesystem flags: signed_directory_hash
Default mount options: (none)
Filesystem state: clean with errors
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 2564096
Block count: 10239421
Reserved block count: 511971
Free blocks: 7358250
Free inodes: 2078208
First block: 0
Block size: 4096
Fragment size: 4096
Reserved GDT blocks: 1021
Blocks per group: 32768
Fragments per group: 32768
Inodes per group: 8192
Inode blocks per group: 512
Filesystem created: Sat Jun 7 13:53:04 2008
Last mount time: Sat Oct 11 17:04:46 2008
Last write time: Sat Oct 11 15:33:17 2008
Mount count: 5
Maximum mount count: 29
Last checked: Wed Oct 8 17:12:44 2008
Check interval: 15552000 (6 months)
Next check after: Mon Apr 6 17:12:44 2009
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 256
Journal inode: 8
First orphan inode: 2342969
Default directory hash: tea
Directory Hash Seed: d602c3a4-62e9-4122-9339-2b1bc1e3b0a1
Journal backup: inode blocks


I use the following brand new drive:

Quote:

/dev/sda:

ATA device, with non-removable media
Model Number: SAMSUNG HD200HJ
Serial Number: S16KJ9CPC00227
Firmware Revision: KF100-06
Standards:
Used: ATA-8-ACS revision 3b
Supported: 7 6 5 4
Configuration:
Logical max current
cylinders 16383 16383
heads 16 16
sectors/track 63 63
--
CHS current addressable sectors: 16514064
LBA user addressable sectors: 268435455
LBA48 user addressable sectors: 390721968
device size with M = 1024*1024: 190782 MBytes
device size with M = 1000*1000: 200049 MBytes (200 GB)
Capabilities:
LBA, IORDY(can be disabled)
Queue depth: 32
Standby timer values: spec'd by Standard, no device specific minimum
R/W multiple sector transfer: Max = 16 Current = 16
Advanced power management level: disabled
Recommended acoustic management value: 254, current value: 0
DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6 udma7
Cycle time: min=120ns recommended=120ns
PIO: pio0 pio1 pio2 pio3 pio4
Cycle time: no flow control=120ns IORDY flow control=120ns
Commands/features:
Enabled Supported:
* SMART feature set
Security Mode feature set
* Power Management feature set
* Write cache
* Look-ahead
* Host Protected Area feature set
* WRITE_BUFFER command
* READ_BUFFER command
* NOP cmd
* DOWNLOAD_MICROCODE
Advanced Power Management feature set
SET_MAX security extension
Automatic Acoustic Management feature set
* 48-bit Address feature set
* Device Configuration Overlay feature set
* Mandatory FLUSH_CACHE
* FLUSH_CACHE_EXT
* SMART error logging
* SMART self-test
* General Purpose Logging feature set
* 64-bit World wide name
* WRITE_UNCORRECTABLE_EXT command
* {READ,WRITE}_DMA_EXT_GPL commands
* Segmented DOWNLOAD_MICROCODE
* SATA-I signaling speed (1.5Gb/s)
* SATA-II signaling speed (3.0Gb/s)
* Native Command Queueing (NCQ)
* Host-initiated interface power management
* Phy event counters
DMA Setup Auto-Activate optimization
Device-initiated interface power management
* Software settings preservation
* SMART Command Transport (SCT) feature set
* SCT Long Sector Access (AC1)
* SCT LBA Segment Access (AC2)
* SCT Error Recovery Control (AC3)
* SCT Features Control (AC4)
* SCT Data Tables (AC5)
Security:
Master password revision code = 65534
supported
not enabled
not locked
not frozen
not expired: security count
supported: enhanced erase
50min for SECURITY ERASE UNIT. 50min for ENHANCED SECURITY ERASE UNIT.
Logical Unit WWN Device Identifier: 5000f009bc0227
NAA : 5
IEEE OUI : f0
Unique ID : 09bc00227
Checksum: correct


Here is the smartctl output

Quote:

smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART Error Log Version: 1
ATA Error Count: 1
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 1 occurred at disk power-on lifetime: 323 hours (13 days + 11 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 18 3a bd 43 e1 Error: UNC 24 sectors at LBA = 0x0143bd3a = 21216570

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 18 3a bd 43 e1 00 00:00:41.563 READ DMA
c8 00 20 e2 bc 43 e1 00 00:00:36.688 READ DMA
c8 00 18 b2 bc 43 e1 00 00:00:36.688 READ DMA
c8 00 18 82 bc 43 e1 00 00:00:36.688 READ DMA
c8 00 08 c2 bc 20 e1 00 00:00:36.688 READ DMA


smartctl -H /dev/sda

Quote:

smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED


Besides this problem I have no problems, no lock-ups, nothing. Until a few months ago I had used
reiserfs for 4 years on another drive with no problems.

Is it ext3 or is my drive starting to fail?


Last edited by nick_already_taken on Fri Jan 02, 2009 9:16 pm; edited 1 time in total
Back to top
View user's profile Send private message
chtof
n00b
n00b


Joined: 29 Aug 2003
Posts: 62
Location: France

PostPosted: Sun Nov 23, 2008 12:53 am    Post subject: Reply with quote

According to http://lists.samba.org/archive/rsync/2006-December/016888.html, it can be due from a problematic file. Can you try to delete /usr/portage
Code:
rm -rf /usr/portage/*
and restart
Code:
emerge --sync
?
In a second time, if you continue to have this problem, try to launch the "samsung disgnostic" when your disk is "hot".
Back to top
View user's profile Send private message
jcat
Veteran
Veteran


Joined: 26 May 2006
Posts: 1337

PostPosted: Sun Nov 23, 2008 11:43 am    Post subject: Re: emerge sync fails, is it ext3 or is my drive dying? (Par Reply with quote

nick_already_taken wrote:


Until a few months ago I had used
reiserfs for 4 years on another drive with no problems.

Is it ext3 or is my drive starting to fail?


Err, without wishing to start a flame war, Reiser is far less mature than Ext! I seriously doubt you have found some bug in Ext3, this is much more likely to be your drive, even if it's new, and even if it passes the manufacturers diagnostic tests. I'm not saying it's definitely the drive, but it's the most likely option IMHO.

You could try backing up and reformatting that problem partition, it's worth a shot. It's also worth re-seating all cables to the drive (at both ends).



Cheers,
jcat
Back to top
View user's profile Send private message
nick_already_taken
Tux's lil' helper
Tux's lil' helper


Joined: 15 Jan 2005
Posts: 137

PostPosted: Sun Dec 07, 2008 2:37 pm    Post subject: Reply with quote

Thanks for your advice. I have moved my portage directory now to a different partition.

After every system boot I have to run fsck manually. The funny thing is that everytime the same file is reported as corrupt.
It is "Xorg.0.log.old". So it seems that the problem has nothing to do with rsync and the portage tree.

If I have the time I will reformat my whole root partition and see what happens.
Back to top
View user's profile Send private message
nick_already_taken
Tux's lil' helper
Tux's lil' helper


Joined: 15 Jan 2005
Posts: 137

PostPosted: Fri Jan 02, 2009 9:16 pm    Post subject: Reply with quote

After I reformated my root partition and restored everything the problem seems to have vanished. I still do not know, what the problem was. But of course I don't have to understand everything.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum