View previous topic :: View next topic |
Author |
Message |
kpep01 Tux's lil' helper
Joined: 20 Oct 2005 Posts: 96 Location: Seattle Area, WA
|
Posted: Mon Aug 18, 2008 2:20 am Post subject: Weird HD Progressive Meltdown |
|
|
This problem may have started earlier on an older (now gone) X86 box. The old box died what I thought to be a natural death of old-age - now I wonder.
Last November (2007) I plugged in all of the old hard drives to my AMD64 box, and copied the files over to my SATA drives. By May of 2008, the ADM 64 box was becoming unresponsive and locking up. Eventually, the core system files were wiped out.
The SD that I was using was less than one-year old, so, that should not have been an issue.
When the system eventually wouldn't do squat, I simply rebuilt using a spare SATA drive I had sitting around. I rebuild Gentoo on the new SATA, then copied my files from the old SATA onto the new one - it was a bit of a chore, considering the I/O errors, but, I got the job done.
After the file process copy was complete, I reformatted the old SATA, and copied my archives back onto that SD.
Again, this was last May (2008). The new SATA drive had never been used, so was clean.
Within the last week, the progressive SD meltdown began happening again. Recognizing what was happening, I was quick to ensure that my current config files/etc. were all copied to the older SATA drive, as well as all of the other important stuff on the newer drive.
This process was again complicated by numerous I/O errors, but, I did manage to save my files.
I then began the laborious task of rebuilding Gentoo on the newer drive starting with a complete reformatting of the drive (if there is something living on the drive, I wanted it gone.)
Now we come to the fun part.
I can't reformat the drive - whether using multiple partitions or using it as one single partition. There are numerous I/O errors, etc. This really shouldn't be the case on a drive that is less than 6-months old.
Most interesting among the error messages is (or seems to be in my paranoid mind):
Code: | sd :0:0:0:0 rejecting I/O to offine devise |
There are numerous other error codes that are making me a tad paranoid.
I've gone through a very laborious process of reformatting partitions until the error codes disappear, only to find them further along the line. Once I get all the way through reformatting the drive, I start over, only to find the errors back in partitions where I thought they'd been worked out.
The same thing happens when I wipe out the partitions and try to reformat the SATA as a single partition.
I could go on at greater length, but, I'd like some idea as to where to start at locating this problem.
I know enough about Linux to be dangerous to myself, and that's about it. This problem, however, seems not typical of the normal mundane Gentoo stuff. Call me paranoid, but, I don't think the cheese is smelling too good.
(I believe this due to the progressive and recurring nature of the problem.)
Let me know what you think, and what more info you need so we can sort this problem out. _________________ There is no higher religion than human service.
To work for the common good is the highest creed.
-Albert Schweitzer |
|
Back to top |
|
|
bunder Bodhisattva
Joined: 10 Apr 2004 Posts: 5947
|
Posted: Mon Aug 18, 2008 3:38 am Post subject: |
|
|
Moved from Networking & Security to Kernel & Hardware. _________________
Neddyseagoon wrote: | The problem with leaving is that you can only do it once and it reduces your influence. |
banned from #gentoo since sept 2017 |
|
Back to top |
|
|
Janne Pikkarainen Veteran
Joined: 29 Jul 2003 Posts: 1143 Location: Helsinki, Finland
|
Posted: Mon Aug 18, 2008 5:51 am Post subject: |
|
|
Bad cable, bad RAM, too much heat, or actually a dying drive. _________________ Yes, I'm the man. Now it's your turn to decide if I meant "Yes, I'm the male." or "Yes, I am the Unix Manual Page.". |
|
Back to top |
|
|
manaka Apprentice
Joined: 23 Jul 2007 Posts: 178 Location: Spain
|
Posted: Mon Aug 18, 2008 7:34 am Post subject: |
|
|
Could you post the output of dmesg to see the kernel messages?
The output of smartctl -a /dev/sda could also be of help to spot a dying drive. _________________ Javier Miqueleiz
"Listen to your heart. It knows all things, because it came from the Soul of the World, and it will one day return there." |
|
Back to top |
|
|
pappy_mcfae Watchman
Joined: 27 Dec 2007 Posts: 5999 Location: Pomona, California.
|
Posted: Mon Aug 18, 2008 9:55 pm Post subject: |
|
|
A drive can fail at six hours old, six days old, six months old...for any number of reasons. When you consider the physics involved with surface mount chips, it's a wonder that most cutting edge electronic systems don't just fall into smoldering piles of ashes. When one considers the physics of how the information is physically stored on the drive's platters it's a miracle that data can survive at all.
If I were you, I'd take said drive and get it replaced. Don't worry about theories. If you still have the receipt, replace! I have a Maxtor that's going back soon. It went to sleep functional after I removed it from my tower. I went to do some funtoo experimentation, and the drive was dead in the water. It spins up, but no head accessing, clicking, or the like. It's less than two years old (5 year wty).
I have no idea what killed it. All I know is it was dead, and I was really glad that there was NOTHING of import on the drive. As NeddySeagon's sig says,
Code: | "Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail." |
I used to be the latter. Then I lost some REALLY important stuff. Now, I do backups. And, it just saved my bacon. Stage-4 backups rock!
Blessed be!
Pappy _________________ This space left intentionally blank, except for these ASCII symbols. |
|
Back to top |
|
|
aka.doode n00b
Joined: 16 May 2005 Posts: 45 Location: Sweden
|
Posted: Mon Aug 18, 2008 10:08 pm Post subject: Re: Weird HD Progressive Meltdown |
|
|
kpep01 wrote: | Last November (2007) I plugged in all of the old hard drives to my AMD64 box, and copied the files over to my SATA drives. By May of 2008, the ADM 64 box was becoming unresponsive and locking up. Eventually, the core system files were wiped out.... |
What kernel version were you using? The same all the time, or did you upgrade when a new version was released? The reason I ask is because a couple of days ago when searching the gentoo forums I came across a thread stating that a bug in an older kernel (2.6.17, or something like that) could mess up your hard drive over time in a weird way. This sure sounds like that, but I honestly don't know for sure, and sadly I can't remember in what thread I read it :/
Disregard from my input if you're using a fairly new kernel; if you aren't - could be worth looking into it.
Good luck |
|
Back to top |
|
|
cyrillic Watchman
Joined: 19 Feb 2003 Posts: 7313 Location: Groton, Massachusetts USA
|
Posted: Tue Aug 19, 2008 1:11 am Post subject: |
|
|
I would tend to agree with Janne Pikkarainen as far as the "too much heat" thing goes.
This is the number one reason I have seen my own harddrives fail.
Now, I make sure my computer cases have plenty of ventilation, especially around the harddrives. |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|