Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
[Solved] System freeze. possible hard drive failure?
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo
View previous topic :: View next topic  
Author Message
dystopic_utopia
n00b
n00b


Joined: 17 Sep 2019
Posts: 12

PostPosted: Sun Mar 29, 2020 4:39 pm    Post subject: [Solved] System freeze. possible hard drive failure? Reply with quote

I am posting this thread here, since it didn't really seem to fit into the categories.

Yesterday, I had the rare, in my experience, occurrence of my system freezing up. I mean the type of freeze up where ctrl+alt+F1-F6 would not bring up a console, let alone any noticeable change in my system. After giving it around 10 minutes to see if it would right itself, I did a hard power down of my laptop. The problem occurred while I was running emerge on seamonkey-2.53-r1, which I wouldn't think would be the issue, since I have compiled the prior version, let alone much of the other packages on my system. I have booted from a systemrescuecd image (I know it is Arch based now, but the copy of the minimal installation image I used, didn't seem to have smartctl on it), and have ran smartctl on the hard drive that came with this laptop.

Here is the output of smartctl --all /dev/sda after running smartctl --test=long: http://dpaste.com/3E37YV8
Edit: removed period.

My only assumption is that any errors from a storage device means, "prepare for immanent failure", let alone 612 of them. Could these have been the cause of my system hanging, especially during a build?

If it is at all relevant, my Gentoo partition is a LUKS->LVM->ext4 setup.
I haven't booted into Gentoo, let alone the OS that came with this laptop, since the freeze. I am thinking about mounting my root partition read-only, after decrypting, activating and fsck.ext4-ing it, so I can see if anything was logged, before the freeze. As well as, at least backup anything in /home/, at least. I will post back if I find anything that might be useful.

The laptop is a refurbished (I know, but I am on budget) Acer Nitro 5. Additional specs to be posted, if they become relevant. My concern of course is, did I get a faulty hard drive, as well as the system freeze itself.

Though the laptop is still within the 90 day warranty period, I am going to assume that installing any non-Windows OS is an automatic void of any warranty. I of course will deal with that issue myself, if I decide to contact Acer about it. If anything, I could stick the hard drive that the original Gentoo install came, from into this machine.

Anyway, thanks for reading, and as always, thanks in advance for any input from the Gentoo community.

[Moderator edit: fixed url. Forum auto-linking considers trailing periods to be part of the URL. [Fix made after Goverp posted.] -Hu]


Last edited by dystopic_utopia on Tue Mar 31, 2020 1:25 pm; edited 2 times in total
Back to top
View user's profile Send private message
Goverp
Advocate
Advocate


Joined: 07 Mar 2007
Posts: 2014

PostPosted: Sun Mar 29, 2020 5:10 pm    Post subject: Reply with quote

A) The URL you gave includes the trailing '.', so it doesn't work.

B) When I use the right one, the report says the drive is healthy. According to some gurus I found with Google, the 610 CRC errors are data transmission between the drive and motherboard - i.e. a bad cable or socket. I wonder if it's something to do with unexpected power-offs (it is a laptop). More interesting is the drive has done 2037 hours, say about 18 months use at 5 hours a day 5 days a week, and has been powered up nearly 900 times. I suspect you can ignore the CRC errors, they're only about one every 3 hours. The hardware will have retried them anyway. The one to worry about is reallocated sectors, and there aren't any.

C) I've found that sort of lockup when I've run out of RAM running a big compilation, such as qtwebengine (=chromium), err, firefox (=seamonkey) and their ilk. Nothing works for me, as the box has to page software in and out to find the program to handle the interrupt. Before blaming the hardware, I'd check you aren't overstressing the system. Typical snafus include "-jtoomany", "--jobs=toomany", not having enough swap space, using tmpfs for portage temp disk and leaving nothing over for the compiler, and so forth.
_________________
Greybeard
Back to top
View user's profile Send private message
dystopic_utopia
n00b
n00b


Joined: 17 Sep 2019
Posts: 12

PostPosted: Sun Mar 29, 2020 9:24 pm    Post subject: Reply with quote

Thank you Goverp, for your response.

I did not find anything helpful in the logs. I also opened up the case and checked that the SATA connected was firmly attached to the hard drive. I do not have any spare cables of that type, let alone open the case enough to see if it is hard wired to the motherboard. I ended up booting Gentoo, and all seems normal so far.

18 months of usage? Could the hard drive they stuck in this model have been a "gently used" component? I do not see how I could have racked up that much time, after having it only a little over two months.

I will just have keep an eye on it, for now. As for seamonkey, I will probably try sorting that one more time. Otherwise, there is always the binary release.
Back to top
View user's profile Send private message
Ant P.
Watchman
Watchman


Joined: 18 Apr 2009
Posts: 6920

PostPosted: Sun Mar 29, 2020 10:47 pm    Post subject: Reply with quote

The hard disk is fine. CRC errors being caught means they're doing their job (preventing silent corruption). They're infrequent enough that it's probably just a slightly flaky cable.

If it's hanging during emerge you either have an overheating issue or OOM problems. Heat is far more likely for a laptop.
Back to top
View user's profile Send private message
dystopic_utopia
n00b
n00b


Joined: 17 Sep 2019
Posts: 12

PostPosted: Tue Mar 31, 2020 1:24 pm    Post subject: Reply with quote

Thank you Ant P. for the extra reassurance about the health of my hard drive.

It seems that Goverp's suggestion about not having enough swap space might have been my problem. Since my last post, I have resized my swap partition, and was able to emerge seamonkey successful. There was a brief moment of anxiety during the emerge, when I trying switch to another xfce-terminal tab that top was running on caused a momentary hang. However, it switched after a second and showed typical resource usage when running emerge. I even ran sensors from the lm_sensors package, and it showed that the temperature of each core wasn't at its peak temperature.
=
I guess the lesson for this situation is to never underestimate the need for swap space, even with modern systems that have have 8GB+ of ram.
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54317
Location: 56N 3W

PostPosted: Tue Mar 31, 2020 1:37 pm    Post subject: Reply with quote

dystopic_utopia,

As its a laptop and its an interface issue,
Code:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
199 UDMA_CRC_Error_Count    0x000a   200   200   000    Old_age   Always       -       610

its worth removing the HDD and replacing it to 'wipe' the connector pins.

This will fix it if its not seated properly for some reason too.

You won't have a data cable that you can replace.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
JustAnother
Apprentice
Apprentice


Joined: 23 Sep 2016
Posts: 186

PostPosted: Tue Aug 25, 2020 3:31 am    Post subject: Reply with quote

With a laptop, always put it on one of those fan bases, and always put spacers under the fan base feet to get the fan base 1/2 inch off the underlying surface. I cut up some erasers to make the spacers.

My Toshiba laptop acted up (i.e., mysterious shutdowns) until I did the above. My sister gave it to me because it always shut down. Finally I asked her how she held it. She put it on her legs.

Also, hit the fan section in the laptop with a blast of canned air. But stick a pencil in there to keep the fan from spinning up. The back emf from the fan can blow things up.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum