Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Hardware issues - but can't work out what. Diagnostic tips?
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
optiluca
Guru
Guru


Joined: 16 Jan 2006
Posts: 551
Location: Rivergaro, Italy

PostPosted: Mon Apr 13, 2020 5:03 pm    Post subject: Hardware issues - but can't work out what. Diagnostic tips? Reply with quote

Hi. My 10 year old Lenovo W510 has started randomly hanging, sitting there completely frozen for around 10 seconds before then powering off. During these 10 seconds the HDD light remains on, as if something is being written. Temperatures are all seemingly OK (CPU and GPU both < 80°at all times). This happens on Windows 10 too, so it's not driver/software related, but I come here looking for wisdom and hoping to be able to get some sort of log from Linux...

Assuming some hardware component is failing, I've tried:
Running FurMark under Windows for 30 mins or so, in order to stress the GPU. No freeze.
Running Prime95 under Windows for 30 mins or so, in order to stress the CPU. No freeze.
Running Memtest86+, as well as Windows' own memory test, in order to stress the RAM. No freeze.
Running fio in order to stress the HDD. No freeze.

If I then start using my PC, running virtually any "mixed" load seems to cause the issue. E.g. watching youtube while updating the system. Or any videogame, within 15 mins of opening.

I'm running systemd, I've tried getting some sort of log from journalctl -b -1 and scrolling to the end, where I'd expect to see some errors, but there's nothing at all.

Any ideas on where I should look/what I should try? Short of buying a new PC that is... :)

Thanks!
_________________
# "Hmm, sounds like your system froze up."
# "I don't know why. It's about 80 degrees in here!"

http://www.rinkworks.com/stupid/cs_mincing.shtml
Back to top
View user's profile Send private message
Mistwolf
Apprentice
Apprentice


Joined: 07 Mar 2007
Posts: 189
Location: Edmonton, AB

PostPosted: Mon Apr 13, 2020 5:12 pm    Post subject: Reply with quote

Sounds like the laptop is powering off due to low battery. My old work laptop would "hibernate" (hence writing to disk) when it thought I would run out of battery power. Happened more often as time went by due to the battery starting to die.

Does the issue only occur when not plugged in to power/running on battery?
Back to top
View user's profile Send private message
Jaglover
Watchman
Watchman


Joined: 29 May 2005
Posts: 8291
Location: Saint Amant, Acadiana

PostPosted: Mon Apr 13, 2020 5:22 pm    Post subject: Reply with quote

Aging causes the thermal paste to lose its effectiveness. This means you have take it apart to replace it. Indeed, there is also dust buildup which can cause all kind of cooling issues. Some parts may overheat, there are no sensors to cover all parts. Constant overheating will speed up electrolytic capacitor drying and reduce their lifespan, too.
In short, if you haven't done it recently, a mayor service is in order before you can say RIP.
_________________
My Gentoo installation notes.
Please learn how to denote units correctly!
Back to top
View user's profile Send private message
optiluca
Guru
Guru


Joined: 16 Jan 2006
Posts: 551
Location: Rivergaro, Italy

PostPosted: Mon Apr 13, 2020 5:22 pm    Post subject: Reply with quote

Mistwolf wrote:
Sounds like the laptop is powering off due to low battery. My old work laptop would "hibernate" (hence writing to disk) when it thought I would run out of battery power. Happened more often as time went by due to the battery starting to die.

Does the issue only occur when not plugged in to power/running on battery?


Sorry, I should have said. I'm always plugged in when all of this is happening. :? Also, when it freezes it's properly blocked, not even the mouse cursor moves, which I think is a little more extreme than when hybernation kicks in?
_________________
# "Hmm, sounds like your system froze up."
# "I don't know why. It's about 80 degrees in here!"

http://www.rinkworks.com/stupid/cs_mincing.shtml
Back to top
View user's profile Send private message
optiluca
Guru
Guru


Joined: 16 Jan 2006
Posts: 551
Location: Rivergaro, Italy

PostPosted: Mon Apr 13, 2020 5:25 pm    Post subject: Reply with quote

Jaglover wrote:
Aging causes the thermal paste to lose its effectiveness. This means you have take it apart to replace it. Indeed, there is also dust buildup which can cause all kind of cooling issues. Some parts may overheat, there are no sensors to cover all parts. Constant overheating will speed up electrolytic capacitor drying and reduce their lifespan, too.
In short, if you haven't done it recently, a mayor service is in order before you can say RIP.


I changed CPU and GPU paste around 1 year ago, but you think there could be something else that's overheating while not being monitored? It's weird that it only happens on "mixed" loads, but lacking other ideas I could definitely dismantle everything and do a thorough cleanup/repaste...
_________________
# "Hmm, sounds like your system froze up."
# "I don't know why. It's about 80 degrees in here!"

http://www.rinkworks.com/stupid/cs_mincing.shtml
Back to top
View user's profile Send private message
Ant P.
Watchman
Watchman


Joined: 18 Apr 2009
Posts: 6920

PostPosted: Mon Apr 13, 2020 5:48 pm    Post subject: Reply with quote

Depending on the laptop it might be the battery even if it's plugged in, especially if the peak power draw is higher than the mains adapter is rated for. How much capacity does it have when it's unplugged?
Back to top
View user's profile Send private message
optiluca
Guru
Guru


Joined: 16 Jan 2006
Posts: 551
Location: Rivergaro, Italy

PostPosted: Mon Apr 13, 2020 5:57 pm    Post subject: Reply with quote

Ant P. wrote:
Depending on the laptop it might be the battery even if it's plugged in, especially if the peak power draw is higher than the mains adapter is rated for. How much capacity does it have when it's unplugged?


Code:
[luca@optipad-arch ~]$  upower -i /org/freedesktop/UPower/devices/battery_BAT0 upower -i /org/freedesktop/UPower/devices/battery_BAT0
  native-path:          BAT0
  vendor:               SONY
  model:                42T4795
  serial:               2763
  power supply:         yes
  updated:              Mon 13 Apr 2020 19:54:34 CEST (39 seconds ago)
  has history:          yes
  has statistics:       yes
  battery
    present:             yes
    rechargeable:        yes
    state:               fully-charged
    warning-level:       none
    energy:              40.3704 Wh
    energy-empty:        0 Wh
    energy-full:         41.796 Wh
    energy-full-design:  50.5332 Wh
    energy-rate:         0.00874136 W
    voltage:             12.042 V
    percentage:          96%
    capacity:            82.71%
    technology:          lithium-ion
    icon-name:          'battery-full-charged-symbolic'


It lasts around 30-40 minutes under light load (it's showing its age somewhat). The mains power brick on the other hand is rated for 135W in theory.

I can try running a mixed load on battery, but I'll struggle to run it for very long :D
_________________
# "Hmm, sounds like your system froze up."
# "I don't know why. It's about 80 degrees in here!"

http://www.rinkworks.com/stupid/cs_mincing.shtml
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54851
Location: 56N 3W

PostPosted: Mon Apr 13, 2020 6:49 pm    Post subject: Reply with quote

optiluca,

I suspect something battery related too. The system should operate properly with the battery removed.

Shut the system down, unplug the external power brick.
Remove the battery
Connect the external power. now test.

Old lithium batteries have some very strange operating characteristics.

If this seems to fix it, recycle the old battery carefully.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
optiluca
Guru
Guru


Joined: 16 Jan 2006
Posts: 551
Location: Rivergaro, Italy

PostPosted: Mon Apr 13, 2020 10:03 pm    Post subject: Reply with quote

NeddySeagoon wrote:
optiluca,

I suspect something battery related too. The system should operate properly with the battery removed.

Shut the system down, unplug the external power brick.
Remove the battery
Connect the external power. now test.

Old lithium batteries have some very strange operating characteristics.

If this seems to fix it, recycle the old battery carefully.


I've just had a go, but unfortunately even with no battery installed at all the PC froze under medium load (a few webbrowser tabs open, mostly with youtube videos of which only 1 playing). A couple more points which I've noticed:

1) The HDD light doesn't always stay on between the PC freezing and it switching off. I think it just freezes in whatever state it happened to be in the instant the PC locked up.
2) If after a freeze+shutdown I immediately try to power on the PC again, it often hangs already on the GRUB screen, before I even manage to make my OS selection. In general, after a lockup it seems to be much more likely for a new lockup to occur pretty swiftly after boot.

It would seem that there is some sort of "state" to this problem, which could suggest something obscure (not measured) overheating, as Jaglover suggested? Though I did disassemble/clean/repaste/reassemble not that long ago, so it's not as if there is 10 years' worth of dirt and old thermal paste in there :?
_________________
# "Hmm, sounds like your system froze up."
# "I don't know why. It's about 80 degrees in here!"

http://www.rinkworks.com/stupid/cs_mincing.shtml
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54851
Location: 56N 3W

PostPosted: Mon Apr 13, 2020 10:09 pm    Post subject: Reply with quote

optiluca,

When you attended to the thermal paste, did you remove all the grot in the air ducts too?
Grot on the fans is easy to spot and remove.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
Jaglover
Watchman
Watchman


Joined: 29 May 2005
Posts: 8291
Location: Saint Amant, Acadiana

PostPosted: Mon Apr 13, 2020 10:42 pm    Post subject: Reply with quote

You could lower the allowed top frequency for CPU.
Code:
cpupower frequency-set -u <somelowfreq>

If the problem goes away then it is a CPU issue.
_________________
My Gentoo installation notes.
Please learn how to denote units correctly!
Back to top
View user's profile Send private message
optiluca
Guru
Guru


Joined: 16 Jan 2006
Posts: 551
Location: Rivergaro, Italy

PostPosted: Fri Apr 17, 2020 10:33 am    Post subject: Reply with quote

I dismantled and rebuilt the laptop, replacing the thermal paste. Temperatures are actually a little worse than before (I clearly did a better job last time...), but no freezes so far (it's been 2 days now).

I'm sending this message as the ultimate acid test, since I fully expect it to freeze up the second I hit send... :)
_________________
# "Hmm, sounds like your system froze up."
# "I don't know why. It's about 80 degrees in here!"

http://www.rinkworks.com/stupid/cs_mincing.shtml
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum