Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Continuous (fake?) MCE warnings on CPU overheat
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
Cazzantonio
Bodhisattva
Bodhisattva


Joined: 20 Mar 2004
Posts: 4514
Location: Somewere around the world

PostPosted: Sun Jul 07, 2019 9:40 am    Post subject: Continuous (fake?) MCE warnings on CPU overheat Reply with quote

I have a laptop witn a core i7-9750H and kernel 5.1.16

I see that periodically I have these warnings in the logs:

dmesg:
Code:
[   24.653128] mce: CPU2: Core temperature above threshold, cpu clock throttled (total events = 1)
[   24.653128] mce: CPU8: Core temperature above threshold, cpu clock throttled (total events = 1)
[   24.653129] mce: CPU2: Package temperature above threshold, cpu clock throttled (total events = 1)
[   24.653159] mce: CPU8: Package temperature above threshold, cpu clock throttled (total events = 1)
[   24.653164] mce: CPU0: Package temperature above threshold, cpu clock throttled (total events = 1)
[   24.653164] mce: CPU6: Package temperature above threshold, cpu clock throttled (total events = 1)
[   24.653165] mce: CPU1: Package temperature above threshold, cpu clock throttled (total events = 1)
[   24.653166] mce: CPU7: Package temperature above threshold, cpu clock throttled (total events = 1)
[   24.653167] mce: CPU3: Package temperature above threshold, cpu clock throttled (total events = 1)
[   24.653167] mce: CPU9: Package temperature above threshold, cpu clock throttled (total events = 1)
[   24.653168] mce: CPU10: Package temperature above threshold, cpu clock throttled (total events = 1)
[   24.653169] mce: CPU4: Package temperature above threshold, cpu clock throttled (total events = 1)
[   24.653170] mce: CPU11: Package temperature above threshold, cpu clock throttled (total events = 1)
[   24.653170] mce: CPU5: Package temperature above threshold, cpu clock throttled (total events = 1)
[   24.658112] mce: CPU2: Core temperature/speed normal
[   24.658112] mce: CPU8: Core temperature/speed normal
[   24.658112] mce: CPU2: Package temperature/speed normal
[   24.658113] mce: CPU8: Package temperature/speed normal
[   24.658149] mce: CPU6: Package temperature/speed normal
[   24.658149] mce: CPU0: Package temperature/speed normal
[   24.658150] mce: CPU7: Package temperature/speed normal
[   24.658151] mce: CPU1: Package temperature/speed normal
[   24.658151] mce: CPU3: Package temperature/speed normal
[   24.658152] mce: CPU9: Package temperature/speed normal
[   24.658153] mce: CPU5: Package temperature/speed normal
[   24.658153] mce: CPU11: Package temperature/speed normal
[   24.658154] mce: CPU10: Package temperature/speed normal


As you can see, within the same millisecond the temperature overheats and cool down. I add that this happens under no load at all (idle time) so it's actually a mistery to me. I strongly suspect that this warnings are some fake produced by some misconfiguration.

Do you have esperience with the same issue? Can you suggest me how to start dealing with it?

Thanks
_________________
Any mans death diminishes me, because I am involved in Mankinde; and therefore never send to know for whom the bell tolls; It tolls for thee.
-John Donne
Back to top
View user's profile Send private message
Cazzantonio
Bodhisattva
Bodhisattva


Joined: 20 Mar 2004
Posts: 4514
Location: Somewere around the world

PostPosted: Sun Jul 07, 2019 12:52 pm    Post subject: Reply with quote

UPDATE:
I noticed that this errors happen exactly every multiple of 30 minutes... This is quite surprising. There is no 30 minute timeout in the kernel (to my knowledge) so it must be an hardware issue? Is this related to some ACPI bug?

Here a list of the events (just for one cpu, others are the same)
This is just after boot:
Code:
[dom lug  7 11:33:04 2019] mce: CPU2: Core temperature above threshold, cpu clock throttled (total events = 1)
[dom lug  7 11:33:04 2019] mce: CPU2: Core temperature/speed normal


This is the second event. From here after is every 30 minutes:
Code:
[dom lug  7 12:09:50 2019] mce: CPU0: Core temperature above threshold, cpu clock throttled (total events = 1)
[dom lug  7 12:09:50 2019] mce: CPU0: Core temperature/speed normal
...
[dom lug  7 13:09:47 2019] mce: CPU0: Core temperature above threshold, cpu clock throttled (total events = 35)
[dom lug  7 13:09:47 2019] mce: CPU0: Core temperature/speed normal
...
[dom lug  7 13:39:44 2019] mce: CPU0: Core temperature above threshold, cpu clock throttled (total events = 58)
[dom lug  7 13:39:44 2019] mce: CPU0: Core temperature/speed normal
...

[dom lug  7 14:09:28 2019] mce: CPU0: Core temperature above threshold, cpu clock throttled (total events = 106)
[dom lug  7 14:09:28 2019] mce: CPU0: Core temperature/speed normal
....


During the whole time the CPU was idle.

The same messages happen also with ubuntu, so it's not me messing up in the kernel configuration :-) (yes, I started using gentoo before genkernel and I still configure everything by hand...)
_________________
Any mans death diminishes me, because I am involved in Mankinde; and therefore never send to know for whom the bell tolls; It tolls for thee.
-John Donne
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum