Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Clock speeds stuck at 800MHz after upgrading to 6.6
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
proxy_correlations
n00b
n00b


Joined: 11 May 2022
Posts: 6

PostPosted: Sun Mar 24, 2024 4:30 pm    Post subject: Clock speeds stuck at 800MHz after upgrading to 6.6 Reply with quote

Essentially, one night I decided to finally upgrade my kernel from 6.1.57 to whatever was latest with my classic routine of letting the kernel cook overnight, but come morning everything was slow. The reason - CPU freq stuck at 800MHz even under heavy load.

Code:

❯ cpupower frequency-info

  driver: intel_pstate
  hardware limits: 800 MHz - 4.10 GHz
  available cpufreq governors: performance powersave
  current policy: frequency should be within 800 MHz and 4.10 GHz.
                  The governor "performance" may decide which speed to use
                  within this range.
  current CPU frequency: Unable to call hardware
  current CPU frequency: 800 MHz (asserted by call to kernel)



I checked everything I thought would help, scaling governor, firmware, temps, pstate controls, everything worked fine on 6.1.57, but as soon as I logged back into 6.6, everything was slow once again. So after around 10 days of constantly tinkering with possible culprits, constantly recompiling, installing and reinstalling, I finally gave up and just wiped everything clean.

To my surprise, after this fresh install from liveUSB, everything worked perfectly fine! So this time, instead of just copying setting from my old kernel config, I enabled each feature set when I needed them, one group at a time.

So how crazy would it be if I were to say, I finally found the (possible) culprit and it was CONFIG_DRM_AMDGPU <*/M>. Why would gpu modules/firmware affect CPU clock speeds? Perhaps there's something else I'm missing. Right now I have everything AMD related disabled, but I do need the GPU support, without it the onboard intel integrated chip struggles to play a YT video, mpv just gives up on 4k videos.

Tho I have not the faintest idea what the exact issue is, nor do I know where to look for. Any insights?

Spec:
Code:
CPU: Intel i7-8705G (8) @ 4.100GHz
GPU: Intel HD Graphics 630
GPU: AMD ATI Radeon RX Vega M GL
Back to top
View user's profile Send private message
logrusx
Advocate
Advocate


Joined: 22 Feb 2018
Posts: 2521

PostPosted: Sun Mar 24, 2024 6:08 pm    Post subject: Re: Clock speeds stuck at 800MHz after upgrading to 6.6 Reply with quote

proxy_correlations wrote:

So how crazy would it be if I were to say, I finally found the (possible) culprit and it was CONFIG_DRM_AMDGPU <*/M>.


It is crazy and it most definitely isn't true. I would suggest reading the documentation of intel pstate driver. It has it's own specifics and I would speculate you were using ACPI cpufreq driver in 6.1

Best Regards,
Georgi
Back to top
View user's profile Send private message
proxy_correlations
n00b
n00b


Joined: 11 May 2022
Posts: 6

PostPosted: Mon Mar 25, 2024 4:09 am    Post subject: Re: Clock speeds stuck at 800MHz after upgrading to 6.6 Reply with quote

Thanks for the reply


logrusx wrote:

would speculate you were using ACPI cpufreq driver in 6.1


Damn that's right, I didn't even realize!

Code:
CONFIG_X86_ACPI_CPUFREQ=y
CONFIG_X86_ACPI_CPUFREQ_CPB=y


Apparently I had both toggled, i.e. ACPI_CPUFREQ and INTEL_PSTATE so perhaps 6.1 defaulted to ACPI and 6.6 is defaulting to intel_pstate. Still not sure why enabling AMDGPU tanks my CPU freq. Could some of its modules be forcing intel_pstate on CPU?

I don't see much I can configure with intel_pstate except for the scaling governor, I think I'm missing something here, anything in particular I should be looking for?

Also would it be too dumb to switch to ACPI and just disable pstate? (Assuming that it'll fix the freq problem and I can enable AMDGPU like in 6.1).
Back to top
View user's profile Send private message
logrusx
Advocate
Advocate


Joined: 22 Feb 2018
Posts: 2521

PostPosted: Mon Mar 25, 2024 8:41 am    Post subject: Reply with quote

Quote:
``intel_pstate`` can operate in two different modes, active or passive. In the
active mode, it uses its own internal performance scaling governor algorithm or
allows the hardware to do performance scaling by itself, while in the passive
mode it responds to requests made by a generic ``CPUFreq`` governor implementing
a certain performance scaling algorithm. Which of them will be in effect
depends on what kernel command line options are used and on the capabilities of
the processor.


Quote:
Passive Mode
------------

This is the default operation mode of ``intel_pstate`` for processors without
hardware-managed P-states (HWP) support. It is always used if the
``intel_pstate=passive`` argument is passed to the kernel in the command line


Choose your preferred scaling algorithm, I guess you would want it to be schedutil and add intel_pstate=passive to your kernel command line.

Some people stick with intel pstate but from what I've read it's not very flexible.

Best Regards,
Georgi
Back to top
View user's profile Send private message
e8root
Tux's lil' helper
Tux's lil' helper


Joined: 09 Feb 2024
Posts: 94

PostPosted: Mon Mar 25, 2024 7:24 pm    Post subject: Reply with quote

Recently I had to enable Speedstep in the UEFI to get proper scheduling on the CPU. In this case it was more to do with P and E cores not being recognized but rather than clock frequency but in either case it might be worth checking.

Also I would recommend trying to test newer kernel like 6.8.1. It costs nothing to try and maybe the issue was already fixed - which even if you don't want to use latest and greatest kernel might give you some clue about what the issue might be.
_________________
Unix Wars - Episode V: AT&T Strikes Back
Back to top
View user's profile Send private message
logrusx
Advocate
Advocate


Joined: 22 Feb 2018
Posts: 2521

PostPosted: Mon Mar 25, 2024 8:10 pm    Post subject: Reply with quote

e8root wrote:
Recently I had to enable Speedstep in the UEFI to get proper scheduling on the CPU.


It might be the case. I've lost track of Intel's new screw ups.

Best Regards,
Georgi
Back to top
View user's profile Send private message
e8root
Tux's lil' helper
Tux's lil' helper


Joined: 09 Feb 2024
Posts: 94

PostPosted: Mon Mar 25, 2024 9:24 pm    Post subject: Reply with quote

Actually the way it was supposed to work is that my cores run at 100% frequency and Linux kernel should schedule threads on fastest cores judging by clock alone. In Windows it work exactly like that and I could disable Speedstep which would always disable frequency scaling completely and still have priority threads always scheduled to P-cores first. Linux somehow doesn't do that and would somehow still control cores frequency and even when completely disabling idle driver which was keeping frequency at 100% it still not schedule threads on P-cores. I won't even say how I did that because it was bad method and would increase idle power consumption by not issuing halting instructions to the CPU.

Long story short my case looks to be Linux screw-up and not Intel's.
There exists interface through which Linux query information from core itself and it requires Speedstep.
I guess there might have not been reason to design any other mechanism of scheduling threads in Linux since each CPU supports this kind of information querying from CPU and its not like there existed CPUs with different clock speeds on different cores before Alder Lake.

All that said;
1) no idea if Speedstep can be related in any way to your issue
2) not even sure if I investigated and understood everything correctly so please use grain of NaCl or even KCl after reading my post ; )
_________________
Unix Wars - Episode V: AT&T Strikes Back
Back to top
View user's profile Send private message
pietinger
Moderator
Moderator


Joined: 17 Oct 2006
Posts: 5231
Location: Bavaria

PostPosted: Mon Mar 25, 2024 9:36 pm    Post subject: Re: Clock speeds stuck at 800MHz after upgrading to 6.6 Reply with quote

proxy_correlations wrote:
[...] So this time, instead of just copying setting from my old kernel config, I enabled each feature set when I needed them, one group at a time. [...]

Maybe this is the reason of your problem ... copying old settings is fine ... IF ... you do a "make oldconfig" ...

(See more here: https://wiki.gentoo.org/wiki/User:Pietinger/Tutorials/Manual_kernel_configuration#Cheat_Sheets )

proxy_correlations wrote:
[...] So how crazy would it be if I were to say, I finally found the (possible) culprit and it was CONFIG_DRM_AMDGPU <*/M>.

As @logrusx already said: This is not the culprit.

proxy_correlations wrote:
Apparently I had both toggled, i.e. ACPI_CPUFREQ and INTEL_PSTATE so perhaps 6.1 defaulted to ACPI and 6.6 is defaulting to intel_pstate. [...]

No. If both modules are enabled in your kernel .config AND you have a Intel CPU with capable "hwp" (yes, you have; a Intel i7-8705G has hwp support; see also in: "lscpu") THEN this module will do the job (you can disable ACPI freq worry-free because it is not used).

(See more here: https://wiki.gentoo.org/wiki/User:Pietinger/Tutorials/Manual_Configuring_Kernel_Version_6.6#Part_3_-_Must_Haves
and: https://docs.kernel.org/admin-guide/pm/intel_pstate.html )

proxy_correlations wrote:
[...]Also would it be too dumb to switch to ACPI and just disable pstate? (Assuming that it'll fix the freq problem and I can enable AMDGPU like in 6.1).

Yes it would be dumb, because Intels P-State is better than ACPI.

Do you have enabled Intel-idle additionally ? If not, please do so:
Code:
Power management and ACPI options  --->
    [*] Cpuidle Driver for Intel Processors

_________________
https://wiki.gentoo.org/wiki/User:Pietinger
Back to top
View user's profile Send private message
proxy_correlations
n00b
n00b


Joined: 11 May 2022
Posts: 6

PostPosted: Wed Mar 27, 2024 8:26 am    Post subject: Re: Clock speeds stuck at 800MHz after upgrading to 6.6 Reply with quote

Took a while to try out everything

pietinger wrote:

Maybe this is the reason of your problem ... copying old settings is fine ... IF ... you do a "make oldconfig" ...


Always! I learned that lesson a long time ago... the hard way :(

I went over https://wiki.gentoo.org/wiki/User:Pietinger/Tutorials/Manual_Configuring_Kernel_Version_6.6#Part_3_-_Must_Haves (btw great tutorial) and now I'm running a similar setup as far as I can tell.

The good news is - there's a 50% chance the machine will boot running at good frequencies, the bad news is - there's a 50% chance it wont. Even worse news - sometimes it just bricks, everything freezes, had a minor heartattack when the system froze in middle of `make install`, luckily nothing broke and I was able to recompile just fine after force restart.

This is from that other 50%, stuck at 800 MHz

Code:
analyzing CPU 0:
  driver: intel_cpufreq
  CPUs which run at the same hardware frequency: 0
  CPUs which need to have their frequency coordinated by software: 0
  maximum transition latency: 20.0 us
  hardware limits: 800 MHz - 4.10 GHz
  available cpufreq governors: performance schedutil
  current policy: frequency should be within 800 MHz and 4.10 GHz.
                  The governor "schedutil" may decide which speed to use
                  within this range.
  current CPU frequency: Unable to call hardware
  current CPU frequency: 800 MHz (asserted by call to kernel)
  boost state support:
    Supported: yes
    Active: yes


Apparently `intel_pstate` becomes `intel_cpufreq` with `intel_pstate=passive`.

Right now I'm trying out intel_pstate=passive with cpufreq.default_governor=schedutil (Grub)

I did noticed "smu firmware loading failed" when checking for loaded firmware with "dmesg -t | grep amdgpu | grep firmware", don't know if that has something to do with this weird behavior. Things work fine in 6.1 so I doubt this is some hardware issue.

But yeah, extremely weird stuff, I don't think AMDGPU is the culprit here, but unloading it does make everything work :?
Back to top
View user's profile Send private message
logrusx
Advocate
Advocate


Joined: 22 Feb 2018
Posts: 2521

PostPosted: Wed Mar 27, 2024 10:47 am    Post subject: Reply with quote

System freezes are sign of another issue, highly probably of hardware origin. The easiest to do is throw 8 rounds of memtest86+.

Best Regards,
Georgi
Back to top
View user's profile Send private message
proxy_correlations
n00b
n00b


Joined: 11 May 2022
Posts: 6

PostPosted: Wed Mar 27, 2024 11:29 am    Post subject: Reply with quote

logrusx wrote:
highly probably of hardware origin


I thought so too! So I swapped back to 6.1.57 with AMDGPU, DRI_PRIME=1 for most of my tasks, no freezes, smooth sailing, swapped back to 6.6.21 (No AMDGPU), again no problems, toggled AMDGPU and bam it's a brick.
Back to top
View user's profile Send private message
pietinger
Moderator
Moderator


Joined: 17 Oct 2006
Posts: 5231
Location: Bavaria

PostPosted: Wed Mar 27, 2024 12:21 pm    Post subject: Re: Clock speeds stuck at 800MHz after upgrading to 6.6 Reply with quote

proxy_correlations wrote:
[...] Things work fine in 6.1 so I doubt this is some hardware issue.

But yeah, extremely weird stuff, I don't think AMDGPU is the culprit here, but unloading it does make everything work :?

In your first post you wrote that this system has additionally a "GPU: AMD ATI Radeon RX Vega M GL" ... so, if everything works fine in 6.1 AND you have a problem in 6.6 WITH this module THEN I guess there is really a problem in this 6.6-AMD-driver. I highly suggest to try a 6.7 (or 6.8 ) kernel (yes, AMD did some heavy patching their stuff in all the last major kernel versions).
_________________
https://wiki.gentoo.org/wiki/User:Pietinger
Back to top
View user's profile Send private message
logrusx
Advocate
Advocate


Joined: 22 Feb 2018
Posts: 2521

PostPosted: Wed Mar 27, 2024 12:55 pm    Post subject: Reply with quote

proxy_correlations wrote:
logrusx wrote:
highly probably of hardware origin


I thought so too! So I swapped back to 6.1.57 with AMDGPU, DRI_PRIME=1 for most of my tasks, no freezes, smooth sailing, swapped back to 6.6.21 (No AMDGPU), again no problems, toggled AMDGPU and bam it's a brick.


If I hit such an issue will I would do everything to make sure I know what the cause is. If did not get convinced, I would run memtest86+ overnight just to be sure there's one less possible cause.

Best Regards,
Georgi
Back to top
View user's profile Send private message
pietinger
Moderator
Moderator


Joined: 17 Oct 2006
Posts: 5231
Location: Bavaria

PostPosted: Wed Mar 27, 2024 1:52 pm    Post subject: Reply with quote

proxy_correlations,

I have just had a look at the patches that came in 6.6.23 (it took a while because this minor version was once again very extensive). This patch looks very interesting for you:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=linux-6.6.y&id=8f3e68c6a3fff53c2240762a47a0045d89371775
_________________
https://wiki.gentoo.org/wiki/User:Pietinger
Back to top
View user's profile Send private message
proxy_correlations
n00b
n00b


Joined: 11 May 2022
Posts: 6

PostPosted: Sat Mar 30, 2024 9:12 am    Post subject: Reply with quote

Apologies for the delay, this is my work PC so I have to be very strategic with time, plus a very busy week.

I did what logrusx suggested

logrusx wrote:
I would run memtest86+ overnight just to be sure there's one less possible cause.


Luckily the test passed 100%, no failing addresses!

pietinger wrote:
I highly suggest to try a 6.7 (or 6.8 ) kernel (yes, AMD did some heavy patching their stuff in all the last major kernel versions).


Code:
❯ uname -a
Linux vector 6.8.1-gentoo-x86_64 #4 SMP PREEMPT_DYNAMIC Wed Mar 27 21:56:45 IST 2024 x86_64 Intel(R) Core(TM) i7-8705G CPU @ 3.10GHz GenuineIntel GNU/Linux


Good new and bad news. Bad news first, unfortunately the issue persists :(

Tho good news is it somehow fixed (a bug?) that would make it so that the initial encryption password prompt would just disappear underneath other bootup messages. You could still unlock the machine if you just pretended like there was a prompt and just started typing. Now it shows up just fine.

pietinger wrote:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=linux-6.6.y&id=8f3e68c6a3fff53c2240762a47a0045d89371775


Hmm could be! Perhaps I should try something between 6.6 and 6.6.2 to see if that was indeed that patch that's responsible.

Right now I'm trying to see if I can enable just some module and reproduce the issue, i.e. instead of M for AMDGPU, like this with <*>
Code:
CONFIG_EXTRA_FIRMWARE="amdgpu/vegam_ce.bin amdgpu/vegam_me.bin amdgpu/vegam_mec2.bin amdgpu/vegam_mec.bin amdgpu/vegam_pfp.bin amdgpu/vegam_rlc.bin amdgpu/vegam_sdma1.bin amdgpu/vegam_sdma.bin amdgpu/vegam_smc.bin amdgpu/vegam_uvd.bin amdgpu/vegam_vce.bin"


Also don't know how important this is but I noticed

Code:
 976   │ [  122.395930] [drm:atom_op_jump [amdgpu]] *ERROR* atombios stuck in loop for more than 20secs aborting
 977   │ [  122.396133] [drm:amdgpu_atom_execute_table_locked [amdgpu]] *ERROR* atombios stuck executing C6FC (len 403, WS 20, PS 0) @ 0xC815
 978   │ [  122.396258] [drm:amdgpu_atom_execute_table_locked [amdgpu]] *ERROR* atombios stuck executing C410 (len 114, WS 0, PS 8) @ 0xC46D


in dmesg
Back to top
View user's profile Send private message
pietinger
Moderator
Moderator


Joined: 17 Oct 2006
Posts: 5231
Location: Bavaria

PostPosted: Sat Mar 30, 2024 12:01 pm    Post subject: Reply with quote

Have you tried to enable the AMDGPU module static <*> into your kernel (instead as being a <M>odule) ? Do have then also the same errror message in dmesg ?
_________________
https://wiki.gentoo.org/wiki/User:Pietinger
Back to top
View user's profile Send private message
Hu
Administrator
Administrator


Joined: 06 Mar 2007
Posts: 22853

PostPosted: Sat Mar 30, 2024 2:49 pm    Post subject: Reply with quote

proxy_correlations wrote:
Tho good news is it somehow fixed (a bug?) that would make it so that the initial encryption password prompt would just disappear underneath other bootup messages. You could still unlock the machine if you just pretended like there was a prompt and just started typing. Now it shows up just fine.
Although the old behavior was undesirable, it is not really a fixable bug, so it may recur either randomly or when you do another upgrade in the future. You have the kernel printing messages as it completes various steps, and you also have the user tool printing the password prompt. Neither is dependent on the other reaching a quiet state, so messages can end up interspersed.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum