View previous topic :: View next topic |
Author |
Message |
proxy_correlations n00b
Joined: 11 May 2022 Posts: 6
|
Posted: Sun Mar 24, 2024 4:30 pm Post subject: Clock speeds stuck at 800MHz after upgrading to 6.6 |
|
|
Essentially, one night I decided to finally upgrade my kernel from 6.1.57 to whatever was latest with my classic routine of letting the kernel cook overnight, but come morning everything was slow. The reason - CPU freq stuck at 800MHz even under heavy load.
Code: |
❯ cpupower frequency-info
driver: intel_pstate
hardware limits: 800 MHz - 4.10 GHz
available cpufreq governors: performance powersave
current policy: frequency should be within 800 MHz and 4.10 GHz.
The governor "performance" may decide which speed to use
within this range.
current CPU frequency: Unable to call hardware
current CPU frequency: 800 MHz (asserted by call to kernel)
|
I checked everything I thought would help, scaling governor, firmware, temps, pstate controls, everything worked fine on 6.1.57, but as soon as I logged back into 6.6, everything was slow once again. So after around 10 days of constantly tinkering with possible culprits, constantly recompiling, installing and reinstalling, I finally gave up and just wiped everything clean.
To my surprise, after this fresh install from liveUSB, everything worked perfectly fine! So this time, instead of just copying setting from my old kernel config, I enabled each feature set when I needed them, one group at a time.
So how crazy would it be if I were to say, I finally found the (possible) culprit and it was CONFIG_DRM_AMDGPU <*/M>. Why would gpu modules/firmware affect CPU clock speeds? Perhaps there's something else I'm missing. Right now I have everything AMD related disabled, but I do need the GPU support, without it the onboard intel integrated chip struggles to play a YT video, mpv just gives up on 4k videos.
Tho I have not the faintest idea what the exact issue is, nor do I know where to look for. Any insights?
Spec:
Code: | CPU: Intel i7-8705G (8) @ 4.100GHz
GPU: Intel HD Graphics 630
GPU: AMD ATI Radeon RX Vega M GL |
|
|
Back to top |
|
|
logrusx Advocate
Joined: 22 Feb 2018 Posts: 2555
|
Posted: Sun Mar 24, 2024 6:08 pm Post subject: Re: Clock speeds stuck at 800MHz after upgrading to 6.6 |
|
|
proxy_correlations wrote: |
So how crazy would it be if I were to say, I finally found the (possible) culprit and it was CONFIG_DRM_AMDGPU <*/M>. |
It is crazy and it most definitely isn't true. I would suggest reading the documentation of intel pstate driver. It has it's own specifics and I would speculate you were using ACPI cpufreq driver in 6.1
Best Regards,
Georgi |
|
Back to top |
|
|
proxy_correlations n00b
Joined: 11 May 2022 Posts: 6
|
Posted: Mon Mar 25, 2024 4:09 am Post subject: Re: Clock speeds stuck at 800MHz after upgrading to 6.6 |
|
|
Thanks for the reply
logrusx wrote: |
would speculate you were using ACPI cpufreq driver in 6.1
|
Damn that's right, I didn't even realize!
Code: | CONFIG_X86_ACPI_CPUFREQ=y
CONFIG_X86_ACPI_CPUFREQ_CPB=y |
Apparently I had both toggled, i.e. ACPI_CPUFREQ and INTEL_PSTATE so perhaps 6.1 defaulted to ACPI and 6.6 is defaulting to intel_pstate. Still not sure why enabling AMDGPU tanks my CPU freq. Could some of its modules be forcing intel_pstate on CPU?
I don't see much I can configure with intel_pstate except for the scaling governor, I think I'm missing something here, anything in particular I should be looking for?
Also would it be too dumb to switch to ACPI and just disable pstate? (Assuming that it'll fix the freq problem and I can enable AMDGPU like in 6.1). |
|
Back to top |
|
|
logrusx Advocate
Joined: 22 Feb 2018 Posts: 2555
|
Posted: Mon Mar 25, 2024 8:41 am Post subject: |
|
|
Quote: | ``intel_pstate`` can operate in two different modes, active or passive. In the
active mode, it uses its own internal performance scaling governor algorithm or
allows the hardware to do performance scaling by itself, while in the passive
mode it responds to requests made by a generic ``CPUFreq`` governor implementing
a certain performance scaling algorithm. Which of them will be in effect
depends on what kernel command line options are used and on the capabilities of
the processor.
|
Quote: | Passive Mode
------------
This is the default operation mode of ``intel_pstate`` for processors without
hardware-managed P-states (HWP) support. It is always used if the
``intel_pstate=passive`` argument is passed to the kernel in the command line
|
Choose your preferred scaling algorithm, I guess you would want it to be schedutil and add intel_pstate=passive to your kernel command line.
Some people stick with intel pstate but from what I've read it's not very flexible.
Best Regards,
Georgi |
|
Back to top |
|
|
e8root Tux's lil' helper
Joined: 09 Feb 2024 Posts: 94
|
Posted: Mon Mar 25, 2024 7:24 pm Post subject: |
|
|
Recently I had to enable Speedstep in the UEFI to get proper scheduling on the CPU. In this case it was more to do with P and E cores not being recognized but rather than clock frequency but in either case it might be worth checking.
Also I would recommend trying to test newer kernel like 6.8.1. It costs nothing to try and maybe the issue was already fixed - which even if you don't want to use latest and greatest kernel might give you some clue about what the issue might be. _________________ Unix Wars - Episode V: AT&T Strikes Back |
|
Back to top |
|
|
logrusx Advocate
Joined: 22 Feb 2018 Posts: 2555
|
Posted: Mon Mar 25, 2024 8:10 pm Post subject: |
|
|
e8root wrote: | Recently I had to enable Speedstep in the UEFI to get proper scheduling on the CPU. |
It might be the case. I've lost track of Intel's new screw ups.
Best Regards,
Georgi |
|
Back to top |
|
|
e8root Tux's lil' helper
Joined: 09 Feb 2024 Posts: 94
|
Posted: Mon Mar 25, 2024 9:24 pm Post subject: |
|
|
Actually the way it was supposed to work is that my cores run at 100% frequency and Linux kernel should schedule threads on fastest cores judging by clock alone. In Windows it work exactly like that and I could disable Speedstep which would always disable frequency scaling completely and still have priority threads always scheduled to P-cores first. Linux somehow doesn't do that and would somehow still control cores frequency and even when completely disabling idle driver which was keeping frequency at 100% it still not schedule threads on P-cores. I won't even say how I did that because it was bad method and would increase idle power consumption by not issuing halting instructions to the CPU.
Long story short my case looks to be Linux screw-up and not Intel's.
There exists interface through which Linux query information from core itself and it requires Speedstep.
I guess there might have not been reason to design any other mechanism of scheduling threads in Linux since each CPU supports this kind of information querying from CPU and its not like there existed CPUs with different clock speeds on different cores before Alder Lake.
All that said;
1) no idea if Speedstep can be related in any way to your issue
2) not even sure if I investigated and understood everything correctly so please use grain of NaCl or even KCl after reading my post ; ) _________________ Unix Wars - Episode V: AT&T Strikes Back |
|
Back to top |
|
|
pietinger Moderator
Joined: 17 Oct 2006 Posts: 5277 Location: Bavaria
|
Posted: Mon Mar 25, 2024 9:36 pm Post subject: Re: Clock speeds stuck at 800MHz after upgrading to 6.6 |
|
|
proxy_correlations wrote: | [...] So this time, instead of just copying setting from my old kernel config, I enabled each feature set when I needed them, one group at a time. [...] |
Maybe this is the reason of your problem ... copying old settings is fine ... IF ... you do a "make oldconfig" ...
(See more here: https://wiki.gentoo.org/wiki/User:Pietinger/Tutorials/Manual_kernel_configuration#Cheat_Sheets )
proxy_correlations wrote: | [...] So how crazy would it be if I were to say, I finally found the (possible) culprit and it was CONFIG_DRM_AMDGPU <*/M>. |
As @logrusx already said: This is not the culprit.
proxy_correlations wrote: | Apparently I had both toggled, i.e. ACPI_CPUFREQ and INTEL_PSTATE so perhaps 6.1 defaulted to ACPI and 6.6 is defaulting to intel_pstate. [...] |
No. If both modules are enabled in your kernel .config AND you have a Intel CPU with capable "hwp" (yes, you have; a Intel i7-8705G has hwp support; see also in: "lscpu") THEN this module will do the job (you can disable ACPI freq worry-free because it is not used).
(See more here: https://wiki.gentoo.org/wiki/User:Pietinger/Tutorials/Manual_Configuring_Kernel_Version_6.6#Part_3_-_Must_Haves
and: https://docs.kernel.org/admin-guide/pm/intel_pstate.html )
proxy_correlations wrote: | [...]Also would it be too dumb to switch to ACPI and just disable pstate? (Assuming that it'll fix the freq problem and I can enable AMDGPU like in 6.1). |
Yes it would be dumb, because Intels P-State is better than ACPI.
Do you have enabled Intel-idle additionally ? If not, please do so:
Code: | Power management and ACPI options --->
[*] Cpuidle Driver for Intel Processors |
_________________ https://wiki.gentoo.org/wiki/User:Pietinger |
|
Back to top |
|
|
proxy_correlations n00b
Joined: 11 May 2022 Posts: 6
|
Posted: Wed Mar 27, 2024 8:26 am Post subject: Re: Clock speeds stuck at 800MHz after upgrading to 6.6 |
|
|
Took a while to try out everything
pietinger wrote: |
Maybe this is the reason of your problem ... copying old settings is fine ... IF ... you do a "make oldconfig" ...
|
Always! I learned that lesson a long time ago... the hard way
I went over https://wiki.gentoo.org/wiki/User:Pietinger/Tutorials/Manual_Configuring_Kernel_Version_6.6#Part_3_-_Must_Haves (btw great tutorial) and now I'm running a similar setup as far as I can tell.
The good news is - there's a 50% chance the machine will boot running at good frequencies, the bad news is - there's a 50% chance it wont. Even worse news - sometimes it just bricks, everything freezes, had a minor heartattack when the system froze in middle of `make install`, luckily nothing broke and I was able to recompile just fine after force restart.
This is from that other 50%, stuck at 800 MHz
Code: | analyzing CPU 0:
driver: intel_cpufreq
CPUs which run at the same hardware frequency: 0
CPUs which need to have their frequency coordinated by software: 0
maximum transition latency: 20.0 us
hardware limits: 800 MHz - 4.10 GHz
available cpufreq governors: performance schedutil
current policy: frequency should be within 800 MHz and 4.10 GHz.
The governor "schedutil" may decide which speed to use
within this range.
current CPU frequency: Unable to call hardware
current CPU frequency: 800 MHz (asserted by call to kernel)
boost state support:
Supported: yes
Active: yes
|
Apparently `intel_pstate` becomes `intel_cpufreq` with `intel_pstate=passive`.
Right now I'm trying out intel_pstate=passive with cpufreq.default_governor=schedutil (Grub)
I did noticed "smu firmware loading failed" when checking for loaded firmware with "dmesg -t | grep amdgpu | grep firmware", don't know if that has something to do with this weird behavior. Things work fine in 6.1 so I doubt this is some hardware issue.
But yeah, extremely weird stuff, I don't think AMDGPU is the culprit here, but unloading it does make everything work |
|
Back to top |
|
|
logrusx Advocate
Joined: 22 Feb 2018 Posts: 2555
|
Posted: Wed Mar 27, 2024 10:47 am Post subject: |
|
|
System freezes are sign of another issue, highly probably of hardware origin. The easiest to do is throw 8 rounds of memtest86+.
Best Regards,
Georgi |
|
Back to top |
|
|
proxy_correlations n00b
Joined: 11 May 2022 Posts: 6
|
Posted: Wed Mar 27, 2024 11:29 am Post subject: |
|
|
logrusx wrote: | highly probably of hardware origin |
I thought so too! So I swapped back to 6.1.57 with AMDGPU, DRI_PRIME=1 for most of my tasks, no freezes, smooth sailing, swapped back to 6.6.21 (No AMDGPU), again no problems, toggled AMDGPU and bam it's a brick. |
|
Back to top |
|
|
pietinger Moderator
Joined: 17 Oct 2006 Posts: 5277 Location: Bavaria
|
Posted: Wed Mar 27, 2024 12:21 pm Post subject: Re: Clock speeds stuck at 800MHz after upgrading to 6.6 |
|
|
proxy_correlations wrote: | [...] Things work fine in 6.1 so I doubt this is some hardware issue.
But yeah, extremely weird stuff, I don't think AMDGPU is the culprit here, but unloading it does make everything work |
In your first post you wrote that this system has additionally a "GPU: AMD ATI Radeon RX Vega M GL" ... so, if everything works fine in 6.1 AND you have a problem in 6.6 WITH this module THEN I guess there is really a problem in this 6.6-AMD-driver. I highly suggest to try a 6.7 (or 6.8 ) kernel (yes, AMD did some heavy patching their stuff in all the last major kernel versions). _________________ https://wiki.gentoo.org/wiki/User:Pietinger |
|
Back to top |
|
|
logrusx Advocate
Joined: 22 Feb 2018 Posts: 2555
|
Posted: Wed Mar 27, 2024 12:55 pm Post subject: |
|
|
proxy_correlations wrote: | logrusx wrote: | highly probably of hardware origin |
I thought so too! So I swapped back to 6.1.57 with AMDGPU, DRI_PRIME=1 for most of my tasks, no freezes, smooth sailing, swapped back to 6.6.21 (No AMDGPU), again no problems, toggled AMDGPU and bam it's a brick. |
If I hit such an issue will I would do everything to make sure I know what the cause is. If did not get convinced, I would run memtest86+ overnight just to be sure there's one less possible cause.
Best Regards,
Georgi |
|
Back to top |
|
|
pietinger Moderator
Joined: 17 Oct 2006 Posts: 5277 Location: Bavaria
|
|
Back to top |
|
|
proxy_correlations n00b
Joined: 11 May 2022 Posts: 6
|
Posted: Sat Mar 30, 2024 9:12 am Post subject: |
|
|
Apologies for the delay, this is my work PC so I have to be very strategic with time, plus a very busy week.
I did what logrusx suggested
logrusx wrote: | I would run memtest86+ overnight just to be sure there's one less possible cause. |
Luckily the test passed 100%, no failing addresses!
pietinger wrote: | I highly suggest to try a 6.7 (or 6.8 ) kernel (yes, AMD did some heavy patching their stuff in all the last major kernel versions). |
Code: | ❯ uname -a
Linux vector 6.8.1-gentoo-x86_64 #4 SMP PREEMPT_DYNAMIC Wed Mar 27 21:56:45 IST 2024 x86_64 Intel(R) Core(TM) i7-8705G CPU @ 3.10GHz GenuineIntel GNU/Linux
|
Good new and bad news. Bad news first, unfortunately the issue persists
Tho good news is it somehow fixed (a bug?) that would make it so that the initial encryption password prompt would just disappear underneath other bootup messages. You could still unlock the machine if you just pretended like there was a prompt and just started typing. Now it shows up just fine.
pietinger wrote: | https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=linux-6.6.y&id=8f3e68c6a3fff53c2240762a47a0045d89371775 |
Hmm could be! Perhaps I should try something between 6.6 and 6.6.2 to see if that was indeed that patch that's responsible.
Right now I'm trying to see if I can enable just some module and reproduce the issue, i.e. instead of M for AMDGPU, like this with <*>
Code: | CONFIG_EXTRA_FIRMWARE="amdgpu/vegam_ce.bin amdgpu/vegam_me.bin amdgpu/vegam_mec2.bin amdgpu/vegam_mec.bin amdgpu/vegam_pfp.bin amdgpu/vegam_rlc.bin amdgpu/vegam_sdma1.bin amdgpu/vegam_sdma.bin amdgpu/vegam_smc.bin amdgpu/vegam_uvd.bin amdgpu/vegam_vce.bin"
|
Also don't know how important this is but I noticed
Code: | 976 │ [ 122.395930] [drm:atom_op_jump [amdgpu]] *ERROR* atombios stuck in loop for more than 20secs aborting
977 │ [ 122.396133] [drm:amdgpu_atom_execute_table_locked [amdgpu]] *ERROR* atombios stuck executing C6FC (len 403, WS 20, PS 0) @ 0xC815
978 │ [ 122.396258] [drm:amdgpu_atom_execute_table_locked [amdgpu]] *ERROR* atombios stuck executing C410 (len 114, WS 0, PS 8) @ 0xC46D |
in dmesg |
|
Back to top |
|
|
pietinger Moderator
Joined: 17 Oct 2006 Posts: 5277 Location: Bavaria
|
Posted: Sat Mar 30, 2024 12:01 pm Post subject: |
|
|
Have you tried to enable the AMDGPU module static <*> into your kernel (instead as being a <M>odule) ? Do have then also the same errror message in dmesg ? _________________ https://wiki.gentoo.org/wiki/User:Pietinger |
|
Back to top |
|
|
Hu Administrator
Joined: 06 Mar 2007 Posts: 22924
|
Posted: Sat Mar 30, 2024 2:49 pm Post subject: |
|
|
proxy_correlations wrote: | Tho good news is it somehow fixed (a bug?) that would make it so that the initial encryption password prompt would just disappear underneath other bootup messages. You could still unlock the machine if you just pretended like there was a prompt and just started typing. Now it shows up just fine. | Although the old behavior was undesirable, it is not really a fixable bug, so it may recur either randomly or when you do another upgrade in the future. You have the kernel printing messages as it completes various steps, and you also have the user tool printing the password prompt. Neither is dependent on the other reaching a quiet state, so messages can end up interspersed. |
|
Back to top |
|
|
|