View previous topic :: View next topic |
Author |
Message |
devsk Advocate
Joined: 24 Oct 2003 Posts: 3003 Location: Bay Area, CA
|
Posted: Tue Jan 02, 2024 3:13 am Post subject: IO Performance loss after kernel upgrades from 5.15 to 6.x |
|
|
Has anybody tracked the IO performance across kernel upgrades?
I seem to have run into a situation where the performance is halved in the worst case. I am comparing three LTS kernels:
5.15.145
6.1.68
6.6.8 (purported next LTS)
I have a RAIDZ1 of 4 NVME SSDs. Same ZFS version 2.2.2. The only thing I change is the kernel version, rest everything is same (same machine, same ZFS version, same pool, same userspace). The kernel config has been brought forward with 'make oldconfig'. So, in almost all aspects, it should be identical for common config elements between 5.15, 6.1 and 6.6.
Zpool scrub times:
5.15.145 - 10m
6.1.68 - 12m
6.6.8 - 20m
Attached kdiskmark shots from 5.15.145 vs 6.6.8 (I did not run for 6.1.68 ). I ran it on top of the same ZFS dataset where compression is disabled. All numbers are lower for 6.6.8.
5.15.145 https://www.phoronix.com/forums/filedata/fetch?id=1431236
6.6.8 https://www.phoronix.com/forums/filedata/fetch?id=1431237
I was expecting 6.6 to be higher numbers because of all the IO enhancements work that has gone in since 5.15.
Is there something obvious that I missed and I need to tune specifically to double up the sequential IO speed in 6.6.8? |
|
Back to top |
|
|
devsk Advocate
Joined: 24 Oct 2003 Posts: 3003 Location: Bay Area, CA
|
Posted: Tue Jan 02, 2024 3:14 am Post subject: |
|
|
BTW, its not a ZFS regression because the raw IO speed with 'dd' in non-cached mode has fallen as well. |
|
Back to top |
|
|
devsk Advocate
Joined: 24 Oct 2003 Posts: 3003 Location: Bay Area, CA
|
|
Back to top |
|
|
pietinger Moderator
Joined: 17 Oct 2006 Posts: 5353 Location: Bavaria
|
|
Back to top |
|
|
pietinger Moderator
Joined: 17 Oct 2006 Posts: 5353 Location: Bavaria
|
|
Back to top |
|
|
gentoo_ram Guru
Joined: 25 Oct 2007 Posts: 513 Location: San Diego, California USA
|
Posted: Thu Jan 11, 2024 1:22 am Post subject: |
|
|
I believe the default I/O schedulers have changed in the various kernel versions. That might have something to do with it. If you can boot your different kernels then maybe compare the I/O scheduler settings. Try:
|
|
Back to top |
|
|
devsk Advocate
Joined: 24 Oct 2003 Posts: 3003 Location: Bay Area, CA
|
Posted: Tue Jan 16, 2024 6:21 am Post subject: |
|
|
gentoo_ram wrote: | I believe the default I/O schedulers have changed in the various kernel versions. That might have something to do with it. If you can boot your different kernels then maybe compare the I/O scheduler settings. Try:
| do you have the output from 6.6.x?
The nvme drives use the "none" scheduler and hdd drives use mq-deadline in 5.15. I don't think that has changed. |
|
Back to top |
|
|
devsk Advocate
Joined: 24 Oct 2003 Posts: 3003 Location: Bay Area, CA
|
Posted: Mon Feb 05, 2024 4:51 pm Post subject: |
|
|
This regression is present in 6.7.3 as well. The system boots in 45seconds while 5.15 boots in about 25s. The zpool scrub over a pool of NVME drives is running at half the speed compared to 5.15. So, definitely some major issues with kernels beyond 5.15.
I am surprised that no one else is noticing any slowdowns over 2 LTS kernels (6.1 and 6.6). Maybe folks don't test performance that often. |
|
Back to top |
|
|
logrusx Advocate
Joined: 22 Feb 2018 Posts: 2660
|
Posted: Mon Feb 05, 2024 5:41 pm Post subject: |
|
|
I recently switched to 6.6 and is seems like there's some degradation in responsiveness of the desktop which could be attributed to the CPU scheduler. I remember at one time I used to use Con Kolivas patches to overcome that, then it was not necessary anymore.
Best Regards,
Georgi |
|
Back to top |
|
|
devsk Advocate
Joined: 24 Oct 2003 Posts: 3003 Location: Bay Area, CA
|
Posted: Mon Feb 05, 2024 6:17 pm Post subject: |
|
|
The worst part here is that once the IO starts to build up because of slowness in IO, the desktop starts to suffer: mouse stutters, menus don't open, applications don't fire when you click on them. Its a complete shutshow.
I can not reproduce any of that with 5.15.
I think they made that CPU scheduler change in a hurry and did not add an option to stay with older code via a config option. If they did that, we could at least confirm that that indeed is the issue. |
|
Back to top |
|
|
Saundersx Apprentice
Joined: 11 Apr 2005 Posts: 294
|
Posted: Mon Feb 05, 2024 8:28 pm Post subject: |
|
|
Just for giggles try this, I had a regression back in 5.19.6 and this made it tolerable.
Code: | echo 500 > /proc/sys/vm/dirty_expire_centisecs
echo $((1024*1024*256)) > /proc/sys/vm/dirty_background_bytes
echo $((1024*1024*512)) > /proc/sys/vm/dirty_bytes |
|
|
Back to top |
|
|
devsk Advocate
Joined: 24 Oct 2003 Posts: 3003 Location: Bay Area, CA
|
Posted: Mon Feb 05, 2024 10:43 pm Post subject: |
|
|
Saundersx wrote: | Just for giggles try this, I had a regression back in 5.19.6 and this made it tolerable.
Code: | echo 500 > /proc/sys/vm/dirty_expire_centisecs
echo $((1024*1024*256)) > /proc/sys/vm/dirty_background_bytes
echo $((1024*1024*512)) > /proc/sys/vm/dirty_bytes |
| Thanks, Saundersx! I know about these settings very well and they are tuned perfectly for my system. I never use a Linux machine without VM settings tuned properly. Otherwise, the system is intolerable when IO starts.
The difference in behavior in my case is seen with one change: the kernel version. |
|
Back to top |
|
|
|