View previous topic :: View next topic |
Author |
Message |
lagalopex Guru
Joined: 16 Oct 2004 Posts: 565
|
Posted: Sat Jun 27, 2009 11:16 am Post subject: |
|
|
I now enabled the new in kernel check for hung tasks. It will print a warning for me quite often!
INFO: task *** blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Call Trace: ...
Its most often a small program to capture the webcam and save the picture to harddisk. |
|
Back to top |
|
|
Lepaca Kliffoth l33t
Joined: 28 Apr 2004 Posts: 737 Location: Florence, Italy
|
Posted: Mon Jun 29, 2009 4:15 pm Post subject: |
|
|
kernelOfTruth wrote: | Lepaca Kliffoth, there are more factors involved in this problem:
for me it's actually the other way around:
- copying large files from one hdd to another
-> results in 100% halt of network-card
-> often also the mouse-pointer stops reacting and even the sound (if streaming on the net, of course - if playing locally it also stops)
in several situations if you raise the priority via renice of the affected apps sound and the mouse-pointer continue running/working
it pretty much looks like a bottleneck in the I/O subsystem combined with cpu-scheduler issues ... |
I thought about it and you're right, there are more variables involved. While the situation did improve, I can't say with any certainty that the networking is at fault, it could be something that is triggered more often or more strongly by the madwifi drivers but the "faulty code" could still be somewhere else. But it doesn't matter: in the end we're all just soft, cute kittens left out in the cold rain by the heartless kernel devs, meowing for a fix. _________________ It isn't enough to win - everyone else must lose, and you also have to rub it in their face (maybe chop off an arm too for good measure).
Animebox! |
|
Back to top |
|
|
luispa Guru
Joined: 17 Mar 2006 Posts: 359 Location: España
|
Posted: Thu Jul 02, 2009 5:12 pm Post subject: |
|
|
Hi,
It's been 4 days since I upgraded from 2.6.28 to 2.6.30. Since then I've noticed "sporadic" I/O issues, from sudden "slowness/glitches" while watching a movie, writtes to mysql taking much longer than expected to unvelievable aparent hang on disk I/O, while executing "sync" (twice, took more than 15 minutes to finish a sync and while doing so I observed a misere ~<200KBps throughput to the disk).
I've had nothing similiar to this issue since I installed the system on January (2.6.2 and started just wen I went to 2.6.30. My system is not AMD, but Intel Core I7 920, 12GB RAM, and some 1.5TB SATA II Hard Drive's, so HW shouldn't be the problem.
I just comment here (even being an Intel cpu) because I just started my research and found this thread, hope it adds value.
Regards,
Luis
PD: I'll try with 2.6.29 and post back in few days if something changes. |
|
Back to top |
|
|
fangorn Veteran
Joined: 31 Jul 2004 Posts: 1886
|
Posted: Fri Jul 03, 2009 8:21 am Post subject: |
|
|
@luispa
You are correct here. AMD64 is the name of the architecture and not bound to a manufaturer.
I own many AMD powered boxes and a Core i7 920 powered box. If any different the Intel box is worse. Ok, might have something to do with the fact that it is built for the only purpose of handling multi-TB of video data per month. For that reason I/O is quite high with this machine. But as soon as a job starts heavy writing while I copy something the machine is hardly usable any more. _________________ Video Encoding scripts collection | Project page |
|
Back to top |
|
|
luispa Guru
Joined: 17 Mar 2006 Posts: 359 Location: España
|
Posted: Sat Jul 04, 2009 10:55 am Post subject: |
|
|
@fangorn
Thanks for the information, as I said here is the result with 2.6.29: no problem, back to normal behaviour. I'm not suffering problems with I/O now. Obviously I cant add any value here, but my experience. 2.6.28: Ok, 2.6.30: I/O issue, 2.6.29: Ok.
The system is not under heavy load as yours, but it has lots of services installed, as I use it as a Workstation (mainly photography and rarely transcoding video), and as a Server (mail, web, mysql, wiki, ...) but with not much load.
I can help though making tests. What commands should I use to start the test and which one to get the metrics?.
Thanks
Luis |
|
Back to top |
|
|
DaggyStyle Watchman
Joined: 22 Mar 2006 Posts: 5929
|
Posted: Sat Jul 04, 2009 3:16 pm Post subject: |
|
|
what kernel config paramters should I check under 2.6.30 inorder to see if there is a difference?
should I select group scheduling? _________________ Only two things are infinite, the universe and human stupidity and I'm not sure about the former - Albert Einstein |
|
Back to top |
|
|
Need4Speed Guru
Joined: 06 Jun 2004 Posts: 497
|
Posted: Thu Jul 09, 2009 3:25 pm Post subject: |
|
|
luispa wrote: | @fangorn
Thanks for the information, as I said here is the result with 2.6.29: no problem, back to normal behaviour. I'm not suffering problems with I/O now. Obviously I cant add any value here, but my experience. 2.6.28: Ok, 2.6.30: I/O issue, 2.6.29: Ok.
The system is not under heavy load as yours, but it has lots of services installed, as I use it as a Workstation (mainly photography and rarely transcoding video), and as a Server (mail, web, mysql, wiki, ...) but with not much load.
I can help though making tests. What commands should I use to start the test and which one to get the metrics?.
Thanks
Luis |
If you have the time, the best thing you can probably do is download the kernel tree and run a git bisect. This will allow you to identify the commit that caused this regression and then you can open a bug about it. Here's an example of how to do it: http://kerneltrap.org/node/11753 _________________ 2.6.34-rc3 on x86_64 w/ paludis
WM: ratpoison
Term: urxvt, zsh
Browser: uzbl
Email: mutt, offlineimap
IRC: weechat
News: newsbeuter
PDF: apvlv |
|
Back to top |
|
|
Bogo Tux's lil' helper
Joined: 04 May 2002 Posts: 98
|
Posted: Fri Jul 17, 2009 12:48 am Post subject: |
|
|
Just a question, is anyone with this problem using an ATI/AMD video card? I used to have this problem, and it has completely gone away after replacing my ATI with Nvidia.
I had an Athlon 64 3200 with an HD 3000 series that would grind to a halt whenever I did any kind of big file operation (eg transfering a movie across hard drives). The desktop would become nearly unresponsive. It would take sometimes up to half a minute to switch to the next song. Even typing in a console was slow. I didn't recall having that problem with that system earlier, and I attributed it to something changing in the kernel. I had been using an Nvidia FX 5700 earlier.
Recently I got a Phenom II X4 with an ATI HD 4870, and still had the same issue. A few weeks ago I replaced the ATI card with an Nvidia GTX 260 because ATI cards do not work very well with Linux (at least for me). Ever since then, I have not experienced the problem. Nothing else changed, just the video card and associated drivers. _________________ "I know it's only rock and roll but I like it." |
|
Back to top |
|
|
kernelOfTruth Watchman
Joined: 20 Dec 2005 Posts: 6111 Location: Vienna, Austria; Germany; hello world :)
|
Posted: Fri Jul 17, 2009 4:14 pm Post subject: |
|
|
Bogo wrote: | Just a question, is anyone with this problem using an ATI/AMD video card? I used to have this problem, and it has completely gone away after replacing my ATI with Nvidia.
I had an Athlon 64 3200 with an HD 3000 series that would grind to a halt whenever I did any kind of big file operation (eg transfering a movie across hard drives). The desktop would become nearly unresponsive. It would take sometimes up to half a minute to switch to the next song. Even typing in a console was slow. I didn't recall having that problem with that system earlier, and I attributed it to something changing in the kernel. I had been using an Nvidia FX 5700 earlier.
Recently I got a Phenom II X4 with an ATI HD 4870, and still had the same issue. A few weeks ago I replaced the ATI card with an Nvidia GTX 260 because ATI cards do not work very well with Linux (at least for me). Ever since then, I have not experienced the problem. Nothing else changed, just the video card and associated drivers. |
no - it's not !
I've switched from 7600 GT to 4850 HD and it's the same before and afterwards _________________ https://github.com/kernelOfTruth/ZFS-for-SystemRescueCD/tree/ZFS-for-SysRescCD-4.9.0
https://github.com/kernelOfTruth/pulseaudio-equalizer-ladspa
Hardcore Gentoo Linux user since 2004 |
|
Back to top |
|
|
f0rk Apprentice
Joined: 15 Nov 2004 Posts: 273 Location: Moscow
|
Posted: Tue Jul 21, 2009 4:35 pm Post subject: |
|
|
switching to zen-sources and using BFQ or FIFO scheduler with disabling NCQ (echo 1 > /sys/block/sda/device/queue_depth) help me to improve situation |
|
Back to top |
|
|
kernelOfTruth Watchman
Joined: 20 Dec 2005 Posts: 6111 Location: Vienna, Austria; Germany; hello world :)
|
|
Back to top |
|
|
Syshalt n00b
Joined: 26 Sep 2003 Posts: 22
|
Posted: Thu Jul 23, 2009 8:41 am Post subject: |
|
|
Interesting. I've got a Seagate drive too - and I have the problem. So is there anyone with non-Seagate drives having exactly the same issues? |
|
Back to top |
|
|
DaggyStyle Watchman
Joined: 22 Mar 2006 Posts: 5929
|
Posted: Thu Jul 23, 2009 10:21 am Post subject: |
|
|
Syshalt wrote: | Interesting. I've got a Seagate drive too - and I have the problem. So is there anyone with non-Seagate drives having exactly the same issues? |
this problem isn't seagate exclusive... I have the same issue on 2 diff computers with hd from wd _________________ Only two things are infinite, the universe and human stupidity and I'm not sure about the former - Albert Einstein |
|
Back to top |
|
|
Fred Krogh Veteran
Joined: 07 Feb 2005 Posts: 1036 Location: Tujunga, CA
|
|
Back to top |
|
|
f0rk Apprentice
Joined: 15 Nov 2004 Posts: 273 Location: Moscow
|
Posted: Fri Jul 24, 2009 10:54 pm Post subject: |
|
|
kernelOfTruth wrote: |
++
in that case it's a bad / faulty ncq-implementation of your harddrives (seagate - I'm looking at you ) |
Yep, unfortunately you are right
Increase of compilation time is about 300%. Too slow.
And finally my solution was moving back to x86 arch (all right here with system over hard hdd loading), because, as we can see, profit of using amd64 on desktops is doubtful.. |
|
Back to top |
|
|
MageSlayer Apprentice
Joined: 26 Jul 2007 Posts: 253 Location: Ukraine
|
|
Back to top |
|
|
Serchio n00b
Joined: 17 May 2008 Posts: 26
|
Posted: Sun Aug 09, 2009 12:29 am Post subject: |
|
|
How can I disable ncq? |
|
Back to top |
|
|
loftwyr l33t
Joined: 29 Dec 2004 Posts: 970 Location: 43°38'23.62"N 79°27'8.60"W
|
Posted: Sun Aug 09, 2009 1:10 am Post subject: |
|
|
echo 1 > /sys/block/[drive device]/device/queue_depth
Sets the queue to 1 to turn it back on echo 31 or whatever is the default in your dmesg. _________________ My emerge --info
Have you run revdep-rebuild lately? It's in gentoolkit and it's worth a shot if things don't work well.
Celebrating 5 years of Gentoo-ing. |
|
Back to top |
|
|
Serchio n00b
Joined: 17 May 2008 Posts: 26
|
Posted: Sun Aug 09, 2009 11:25 am Post subject: |
|
|
loftwyr wrote: | echo 1 > /sys/block/[drive device]/device/queue_depth
Sets the queue to 1 to turn it back on echo 31 or whatever is the default in your dmesg. |
It returns:
bash: /sys/block/sda/device/queue_depth: Access Denied |
|
Back to top |
|
|
darc n00b
Joined: 03 Sep 2006 Posts: 14
|
Posted: Sun Aug 09, 2009 11:37 am Post subject: |
|
|
Serchio wrote: | loftwyr wrote: | echo 1 > /sys/block/[drive device]/device/queue_depth
Sets the queue to 1 to turn it back on echo 31 or whatever is the default in your dmesg. |
It returns:
bash: /sys/block/sda/device/queue_depth: Access Denied |
That means your hardware doesn't support NCQ, so you have it off. |
|
Back to top |
|
|
Serchio n00b
Joined: 17 May 2008 Posts: 26
|
Posted: Sun Aug 09, 2009 11:47 am Post subject: |
|
|
darc wrote: | Serchio wrote: | loftwyr wrote: | echo 1 > /sys/block/[drive device]/device/queue_depth
Sets the queue to 1 to turn it back on echo 31 or whatever is the default in your dmesg. |
It returns:
bash: /sys/block/sda/device/queue_depth: Access Denied |
That means your hardware doesn't support NCQ, so you have it off. |
edit: I have forgotten that AHCI in bios had been disabled. Now NCQ is enabled
edit2: I currently use this patch, and I can see a significant difference regarding to not using any patch at all. |
|
Back to top |
|
|
MageSlayer Apprentice
Joined: 26 Jul 2007 Posts: 253 Location: Ukraine
|
Posted: Sun Aug 09, 2009 5:09 pm Post subject: |
|
|
Guys. Please post something that can be compared.
I think that Interbench (http://users.on.net/~ckolivas/interbench/) results should be fairly just.
Moreover its Con Kolivas tool, so we have some confidence about what it does. We aim for interactivity, aren't we?
These are my results.
Hardware - laptop Compaq nx7010 (Pentium-M 1.7GHz, 512Mb)
Common options for two measurements:
Code: |
vm.swappiness=20
vm.vfs_cache_pressure=30
vm.dirty_background_bytes = 0
vm.dirty_background_ratio = 2
vm.dirty_bytes = 0
vm.dirty_expire_centisecs = 3000
vm.dirty_ratio = 10
vm.dirty_writeback_centisecs = 500
vm.highmem_is_dirtyable = 0
echo 1024 > /sys/block/sda/queue/nr_requests
|
Without patch http://bugzilla.kernel.org/show_bug.cgi?id=12309#c397 plus "clocksource=acpi_pm"
Code: |
Using 787575 loops per ms, running every load for 30 seconds
Benchmarking kernel 2.6.30-zen2-31386-g752ddf5 at datestamp 200908091549
--- Benchmarking simulated cpu of Audio in the presence of simulated ---
Load Latency +/- SD (ms) Max Latency % Desired CPU % Deadlines Met
None 0.008 +/- 0.00884 0.019 100 100
Video 0.023 +/- 0.37 9.05 100 100
X 0.008 +/- 0.00901 0.021 100 100
Burn 0.008 +/- 0.00917 0.021 100 100
Write 0.057 +/- 0.487 9.06 100 100
Read 0.027 +/- 0.0561 0.566 100 100
Compile 0.036 +/- 0.223 4.72 100 100
Memload 0.138 +/- 0.898 15.6 100 100
--- Benchmarking simulated cpu of Video in the presence of simulated ---
Load Latency +/- SD (ms) Max Latency % Desired CPU % Deadlines Met
None 7.25 +/- 13.8 46.5 95.9 68.4
X 14.7 +/- 21.3 80 70.7 37.5
Burn 44.9 +/- 47.8 82.5 27.9 0.396
Write 9.91 +/- 17.2 76.6 83.4 57.3
Read 7.58 +/- 12.8 48.5 98 61.8
Compile 46.7 +/- 51.7 115 25.9 3.18
Memload 12.9 +/- 24.1 333 75 46.3
--- Benchmarking simulated cpu of X in the presence of simulated ---
Load Latency +/- SD (ms) Max Latency % Desired CPU % Deadlines Met
None 68.8 +/- 97.4 231 15.6 6.58
Video 122 +/- 178 411 7.97 3.21
Burn 283 +/- 355 675 12.3 2.73
Write 81.8 +/- 117 348 13.6 5.46
Read 77.6 +/- 110 259 13.7 5.55
Compile 304 +/- 380 675 12.6 2.76
Memload 90.1 +/- 132 448 11.2 4.59
--- Benchmarking simulated cpu of Gaming in the presence of simulated ---
Load Latency +/- SD (ms) Max Latency % Desired CPU
None 230 +/- 232 262 30.3
Video 389 +/- 393 445 20.4
X 378 +/- 383 437 20.9
Burn 875 +/- 892 931 10.3
Write 278 +/- 283 453 26.5
Read 262 +/- 264 291 27.6
Compile 980 +/- 1001 1066 9.26
Memload 316 +/- 323 520 24
|
With patch http://bugzilla.kernel.org/show_bug.cgi?id=12309#c397 plus "clocksource=tsc"
Code: |
Using 787575 loops per ms, running every load for 30 seconds
Benchmarking kernel 2.6.30-zen2-31386-g752ddf5-dirty at datestamp 200908091528
--- Benchmarking simulated cpu of Audio in the presence of simulated ---
Load Latency +/- SD (ms) Max Latency % Desired CPU % Deadlines Met
None 0.003 +/- 0.00369 0.006 100 100
Video 0.003 +/- 0.00379 0.006 100 100
X 0.003 +/- 0.00674 0.136 100 100
Burn 0.003 +/- 0.00417 0.04 100 100
Write 0.046 +/- 0.411 7.44 100 100
Read 0.017 +/- 0.033 0.38 100 100
Compile 0.042 +/- 0.317 4.48 100 100
Memload 0.053 +/- 0.652 12.9 100 100
--- Benchmarking simulated cpu of Video in the presence of simulated ---
Load Latency +/- SD (ms) Max Latency % Desired CPU % Deadlines Met
None 0.003 +/- 0.00388 0.04 100 100
X 4.12 +/- 10.3 49.7 93.9 91.1
Burn 21 +/- 28.4 77 59.8 34.2
Write 0.476 +/- 3.46 49.9 99.6 98.4
Read 0.013 +/- 0.0347 0.408 100 100
Compile 23.9 +/- 32.6 80.2 49.4 31.3
Memload 1.75 +/- 8.37 107 94.9 92.6
--- Benchmarking simulated cpu of X in the presence of simulated ---
Load Latency +/- SD (ms) Max Latency % Desired CPU % Deadlines Met
None 20.6 +/- 37.4 107 26.9 18.5
Video 58.8 +/- 86.1 210 17.3 8.4
Burn 202 +/- 258 508 5.2 1.29
Write 31.3 +/- 53.2 208 20.5 12.8
Read 26.1 +/- 44.7 120 22.9 14.8
Compile 214 +/- 274 528 8.36 2.12
Memload 48.3 +/- 78.9 306 16.4 8.28
--- Benchmarking simulated cpu of Gaming in the presence of simulated ---
Load Latency +/- SD (ms) Max Latency % Desired CPU
None 98.5 +/- 99.1 119 50.4
Video 198 +/- 199 229 33.6
X 185 +/- 188 228 35.1
Burn 490 +/- 496 506 16.9
Write 125 +/- 127 224 44.5
Read 113 +/- 113 129 47
Compile 540 +/- 547 643 15.6
Memload 135 +/- 143 348 42.5
|
With patch http://bugzilla.kernel.org/show_bug.cgi?id=12309#c397 plus "clocksource=acpi_pm"
Code: |
Using 787575 loops per ms, running every load for 30 seconds
Benchmarking kernel 2.6.30-zen2-31386-g752ddf5-dirty at datestamp 200908091940
--- Benchmarking simulated cpu of Audio in the presence of simulated ---
Load Latency +/- SD (ms) Max Latency % Desired CPU % Deadlines Met
None 0.007 +/- 0.00785 0.041 100 100
Video 0.044 +/- 0.537 7.94 100 100
X 0.006 +/- 0.00724 0.03 100 100
Burn 0.007 +/- 0.00724 0.018 100 100
Write 0.026 +/- 0.299 7.3 100 100
Read 0.02 +/- 0.0352 0.319 100 100
Compile 0.036 +/- 0.294 6.05 100 100
Memload 0.037 +/- 0.275 5.7 100 100
--- Benchmarking simulated cpu of Video in the presence of simulated ---
Load Latency +/- SD (ms) Max Latency % Desired CPU % Deadlines Met
None 0.006 +/- 0.00739 0.042 100 100
X 7.23 +/- 13.6 49.5 96 70.3
Burn 27.7 +/- 32.6 73.8 42.5 10.8
Write 1.5 +/- 6.12 48.9 99.1 94.2
Read 0.383 +/- 2.36 25.8 100 98.2
Compile 28.6 +/- 37.2 105 40.7 21.3
Memload 2.28 +/- 16.2 583 95.3 89.1
--- Benchmarking simulated cpu of X in the presence of simulated ---
Load Latency +/- SD (ms) Max Latency % Desired CPU % Deadlines Met
None 35.2 +/- 55.9 145 19 10.8
Video 70.7 +/- 103 256 14.7 6.66
Burn 225 +/- 286 549 9.72 2.46
Write 49.4 +/- 73.6 225 15.2 7.56
Read 46.1 +/- 68.2 160 15.8 7.67
Compile 239 +/- 303 555 10.7 2.73
Memload 55.3 +/- 80.3 220 18.1 8.46
--- Benchmarking simulated cpu of Gaming in the presence of simulated ---
Load Latency +/- SD (ms) Max Latency % Desired CPU
None 134 +/- 134 148 42.8
Video 246 +/- 248 281 28.9
X 234 +/- 237 278 30
Burn 587 +/- 598 651 14.6
Write 158 +/- 161 269 38.7
Read 150 +/- 151 168 40
Compile 652 +/- 662 783 13.3
Memload 183 +/- 198 759 35.4
|
As you see some improvements are really here, but it's hard to call them considerable.
P.S. "clocksource=jiffies" just hangs the system. My HDD does not support NCQ. |
|
Back to top |
|
|
wrc1944 Advocate
Joined: 15 Aug 2002 Posts: 3456 Location: Gainesville, Florida
|
Posted: Mon Aug 10, 2009 5:11 pm Post subject: |
|
|
For those having this problem, especially if they have SATA drives, it would probably be worth a shot to try the deadline scheduler instead of cfq.
Everything I've read over the last year or so seems to indicate there is still an I/O problem with cfq on some systems, and also that generally with SATA drives deadline is often a better scheduler that cfq. Kernel >=2.6.30-rc4 seemed to improve it somewhat (as mentioned), but I'm still sticking with deadline myself until I'm convinced this is really fixed with cfq.
You need to enable support in your kernel (probably already has it, but check your .config file). If not, you'll need to recompile your kernel and enable deadline, but if it does already have it, just append your grub kernel line with
and reboot.
If it makes a difference great, but if not, just remove the append. _________________ Main box- AsRock x370 Gaming K4
Ryzen 7 3700x, 3.6GHz, 16GB GSkill Flare DDR4 3200mhz
Samsung SATA 1000GB, Radeon HD R7 350 2GB DDR5
OpenRC Gentoo ~amd64 plasma, glibc-2.40-r5, gcc-14
kernel-6.11.3 USE=experimental python3_12.7-final-0 |
|
Back to top |
|
|
Serchio n00b
Joined: 17 May 2008 Posts: 26
|
Posted: Mon Aug 10, 2009 6:49 pm Post subject: |
|
|
@wrc1944 Do you have deadline as default IO scheduler in kernel? |
|
Back to top |
|
|
wrc1944 Advocate
Joined: 15 Aug 2002 Posts: 3456 Location: Gainesville, Florida
|
Posted: Wed Aug 12, 2009 11:40 pm Post subject: |
|
|
Serchio,
Yes, I use currently use deadline, but also enable cfq whenever i compile a new kernel, so I can test it if I happen to hear there were any promising fixes/patches. Never hurts to have the option supported to append the grub kernel line for a different scheduler. _________________ Main box- AsRock x370 Gaming K4
Ryzen 7 3700x, 3.6GHz, 16GB GSkill Flare DDR4 3200mhz
Samsung SATA 1000GB, Radeon HD R7 350 2GB DDR5
OpenRC Gentoo ~amd64 plasma, glibc-2.40-r5, gcc-14
kernel-6.11.3 USE=experimental python3_12.7-final-0 |
|
Back to top |
|
|
|