Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
AMD64 system slow/unresponsive during disk access (Part 2)
View unanswered posts
View posts from last 24 hours

Goto page 1, 2, 3, 4, 5, 6, 7  Next  
This topic is locked: you cannot edit posts or make replies.    Gentoo Forums Forum Index Gentoo on AMD64
View previous topic :: View next topic  
Author Message
timeBandit
Bodhisattva
Bodhisattva


Joined: 31 Dec 2004
Posts: 2719
Location: here, there or in transit

PostPosted: Sat Sep 19, 2009 12:29 am    Post subject: AMD64 system slow/unresponsive during disk access (Part 2) Reply with quote

Continuation of AMD64 system slow/unresponsive during disk access....

For convenience, in case this changeover happened in mid-conversation, the above link refers to the last page of the prior topic, not the first.
_________________
Plants are pithy, brooks tend to babble--I'm content to lie between them.
Super-short f.g.o checklist: Search first, strip comments, mark solved, help others.
Back to top
View user's profile Send private message
joi_
Apprentice
Apprentice


Joined: 28 Mar 2005
Posts: 171

PostPosted: Sun Sep 20, 2009 9:47 pm    Post subject: Reply with quote

there's a new thread at LKML about responsiveness under I/O load
http://lkml.org/lkml/2009/9/19/288
speak up there people!
Back to top
View user's profile Send private message
lagalopex
Guru
Guru


Joined: 16 Oct 2004
Posts: 565

PostPosted: Mon Sep 21, 2009 12:04 pm    Post subject: Reply with quote

I tried the BFS scheduler, but my usb system would afterwards not recognize any new devices.
So I have to wait a bit longer...

In the lkml thread they mentioned:
Code:
for i in /sys/block/sd?/queue/iosched/quantum ; do echo 1 > "$i" ; done

Quote:
quantum:
The amount of requests we select for dispatch when the driver asks for work to do and the current pending list is empty.
Default is 4.

Will now use it, but looks good so far.
Back to top
View user's profile Send private message
kernelOfTruth
Watchman
Watchman


Joined: 20 Dec 2005
Posts: 6111
Location: Vienna, Austria; Germany; hello world :)

PostPosted: Mon Sep 21, 2009 1:43 pm    Post subject: Reply with quote

lagalopex wrote:
I tried the BFS scheduler, but my usb system would afterwards not recognize any new devices.
So I have to wait a bit longer...

In the lkml thread they mentioned:
Code:
for i in /sys/block/sd?/queue/iosched/quantum ; do echo 1 > "$i" ; done

Quote:
quantum:
The amount of requests we select for dispatch when the driver asks for work to do and the current pending list is empty.
Default is 4.

Will now use it, but looks good so far.


8O

I thought the higher the better (I'm currently using 64 or sometimes 8) but I will try 1

thanks !
_________________
https://github.com/kernelOfTruth/ZFS-for-SystemRescueCD/tree/ZFS-for-SysRescCD-4.9.0
https://github.com/kernelOfTruth/pulseaudio-equalizer-ladspa

Hardcore Gentoo Linux user since 2004 :D
Back to top
View user's profile Send private message
kernelOfTruth
Watchman
Watchman


Joined: 20 Dec 2005
Posts: 6111
Location: Vienna, Austria; Germany; hello world :)

PostPosted: Thu Sep 24, 2009 10:02 pm    Post subject: Reply with quote

all of those recommendations of extremly lowering numbers of dirty_background_ratio, dirty_ratio, quantum, etc. seems to have made it magnitudes more worse !

some new stuff:

1) for this to work you need to switch to CFQ (if you use zen-sources or a kernel which has support for BFQ, better use that)

Quote:
for i in /sys/block/sd*; do
/bin/echo "bfq" > $i/queue/scheduler
done


2) if you have buggy ncq-implementation in your drive-firmware (Seagate - I'm looking at you :P )

set slice_idle to "0" - you don't need this with newer kernels and/or BFQ

so the code would look like:

Quote:
for i in /sys/block/sd*; do
/bin/echo "bfq" > $i/queue/scheduler
/bin/echo "0" > $i/queue/iosched/slice_idle # default: 6, 0 fixes low throughput with drives which have a buggy ncq implementation
done


also enable:

# "the following practically disables NCQ which is buggy"
# "on a lot of drives and known to drive CFQ and other i/o schedulers crazy :D"
Quote:
for i in /sys/block/sd*; do
/bin/echo "2" > $i/device/queue_depth
done


(you can also set it to "1" - 0 doesn't work)

3)

set up portage

Code:
PORTAGE_NICENESS=19
PORTAGE_IONICE_COMMAND="ionice -c 3 -p \${PID}"


in /etc/make.conf

setting up your daemons

execute:

Quote:
for i in `pidof kjournald` ; do ionice -c3 -p $i ; done
for i in `pidof kjournald2` ; do ionice -c3 -p $i ; done
for i in `pidof pdflush` ; do renice 10 $i ; done
for i in `pidof kcryptd` ; do ionice -c1 -p $i ; done



if the above ionice-settings breaks anything (if at all) please report !

4) renice X or some of X

Code:
for i in `pidof X` ; do renice -10 $i ; done
for i in `pidof kwin` ; do renice -10 $i ; done

...


this doesn't fix all hickup for me but improves interactivity A LOT

thanks for reading until now !

enjoy ! :)
_________________
https://github.com/kernelOfTruth/ZFS-for-SystemRescueCD/tree/ZFS-for-SysRescCD-4.9.0
https://github.com/kernelOfTruth/pulseaudio-equalizer-ladspa

Hardcore Gentoo Linux user since 2004 :D
Back to top
View user's profile Send private message
mamunata
Apprentice
Apprentice


Joined: 30 Nov 2004
Posts: 169

PostPosted: Fri Sep 25, 2009 11:50 am    Post subject: Reply with quote

I follow this topic for a while, because I have a laptop with Turion64 CPU and 64bit gentoo on it, and I/O performance is terrible - system load increases to 10 or even 15-20 on heavy disk usages and system is completely unusable.
I've been impressed that there's different "tricks" that decrease the problem and lot of them are for SATA (sd*) hard disks (with or without NCQ), but my laptop is with IDE hard disk and problem is fully present.
So am I missing something or problem is independent of hard disk type and is inherent for all types HDD?
Back to top
View user's profile Send private message
Elv13
Guru
Guru


Joined: 13 Nov 2005
Posts: 388
Location: Socialist land of North America

PostPosted: Sat Sep 26, 2009 8:20 pm    Post subject: Reply with quote

Try the kernel 2.6.30-rc5, it is a version just after a patch that really worked well and a redesign that broke it again. I use sata drives, but it may work for you too.
Back to top
View user's profile Send private message
sidamos
Apprentice
Apprentice


Joined: 16 Dec 2007
Posts: 246

PostPosted: Sat Sep 26, 2009 10:09 pm    Post subject: Reply with quote

I am having the same problem since 2.6.30 and I also do not have SATA. 2.6.29 works better. But this is not fixed in 2.6.31, as far as I understood.
Back to top
View user's profile Send private message
kernelOfTruth
Watchman
Watchman


Joined: 20 Dec 2005
Posts: 6111
Location: Vienna, Austria; Germany; hello world :)

PostPosted: Sun Sep 27, 2009 2:53 pm    Post subject: Reply with quote

mamunata wrote:
I follow this topic for a while, because I have a laptop with Turion64 CPU and 64bit gentoo on it, and I/O performance is terrible - system load increases to 10 or even 15-20 on heavy disk usages and system is completely unusable.
I've been impressed that there's different "tricks" that decrease the problem and lot of them are for SATA (sd*) hard disks (with or without NCQ), but my laptop is with IDE hard disk and problem is fully present.
So am I missing something or problem is independent of hard disk type and is inherent for all types HDD?


those are more than 1 problem,

*) the problem with low throughput is related to a buggy firmware on several SATA-drives with NCQ

*) the problem with lagging / bad desktop interactivity is (seems) specific to the x86_64 / amd64 architecture and cfq and cfs (the cpu-scheduler) seem to be involved

there are more:

*) extremely low throughput with enabled barriers

*) ...
_________________
https://github.com/kernelOfTruth/ZFS-for-SystemRescueCD/tree/ZFS-for-SysRescCD-4.9.0
https://github.com/kernelOfTruth/pulseaudio-equalizer-ladspa

Hardcore Gentoo Linux user since 2004 :D
Back to top
View user's profile Send private message
sidamos
Apprentice
Apprentice


Joined: 16 Dec 2007
Posts: 246

PostPosted: Sun Sep 27, 2009 3:11 pm    Post subject: Reply with quote

sidamos wrote:
I am having the same problem since 2.6.30 and I also do not have SATA. 2.6.29 works better. But this is not fixed in 2.6.31, as far as I understood.


Additional info: I am running AMD64X2 CPU with 32 bit Gentoo.
Back to top
View user's profile Send private message
krinn
Watchman
Watchman


Joined: 02 May 2003
Posts: 7470

PostPosted: Sun Sep 27, 2009 3:16 pm    Post subject: Reply with quote

sidamos wrote:
sidamos wrote:
I am having the same problem since 2.6.30 and I also do not have SATA. 2.6.29 works better. But this is not fixed in 2.6.31, as far as I understood.


Additional info: I am running AMD64X2 CPU with 32 bit Gentoo.


so the problem is beyond AMD64 (64bits) version ? maybe some title change so... As is it seems to refer to the arch AMD64 and not the cpu ?
Back to top
View user's profile Send private message
Cyker
Veteran
Veteran


Joined: 15 Jun 2006
Posts: 1746

PostPosted: Sun Sep 27, 2009 7:08 pm    Post subject: Reply with quote

krinn wrote:
sidamos wrote:
sidamos wrote:
I am having the same problem since 2.6.30 and I also do not have SATA. 2.6.29 works better. But this is not fixed in 2.6.31, as far as I understood.


Additional info: I am running AMD64X2 CPU with 32 bit Gentoo.


so the problem is beyond AMD64 (64bits) version ? maybe some title change so... As is it seems to refer to the arch AMD64 and not the cpu ?


Well, on going from 2.6.28 to 2.6.30, I've been experiencing the total opposite!

I'm running standard 32-bit x86 on an Opteron 180 and am getting much lower iowait and much better throughput on my HDs and network connections.
Back to top
View user's profile Send private message
nero37
Tux's lil' helper
Tux's lil' helper


Joined: 15 Jun 2007
Posts: 141
Location: Ireland

PostPosted: Mon Sep 28, 2009 11:40 am    Post subject: Reply with quote

kernelOfTruth wrote:
4) renice X or some of X

Code:
for i in `pidof X` ; do renice -10 $i ; done
for i in `pidof kwin` ; do renice -10 $i ; done


Where is the best place to have this run automatically at start-up? The window manager starts too late to renice it from a init.d script and since it requires root privileges can't easily be done after login.
Back to top
View user's profile Send private message
lagalopex
Guru
Guru


Joined: 16 Oct 2004
Posts: 565

PostPosted: Tue Sep 29, 2009 2:01 pm    Post subject: Reply with quote

Anybody else running bfs?
I am currently running a 2.6.31.1 with the bfs-240 patch applied.
It now works for me and is pretty responsive as far as I can tell ;)
Back to top
View user's profile Send private message
kernelOfTruth
Watchman
Watchman


Joined: 20 Dec 2005
Posts: 6111
Location: Vienna, Austria; Germany; hello world :)

PostPosted: Wed Sep 30, 2009 4:50 pm    Post subject: Reply with quote

lagalopex wrote:
Anybody else running bfs?
I am currently running a 2.6.31.1 with the bfs-240 patch applied.
It now works for me and is pretty responsive as far as I can tell ;)


++

anyone who is saying that this isn't the i/o schedulers / VFS' / the cpu schedulers (CFS') fault has NO idea :twisted:

the most responsive kernel for me (without BFS) so far was a 2.6.24.something based zen-kernel

could have been that it had RSDL applied / ported - unfortunately I don't have the patch or sources for it anymore ...

anyways:

try 2.6.31-zen2 with BFS (build 240) enabled, there's almost no lagging anymore - additionally the desktop is much much more responsive :D
_________________
https://github.com/kernelOfTruth/ZFS-for-SystemRescueCD/tree/ZFS-for-SysRescCD-4.9.0
https://github.com/kernelOfTruth/pulseaudio-equalizer-ladspa

Hardcore Gentoo Linux user since 2004 :D
Back to top
View user's profile Send private message
lagalopex
Guru
Guru


Joined: 16 Oct 2004
Posts: 565

PostPosted: Thu Oct 01, 2009 9:43 am    Post subject: Reply with quote

With bfs240 I had again a keyboard issue. So I switched to 2.6.32-rc1 as stated in Kernel 2.6.30/31 desktop interactivity patch (no BFS) some improvements for cfs got merged in mainline. Its imo better...

bfs300 is now available... will try it the next days...

Update: Now with bfs302 as the 300 still had a keyboard issue for me. ck noted that switching to evdev helps... I will see...

Update2: bfs303 and ck ran a new test against vanilla-2.6.32-rc3. The rc is doing well ;)
Back to top
View user's profile Send private message
MageSlayer
Apprentice
Apprentice


Joined: 26 Jul 2007
Posts: 253
Location: Ukraine

PostPosted: Sun Oct 11, 2009 7:04 am    Post subject: Reply with quote

I confirm that problem with keyboard under Xorg + bfs has gone after switching to evdev.
evdev 2.1.3 version is the only one which does not have buggy keys repetition though.
Back to top
View user's profile Send private message
Lepaca Kliffoth
l33t
l33t


Joined: 28 Apr 2004
Posts: 737
Location: Florence, Italy

PostPosted: Mon Oct 12, 2009 7:29 pm    Post subject: Reply with quote

BFS solves the problem completely for me. It doesn't make it better; under the kind of loads that used to still create problems even with the few kernel versions that were better, the mouse jerkiness and frame skipping in videos simply doesn't appear.

Things that used to be problematic and now are fine:

- copy stuff over SMB
- check out linux git sources
- paludis -s

It is now impossible for me to reproduce the problem.

I'm using the latest zen-sources.
_________________
It isn't enough to win - everyone else must lose, and you also have to rub it in their face (maybe chop off an arm too for good measure).
Animebox!
Back to top
View user's profile Send private message
Sujao
l33t
l33t


Joined: 25 Sep 2004
Posts: 677
Location: Germany

PostPosted: Wed Oct 21, 2009 5:23 pm    Post subject: Reply with quote

Lepaca Kliffoth wrote:
BFS solves the problem completely for me. It doesn't make it better; under the kind of loads that used to still create problems even with the few kernel versions that were better, the mouse jerkiness and frame skipping in videos simply doesn't appear.

Things that used to be problematic and now are fine:

- copy stuff over SMB
- check out linux git sources
- paludis -s

It is now impossible for me to reproduce the problem.

I'm using the latest zen-sources.


Does it also solve the dd if=/dev/zero of=dump problem where the system locks up as soon as RAM is full?

Where can I get zen-sources? zen-sources.org seems to be down.
Back to top
View user's profile Send private message
forkboy
Apprentice
Apprentice


Joined: 24 Nov 2004
Posts: 200
Location: Blackpool, UK

PostPosted: Wed Oct 21, 2009 10:42 pm    Post subject: Reply with quote

Quote:
Where can I get zen-sources? zen-sources.org seems to be down.

See www.zen-kernel.org and this thread.
Back to top
View user's profile Send private message
wazoo42
Apprentice
Apprentice


Joined: 13 Apr 2004
Posts: 165

PostPosted: Wed Oct 28, 2009 1:20 am    Post subject: Reply with quote

They are available in portage (my portage tree shows .31-r4 is the latest one), as well as from the overlay, zen-sources. I found BFS helps, but I'm not sure it eliminated the problem. Then again, I haven't done any sort of concrete testing...
Back to top
View user's profile Send private message
Sujao
l33t
l33t


Joined: 25 Sep 2004
Posts: 677
Location: Germany

PostPosted: Wed Oct 28, 2009 4:59 am    Post subject: Reply with quote

8O ... no can't be true .... :? ..... :) ... :D .... 8O ... :twisted: Oh my God. It actually seems to work. I tried out the things that were annoying me for almost 2 years now and my system remained responsive. Usually when I write a big (>5GB) file with mmg to harddrive or copy them click response time in most applications is between 10-30s. mmg window doesn't get refreshed until its finished after several minutes and videos don't play proberly. Now I can surf easily with firefox, even play a movie with no visible delay and the mmg window is updating as expected. I'll report back if something changes....I still can't really believe that this is true.

EDIT: It's a shame the vanilla kernel doesn't accomplish this. Thanks to the ZEND Guys! I love you right now!

EDIT2: Moving a 8GB file still increases the response time of applications. For example smplayer open file dialog reacts after 3-4s but this is acceptable. Still much better then 30s. And video still plays smoothly.
Back to top
View user's profile Send private message
mamunata
Apprentice
Apprentice


Joined: 30 Nov 2004
Posts: 169

PostPosted: Wed Oct 28, 2009 5:18 pm    Post subject: Reply with quote

Sujao wrote:
8O ... no can't be true .... :? ..... :) ... :D .... 8O ... :twisted: Oh my God. It actually seems to work. I tried out the things that were annoying me for almost 2 years now and my system remained responsive. Usually when I write a big (>5GB) file with mmg to harddrive or copy them click response time in most applications is between 10-30s. mmg window doesn't get refreshed until its finished after several minutes and videos don't play proberly. Now I can surf easily with firefox, even play a movie with no visible delay and the mmg window is updating as expected. I'll report back if something changes....I still can't really believe that this is true.

EDIT: It's a shame the vanilla kernel doesn't accomplish this. Thanks to the ZEND Guys! I love you right now!

EDIT2: Moving a 8GB file still increases the response time of applications. For example smplayer open file dialog reacts after 3-4s but this is acceptable. Still much better then 30s. And video still plays smoothly.

Can you post what options did you compile kernel with? I'm running zen-kernel-2.6.31-r4 and performance is not much better.
I've noticed (but not 100% sure) that if I leave my PC alone, and not working on it for a couple of hours, when I get back the PC is very slow.
Back to top
View user's profile Send private message
kernelOfTruth
Watchman
Watchman


Joined: 20 Dec 2005
Posts: 6111
Location: Vienna, Austria; Germany; hello world :)

PostPosted: Wed Oct 28, 2009 6:10 pm    Post subject: Reply with quote

mamunata wrote:
Sujao wrote:
8O ... no can't be true .... :? ..... :) ... :D .... 8O ... :twisted: Oh my God. It actually seems to work. I tried out the things that were annoying me for almost 2 years now and my system remained responsive. Usually when I write a big (>5GB) file with mmg to harddrive or copy them click response time in most applications is between 10-30s. mmg window doesn't get refreshed until its finished after several minutes and videos don't play proberly. Now I can surf easily with firefox, even play a movie with no visible delay and the mmg window is updating as expected. I'll report back if something changes....I still can't really believe that this is true.

EDIT: It's a shame the vanilla kernel doesn't accomplish this. Thanks to the ZEND Guys! I love you right now!

EDIT2: Moving a 8GB file still increases the response time of applications. For example smplayer open file dialog reacts after 3-4s but this is acceptable. Still much better then 30s. And video still plays smoothly.

Can you post what options did you compile kernel with? I'm running zen-kernel-2.6.31-r4 and performance is not much better.
I've noticed (but not 100% sure) that if I leave my PC alone, and not working on it for a couple of hours, when I get back the PC is very slow.


you need to select BFS (as a CPU scheduler instead of the stock-scheduler CFS) and select BFQ for the i/o-scheduler

kernel-config:
Code:
# CONFIG_CPU_CFS is not set
CONFIG_CPU_BFS=y
CONFIG_CPU_BFS_AUTOISO=y
# CONFIG_BFS_CUSTOM_RR is not set



Code:
for i in /sys/block/sd*; do
         /bin/echo "bfq" >  $i/queue/scheduler
done


the following might also provide you with additional throughput:

Code:
for i in /sys/block/sd*; do
         /bin/echo "256" >  $i/queue/read_ahead_kb
         /bin/echo "192" >  $i/queue/max_sectors_kb
         /bin/echo "1"   >  $i/queue/rq_affinity
         /bin/echo "0"   >  $i/queue/nomerges
done

_________________
https://github.com/kernelOfTruth/ZFS-for-SystemRescueCD/tree/ZFS-for-SysRescCD-4.9.0
https://github.com/kernelOfTruth/pulseaudio-equalizer-ladspa

Hardcore Gentoo Linux user since 2004 :D
Back to top
View user's profile Send private message
Sujao
l33t
l33t


Joined: 25 Sep 2004
Posts: 677
Location: Germany

PostPosted: Wed Oct 28, 2009 6:13 pm    Post subject: Reply with quote

Well it's not as good as in the first test. This time I had to hash a ~6GB file and there were big delays. I had a flashvideo in firefox running and another movie in PAL and the hash. Firefox didn't react to clicks for more than 1min until the hash was finished, the flash video ran with no problems, the movie stopped about every 15s for 1-2s but i think with the old kernel it was much worse, the movie would freeze completely and run for 1-2s every 15s.

Overall it seems to be better but I didn't make any identical benchmarks. I simply try things that used to lock my system. So maybe my first test was somehow different and that is why it ran so flawlessly.

My kernel configuration (The forum doesn't seem to accept such a long text):
http://pastebin.com/f78f5fec1

EDIT:
  • CONFIG_CPU_BFS_AUTOISO=y was not set for me, recompiling at the moment. What does it do exactly? Almost no help available. Scheduler for the X-Server?
  • Code:
    for i in /sys/block/sd*; do
             /bin/echo "bfq" >  $i/queue/scheduler
    done

    Was not necessary for me. All drives have the same output:
    Code:
    # cat /sys/block/sda/queue/scheduler
    noop fifo anticipatory deadline cfq vr [bfq]

  • mamunata wrote:
    I've noticed (but not 100% sure) that if I leave my PC alone, and not working on it for a couple of hours, when I get back the PC is very slow.

    I think this is not caused by a faulty IO scheduler. I have the same behaviour. I think the RAM is simply cleaned of not accessed desktop applications and when you come back it needs to reread libraries from the harddrive. Can anyone confirm that?


EDIT2: OK, seriously.. :x .... Sorry guys. This is the second time I thought the problem was kind of solved. Now, I did three "comparable" benchmarks with my old 2.6.30-gentoo-r1(1), linux-2.6.31-zen4 CONFIG_CPU_BFS_AUTOISO=n(2) and linux-2.6.31-zen4 CONFIG_CPU_BFS_AUTOISO=y(3). The benchmark: Dumping a file from hdd A to hdd B, vieweing a movie from hdd A and hdd C, viewing the same flashvideo and hashing a file from hdd A and hdd B with sha256sum.

The result (just my subjective perception): 2 and 3 makes no difference, 1 is a little worse, but not as worse as I would have expected. I think I need a more detailed automatic benchmark with measurable results to make further statements, as my previous claims turned out to be false.
Back to top
View user's profile Send private message
Display posts from previous:   
This topic is locked: you cannot edit posts or make replies.    Gentoo Forums Forum Index Gentoo on AMD64 All times are GMT
Goto page 1, 2, 3, 4, 5, 6, 7  Next
Page 1 of 7

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum