View previous topic :: View next topic |
Author |
Message |
criacow n00b
![n00b n00b](/images/ranks/rank_rect_0.gif)
Joined: 08 Jan 2008 Posts: 1 Location: Vancouver, Canada
|
Posted: Tue Jan 08, 2008 10:36 pm Post subject: slow NFS degradation, 98% access calls |
|
|
Good afternoon!
We've got a cluster of Gentoo boxes. There's a central unit that load-balances out to four servers, each of which are NFS clients to a sixth box:
Code: |
load-balancer
/ | | \ <- traffic forwarding
box1 box2 box3 box4
\ | | / <- NFS mounts
central fileshare
|
At first, everything's great -- it does a GETATTR call, gets the reply, the file goes.
Over the course of a week or two, there get to be more and more ACCESS calls on every grab, until it starts looking like this:
Code: |
No. Time Source Destination Protocol Info
367848 58.923514 192.168.1.13 192.168.1.1 NFS V3 ACCESS Call (Reply In 367849), FH:0xb85fb85d
367849 58.923785 192.168.1.1 192.168.1.13 NFS V3 ACCESS Reply (Call In 367848)
367850 58.923800 192.168.1.13 192.168.1.1 NFS V3 ACCESS Call (Reply In 367851), FH:0xb85fb85d
367851 58.923915 192.168.1.1 192.168.1.13 NFS V3 ACCESS Reply (Call In 367850)
367852 58.923928 192.168.1.13 192.168.1.1 NFS V3 ACCESS Call (Reply In 367853), FH:0x7bd67bd7
367853 58.924045 192.168.1.1 192.168.1.13 NFS V3 ACCESS Reply (Call In 367852)
367854 58.924059 192.168.1.13 192.168.1.1 NFS V3 ACCESS Call (Reply In 367855), FH:0xb85fb85d
367855 58.924173 192.168.1.1 192.168.1.13 NFS V3 ACCESS Reply (Call In 367854)
367856 58.924185 192.168.1.13 192.168.1.1 NFS V3 ACCESS Call (Reply In 367857), FH:0x7bd67bd7
367857 58.924301 192.168.1.1 192.168.1.13 NFS V3 ACCESS Reply (Call In 367856)
367858 58.924313 192.168.1.13 192.168.1.1 NFS V3 ACCESS Call (Reply In 367859), FH:0x637b7c7a
367859 58.924430 192.168.1.1 192.168.1.13 NFS V3 ACCESS Reply (Call In 367858)
367860 58.924444 192.168.1.13 192.168.1.1 NFS V3 ACCESS Call (Reply In 367861), FH:0xb85fb85d
367861 58.924558 192.168.1.1 192.168.1.13 NFS V3 ACCESS Reply (Call In 367860)
367862 58.924570 192.168.1.13 192.168.1.1 NFS V3 ACCESS Call (Reply In 367863), FH:0x7bd67bd7
367863 58.924687 192.168.1.1 192.168.1.13 NFS V3 ACCESS Reply (Call In 367862)
367864 58.924698 192.168.1.13 192.168.1.1 NFS V3 ACCESS Call (Reply In 367865), FH:0x637b7c7a
367865 58.924814 192.168.1.1 192.168.1.13 NFS V3 ACCESS Reply (Call In 367864)
367866 58.924826 192.168.1.13 192.168.1.1 NFS V3 ACCESS Call (Reply In 367867), FH:0xf3bdecbc
367867 58.924944 192.168.1.1 192.168.1.13 NFS V3 ACCESS Reply (Call In 367866)
367868 58.924964 192.168.1.13 192.168.1.1 NFS V3 ACCESS Call (Reply In 367869), FH:0xb85fb85d
367869 58.925078 192.168.1.1 192.168.1.13 NFS V3 ACCESS Reply (Call In 367868)
367870 58.925090 192.168.1.13 192.168.1.1 NFS V3 ACCESS Call (Reply In 367871), FH:0x7bd67bd7
367871 58.925211 192.168.1.1 192.168.1.13 NFS V3 ACCESS Reply (Call In 367870)
367872 58.925223 192.168.1.13 192.168.1.1 NFS V3 ACCESS Call (Reply In 367873), FH:0x637b7c7a
367873 58.925339 192.168.1.1 192.168.1.13 NFS V3 ACCESS Reply (Call In 367872)
367874 58.925351 192.168.1.13 192.168.1.1 NFS V3 ACCESS Call (Reply In 367875), FH:0xf3bdecbc
367875 58.925466 192.168.1.1 192.168.1.13 NFS V3 ACCESS Reply (Call In 367874)
367876 58.925478 192.168.1.13 192.168.1.1 NFS V3 GETATTR Call (Reply In 367877), FH:0xdbbbc4ba
367877 58.925593 192.168.1.1 192.168.1.13 NFS V3 GETATTR Reply (Call In 367876) Regular File mode:0644 uid:1000 gid:440
|
nfsstat -c starts to look like this:
Code: |
Client rpc stats:
calls retrans authrefrsh
369630277 866766 0
Client nfs v2:
null getattr setattr root lookup readlink
0 0% 0 0% 0 0% 0 0% 0 0% 0 0%
read wrcache write create remove rename
0 0% 0 0% 0 0% 0 0% 0 0% 0 0%
link symlink mkdir rmdir readdir fsstat
0 0% 0 0% 0 0% 0 0% 0 0% 0 0%
Client nfs v3:
null getattr setattr lookup access readlink
0 0% 1075664587 23% 38260 0% 4181249 0% -711993663 76% 0 0%
read write create mkdir symlink mknod
192202 0% 402150 0% 324530 0% 135 0% 0 0% 0 0%
remove rmdir rename link readdir readdirplus
327099 0% 0 0% 56 0% 0 0% 0 0% 225024 0%
fsstat fsinfo pathconf commit
4 0% 2 0% 0 0% 268642 0%
|
and will eventually get to be 98% access.
I've searched through the forums and through Google, but can't find anything relevant. Any clue why this happens? Is there somewhere that filehandles get incorrectly cached, or something along those lines?
Thanks in advance!
-criacow |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
guruvan Tux's lil' helper
![Tux's lil' helper Tux's lil' helper](/images/ranks/rank_rect_1.gif)
![](images/avatars/152774742247866541dc17e.jpg)
Joined: 21 Aug 2007 Posts: 132
|
Posted: Thu Jan 10, 2008 9:00 am Post subject: |
|
|
sounds like you're on the right track. I found one little discussion somewhere http://www.scooter.cx/~mozbot/%23vesta-20050517-070000.xml where someone mentions a similar problem. (not much to go on) from your log excerpt, it would seem that the filehandles are not released. maybe you can isolate it to certain types of files, files that are accessed while a certain operation takes place, some reason that the clients aren't closing the files? can you plot the filehandles and what the access pattern is? is it a growing number of files that each are being accessed over and over, or is it the number of accesses for each original file operation grows over time?
simpler to fix, sometimes easy to overlook:
have you done a trace on the network to see the contents of the excess nfs packets? do they have the right source addresses? in particular, is there any way the nfs server can find the address of the boxes wrong interface? (i.e. dns) this could be a possible explanation of files not closing
can you find a way to duplicate this on a test load?
when in doubt, rip it out! maybe roll up another fresh cluster box, with freshly compiled toolchain, glibc, kernel, and nfs daemons. (not necessarily upgraded) _________________ Everything is broken......(b.dylan).
guruvan |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|