Accessing http://ptolemy.eecs.berkeley.edu/~cxh and http://ptolemy.eecs.berkeley.edu/~eal is slow.
If I use Firefox's Tools -> Web Developer -> Toggle Tools, then I can see that the GET of /~cxh/
is taking 43 seconds. Clicking on "Show Request Details" shows that the bulk of the request time is "waiting".
The same is true when running on a machine on campus.
Looking at the logs, if the trailing slash redirect from http://ptolemy.eecs.berkeley.edu/~cxh to http://ptolemy.eecs.berkeley.edu/~cxh/ typically takes a second.
180.76.15.149 - - [25/Feb/2016:21:35:25 -0800] "GET /~cxh HTTP/1.1" 301 338 180.76.15.156 - - [25/Feb/2016:21:35:26 -0800] "GET /~cxh/ HTTP/1.1" 200 26395However, around February 26, the delays of 43 seconds started to occur:
50.188.21.247 - - [26/Feb/2016:20:38:30 -0800] "GET /~cxh HTTP/1.1" 301 338 50.188.21.247 - - [26/Feb/2016:20:39:13 -0800] "GET /~cxh/ HTTP/1.1" 200 26395
Not much on the machine changed, though in the logs on Feb 24, messages about key ".htaccess" not found
started appearing:
Feb 24 15:38:05 moog automount[9038]: lookup_read_master: lookup(nisplus): couldn't locate nis+ table auto.master Feb 24 15:39:08 moog automount[9038]: umount_autofs_indirect: ask umount returned busy /home/eecs Feb 24 15:39:11 moog automount[26694]: lookup_read_master: lookup(nisplus): couldn't locate nis+ table auto.master Feb 24 15:40:26 moog automount[26694]: key ".htaccess" not found in map source(s). Feb 24 15:41:27 moog automount[26694]: key ".htaccess" not found in map source(s). Feb 24 15:55:51 moog automount[26694]: key ".htaccess" not found in map source(s).
/usr/local/apache/conf/extra/httpd-vhosts.conf
looks like:
# Settings for user home directories # # Required module: mod_userdir # # UserDir: The name of the directory that is appended onto a user's home # directory if a ~user request is received. Note that you must also set # the default access control for these directories, as in the example below. # UserDir public_html # # Control access to UserDir directories. The following is an example # for a site where these directories are restricted to read-only. # <Directory "/home/*/public_html"> AllowOverride FileInfo AuthConfig Limit Indexes Options MultiViews Indexes SymLinksIfOwnerMatch IncludesNoExec <Limit GET POST OPTIONS> Order allow,deny Allow from all </Limit> <LimitExcept GET POST OPTIONS> Order deny,allow Deny from all </LimitExcept> </Directory> # Ptolemy_Localization endWatching the network traffic from moog to home10 shows that moog sends a packet and then sends another one 43 seconds later
[root@moog etc]# tcpdump -v -u host home10 tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes 13:18:16.026684 IP (tos 0x0, ttl 64, id 11414, offset 0, flags [DF], proto TCP (6), length 228) moog.eecs.berkeley.edu.1045667650 > naphome.EECS.Berkeley.EDU.nfs: 172 access fh[1:80000438:0:3d0:242a8ec1:80000438:0:3d0:242a8ec1:40:0:125:0] 001f 13:18:16.026969 IP (tos 0x0, ttl 61, id 32763, offset 0, flags [DF], proto TCP (6), length 176) naphome.EECS.Berkeley.EDU.nfs > moog.eecs.berkeley.edu.1045667650: reply ok 120 access attr: DIR 755 ids 6269/244 sz 16384 c 0003 13:18:16.027031 IP (tos 0x0, ttl 64, id 11415, offset 0, flags [DF], proto TCP (6), length 224) moog.eecs.berkeley.edu.1062444866 > naphome.EECS.Berkeley.EDU.nfs: 168 getattr fh[1:80000438:0:7bdbbc:2e01796a:80000438:0:3d0:242a8ec1:40:0:125:0] 13:18:16.027257 IP (tos 0x0, ttl 61, id 33019, offset 0, flags [DF], proto TCP (6), length 168) naphome.EECS.Berkeley.EDU.nfs > moog.eecs.berkeley.edu.1062444866: reply ok 112 getattr DIR 755 ids 6269/244 sz 12288 13:18:16.066805 IP (tos 0x0, ttl 64, id 11416, offset 0, flags [DF], proto TCP (6), length 52) moog.eecs.berkeley.edu.863 > naphome.EECS.Berkeley.EDU.nfs: Flags [.], cksum 0x5283 (correct), ack 931053800, win 5360, options [nop,nop,TS val 772890630 ecr 3250707893], length 0 13:18:59.074712 IP (tos 0x0, ttl 64, id 11417, offset 0, flags [DF], proto TCP (6), length 228) moog.eecs.berkeley.edu.1079222082 > naphome.EECS.Berkeley.EDU.nfs: 172 access fh[1:80000438:0:7bdbbc:2e01796a:80000438:0:3d0:242a8ec1:40:0:125:0] 001f 13:18:59.074930 IP (tos 0x0, ttl 61, id 58036, offset 0, flags [DF], proto TCP (6), length 176) naphome.EECS.Berkeley.EDU.nfs > moog.eecs.berkeley.edu.1079222082: reply ok 120 access attr: DIR 755 ids 6269/244 sz 12288 c 0003 13:18:59.074956 IP (tos 0x0, ttl 64, id 11418, offset 0, flags [DF], proto TCP (6), length 52) moog.eecs.berkeley.edu.863 > naphome.EECS.Berkeley.EDU.nfs: Flags [.], cksum 0x012e (correct), ack 125, win 5360, options [nop,nop,TS val 772933638 ecr 3250750941], length 0If I comment out:
# <Limit GET POST OPTIONS> # Order allow,deny # Allow from all # </Limit> # <LimitExcept GET POST OPTIONS> # Order deny,allow # Deny from all # </LimitExcept>and do
/etc/init.d/apachectl graceful
, then the time is much improved, but the problem still occurs.
The network traffic is different:
[root@moog etc]# tcpdump -v -u host home10 tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes 13:22:09.618089 IP (tos 0x0, ttl 64, id 11554, offset 0, flags [DF], proto TCP (6), length 224) moog.eecs.berkeley.edu.2522062658 > naphome.EECS.Berkeley.EDU.nfs: 168 getattr fh[1:80000438:0:9f9fd6:2e01797d:80000438:0:3d0:242a8ec1:40:0:125:0] 13:22:09.618336 IP (tos 0x0, ttl 61, id 53910, offset 0, flags [DF], proto TCP (6), length 168) naphome.EECS.Berkeley.EDU.nfs > moog.eecs.berkeley.edu.2522062658: reply ok 112 getattr REG 755 ids 6269/30 sz 24535 13:22:09.618360 IP (tos 0x0, ttl 64, id 11555, offset 0, flags [DF], proto TCP (6), length 52) moog.eecs.berkeley.edu.863 > naphome.EECS.Berkeley.EDU.nfs: Flags [.], cksum 0x7691 (correct), ack 931348148, win 5881, options [nop,nop,TS val 773124181 ecr 3250941487], length 0 13:22:09.618683 IP (tos 0x0, ttl 64, id 11556, offset 0, flags [DF], proto TCP (6), length 224) moog.eecs.berkeley.edu.2538839874 > naphome.EECS.Berkeley.EDU.nfs: 168 getattr fh[1:80000438:0:dadd7d:2e01797e:80000438:0:3d0:242a8ec1:40:0:125:0] 13:22:09.618895 IP (tos 0x0, ttl 61, id 13891, offset 0, flags [DF], proto TCP (6), length 168) naphome.EECS.Berkeley.EDU.nfs > moog.eecs.berkeley.edu.2538839874: reply ok 112 getattr REG 644 ids 6269/30 sz 1738 13:22:09.638880 IP (tos 0x0, ttl 64, id 11557, offset 0, flags [DF], proto TCP (6), length 224) moog.eecs.berkeley.edu.2555617090 > naphome.EECS.Berkeley.EDU.nfs: 168 getattr fh[1:80000438:0:dadd7f:2e01797e:80000438:0:3d0:242a8ec1:40:0:125:0] 13:22:09.639132 IP (tos 0x0, ttl 61, id 6212, offset 0, flags [DF], proto TCP (6), length 168) naphome.EECS.Berkeley.EDU.nfs > moog.eecs.berkeley.edu.2555617090: reply ok 112 getattr REG 644 ids 6269/30 sz 312 13:22:09.678787 IP (tos 0x0, ttl 64, id 11558, offset 0, flags [DF], proto TCP (6), length 52) moog.eecs.berkeley.edu.863 > naphome.EECS.Berkeley.EDU.nfs: Flags [.], cksum 0x7400 (correct), ack 233, win 5881, options [nop,nop,TS val 773124242 ecr 3250941507], length 0
Interestingly, pages that are NFS mounted on robotics seem to be ok.
In addition, if rewrite-robotics.conf
is updated so that ~eal
and ~cxh
are available:
RewriteRule ^/~cxh$ http://robotics.eecs.berkeley.edu/~cxh/ [L] RewriteRule ^/~cxh/$ /home/eecs/cxh/public_html/index.html [L] RewriteRule ^/~cxh/(.*) /home/eecs/cxh/public_html/$1 [L] RewriteRule ^/~eal$ http://robotics.eecs.berkeley.edu/~eal/ [L] RewriteRule ^/~eal/$ /home/eecs/eal/public_html/index.html [L] RewriteRule ^/~eal/(.*) /home/eecs/eal/public_html/$1 [L]
then the problem does not occur with
http://robotics.eecs.berkeley.edu/~eal/ or
http://robotics.eecs.berkeley.edu/~cxh/.
One difference between the two sites is that the Ptolemy site has SSI enabled.
Just to provide another data point, I have not noticed anything like this on www.eecs since the switchover to the new NetApps.
It does sound more like an NFS problem than an Apache configuration problem. FYI, there is my automounter configuration for home directories on www.eecs (which is running Red Hat 7).
/home/eecs /usr/sww/share/etc/automount/maps/auto.home.eecs -soft,intr,bg,nolockHere is a sample 'mount' output that shows all the options.
home30.EECS.Berkeley.EDU:/home30/robm on /home/eecs/robm type nfs (rw,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,soft,nolock,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=169.229.3.163,mountvers=3,mountport=635,mountproto=udp,local_lock=all,addr=169.229.3.163)I've moved this ticket into our 'fileserver' queue and hope that Pathma will be able to work with you on the NFS issue. He will probably be curious about your NFS configuration so please send that along for a home that is experiencing the problem.
Editing
/etc/auto.master
and changing/home/eecs /etc/auto.hometo/home/eecs /etc/auto.home soft,intr,bg,nolockand then doing/etc/init.d/autofs reload
seemed to help?Email from Pathma
Thanks for the details, thats very helpful.
How reproducible is this? I tried it just now and it was as you said exactly 43s and then of course it was fast after that.
I have seen sessions hang in the past even with the old netapps, but I am not sure if its a storage issue.
On another note I am noticing that access from moog comes via various different IP address on the netapp. Now I had specified the IPs for each volume, so I am expecting at most 2 IPs for ptolemy, ie naphome and napcluster. However I am also seeing access via napvault and napproj which is non optimal.
I need to check to see how this is happening.