[Linux-cluster] Httpd Process io blocked

Marc Grimme grimme at atix.de
Tue Mar 7 12:43:20 UTC 2006


Sebastien,
On Tuesday 07 March 2006 12:35, Sébastien DIDIER wrote:
> 2006/3/7, Marc Grimme <grimme at atix.de>:
> > Hi,
> > to debug you could use strace. E.g. executing strace -p 14970 will
> > probably show you that the process is waiting for a lock. As the ps
> > already does. My first guess would be, that you use apache with php and
> > sessions.
>
> Thanks. But strace doesnt output anything and became Ctrl-C imune. It
> needs a sigkill to exit and the traced process stays in T state. I
> seems that it doesnt manage to get last system call where the process
> is in D state.
Hmm, sounds like I've heard that already. If you trace the root httpd with -f 
and -t and lookout for great timeslices you'll propably find processes 
waiting for locks. The D state is a good indicator (ps ax | grep " D " and 
look at the pids). Do the pids of the D processes change from time to time or 
do they stay the same pids? 
>
> > If so, the phplib uses flocks for locking the session-ids. Normally it
> > happens that one process locks a session. If another process comes along
> > to get an flock on that session it has to wait until the further flock is
> > closed. It very often happens that the other process gets that flock when
> > the client and session are not available any more. Then the flock is held
> > until the apache process timesout.
>
> I don't think it is session related because I store sessions file
> outside the GFS mount point (/tmp) and I run a load balancer based
> upon the source adress (to always send requests to the same server and
> then keep sessions)
Yes, I agree. Sessions get lost if the the node fails, right?
>
> But, we are using mysql query caching (with some libraries like AdoDb)
> inside the GFS mount point. Do you think it could be the cache files
> which are dead-locked ?
It depends on how those files are locked and how and when the locks are set 
and released. If a lock is set at apache-child forktime and released at 
process terminate time, then yes that could happen. If only accesses to data 
of those files are protected with flocks then it should perform quite well.

Is that query caching part of perl-adodb or is it implemented by yourselves?

Have a look and play with strace and watch out for great times and the 
syscalls concerned with that. I would expect you ending up with 
flock-timeouts.

Hope that helps,
regards Marc.
>
> > We have made a patch for a better locking with php which you can find on
> > http:/www.open-sharedroot.org in the downloads section.
> > Hope that helps
> > Regards Marc.
> >
> > On Tuesday 07 March 2006 11:50, Sébastien DIDIER wrote:
> > > Hi,
> > >
> > > I'm running a two-nodes GFS cluster which hosts web sites. The GFS
> > > partition is over a Iscsi device and by now, i'm using manual fencing.
> > >
> > > Today, I got 5 httpd process on both nodes which got stuck in IO
> > > blocking state. I suspected a GFS filesystem corruption but I haven't
> > > got any output from the kernel. I ran a fsck two days ago after a
> > > power chute.
> > >
> > > Here's the wait state of the process. (idem for the other node)
> > >
> > > # ps -o pid,tt,user,fname,wchan -C apache
> > >   PID TT       USER     COMMAND  WCHAN
> > >  4426 ?        root     apache   -
> > > 14970 ?        www-data apache   glock_wait_internal
> > > 15103 ?        www-data apache   glock_wait_internal
> > > 16780 ?        www-data apache   glock_wait_internal
> > > 16959 ?        www-data apache   glock_wait_internal
> > > 14936 ?        www-data apache   finish_stop
> > > 12859 ?        www-data apache   -
> > > 13005 ?        www-data apache   -
> > > 13311 ?        www-data apache   semtimedop
> > > 13390 ?        www-data apache   semtimedop
> > >
> > > How can I debug further this problem ? And how can I bring back home
> > > my httpd processes without a reboot ?
> > >
> > > Many thanks for your help.
> > >
> > > Regards,
> > > Sébastien DIDIER
> > >
> > > --
> > > Linux-cluster mailing list
> > > Linux-cluster at redhat.com
> > > https://www.redhat.com/mailman/listinfo/linux-cluster
> >
> > --
> > Gruss / Regards,
> >
> > Marc Grimme
> > Phone: +49-89 121 409-54
> > http://www.atix.de/               http://www.open-sharedroot.org/
> >
> > **
> > ATIX - Ges. fuer Informationstechnologie und Consulting mbH
> > Einsteinstr. 10 - 85716 Unterschleissheim - Germany
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster

-- 
Gruss / Regards,

Marc Grimme
Phone: +49-89 121 409-54
http://www.atix.de/               http://www.open-sharedroot.org/

**
ATIX - Ges. fuer Informationstechnologie und Consulting mbH
Einsteinstr. 10 - 85716 Unterschleissheim - Germany





More information about the Linux-cluster mailing list