[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] GFS load average and locking

Marc Grimme wrote:

Although the strace does not show the output I know of the problem description sounds like a deja vu. We had loads of problems with having sessions on GFS and httpd s ending up with "D" state for some time (at high load times we had ServerLimit httpd in D per node which ended up in the service not being available). As I posted already we think it is because of the "bad" locking of sessions with php (as php sessions are on gfs and strace showed those timeouts with the session files). When you issue a "session_start" or what ever that function is called, the session_file is locked via an flock syscall. That lock is held until you end the session which is implicitly done when the tcp connection to the client is ended. Now comes another http process (on whatever node) and calls a "session start" and trys an flock on that session while another process already holds that lock. The process might end up in the seen timeouts (30-60secs) which (as far as I remember relates to the timeout of the tcp connection defined in the httpd.conf or some timeout in the php.ini) - there is an explanation on this but I cannot rember ;-) ). Nevertheless in our scenario the problems were the "bad" session handling by php. We have made a patch for the phplib where you can disable the locking, or just implicitly do locking and therefore keep consitency while session data is read or written. We could make apache work as expected and now we don't see any "D" process anymore since a year.
Oh yes the patch can be found at
www.opensharedroot.org in the download section.

Besides: You will never encounter this on a localfilesystem or nfs (as nfs ignores flocks). As nfs does not support flocks and silently ignores them.


This does look like the problem description sent out by savvis.net folks during our off-list email exchanges. However, without actually looking at the thread traces (when they are in D state), it is difficult to be sure. One way to obtain the exact thread trace is using "crash" tool to do a back trace (e.g. "bt <pid>", you need kernel debuginfo RPM though). Britt, do let us know whether this php patch helps and/or using crash command to obtain the thread trace output.

On the other hand, I don't understand how a local (non-cluster) filesystem can be immune from this problem ?

-- Wendy

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]