[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[Linux-cluster] GFS2 and D state HTTPD processes

Hi All,



 We have 6 nodes running GFS2 under CentOS 5.3 all connecting via Cisco 2960G switches to an MD3000i with 8 x 146GB SAS 15K drives. These nodes run a PHP website pulling their PHP and images files from a GFS2 volume being exported by iSCSI from the MD3000i .


Problem we have is that since inception we’ve seen issues whereby the HTTPD processes will go into a state of ‘D’, zombied’ and the only way we have to recover from that is to restart all the nodes in the cluster.


I’ve tuned the demote_secs down from 300 to 20 seconds on the assumption that file locking is causing an issue. Similarly we’re running with the following GFS values;


        <gfs_controld plock_ownership="1" plock_rate_limit="0"/>


Can anyone give me some pointers on what we should be investigating for why this is failing? I’ve had our networks team crawl over the networking and that all seems fine. The MTU is set correctly on the MD3000i and on the individual nodes. I’ve also used the ping_pong tool and on a single file on the GFS cluster we can get around 90K locks on a file. If I run ping_pong against the same file from two nodes that then drops to around 70 locks per second. I don’t think that’s the issue though.


If anyone can provide some insight to either what to change, what to debug or how to investigate this further it’d be greatly appreciated.





Gavin Conway

Senior Engineer, Operations (Systems Group), UKSolutions


Telephone: 0845 004 1333, option 2

Email: gavin conway uksolutions co uk

Web: www.uksolutions.co.uk

UKS Ltd, Birmingham Road, Studley, Warwickshire, B80 7BG Registered in England Number 3036806

This email must be read in conjunction with the legal & service notices on http://www.uksolutions.co.uk/disclaimer.html

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]