[Linux-cluster] Cluster service restarting Locally

saju john saju8 at rediffmail.com
Sat Mar 11 10:50:07 UTC 2006


  
Dear Mr. Hohberger,

Thanx for the replay.

I saw your comments for the problem I reported. ie lock traffic is getting network-starved.

But I think differently. Because when I stop clumanager on one of the node, the frequency of service restart is very very less compared to that was earlier when clumanager is running on both nodes .My assumption is that, the problem is due to some curruption of meta data information writing to the quroum partition ,as both nodes writing to quroum cuncurrently. May be due to bug in the rawdeivce driver.I am not sure.Then interesting question is ,how the cluster worked all these days(for me around one year with out any major problem).

Could you pelase consider this also when releasing the RHCS3U7.


Thank You,
Saju John
Linux System Administrator,
Thuraya Satellite Telicommunications Company
UAE,Sharjah

On Thu, 09 Mar 2006 Lon Hohberger wrote :
>On Mon, 2006-03-06 at 06:47 +0000, saju john wrote:
> >
> >
> > Dear All,
> >
> > I have a 2 node cluster with RHAS3 update 3.
> > Kernel : 2.4.21-20.Elsmp
> > Clumanager : clumanager-1.2.16-1
> >
> > For more than a year everyting had been fine. Suddenly it started
> > showing the follwing and restarted the service locally
> >
> > clusvcmgrd[1388]: <err> Unable to obtain cluster lock: Connection
> > timed out
> > clulockd[1378]: <warning> Denied A.B.C.D: Broken pipe
> > clulockd[1378]: <err> select error: Broken pipe
> > clusvcmgrd: [1625]: <notice> service notice: Stopping service
> > postgresql ...
> > clusvcmgrd: [1625]: <notice> service notice: Running user script
> > '/etc/init.d/postgresql stop'
> > clusvcmgrd: [1625]: <notice> service notice: Stopped service
> > postgresql
> > clusvcmgrd: [1625]: <notice> service notice: Starting service
> > postgresql ...
> > clusvcmgrd: [1625]: <notice> service notice: Running user script
> > '/etc/init.d/postgresql start'
> > clusvcmgrd: [1625]: <notice> service notice: Started service
> > postgresql ...
>
>It should be fixed in RHCS3U7
>
>-- Lon
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20060311/da3e91af/attachment.htm>


More information about the Linux-cluster mailing list