[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

RE: [Linux-cluster] clurgmgrd - <err> #48: Unable to obtaincluster lock: Connectiontimed out



Hi Robert

we also have a lot of this problems!

First of all, I higly recommanded you to install a newer rgmanager, your version has really a lot of bugs.

min. rgmanager-1.9.54-4.222484hf, but now is 4.5 released take this one. 

we also have the dlm problems ( BZ#206463 and BZ#199673)and the Support recommend us RHEL4 U5, that should fix this problem.

--> install RHEL4 Update 5 and all your Problems should be fixed.

regards Mike

-----Original Message-----
From: linux-cluster-bounces redhat com on behalf of Robert Hurst
Sent: Mon 14.05.2007 21:46
To: linux clustering
Subject: RE: [Linux-cluster] clurgmgrd - <err> #48: Unable to obtaincluster lock: Connectiontimed out
 
Any new thoughts on this, is it a new bug, is it fixed with U5?  I have
a ticket open, but your insights on how probable this is a recurring bug
would be helpful.  Thanks.


On Fri, 2007-05-11 at 19:54 -0400, rhurst bidmc harvard edu wrote:
> We are using RHEL 4 U4 with the GFS/CS that works for that release:
>  
> $ rpm -q rgmanager dlm dlm-kernel magma magma-plugins
> 
> rgmanager-1.9.54-1
> dlm-1.0.1-1
> dlm-kernel-2.6.9-44.9
> magma-1.0.6-0
> magma-plugins-1.0.9-0
> 
> Would the just-announced GFS/CS for U5 help any?  Looks like a lof
> issues were addressed.
>  
> Robert Hurst, Sr. Caché Administrator
> Beth Israel Deaconess Medical Center
> 1135 Tremont Street, REN-7
> Boston, Massachusetts   02120-2140
> 617-754-8754 · Fax: 617-754-8730 · Cell: 401-787-3154
> Any technology distinguishable from magic is insufficiently advanced.
> 
> 
> ______________________________________________________________________
> From: linux-cluster-bounces redhat com on behalf of Lon Hohberger
> Sent: Fri 5/11/2007 4:19 PM
> To: linux clustering
> Subject: Re: [Linux-cluster] clurgmgrd - <err> #48: Unable to obtain
> cluster lock: Connectiontimed out
> 
> 
> On Mon, May 07, 2007 at 01:54:56PM -0400, rhurst bidmc harvard edu
> wrote:
> > What could cause clurgmgrd fail like this?  If clurgmgrd has a
> hiccup
> > like this, is it supposed to shutdown its services?  Is there
> something
> > in our implementation that could have prevented this from shutting
> down?
> >
> > For unexplained reasons, we just had our CS service (WATSON) go down
> on
> > its own, and the syslog entry details the event as:
> >
> > May  7 13:18:39 db1 clurgmgrd[17888]: <err> #48: Unable to obtain
> > cluster lock: Connection timed out
> > May  7 13:18:41 db1 kernel: dlm: Magma: reply from 2 no lock
> > May  7 13:18:41 db1 kernel: dlm: reply
> > May  7 13:18:41 db1 kernel: rh_cmd 5
> > May  7 13:18:41 db1 kernel: rh_lkid 200242
> > May  7 13:18:41 db1 kernel: lockstate 2
> > May  7 13:18:41 db1 kernel: nodeid 0
> > May  7 13:18:41 db1 kernel: status 0
> > May  7 13:18:41 db1 kernel: lkid ee0388
> > May  7 13:18:41 db1 clurgmgrd[17888]: <notice> Stopping service
> WATSON
> 
> This usually is a dlm bug.  Once the DLM gets in to this state,
> rgmanager blows up.  What rgmanager are you using?
> 
> (There's only one lock per service; the complexity of the service
> doesn't matter...)
> 
> --
> Lon Hohberger - Software Engineer - Red Hat, Inc.
> 
> --
> Linux-cluster mailing list
> Linux-cluster redhat com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> 
> 







<<winmail.dat>>


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]