Re: Re: [Linux-cluster] client doesnt start when lock master is not ready

Thanks Adam! This explains the cause of the problem.

Yes, I started experiencing this problem after the upgrade from GFS-6.0.0-15 to GFS-6.0.2-24.

On Thu, 24 Feb 2005 Adam Manthei wrote :
>On Thu, Feb 24, 2005 at 04:45:27PM -0000, Raj  Kumar wrote:
> > Hi All,
> > We have a two node system using GFS. One of them is the lock server and
> > other is just client. We restarted our servers recently and brought the
> > lock client before bringing up the lock master. lock_gulmd is set to
> > restart at levels 3, 4 and 5. The lock client system just hungup with the
> > message "Starting lock_gulmd..." in the boot process. It's clear that this
> > situation happened since lock master server wasn't available then. When
> > the lock master server started the lock client system started successfully.
>This is the desired behavior.  Adjust the following value in
>/etc/sysconfig/gfs if you don't like it's behavior.
># GULM_QUORUM_TIMEOUT -- amount of time to wait for there to be a master
>#    before giving up.  If GULM_QUORUM_TIMEOUT is positive, then we will
>#    wait GULM_QUORUM_TIMEOUT seconds before giving up and failing when
>#    a master server is not found.  If GULM_QUORUM_TIMEOUT is zero, then
>#    wait indefinately for a master server.  If GULM_QUORUM_TIMEOUT is
>#    negative, just start lock_gulmd and not worry about whether it is
>#    quorate.
> > I noticed before client system started even when lock master was not
> > available and the status of the lock_gulmd on client was set to "pending".
> > But now the system doesnt start until the master server is also started.
>Did you have the system mounting GFS automatically?  Apparently not since it
>would have "hung" there too.  The client node should have eventually timed
>out after 5 minutes without a master server to log into.
> > Has this changed recently?
>Define recently... sort of need the version information you are using :)
>My guess is that since you are complaining about this behavior, you just
>upgraded from GFS-6.0.0-15 to GFS-6.0.2-24.  From the rpm change log:
>* Mon Nov 15 2004 Chris Feist <cfeist redhat com> 6.0.2-0
>- init.d/lock_gulmd will not start if quorum is not established after
>  a specified time (rbz135732).
>- init.d/lock_gulmd will not stop if GFS is mounted (rbz135730).
>- pool init.d scripts no longer hang on startup until console input
>  is provided (rbz137382).
> > It is possible that other administrators in the
> > group may have to restart the system at times. If they start the client
> > before master (or worse they dont start master at all) then the system will
> > not complete its boot process and other services remain unavailable.
>Your nodes won't be able to mount GFS if there cluster the gulm servers
>aren't quorate, so what's the problem?
> > I like
> > the system to complete its boot process and have the lock_gulmd stay in
> > pending state until master comes back. Is there any trick to achieve this
> > behavior?
>One other suggestion.  I usually start sshd immediately after networking on
>my machines so that I can get into them as soon as possible.  This often
>helps when dealing with complaints of this nature.
