[Linux-cluster] Re: nanny segfault problem
Christopher Barry
Christopher.Barry at qlogic.com
Tue Nov 13 22:53:20 UTC 2007
On Tue, 2007-11-13 at 15:14 -0500, Christopher Barry wrote:
> script got scraped by my gateway - attached here as a textfile
>
>
> On Tue, 2007-11-13 at 15:05 -0500, Christopher Barry wrote:
> > Greetings All,
> >
> > running RHEL4U5
> >
> > I have a bunch of services on my cluster w/ access via redundant
> > directors.
> >
> > I've created a generic service checking script, which I'm specifying in
> > lvs.cf's 'send_program' config parameter.
> >
> > script is attached to this post. see that for how it works with the
> > symlinks described below.
> >
> > I create symlinks to the script for every service I want to check, with
> > their name containing the port to hit, as in:
> > /sbin/lvs-<port>.sh
> >
> > so the symlink name to check ssh availability, for instance, is:
> > /sbin/lvs-22.sh
> >
> > The script works fine, and returns the first contiguous block of
> > [[:alnum:]] text data from the connection attempt for use with the
> > expect line of lvs.cf.
> >
> >
> > The problem is, when nanny is spawned by pulse, all of the nanny
> > processes segfault.
> >
> > > Nov 13 14:40:44 kop-sds-dir-01 lvs[17740]: create_monitor for ssh_access/kop-sds-01 running as pid 17749
> > > Nov 13 14:40:44 kop-sds-dir-01 nanny[17749]: making 10.32.12.11:22 available
> > > Nov 13 14:40:44 kop-sds-dir-01 kernel: nanny[17749]: segfault at 000000000000006c rip 000000335e570810 rsp 0000007fbfffe978 error 4
> >
> > this occurs almost instantly for every nanny process.
> >
> > Can anyone venture a guess as to what is happening?
> >
> > see my lvs.cf here:
> > http://nanny-error.pastebin.com/m592f7911
> >
> >
All,
More interesting developments:
If I start pulse with:
# pulse -v --nodaemon
everything (kinda) works.
# pulse -v
does not work work at all, however.
Something is different between daemon mode and not, beyond apparently
backgrounding it.
I was thinking this may be a permissions issue, but I'd already changed
the mode of my script to 4755.
More information about the Linux-cluster
mailing list