[Linux-cluster] nanny segfault problem

Alex Kompel alex.kompel at 23andme.com
Tue Nov 13 20:57:05 UTC 2007


On 11/13/07, Christopher Barry <christopher.barry at qlogic.com> wrote:
>
>  Greetings All,
>
> running RHEL4U5
>
> I have a bunch of services on my cluster w/ access via redundant
> directors.
>
> I've created a generic service checking script, which I'm specifying in
> lvs.cf's 'send_program' config parameter.
>
> script is attached to this post. see that for how it works with the
> symlinks described below.
>
> I create symlinks to the script for every service I want to check, with
> their name containing the port to hit, as in:
> /sbin/lvs-<port>.sh
>
> so the symlink name to check ssh availability, for instance, is:
> /sbin/lvs-22.sh
>
> The script works fine, and returns the first contiguous block of
> [[:alnum:]] text data from the connection attempt for use with the
> expect line of lvs.cf.
>
>
> The problem is, when nanny is spawned by pulse, all of the nanny
> processes segfault.
>
> > Nov 13 14:40:44 kop-sds-dir-01 lvs[17740]: create_monitor for
> ssh_access/kop-sds-01 running as pid 17749
> > Nov 13 14:40:44 kop-sds-dir-01 nanny[17749]: making 10.32.12.11:22available
> > Nov 13 14:40:44 kop-sds-dir-01 kernel: nanny[17749]: segfault at
> 000000000000006c rip 000000335e570810 rsp 0000007fbfffe978 error 4
>
> this occurs almost instantly for every nanny process.
>
> Can anyone venture a guess as to what is happening?
>

Try running nanny manually in foreground - see if you get any error
messages. RHEL5 nanny (0.8.4) has a bug where it segfaults on printing
syslog log messages longer than 80 characters. Could be that. The patch is
below.

*** util.c      2002-04-25 21:19:57.000000000 -0700
--- util.new    2007-10-10 13:27:43.000000000 -0700
***************
*** 49,55 ****

    while (1)
      {
!       ret = vsnprintf (buf, bufLen, format, args);
        if ((ret > -1) && (ret < bufLen))
        {
          break;
--- 49,58 ----

    while (1)
      {
!       va_list try_args;
!       va_copy(try_args, args);
!       ret = vsnprintf (buf, bufLen, format, try_args);
!       va_end(try_args);
        if ((ret > -1) && (ret < bufLen))
        {
          break;
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20071113/6e6a30dc/attachment.htm>


More information about the Linux-cluster mailing list