[Spacewalk-list] Spacewalk 1.5 Monitoring issue

trm asn trm.nagios at gmail.com
Fri Aug 5 06:18:01 UTC 2011


On Thu, Aug 4, 2011 at 9:10 PM, David Nutter <davidn at bioss.ac.uk> wrote:

> On Thu, Aug 04, 2011 at 12:41:13PM +0100, David Nutter wrote:
> > On Tue, Aug 02, 2011 at 07:34:46PM +0100, David Nutter wrote:
> > > On Tue, Jul 26, 2011 at 11:27:19AM +0530, trm asn wrote:
> > >
> > > *snip*
> > >
> > > > 'ProbeRecord', 'get_hostAddress') ; that error I have resolved by
> editing
> > > > "NPRecords.pm" & changed the parameter "hostAddress" to "HOSTADDRESS"
> .
> > > >
> > > > now my "scout config push" are getting success, but unfortunately
> still the
> > > > monitoring status is "PENDING, Awaiting Update" .
> > > >
> > > >
> > > > -bash-3.2$ rhn-runprobe --probe=261 --log=all=4
> > > > Can't locate object method "hostname" via package
> > > > "NOCpulse::Probe::Config::ProbeRecord" at
> > > > blib/lib/Class/MethodMaker/Engine.pm (autosplit into
> > > > blib/lib/auto/Class/MethodMaker/Engine/new.al) line 941.
> > > >
> > > > Any idea...
> > >
> > > Hi,
> > >
> >
> > *snip*
> >
> > After a bit more digging I've narrowed down the problem area.
> >
> > The method NOCpulse::Probe::Shell::Unix::connect seems to be failing
> > when running under the scheduler but not when running under
> > rhn-runprobe.
>
> *snip*
>
> > What seems to be happening on the scheduler is the command's STDOUT is
> > getting written to the logfile and not read by Unix::read_result. In
> > the first trace you can see this in the last two log lines:
> >
> > 2011-08-04 12:31:44  Linux#2.6.18-238.12.1.el5xen#2834
> > 2011-08-04 12:31:44  NOCPULSE-1312457503-STATUS 0
>
> Got it. At long last. We're being bitten by this perl bug:
>
> https://rt.perl.org/rt3/Public/Bug/Display.html?id=66224
>
> The test script there does indeed fail on my CentOS 5.6 system.
>
> Reason is that Process.pm redirects STDOUT and STDERR before spawning
> the probe runner. Then, the STDOUT of any subsequent processes
> (e.g. those started by open3 in Unix.pm) ends up going to STDOUT, not
> the IO::Handle created for it to write to.
>
> I believe rhn-runprobe doesn't exhibit the problem because it doesn't
> use Process.pm just calls the ProbeRunner directly.
>
> Anyway, applying the patch below from the perl bug report does indeed fix
> the problem on my spacewalk. All probes now update in the web UI and
> appear to be recording data.
>
> --- Open3.pm.~1~        2005-03-19 18:43:52.000000000 +0100
> +++ Open3.pm    2010-04-29 10:30:54.000000000 +0200
> @@ -200,6 +200,9 @@
>        # A tie in the parent should not be allowed to cause problems.
>        untie *STDIN;
>        untie *STDOUT;
> +        open(STDOUT, ">&=1");
> +        open(STDERR, ">&=2");
> +
>        # If she wants to dup the kid's stderr onto her stdout I need
>        to
>        # save a copy of her stdout before I put something else there.
>        if ($dad_rdr ne $dad_err && $dup_err
>
> Question is, do I file a bug against RHEL5 or look at rewriting the
> logic in UnixCommand.pm to work around the issue? I hardly think that
> patching core perl modules is a sensible thing to do on a production
> server!
>
> Regards,
>
> --
> David Nutter                            Tel: +44 (0)131 650 4888
> BioSS, JCMB, King's Buildings, Mayfield Rd, EH9 3JZ. Scotland, UK
>
> Hi ,

Here I am getting another error after changing the above mentioned
parameters,

Status: UNKNOWN, The RHN Monitoring Daemon (RHNMD) is not responding: Auto
configuration failed 11851:error:0200100D:system library:fopen:Permission
denied:bss_file.c:122:fopen('/etc/pki/tls/openssl.cnf','rb')
11851:error:2006D002:BIO routines:BIO_new_file:system lib:bss_file.c:127:
11851:error:0E078002:configuration file routines:DEF_LOAD:system
lib:conf_def.c:199:. Please make sure the daemon is running and the host is
accessible from the monitoring scout. Command was: /usr/bin/ssh -l nocpulse
-p 4545 -i /var/lib/nocpulse/.ssh/nocpulse-identity -o
StrictHostKeyChecking=no -o BatchMode=yes 192.169.1.20  /bin/sh -s


But if I do from spacewalk server ....

su - nocpulse
-bash-3.2$ ssh -i /var/lib/nocpulse/.ssh/nocpulse-identity
nocpulse at 192.168.1.20

Last login: Fri Aug  5 11:16:33 2011 from 192.168.1.100
-bash-3.2$

-bash-3.2$ /usr/bin/ssh -l nocpulse -p 4545 -i
/var/lib/nocpulse/.ssh/nocpulse-identity -o StrictHostKeyChecking=no -o
BatchMode=yes 192.168.1.20  /bin/sh -s
free -m
             total       used       free     shared    buffers     cached
Mem:          1000        779        221          0        128        467
-/+ buffers/cache:        182        817
Swap:         4094        185       3909
Killed by signal 2.

it's connecting and giving the output without an issue..

Any idea..........
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/spacewalk-list/attachments/20110805/d878027c/attachment.htm>


More information about the Spacewalk-list mailing list