[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[Linux-cluster] rgmanager ceases to send syslog messages



Odd, a member node's rgmanager (clurgmgrd) stopped sending syslog messages, in particular, a 'status' message of a service it was running.  This causes us a problem, as we monitor syslog messages from a centralized server to update us of services running by nodename.

Is there a signal or event that can trigger clurgmgrd to restart its monitoring and logging of its running service?

The last instances of it running and showing 'WATSON status' follow.  Note, I realize there was an issue with this particular cluster.conf change, but those changes had nothing to do with the WATSON service, and all other nodes are still sending their 'service status' syslog messages.  Why would 'WATSON status' just stop?

Aug  6 14:38:35 db5 clurgmgrd: [16354]: <info> Executing /etc/init.d/WATSON status
Aug  6 14:39:05 db5 clurgmgrd: [16354]: <info> Executing /etc/init.d/WATSON status
Aug  6 14:39:20 db5 ccsd[13802]: Update of cluster.conf complete (version 187 -> 188).
Aug  6 14:39:25 db5 clurgmgrd[16354]: <notice> Reconfiguring
Aug  6 14:39:25 db5 clurgmgrd[16354]: <info> Loading Service Data
Aug  6 14:39:25 db5 clurgmgrd[16354]: <err> Error storing ip: Duplicate
Aug  6 14:39:26 db5 clurgmgrd[16354]: <err> Unique attribute collision. type=clusterfs attr=device value=/dev/VGCCC1/lvol0
Aug  6 14:39:26 db5 clurgmgrd[16354]: <err> Error storing clusterfs resource
Aug  6 14:39:26 db5 clurgmgrd[16354]: <err> Unique attribute collision. type=clusterfs attr=device value=/dev/VGCCC1/lvol1
Aug  6 14:39:26 db5 clurgmgrd[16354]: <err> Error storing clusterfs resource
Aug  6 14:39:26 db5 clurgmgrd[16354]: <info> Stopping changed resources.
Aug  6 14:39:26 db5 clurgmgrd[16354]: <info> Restarting changed resources.
Aug  6 14:39:26 db5 clurgmgrd[16354]: <info> Starting changed resources.
Aug  6 14:39:26 db5 clurgmgrd: [16354]: <info> Executing /etc/init.d/syslogger stop
Aug  6 14:39:27 db5 clurgmgrd: [16354]: <info> Executing /etc/init.d/luci stop
Aug  6 14:39:27 db5 clurgmgrd: [16354]: <info> Executing /etc/init.d/webmin stop
Aug  6 14:39:27 db5 clurgmgrd: [16354]: <info> Executing /etc/init.d/nagios stop

I continue to get messages from clurgmgrd, but only through Magma Event changes, i.e.:

Aug  7 16:09:03 db5 clurgmgrd[16354]: <info> Magma Event: Membership Change
Aug  7 16:09:03 db5 clurgmgrd[16354]: <info> State change: db1 UP


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]