[Linux-cluster] Cluster logging issues + rgmanager doesn't notice failed vms

Bart Verwilst lists at verwilst.be
Mon Aug 20 12:24:38 UTC 2012


At the same time, i notice a hanging /etc/libvirt/qemu gfs2 mount ( 
while /var/lib/libvirt/sanlock still works fine ) on vm02. vm01 and vm03 
have perfectly accessible mounts. Nothing special to see in syslog or 
dmesg..

/dev/mapper/iscsi_cluster_qemu on /etc/libvirt/qemu type gfs2 
(rw,relatime,hostdata=jid=2)
/dev/mapper/iscsi_cluster_sanlock on /var/lib/libvirt/sanlock type gfs2 
(rw,relatime,hostdata=jid=2)

Any ideas?

Bart

Bart Verwilst schreef op 20.08.2012 14:11:
> Hello again ;)
>
> My cluster seems to be logging only to /var/log/syslog, and even then
> only from the corosync daemon, the /var/log/cluster logs are empty:
>
> root at vm01-test:~# ls -al /var/log/cluster/*.log
> -rw------- 1 root root 0 Aug 16 06:50 /var/log/cluster/corosync.log
> -rw------- 1 root root 0 Aug 20 06:39 
> /var/log/cluster/dlm_controld.log
> -rw------- 1 root root 0 Aug 20 06:39 /var/log/cluster/fenced.log
> -rw------- 1 root root 0 Aug  7 06:27 /var/log/cluster/fence_na.log
> -rw------- 1 root root 0 Aug 16 06:50 
> /var/log/cluster/gfs_controld.log
> -rw------- 1 root root 0 Aug 20 06:39 /var/log/cluster/qdiskd.log
> -rw------- 1 root root 0 Aug 20 06:39 /var/log/cluster/rgmanager.log
>
> Also, I've shut down my 2 vms with virt-manager or with halt from the
> cli on the guest itself.
> virsh list on all 3 nodes show no running guests. However:
>
> root at vm01-test:~# clustat
> Cluster Status for kvm @ Mon Aug 20 14:10:20 2012
> Member Status: Quorate
>
>  Member Name                                                     ID   
> Status
>  ------ ----                                                     ---- 
> ------
>  vm01-test
> 1 Online, Local, rgmanager
>  vm02-test
> 2 Online, rgmanager
>  vm03-test
> 3 Online, rgmanager
>  /dev/mapper/iscsi_cluster_quorum
> 0 Online, Quorum Disk
>
>  Service Name
> Owner (Last)                                                     
> State
>  ------- ----
> ----- ------                                                     
> -----
>  vm:intux_firewall
> vm02-test
> started
>  vm:intux_zabbix
> vm02-test
> started
>
>
> My config:
>
> <cluster name="kvm" config_version="14">
> <logging debug="on"/>
>         <clusternodes>
>         <clusternode name="vm01-test" nodeid="1">
> 	<fence>
> 		<method name="apc">
> 			<device name="apc01" port="1" action="off"/>
> 			<device name="apc02" port="1" action="off"/>
> 			<device name="apc01" port="1" action="on"/>
> 			<device name="apc02" port="1" action="on"/>
> 		</method>
> 	</fence>
>         </clusternode>
>         <clusternode name="vm02-test" nodeid="2">
> 	<fence>
> 		<method name="apc">
> 			<device name="apc01" port="8" action="off"/>
> 			<device name="apc02" port="8" action="off"/>
> 			<device name="apc01" port="8" action="on"/>
> 			<device name="apc02" port="8" action="on"/>
> 		</method>
>                 </fence>
>         </clusternode>
>         <clusternode name="vm03-test" nodeid="3">
> 	<fence>
> 		<method name="apc">
> 			<device name="apc01" port="2" action="off"/>
> 			<device name="apc02" port="2" action="off"/>
> 			<device name="apc01" port="2" action="on"/>
> 			<device name="apc02" port="2" action="on"/>
> 		</method>
>                 </fence>
>         </clusternode>
>         </clusternodes>
> <fencedevices>
> 	<fencedevice agent="fence_apc" ipaddr="apc01" secure="on"
> login="device" name="apc01" passwd="xxx"/>
> 	<fencedevice agent="fence_apc" ipaddr="apc02" secure="on"
> login="device" name="apc02" passwd="xxx"/>
> </fencedevices>
> <rm log_level="5">
> 	<failoverdomains>
> 		<failoverdomain name="any_node" nofailback="1" ordered="0" 
> restricted="0"/>
> 	</failoverdomains>
> 	<vm domain="any_node" max_restarts="2" migrate="live"
> name="firewall" path="/etc/libvirt/qemu/" recovery="restart"
> restart_expire_time="600"/>
> 	<vm domain="any_node" max_restarts="2" migrate="live" name="zabbix"
> path="/etc/libvirt/qemu/" recovery="restart"
> restart_expire_time="600"/>
> </rm>
> <totem rrp_mode="none" secauth="off"/>
> <quorumd interval="2" tko="4"
> device="/dev/mapper/iscsi_cluster_quorum"></quorumd>
> </cluster>
>
> I hope you guys can shed some light on this.. CMAN, rgmanager, ... =
> 3.1.7-0ubuntu2.1, corosync = 1.4.2-2
>
> Kind regards,
>
> Bart




More information about the Linux-cluster mailing list