[Linux-cluster] Cluster logging issues + rgmanager doesn't notice failed vms
Bart Verwilst
lists at verwilst.be
Mon Aug 20 12:24:38 UTC 2012
At the same time, i notice a hanging /etc/libvirt/qemu gfs2 mount (
while /var/lib/libvirt/sanlock still works fine ) on vm02. vm01 and vm03
have perfectly accessible mounts. Nothing special to see in syslog or
dmesg..
/dev/mapper/iscsi_cluster_qemu on /etc/libvirt/qemu type gfs2
(rw,relatime,hostdata=jid=2)
/dev/mapper/iscsi_cluster_sanlock on /var/lib/libvirt/sanlock type gfs2
(rw,relatime,hostdata=jid=2)
Any ideas?
Bart
Bart Verwilst schreef op 20.08.2012 14:11:
> Hello again ;)
>
> My cluster seems to be logging only to /var/log/syslog, and even then
> only from the corosync daemon, the /var/log/cluster logs are empty:
>
> root at vm01-test:~# ls -al /var/log/cluster/*.log
> -rw------- 1 root root 0 Aug 16 06:50 /var/log/cluster/corosync.log
> -rw------- 1 root root 0 Aug 20 06:39
> /var/log/cluster/dlm_controld.log
> -rw------- 1 root root 0 Aug 20 06:39 /var/log/cluster/fenced.log
> -rw------- 1 root root 0 Aug 7 06:27 /var/log/cluster/fence_na.log
> -rw------- 1 root root 0 Aug 16 06:50
> /var/log/cluster/gfs_controld.log
> -rw------- 1 root root 0 Aug 20 06:39 /var/log/cluster/qdiskd.log
> -rw------- 1 root root 0 Aug 20 06:39 /var/log/cluster/rgmanager.log
>
> Also, I've shut down my 2 vms with virt-manager or with halt from the
> cli on the guest itself.
> virsh list on all 3 nodes show no running guests. However:
>
> root at vm01-test:~# clustat
> Cluster Status for kvm @ Mon Aug 20 14:10:20 2012
> Member Status: Quorate
>
> Member Name ID
> Status
> ------ ---- ----
> ------
> vm01-test
> 1 Online, Local, rgmanager
> vm02-test
> 2 Online, rgmanager
> vm03-test
> 3 Online, rgmanager
> /dev/mapper/iscsi_cluster_quorum
> 0 Online, Quorum Disk
>
> Service Name
> Owner (Last)
> State
> ------- ----
> ----- ------
> -----
> vm:intux_firewall
> vm02-test
> started
> vm:intux_zabbix
> vm02-test
> started
>
>
> My config:
>
> <cluster name="kvm" config_version="14">
> <logging debug="on"/>
> <clusternodes>
> <clusternode name="vm01-test" nodeid="1">
> <fence>
> <method name="apc">
> <device name="apc01" port="1" action="off"/>
> <device name="apc02" port="1" action="off"/>
> <device name="apc01" port="1" action="on"/>
> <device name="apc02" port="1" action="on"/>
> </method>
> </fence>
> </clusternode>
> <clusternode name="vm02-test" nodeid="2">
> <fence>
> <method name="apc">
> <device name="apc01" port="8" action="off"/>
> <device name="apc02" port="8" action="off"/>
> <device name="apc01" port="8" action="on"/>
> <device name="apc02" port="8" action="on"/>
> </method>
> </fence>
> </clusternode>
> <clusternode name="vm03-test" nodeid="3">
> <fence>
> <method name="apc">
> <device name="apc01" port="2" action="off"/>
> <device name="apc02" port="2" action="off"/>
> <device name="apc01" port="2" action="on"/>
> <device name="apc02" port="2" action="on"/>
> </method>
> </fence>
> </clusternode>
> </clusternodes>
> <fencedevices>
> <fencedevice agent="fence_apc" ipaddr="apc01" secure="on"
> login="device" name="apc01" passwd="xxx"/>
> <fencedevice agent="fence_apc" ipaddr="apc02" secure="on"
> login="device" name="apc02" passwd="xxx"/>
> </fencedevices>
> <rm log_level="5">
> <failoverdomains>
> <failoverdomain name="any_node" nofailback="1" ordered="0"
> restricted="0"/>
> </failoverdomains>
> <vm domain="any_node" max_restarts="2" migrate="live"
> name="firewall" path="/etc/libvirt/qemu/" recovery="restart"
> restart_expire_time="600"/>
> <vm domain="any_node" max_restarts="2" migrate="live" name="zabbix"
> path="/etc/libvirt/qemu/" recovery="restart"
> restart_expire_time="600"/>
> </rm>
> <totem rrp_mode="none" secauth="off"/>
> <quorumd interval="2" tko="4"
> device="/dev/mapper/iscsi_cluster_quorum"></quorumd>
> </cluster>
>
> I hope you guys can shed some light on this.. CMAN, rgmanager, ... =
> 3.1.7-0ubuntu2.1, corosync = 1.4.2-2
>
> Kind regards,
>
> Bart
More information about the Linux-cluster
mailing list