[Linux-cluster] openais issue

Alan A alan.zg at gmail.com
Mon Sep 28 20:24:48 UTC 2009


I ran into same issue. I had a node that was working fine and then it locked
up. When I tried to start cman - it would tell me openaix service cannot
read library since the buffer was full.

All this went away after I downgraded cman to recomended version.... Is
there a workaround, a fix?

On Mon, Sep 28, 2009 at 10:03 AM, Paras pradhan <pradhanparas at gmail.com>wrote:

> The only thing I noticed is the message after stopping the vm using xm
> in all nodes and starting using clusvcadm is
>
> "Virtual machine guest1 is blocked"
>
> The whole DEBUG file is attached.
>
>
> Thanks
> Paras.
>
> On Fri, Sep 25, 2009 at 5:53 PM, brem belguebli
> <brem.belguebli at gmail.com> wrote:
> > There's a problem with the script that is called by rgmanager to start
> > the VM, I don't know what causes it
> >
> > May be you should try something like :
> >
> > 1) stop the VM on all nodes with xm commands
> > 2) edit the /usr/share/cluster/vm.sh script and add the following
> > lines (after the #!/bin/bash ):
> >   exec >/tmp/DEBUG 2>&1
> >   set -x
> > 3) start the VM with clusvcadm -e vm:guest1
> >
> > It should fail as it did before.
> >
> > edit the the /tmp/DEBUG file and you will be able to see where it
> > fails (it may generate a lot of debug)
> >
> > 4) remove the debug lines from /usr/share/cluster/vm.sh
> >
> > Post the DEBUG file if you're not able to see where it fails.
> >
> > Brem
> >
> > 2009/9/26 Paras pradhan <pradhanparas at gmail.com>:
> >> No I am not manually starting not using automatic init scripts.
> >>
> >> I started the vm using: clusvcadm -e vm:guest1
> >>
> >> I have just stopped using clusvcadm -s vm:guest1. For few seconds it
> >> says guest1 started . But after a while I can see the guest1 on all
> >> three nodes.
> >>
> >> clustat says:
> >>
> >>  Service Name                                            Owner (Last)
> >>                                          State
> >>  ------- ----                                            ----- ------
> >>                                          -----
> >>  vm:guest1                                               (none)
> >>                                          stopped
> >>
> >> But I can see the vm from xm li.
> >>
> >> This is what I can see from the log:
> >>
> >>
> >> Sep 25 17:19:01 cvtst1 clurgmgrd[4298]: <notice> start on vm "guest1"
> >> returned 1 (generic error)
> >> Sep 25 17:19:01 cvtst1 clurgmgrd[4298]: <warning> #68: Failed to start
> >> vm:guest1; return value: 1
> >> Sep 25 17:19:01 cvtst1 clurgmgrd[4298]: <notice> Stopping service
> vm:guest1
> >> Sep 25 17:19:02 cvtst1 clurgmgrd[4298]: <notice> Service vm:guest1 is
> >> recovering
> >> Sep 25 17:19:15 cvtst1 clurgmgrd[4298]: <notice> Recovering failed
> >> service vm:guest1
> >> Sep 25 17:19:16 cvtst1 clurgmgrd[4298]: <notice> start on vm "guest1"
> >> returned 1 (generic error)
> >> Sep 25 17:19:16 cvtst1 clurgmgrd[4298]: <warning> #68: Failed to start
> >> vm:guest1; return value: 1
> >> Sep 25 17:19:16 cvtst1 clurgmgrd[4298]: <notice> Stopping service
> vm:guest1
> >> Sep 25 17:19:17 cvtst1 clurgmgrd[4298]: <notice> Service vm:guest1 is
> >> recovering
> >>
> >>
> >> Paras.
> >>
> >> On Fri, Sep 25, 2009 at 5:07 PM, brem belguebli
> >> <brem.belguebli at gmail.com> wrote:
> >>> Have you started  your VM via rgmanager (clusvcadm -e vm:guest1) or
> >>> using xm commands out of cluster control  (or maybe a thru an
> >>> automatic init script ?)
> >>>
> >>> When clustered, you should never be starting services (manually or
> >>> thru automatic init script) out of cluster control
> >>>
> >>> The thing would be to stop your vm on all the nodes with the adequate
> >>> xm command (not using xen myself) and try to start it with clusvcadm.
> >>>
> >>> Then see if it is started on all nodes (send clustat output)
> >>>
> >>>
> >>>
> >>> 2009/9/25 Paras pradhan <pradhanparas at gmail.com>:
> >>>> Ok. Please see below. my vm is running on all nodes though clustat
> >>>> says it is stopped.
> >>>>
> >>>> --
> >>>> [root at cvtst1 ~]# clustat
> >>>> Cluster Status for test @ Fri Sep 25 16:52:34 2009
> >>>> Member Status: Quorate
> >>>>
> >>>>  Member Name                                                     ID
> Status
> >>>>  ------ ----                                                     ----
> ------
> >>>>  cvtst2                                                    1 Online,
> rgmanager
> >>>>  cvtst1                                                     2 Online,
> >>>> Local, rgmanager
> >>>>  cvtst3                                                     3 Online,
> rgmanager
> >>>>
> >>>>  Service Name                                            Owner (Last)
> >>>>                                          State
> >>>>  ------- ----                                            ----- ------
> >>>>                                          -----
> >>>>  vm:guest1                                               (none)
> >>>>                                          stopped
> >>>> [root at cvtst1 ~]#
> >>>>
> >>>>
> >>>> ---
> >>>> o/p of xm li on cvtst1
> >>>>
> >>>> --
> >>>> [root at cvtst1 ~]# xm li
> >>>> Name                                      ID Mem(MiB) VCPUs State
> Time(s)
> >>>> Domain-0                                   0     3470     2 r-----
>  28939.4
> >>>> guest1                                     7      511     1 -b----
> 7727.8
> >>>>
> >>>> o/p of xm li on cvtst2
> >>>>
> >>>> --
> >>>> [root at cvtst2 ~]# xm li
> >>>> Name                                      ID Mem(MiB) VCPUs State
> Time(s)
> >>>> Domain-0                                   0     3470     2 r-----
>  31558.9
> >>>> guest1                                    21      511     1 -b----
> 7558.2
> >>>> ---
> >>>>
> >>>> Thanks
> >>>> Paras.
> >>>>
> >>>>
> >>>>
> >>>> On Fri, Sep 25, 2009 at 4:22 PM, brem belguebli
> >>>> <brem.belguebli at gmail.com> wrote:
> >>>>> It looks like no.
> >>>>>
> >>>>> can you send an output of clustat  of when the VM is running on
> >>>>> multiple nodes at the same time?
> >>>>>
> >>>>> And by the way, another one after having stopped (clusvcadm -s
> vm:guest1) ?
> >>>>>
> >>>>>
> >>>>>
> >>>>> 2009/9/25 Paras pradhan <pradhanparas at gmail.com>:
> >>>>>> Anyone having issue as mine? Virtual machine service is not being
> >>>>>> properly handled by the cluster.
> >>>>>>
> >>>>>>
> >>>>>> Thanks
> >>>>>> Paras.
> >>>>>>
> >>>>>> On Mon, Sep 21, 2009 at 9:55 AM, Paras pradhan <
> pradhanparas at gmail.com> wrote:
> >>>>>>> Ok.. here is my cluster.conf file
> >>>>>>>
> >>>>>>> --
> >>>>>>> [root at cvtst1 cluster]# more cluster.conf
> >>>>>>> <?xml version="1.0"?>
> >>>>>>> <cluster alias="test" config_version="9" name="test">
> >>>>>>>        <fence_daemon clean_start="0" post_fail_delay="0"
> post_join_delay="3"/>
> >>>>>>>        <clusternodes>
> >>>>>>>                <clusternode name="cvtst2" nodeid="1" votes="1">
> >>>>>>>                        <fence/>
> >>>>>>>                </clusternode>
> >>>>>>>                <clusternode name="cvtst1" nodeid="2" votes="1">
> >>>>>>>                        <fence/>
> >>>>>>>                </clusternode>
> >>>>>>>                <clusternode name="cvtst3" nodeid="3" votes="1">
> >>>>>>>                        <fence/>
> >>>>>>>                </clusternode>
> >>>>>>>        </clusternodes>
> >>>>>>>        <cman/>
> >>>>>>>        <fencedevices/>
> >>>>>>>        <rm>
> >>>>>>>                <failoverdomains>
> >>>>>>>                        <failoverdomain name="myfd1" nofailback="0"
> ordered="1" restricted="0">
> >>>>>>>                                <failoverdomainnode name="cvtst2"
> priority="3"/>
> >>>>>>>                                <failoverdomainnode name="cvtst1"
> priority="1"/>
> >>>>>>>                                <failoverdomainnode name="cvtst3"
> priority="2"/>
> >>>>>>>                        </failoverdomain>
> >>>>>>>                </failoverdomains>
> >>>>>>>                <resources/>
> >>>>>>>                <vm autostart="1" domain="myfd1" exclusive="0"
> max_restarts="0"
> >>>>>>> name="guest1" path="/vms" recovery="r
> >>>>>>> estart" restart_expire_time="0"/>
> >>>>>>>        </rm>
> >>>>>>> </cluster>
> >>>>>>> [root at cvtst1 cluster]#
> >>>>>>> ------
> >>>>>>>
> >>>>>>> Thanks!
> >>>>>>> Paras.
> >>>>>>>
> >>>>>>>
> >>>>>>> On Sun, Sep 20, 2009 at 9:44 AM, Volker Dormeyer <
> volker at ixolution.de> wrote:
> >>>>>>>> On Fri, Sep 18, 2009 at 05:08:57PM -0500,
> >>>>>>>> Paras pradhan <pradhanparas at gmail.com> wrote:
> >>>>>>>>> I am using cluster suite for HA of xen virtual machines. Now I am
> >>>>>>>>> having another problem. When I start the my xen vm in one node,
> it
> >>>>>>>>> also starts on other nodes. Which daemon controls  this?
> >>>>>>>>
> >>>>>>>> This is usually done bei clurgmgrd (which is part of the rgmanager
> >>>>>>>> package). To me, this sounds like a configuration problem. Maybe,
> >>>>>>>> you can post your cluster.conf?
> >>>>>>>>
> >>>>>>>> Regards,
> >>>>>>>> Volker
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> Linux-cluster mailing list
> >>>>>>>> Linux-cluster at redhat.com
> >>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> Linux-cluster mailing list
> >>>>>> Linux-cluster at redhat.com
> >>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster
> >>>>>>
> >>>>>
> >>>>> --
> >>>>> Linux-cluster mailing list
> >>>>> Linux-cluster at redhat.com
> >>>>> https://www.redhat.com/mailman/listinfo/linux-cluster
> >>>>>
> >>>>
> >>>> --
> >>>> Linux-cluster mailing list
> >>>> Linux-cluster at redhat.com
> >>>> https://www.redhat.com/mailman/listinfo/linux-cluster
> >>>>
> >>>
> >>> --
> >>> Linux-cluster mailing list
> >>> Linux-cluster at redhat.com
> >>> https://www.redhat.com/mailman/listinfo/linux-cluster
> >>>
> >>
> >> --
> >> Linux-cluster mailing list
> >> Linux-cluster at redhat.com
> >> https://www.redhat.com/mailman/listinfo/linux-cluster
> >>
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> >
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>



-- 
Alan A.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20090928/8e34847a/attachment.htm>


More information about the Linux-cluster mailing list