[rdo-list] Newton HA galera-ready dependency error

Charles Short cems at ebi.ac.uk
Wed Oct 12 10:48:36 UTC 2016


Ok, I updated puppet-tripleo as  suggested with virt-customize and 
verified that the patch was indeed updated (using guestmount )
I am now redeploying

Charles

On 12/10/2016 10:56, Marius Cornea wrote:
> I see that in that repo puppet-tripleo got updated on Friday,
> 2016-10-07. If the repo file is present inside the image you could
> update puppet-tripleo with virt-customize, then upload the image the
> the undercloud Glance with 'openstack overcloud image upload
> --update-existing' and redeploy.
>
> On Wed, Oct 12, 2016 at 11:45 AM, Charles Short <cems at ebi.ac.uk> wrote:
>> Just checked what I did to build the image previously.
>>
>> I used
>>   export
>> DELOREAN_TRUNK_REPO="http://trunk.rdoproject.org/centos7-newton/current/"
>>
>> Which is the same repo I used to install the Undercloud.
>> Maybe I need to do a yum update in the image with  libguestfs-tools prior to
>> building the image?
>>
>> I will rebuild again anyway in case I made an error
>>
>> C
>>
>>
>>
>> On 12/10/2016 10:28, Charles Short wrote:
>>> Hi,
>>>
>>> Ok,  I will rebuild using the undercloud repo and report back.
>>>
>>> Thanks for your help
>>>
>>> Charles
>>>
>>> On 12/10/2016 09:44, Marius Cornea wrote:
>>>> Oh, that explains it. It looks that the overcloud image doesn't
>>>> contain the patch.
>>>>
>>>> I'm not familiar with the image build process but according to the
>>>> docs[1] I think the packages get installed from the repo specified by
>>>> export DELOREAN_TRUNK_REPO so maybe you should try to rebuild the
>>>> image and use the same repo as the one set on the undercloud.
>>>>
>>>> [1]
>>>> http://docs.openstack.org/developer/tripleo-docs/basic_deployment/basic_deployment_cli.html
>>>>
>>>> On Wed, Oct 12, 2016 at 10:30 AM, Charles Short <cems at ebi.ac.uk> wrote:
>>>>> Hi,
>>>>>
>>>>> We noticed something that may be the reason for this patch not working
>>>>> which
>>>>> may be related to the way the Undercloud built the image? -
>>>>>
>>>>> This is difference between the puppet files in the undercloud and
>>>>> puppet in the image:
>>>>>
>>>>> [stack at hh-extcl05-undercloud ~]$ grep mysql_short_node_names
>>>>>
>>>>> /usr/share/openstack-puppet/modules/tripleo/manifests/profile/pacemaker/database/mysql.pp
>>>>>
>>>>> $galera_node_names_lookup = hiera('mysql_short_node_names',
>>>>> hiera('mysql_node_names', $::hostname))
>>>>>
>>>>> [root at overcloud-controller-1 puppet]# grep mysql_short_node_names
>>>>>
>>>>> /etc/puppet/modules/tripleo/manifests/profile/pacemaker/database/mysql.pp
>>>>>
>>>>> (nothing found)
>>>>>
>>>>>
>>>>> On 12/10/2016 09:09, Marius Cornea wrote:
>>>>>> That's odd. I encountered the same issue and it was caused by missing
>>>>>> this patch.  What do you get if you do sudo hiera
>>>>>> mysql_short_node_names on the controller node?
>>>>>>
>>>>>> On Wed, Oct 12, 2016 at 12:48 AM, Charles Short <cems at ebi.ac.uk> wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> puppet-tripleo-5.2.0-0.20161007035759.f32e484.el7.centos.noarch
>>>>>>>
>>>>>>> The patch seems to be already present -
>>>>>>>
>>>>>>>      grep short
>>>>>>>
>>>>>>>
>>>>>>> /usr/share/openstack-puppet/modules/tripleo/manifests/profile/pacemaker/database/mysql.pp
>>>>>>>
>>>>>>>      # short name which is already registered in pacemaker until we get
>>>>>>> around
>>>>>>>      $galera_node_names_lookup = hiera('mysql_short_node_names',
>>>>>>> hiera('mysql_node_names', $::hostname))
>>>>>>>
>>>>>>> Charles
>>>>>>>
>>>>>>> On 11/10/2016 19:39, Marius Cornea wrote:
>>>>>>>> I think the issue is caused by the addresses in wsrep_cluster_address
>>>>>>>> not matching the pacemaker node names:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> wsrep_cluster_address=gcomm://overcloud-controller-0.internalapi.localdomain,overcloud-controller-1.internalapi.localdomain,overcloud-controller-2.internalapi.localdomain
>>>>>>>>
>>>>>>>> Could you please confirm what version of puppet-tripleo you've got
>>>>>>>> installed on the overcloud nodes and if it contains the following
>>>>>>>> patch:
>>>>>>>>
>>>>>>>>
>>>>>>>> https://review.openstack.org/#/c/382883/1/manifests/profile/pacemaker/database/mysql.pp
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Marius
>>>>>>>>
>>>>>>>> On Tue, Oct 11, 2016 at 7:42 PM, Charles Short <cems at ebi.ac.uk>
>>>>>>>> wrote:
>>>>>>>>> Ok install finished with same error
>>>>>>>>> The latest pcs status etc
>>>>>>>>>
>>>>>>>>> http://pastebin.com/ZK683gZe
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 11/10/2016 17:35, Charles Short wrote:
>>>>>>>>>> Deployment almost finished...so
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> http://pastebin.com/zE9B19XB
>>>>>>>>>>
>>>>>>>>>> This shows the pcs status as the deployment nears the end, and pcs
>>>>>>>>>> resource show galera
>>>>>>>>>>
>>>>>>>>>> Charles
>>>>>>>>>>
>>>>>>>>>> On 11/10/2016 16:59, Marius Cornea wrote:
>>>>>>>>>>> Great, thanks for checking this.
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Oct 11, 2016 at 5:58 PM, Charles Short <cems at ebi.ac.uk>
>>>>>>>>>>> wrote:
>>>>>>>>>>>> Currently having more generic deployment issues (no valid host
>>>>>>>>>>>> found
>>>>>>>>>>>> etc).
>>>>>>>>>>>> I can work around/solve these.
>>>>>>>>>>>> I don't yet have another stack to analyse, but will do soon.
>>>>>>>>>>>>
>>>>>>>>>>>> Charles
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 11/10/2016 16:35, Marius Cornea wrote:
>>>>>>>>>>>>> Did it succeed in bringing the Galera nodes to Master? You can
>>>>>>>>>>>>> ssh
>>>>>>>>>>>>> to
>>>>>>>>>>>>> the nodes and run 'pcs resource show galera' even though the
>>>>>>>>>>>>> deployment hasn't finished. I'm interested to see how the
>>>>>>>>>>>>> wsrep_cluster_address is set to see if it's affected by the
>>>>>>>>>>>>> resource
>>>>>>>>>>>>> agent issue described in
>>>>>>>>>>>>> https://bugs.launchpad.net/tripleo/+bug/1628521
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Tue, Oct 11, 2016 at 5:18 PM, Charles Short <cems at ebi.ac.uk>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> Looks similar to this bug (still waiting on deployment to
>>>>>>>>>>>>>> finish)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1368214
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 11/10/2016 15:25, Charles Short wrote:
>>>>>>>>>>>>>>> Sorry for the delay.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Just redeploying to make sure I can repeat the same error.
>>>>>>>>>>>>>>> Should
>>>>>>>>>>>>>>> not
>>>>>>>>>>>>>>> be
>>>>>>>>>>>>>>> long.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Charles
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 11/10/2016 14:24, Marius Cornea wrote:
>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Could you also please paste the output for 'pcs resource show
>>>>>>>>>>>>>>>> galera',
>>>>>>>>>>>>>>>> it looks that all the galera nodes show up as slaves?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>         Master/Slave Set: galera-master [galera]
>>>>>>>>>>>>>>>>             Slaves: [ overcloud-controller-0
>>>>>>>>>>>>>>>> overcloud-controller-1
>>>>>>>>>>>>>>>> overcloud-controller-2 ]
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Tue, Oct 11, 2016 at 2:16 PM, Charles Short
>>>>>>>>>>>>>>>> <cems at ebi.ac.uk>
>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Here you are -
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>         - Heat stack error  - http://pastebin.com/E8KZa2vE
>>>>>>>>>>>>>>>>>         - PCS status  - http://pastebin.com/z34gSLq6
>>>>>>>>>>>>>>>>>         - mariadb.log - http://pastebin.com/APFXPBLc
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Charles
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On 11/10/2016 12:07, Marius Cornea wrote:
>>>>>>>>>>>>>>>>>> Hi Charles,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Could you please paste the output of 'pcs status' ? The log
>>>>>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>> /var/log/mariadb/mariadb.log might also be a good
>>>>>>>>>>>>>>>>>> indicator.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Tue, Oct 11, 2016 at 11:16 AM, Charles Short
>>>>>>>>>>>>>>>>>> <cems at ebi.ac.uk>
>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>> To add I built my own image from
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> http://cloud.centos.org/centos/7/images/CentOS-7-x86_64-GenericCloud.qcow2
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> as the images in
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> http://buildlogs.centos.org/centos/7/cloud/x86_64/tripleo_images/newton/delorean/
>>>>>>>>>>>>>>>>>>> caused sporadic ramdisk loading errors  (hung at x% loaded
>>>>>>>>>>>>>>>>>>> on
>>>>>>>>>>>>>>>>>>> boot)
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Does my image now need to be customised in any way for HA
>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>> work?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Charles
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On 11/10/2016 09:55, Charles Short wrote:
>>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I am installing Newton with TripleO on baremetal HP
>>>>>>>>>>>>>>>>>>>> blades.
>>>>>>>>>>>>>>>>>>>> I can deploy a single controller stack overcloud no
>>>>>>>>>>>>>>>>>>>> problem,
>>>>>>>>>>>>>>>>>>>> however
>>>>>>>>>>>>>>>>>>>> when
>>>>>>>>>>>>>>>>>>>> I choose three controllers the deployment fails
>>>>>>>>>>>>>>>>>>>> (including
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> /usr/share/openstack-tripleo-heat-templates/environments/puppet-pacemaker.yaml)
>>>>>>>>>>>>>>>>>>>> .
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> The heat stack error first complains "Dependency
>>>>>>>>>>>>>>>>>>>> Exec[galera-ready]
>>>>>>>>>>>>>>>>>>>> has
>>>>>>>>>>>>>>>>>>>> failures" which in turn causes lots of other errors.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I have deployed Liberty and Mitaka successfully in the
>>>>>>>>>>>>>>>>>>>> past
>>>>>>>>>>>>>>>>>>>> on
>>>>>>>>>>>>>>>>>>>> baremetal
>>>>>>>>>>>>>>>>>>>> with three controllers, and this is the first time I have
>>>>>>>>>>>>>>>>>>>> seen
>>>>>>>>>>>>>>>>>>>> this
>>>>>>>>>>>>>>>>>>>> error.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Charles
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>> Charles Short
>>>>>>>>>>>>>>>>>>> Cloud Engineer
>>>>>>>>>>>>>>>>>>> Virtualization and Cloud Team
>>>>>>>>>>>>>>>>>>> European Bioinformatics Institute (EMBL-EBI)
>>>>>>>>>>>>>>>>>>> Tel: +44 (0)1223 494205
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>>>> rdo-list mailing list
>>>>>>>>>>>>>>>>>>> rdo-list at redhat.com
>>>>>>>>>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/rdo-list
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> To unsubscribe: rdo-list-unsubscribe at redhat.com
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>> Charles Short
>>>>>>>>>>>>>>>>> Cloud Engineer
>>>>>>>>>>>>>>>>> Virtualization and Cloud Team
>>>>>>>>>>>>>>>>> European Bioinformatics Institute (EMBL-EBI)
>>>>>>>>>>>>>>>>> Tel: +44 (0)1223 494205
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> Charles Short
>>>>>>>>>>>>>> Cloud Engineer
>>>>>>>>>>>>>> Virtualization and Cloud Team
>>>>>>>>>>>>>> European Bioinformatics Institute (EMBL-EBI)
>>>>>>>>>>>>>> Tel: +44 (0)1223 494205
>>>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Charles Short
>>>>>>>>>>>> Cloud Engineer
>>>>>>>>>>>> Virtualization and Cloud Team
>>>>>>>>>>>> European Bioinformatics Institute (EMBL-EBI)
>>>>>>>>>>>> Tel: +44 (0)1223 494205
>>>>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Charles Short
>>>>>>>>> Cloud Engineer
>>>>>>>>> Virtualization and Cloud Team
>>>>>>>>> European Bioinformatics Institute (EMBL-EBI)
>>>>>>>>> Tel: +44 (0)1223 494205
>>>>>>>>>
>>>>>>> --
>>>>>>> Charles Short
>>>>>>> Cloud Engineer
>>>>>>> Virtualization and Cloud Team
>>>>>>> European Bioinformatics Institute (EMBL-EBI)
>>>>>>> Tel: +44 (0)1223 494205
>>>>>>>
>>>>> --
>>>>> Charles Short
>>>>> Cloud Engineer
>>>>> Virtualization and Cloud Team
>>>>> European Bioinformatics Institute (EMBL-EBI)
>>>>> Tel: +44 (0)1223 494205
>>>>>
>> --
>> Charles Short
>> Cloud Engineer
>> Virtualization and Cloud Team
>> European Bioinformatics Institute (EMBL-EBI)
>> Tel: +44 (0)1223 494205
>>

-- 
Charles Short
Cloud Engineer
Virtualization and Cloud Team
European Bioinformatics Institute (EMBL-EBI)
Tel: +44 (0)1223 494205




More information about the rdo-list mailing list