[rdo-list] Updates to RDO slaves and jobs in ci.centos.org

David Moreau Simard dms at redhat.com
Fri Apr 21 11:55:04 UTC 2017


The performance is not great because of "rdo-ci-slave01" from which Ansible
runs on.

We all know that node has performance problems (especially i/o).
For example, a promote job [1] will take 1 hour and 4 minutes while the
equivalent generic job [2] (ran on a cloudslave) will finish in about 35
minutes.

I mean, it takes rdo-ci-slave01 more than five (5!) minutes to just
bootstrap the job (clone weirdo, virtualenv with ara, ansible, shade and
initialize ara).
The same thing takes less than 30 seconds on a cloudslave.

[1]:
https://ci.centos.org/job/weirdo-master-promote-packstack-scenario001/1080/
[2]:
https://ci.centos.org/view/rdo/view/weirdo/job/weirdo-generic-packstack-scenario001/515/

David Moreau Simard
Senior Software Engineer | Openstack RDO

dmsimard = [irc, github, twitter]

On Apr 21, 2017 4:22 AM, "Alfredo Moralejo Alonso" <amoralej at redhat.com>
wrote:

> On Fri, Apr 21, 2017 at 2:40 AM, David Moreau Simard <dms at redhat.com>
> wrote:
> > WeIRDO jobs were tested manually on the rdo-ci-slave01 (promote slave)
> > on which the jobs would not run successfully yesterday.
> >
> > Everything now looks good after untangling the update issue from
> > yesterday and WeIRDO promote jobs have been switched to rdo-cloud.
> >
>
> Nice!, I've seen weirdo jobs in
> https://ci.centos.org/view/rdo/view/promotion-pipeline/
> job/rdo_trunk-promote-master-current-tripleo/44/
> ran in RDO Cloud with pretty good performance, they seems to run
> slower than jobs running in dusty servers in ci.centos but faster that
> the rest of servers.
>
> I'll keep an eye on it too to find out if there is any abnormal behavior.
>
>
> > I'll be monitoring this closely but let me know if you see any problems.
> >
> > David Moreau Simard
> > Senior Software Engineer | Openstack RDO
> >
> > dmsimard = [irc, github, twitter]
> >
> >
> > On Thu, Apr 20, 2017 at 12:26 AM, David Moreau Simard <dms at redhat.com>
> wrote:
> >> Hi,
> >>
> >> There's been a few updates worth mentioning and explaining to a wider
> >> audience as far as RDO is concerned on the ci.centos.org environment.
> >>
> >> First, please note that all packages on the five RDO slaves have been
> >> updated to the latest version.
> >> We had not yet updated to 7.3.
> >>
> >> The rdo-ci-slave01 node (the "promotion" slave) ran into some issues
> >> that took some time to fix, EPEL was enabled and it picked up python
> >> packages it shouldn't have.
> >> Things seem to be back in order now but some jobs might have failed in
> >> a weird way, triggering them again should be fine.
> >>
> >> Otherwise, all generic WeIRDO jobs are now running on OpenStack
> >> virtual machines provided by the RDO Cloud.
> >> This is provided by using the "rdo-virtualized" slave tags.
> >> The "rdo-promote-virtualized" tag will be used for the weirdo promote
> >> jobs once we're sure there's no more issues running them on the
> >> promotion slave.
> >>
> >> These tags are designed to work with WeIRDO jobs only for the time
> >> being, please contact me if you'd like to run virtualized workloads
> >> from ci.centos.org.
> >>
> >> This amounts to around 35 less jobs per day running on Duffy
> >> ci.centos.org hardware in total on a typical day (including generic
> >> weirdo jobs and promote weirdo jobs).
> >>
> >> I've re-shuffled the capacity around a bit, considering we've now
> >> freed significant capacity for bare-metal based TripleO jobs.
> >> The slave threads are now as follows:
> >> - rdo-ci-slave01: 12 threads (up from 11), tagged with "rdo-promote"
> >> and "rdo-promote-virtualized"
> >> - rdo-ci-cloudslave01: 6 threads (up from 4), tagged with "rdo"
> >> - rdo-ci-cloudslave02: 6 threads (up from 4), tagged with "rdo"
> >> - rdo-ci-cloudslave03: 8 threads (up from 4), tagged with
> "rdo-virtualized"
> >> - rdo-ci-cloudslave04: 8 threads (down from 15), tagged with
> "rdo-virtualized"
> >>
> >> There is a specific reason why cloudslave03 and cloudslave04 amount to
> >> 16 threads between the two, it is to match the quota we have been
> >> given in terms of capacity at RDO cloud.
> >> The threads will be used to artificially limit the amount of jobs run
> >> against the cloud concurrently without needing to implement queueing
> >> on our end.
> >>
> >> You'll otherwise notice the net effect for the "rdo" and "rdo-promote"
> >> tag isn't much, at least for the time being, it's very much the same
> >> since I've re-allocated cloudslave03 to load balance virtualized jobs.
> >> However, jobs are likely to be more reliable and faster now that they
> >> won't have to retry for nodes because we're less likely to hit
> >> rate-limiting.
> >>
> >> I'll monitor the situation over the next few days and bump the numbers
> >> if everything is looking good.
> >> That said, I'd like to hear about your feedback if you feel things are
> >> looking better and if we are running into "out of inventory" errors
> >> less often.
> >>
> >> Let me know if you have any questions,
> >>
> >> David Moreau Simard
> >> Senior Software Engineer | Openstack RDO
> >>
> >> dmsimard = [irc, github, twitter]
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/rdo-list/attachments/20170421/569319d4/attachment.htm>


More information about the rdo-list mailing list