[Pulp-list] REST API performance-related question(s)

Tue Dec 19 16:17:20 UTC 2017

                Absolutely, Brian – you can find the data at
http://www.wildkidz.com/user_uploads/cProfile.zip  Thanx for answering!

                I did actually use a few of the tools to look at the data,
but like I mentioned, that didn’t get me very far – at least not yet.  As
an example, the output for the “orphan” call (task
e30b0bb4-8354-4959-b17e-05ce380bf679), sorted by cumulative time looked
like:

Mon Dec 18 19:06:07 2017
/data/pulp/var/lib/pulp/c_profiles/e30b0bb4-8354-4959-b17e-05ce380bf679

         4511314 function calls (4375770 primitive calls) in 8.681 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)

        1    0.000    0.000    8.676    8.676
/usr/lib/python2.7/site-packages/pulp/server/async/tasks.py:101(__call__)

        1    0.000    0.000    8.676    8.676
/usr/lib/python2.7/site-packages/celery/app/trace.py:432(__protected_call__)

        1    0.000    0.000    8.676    8.676
/usr/lib/python2.7/site-packages/pulp/server/managers/content/orphan.py:208(delete_orphans_by_type)

        4    0.144    0.036    8.659    2.165
/usr/lib/python2.7/site-packages/pulp/server/managers/content/orphan.py:87(generate_orphans_by_type)

    22491    0.116    0.000    7.619    0.000
/usr/lib64/python2.7/site-packages/pymongo/cursor.py:661(count)

…(further output omitted for brevity)…

                Obviously, the top 4 functions is where the lion’s share of
the time is being spent, and it seems clear that in this case, the
*generate_orphans_by_type* function (and/or functions it calls) is largely
to blame.  But this is where I run into trouble;  I haven’t yet figured out
how to further track down the time-sucking culprit(s) from there.  The very
next line of the cProfile output indicates a function that is called 22K+
times – I don’t think it’s where the real problem lies.  Or is it?  Looking
at the *generate_orphans_by_type* function code, it does in fact call a
*count* function within a loop…

This is starting to look like, upon removal of any artifact (unit), that
function will be looking at every artifact in my repo to compare some sort
of ID for what I assume is a reference count, right?  This then means that
I can expect this function to get slower as my repo grows in the number of
artifacts (units) it contains, yes?  Or am I totally missing something here
(I admit I’m no Python guru, so that’s entirely possible)?

*From:* Brian Bouterse [mailto:bbouters at redhat.com]
*Sent:* Monday, December 18, 2017 3:59 PM
*To:* Deej Howard <Deej.Howard at neulion.com>
*Cc:* pulp-list <pulp-list at redhat.com>
*Subject:* Re: [Pulp-list] REST API performance-related question(s)

Is there some way you can post the cProfile data? If you don't have a place
to post, one option may be to file an issue at pulp.plan.io.

Consider using one of these tools [4] to look for the line of Pulp code
that has the longest cumulative time. The tools provide sorting so that
should be easy. Then if you look at that part of the Pulp code you can get
an idea of how your task is spending its time. Usually what I find is that
Pulp is waiting on either the disk or the database, and the cProfile report
can show you that. For example if you're waiting on mongo you'll see a lot
of time being attributed to lines in mongo code. That means Pulp calls a
PyMongo method like read() and Pulp just waits for several seconds. That
example would be a "waiting on the db" issue that you can observe with
cProfile. Once you know where the issue is, we can talk about ways to
improve it. Even in cases of DB wait, perhaps there is a way to restructure
the code to read less from the database for example, so there are still
things that can be done.

If you post the data, maybe someone can help root cause the performance
issue.

[4]:
https://docs.pulpproject.org/dev-guide/debugging.html#analyzing-profiles

-Brian

On Mon, Dec 18, 2017 at 5:44 PM, Deej Howard <Deej.Howard at neulion.com>
wrote:

                Hoping to follow up on my own questions, I attempted to
take advantage of the cProfile functionality[3] against a new run of my
cleanup script (profiling was enabled within the Apache/Pulp container to
get data on the server side).  This run had an even more isolated run-time
data set, with only a single invocation of each of the “unassociate”,
“orphans”, and “publish” operations (clocking in at 15.22, 10.28, and 60.59
seconds each, respectively), and I did in fact end up with cProfile data
for each of these 3 tasks.  This is a very nice feature, and I’ll bet that
data would be really useful to someone who is much more familiar with Pulp
than I… but as yet I haven’t yet managed to make much use out of it or
appreciate how it is impacted by my specific repository configurations.

                Still looking forward to some insights from the experts.

[3] https://docs.pulpproject.org/dev-guide/debugging.html

*From:* Deej Howard [mailto:Deej.Howard at neulion.com]
*Sent:* Friday, December 15, 2017 10:44 AM
*To:* pulp-list <pulp-list at redhat.com>
*Cc:* deej.howard at neulion.com
*Subject:* REST API performance-related question(s)

                Hi, I’m using the 2.14.3 release in a Docker-based
configuration (details below), and I’ve noticed some performance-related
issues in a script-based artifact cleanup job that is run on a daily
basis.  The artifacts in question are of our own construction, incorporated
via the Pulp plugin mechanism, and all residing in a single repository
(there are around 22K artifacts in that one repo at this point).  The
Python script makes various Pulp REST API calls, and I’ve put in some extra
code to give me feedback on how much time each call is taking.  The “query”
calls have acceptable performance (less than a second, typically), but
there are others that are much slower;  calls to “unassociate” and
“orphans” take somewhere around 10s, and calls to “publish” take around 45s.

                I’m looking for some guidance on how I can improve this
performance.  I’m not the original author of this code, but I was lucky(?)
enough to inherit it.  The core algorithm essentially does some queries to
get the essential “keys” for the artifacts in question, then calls
“unassociate” with the relevant JSON payload for those artifacts, followed
by “orphans” to do the actual clean-up action, then “publish” after that
completes.  This cycle of action is executed potentially multiple times
within the cleanup script (on a “grouped artifact” basis).

                Some specific questions I have:

   - Is the methodology outline above appropriate for removing artifacts
   from a repository, or would some other mechanism be better/more efficient?
   - In the documentation for implementing support for new types[1], there
   is mention of a type definition JSON file that belongs in
   /usr/lib/pulp/plugins/types[2]. Unfortunately, it’s not clear which of
   the Pulp components (Qpid?  MongoDB?  Resource manager?  Workers?) use that
   information, and it looks like our installation has no files at all in that
   directory location.  We have other repo types installed (puppet, python),
   so I would have expected at least one such file, especially given that the
   puppet_module is provided as the example in the documentation.   This
   sounds like it could provide improvements to performance via insertion of
   search indexes or other such shortcuts.  Where can I find more details
   about this and/or more extensive examples?

[1]
https://docs.pulpproject.org/dev-guide/newtypesupport/plugin/example.html

[2]
https://docs.pulpproject.org/dev-guide/newtypesupport/plugin/type_defs.html

Environment Details

   - Pulp 2.14.3 using Docker containers based on Centos 7: one Apache/Pulp
   API container, one Qpid message broker container, one Mongo DB container,
   one Celery worker management container, one resource manager/task
   assignment container, and two Pulp worker containers.  All containers are
   running within a single Docker host, dedicated to only Pulp-related
   operations.  The diagram at
   http://docs.pulpproject.org/en/2.14/user-guide/scaling.html was used as
   a guide for this setup.
   - Artifacts are company-proprietary (configured as a Pulp plugin), but
   essentially are a single ZIP file with attached metadata for tracking and
   management purposes.

_______________________________________________
Pulp-list mailing list
Pulp-list at redhat.com
https://www.redhat.com/mailman/listinfo/pulp-list
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/pulp-list/attachments/20171219/b3a558c4/attachment.htm>