[Freeipa-users] 2.20 dirsrv memory usage

Stephen Ingram sbingram at gmail.com
Thu Jul 19 06:29:34 UTC 2012


On Wed, Jul 18, 2012 at 2:26 PM, Dmitri Pal <dpal at redhat.com> wrote:
> On 07/18/2012 05:09 PM, Stephen Ingram wrote:
>> On Wed, Jul 18, 2012 at 1:52 PM, Dmitri Pal <dpal at redhat.com> wrote:
>>> On 07/18/2012 04:27 PM, Stephen Ingram wrote:
>>>> On Wed, Jul 18, 2012 at 1:06 PM, Dmitri Pal <dpal at redhat.com> wrote:
>>>>> On 07/18/2012 03:45 PM, Stephen Ingram wrote:
>>>>>> On Wed, Jul 18, 2012 at 12:28 PM, John Dennis <jdennis at redhat.com> wrote:
>>>>>>> On 07/18/2012 02:59 PM, Stephen Ingram wrote:
>>>>>>>> On Wed, Jul 18, 2012 at 6:45 AM, Petr Vobornik <pvoborni at redhat.com>
>>>>>>>> wrote:
>>>>>>>>> On 07/17/2012 11:43 PM, Stephen Ingram wrote:
>>>>>>>>>
>>>>>>>>> 8><------
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>>> I'm beginning to think this is just the Web UI itself instead of 389
>>>>>>>>>>>> although it is really difficult to tell. I've poured over the debug
>>>>>>>>>>>> logs and didn't see anything that caused me concern.
>>>>>>>>>>>>
>>>>>>>>>>>> It's certainly usable, but I just got really spoiled by the
>>>>>>>>>>>> unbelievable quickness of 2.1.3. When your release notes indicate it
>>>>>>>>>>>> should be faster, what are you comparing it to? Maybe this only
>>>>>>>>>>>> happens with upgraded instances and not fresh installs.
>>>>>>>>>>>
>>>>>>>>>>> It is always possible something didn't get upgraded properly but I've
>>>>>>>>>>> done
>>>>>>>>>>> 2.1.3 -> 2.2.0 upgrades and haven't seen this. When we say something is
>>>>>>>>>>> faster we're always referring to the previous version (or versions).
>>>>>>>>>> Maybe I was just lucky with 2.1.3. On a first load it might take some
>>>>>>>>>> time to load the "frame" as I call it. But the data would load almost
>>>>>>>>>> instantaneously from there (certainly no more than 1 s) as you moved
>>>>>>>>>> from page to page. Here, even if I return to the same page, the system
>>>>>>>>>> acts as if the data is begin fetched for the very first time as it is
>>>>>>>>>> no faster than the first load. Maybe that is significant to the
>>>>>>>>>> problem?
>>>>>>>>> I think the culprit is Web UI paging capabilities introduced in 2.2. With
>>>>>>>>> lot of users, responses might grow in size. You can check their size and
>>>>>>>>> duration in browser developers tools. I suggest chrome/chromium - press
>>>>>>>>> F12
>>>>>>>>> and choose 'network' tab.
>>>>>>>>>
>>>>>>>>> This new feature can't be disabled in configuration. To test if the
>>>>>>>>> slowdown
>>>>>>>>> is done by paging you can (at own risk) replace line
>>>>>>>>> /usr/share/ipa/ui/facet.js:538
>>>>>>>>>
>>>>>>>>> that.pagination = spec.pagination === undefined ? true : spec.pagination;
>>>>>>>>>
>>>>>>>>> with:
>>>>>>>>>
>>>>>>>>> that.pagination = false;
>>>>>>>>>
>>>>>>>>> Note: It will break some other parts of the UI - so for testing only.
>>>>>>>> I've made the substitution in the code (was line 507 for me-do I have
>>>>>>>> a different version?). Looking at the time chart in Chrome I see that
>>>>>>>> the bulk of the time is for /ipa/session waiting. Would "waiting" mean
>>>>>>>> waiting for the directory server or memcached?
>>>>>>> Actually neither, it means waiting for a response from the web server
>>>>>>> (technically it's making an RPC call via HTTP Ajax). The RPC call needs to
>>>>>>> go through the web server, memcached, and typically will invoke one or more
>>>>>>> directory server queries, and run a bunch of Python to massage everything
>>>>>>> before the RPC returns with the result.
>>>>>>>
>>>>>>> It doesn't look like you've got much difference in times between with
>>>>>>> pagination on and pagination off. I don't know the pagination code but I
>>>>>>> suspect it's run after the RPC call returns so the RPC timing is not telling
>>>>>>> us much with respect to that.
>>>>>>>
>>>>>>> Waiting for up to 3 seconds for an RPC call does seem on the high side. Do
>>>>>>> you have a lot of LDAP data?
>>>>>> No. 49 users, 17 hosts, 25 services, 6 DNS zones, only 1 of which has
>>>>>> any significant amount of hosts in it.
>>>>>>
>>>>>>> But really, unless we get timing results for each component we're grasping
>>>>>>> at straws :-(
>>>>>> Understood.
>>>>>>
>>>>>> Steve
>>>>> Do you have a replica and does this replica behave the same?
>>>> No replica yet. I wanted to get the memory leak issue solved first.
>>>> All I have to compare to is the old 2.1.3 before. This one is much
>>>> slower. I just can't seem to figure out what's wrong. The upgrade
>>>> seemed to complete successfully and there were no errors in the log.
>>>> The only things I've found thus far (earlier in this thread) are the
>>>> unindexed entries (all hosts entries) that Rich seemed to think might
>>>> be slowing things up. As the slowness is on every page, I wouldn't
>>>> think that would be the problem.
>>>>
>>>> I wouldn't have said as much about this were it not for the promised
>>>> faster speed mentioned in the release notes. It's comparable to the
>>>> old 2.0 release candidates so I thought it might have been due to the
>>>> complexity of the feature additions.
>>>>
>>>> Steve
>>> Is the time correct on this system?
>> Yes. HW clock is GMT and localtime is Pacific Daylight Time and it is
>> accurate to the minute.
>>
>> Steve
> Can you check the Kerberos logs whether the ldap service ticket is
> acquired on every http request via browser?

That looks to be the case. I tailed the log while accessing each page and got:

Jul 18 22:43:15 ipa1.example.com krb5kdc[1243](info): TGS_REQ (4
etypes {18 17 16 23}) xxx.xx.xx.xx: ISSUE: authtime 1342640481, etypes
{rep=18 tkt=18 ses=18}, HTTP/ipa1.example.com at EXAMPLE.COM for
ldap/ipa1.example.com at EXAMPLE.COM

Jul 18 22:43:15 ipa1.example.com krb5kdc[1243](info): ...
CONSTRAINED-DELEGATION s4u-client=admin at EXAMPLE.COM

Jul 18 22:43:16 ipa1.example.com krb5kdc[1243](info): TGS_REQ (4
etypes {18 17 16 23}) xxx.xx.xx.xx: ISSUE: authtime 1342640481, etypes
{rep=18 tkt=18 ses=18}, HTTP/ipa1.example.com at EXAMPLE.COM for
ldap/ipa1.example.com at EXAMPLE.COM

Jul 18 22:43:16 ipa1.example.com krb5kdc[1243](info): ...
CONSTRAINED-DELEGATION s4u-client=admin at EXAMPLE.COM

Interestingly, this even came up when doing something as simple as
moving to the next page of users. I'm guessing this isn't correct?

> Also http logs might shed some light on what is going on a request. May
> be this would give us some hint.

Everything there looks normal. It's accessing /ipa/session/json every
time and the error log shows what appear to be queries to the
directory server. I compared this to my older v 2.1.3 logs and it
looks the same except for the addition of the session.

> What you are experiencing is wrong and alarming. This is why I want to
> drill down to the root of the problem.

I agree. I was all excited to see the faster IPA and instead I'm stuck
with one that chugs along. Perhaps it is several things together that
are adding up to more time. Petr pointed me in a great direction using
the developer panel in Chrome to see the various pieces that are
taking time. This is illuminating each component that loads. For
example, a simple click to a new page of users generates two hits to
/ipa/session:

json /ipa/session POST 200 Success application/json jquery.js:7365
Script 9.30KB 8.78KB 1.47s 1.46s 1.46s 6ms
json /ipa/session POST 200 Success application/json jquery.js:7365
Script 74.65KB 73.41KB 1.80s 1.61s 1.61s

As you can see just flipping the page took 3.27s for something that
used to be almost instantaneous. As each request waits for a response,
I'm not sure why this shouldn't be one request. Perhaps, with the
kerberos requests, the directory server query and the Web UI jquery
stuff, it requires two requests and things are simply adding up.

Steve




More information about the Freeipa-users mailing list