[Freeipa-users] FreeIPA 3.3 performance issues with many hosts

Dominik Korittki d.korittki at mittwald.de
Wed Oct 7 15:03:45 UTC 2015



Am 07.10.2015 um 15:25 schrieb thierry bordaz:
> On 10/07/2015 11:19 AM, Martin Kosek wrote:
>> On 10/05/2015 02:13 PM, Dominik Korittki wrote:
>>>
>>> Am 01.10.2015 um 21:52 schrieb Rob Crittenden:
>>>> Dominik Korittki wrote:
>>>>> Hello folks,
>>>>>
>>>>> I am running two FreeIPA Servers with around 100 users and around
>>>>> 15.000
>>>>> hosts, which are used by users to login via ssh. The FreeIPA servers
>>>>> (which are Centos 7.0) ran good for a while, but as more and more
>>>>> hosts
>>>>> got migrated to serve as FreeIPA hosts, it started to get slow and
>>>>> unstable.
>>>>>
>>>>> For example, its hard to maintain hostgroups, which have more than
>>>>> 1.000
>>>>> hosts. The ipa host-* commands are getting slower as the hostgroup
>>>>> grows. Is this normal?
>>>> You mean the ipa hostgroup-* commands? Whenever the entry is displayed
>>>> (show and add) it needs to dereference all members so yes, it is
>>>> understandable that it gets somewhat slower with more members. How slow
>>>> are we talking about?
>>>>
>>>>> We also experience random dirsrv segfaults. Here's a dmesg line
>>>>> from the
>>>>> latest:
>>>>>
>>>>> [690787.647261] traps: ns-slapd[5217] general protection
>>>>> ip:7f8d6b6d6bc1
>>>>> sp:7f8d3aff2a88 error:0 in libc-2.17.so[7f8d6b650000+1b6000]
>>>> You probably want to start here:
>>>> http://www.port389.org/docs/389ds/FAQ/faq.html#debugging-crashes
>>> A stacktrace from the latest crash is attached to this email. After
>>> restarting
>>> the service, this is what I get in /var/log/dirsrv/slapd-INTERNAL/errors
>>> (hostname is ipa01.internal):
>> Ludwig or Thierry, can you please take a look at the stack and file
>> 389-DS
>> ticket if appropriate?
>
> Hello Dominik,
>
> DS is crashing during a BIND and from the arguments values we can guess
> it was due to a heap corruption that corrupted it operation pblock.
> This bind operation was likely victim of the heap corruption more than
> responsible of it.
>
> Using valgrind is the best way to track such problem but as you already
> suffer from bad performance I doubt it would be acceptable.
> How frequently does it crash ? did you identify a kind of test case ?

At first the crashes happenend at a daily basis. Simply restarting the 
dirsrv daemon resolved the issue for another day but later on the daemon 
did not survive more than 15 minutes most of the time. There were 
exceptions though. Sometimes the daemon ran for several hours until it 
chrashed.
I did not really identify a testcase. However, I supposed it could have 
something to do with replication, as I have seen replication related 
errors in dirsrv error log (mentioned in an earlier mail in this topic).

So did the following:
ipa01 has a replication agreement with ipa02. ipa01 was the one with 
segfaults. I removed ipa01 from the replication agreement 
(ipa-replica-manage del), did an ipa-server-install --uninstall on ipa01 
and created ipa01 as a replica of ipa02. Since then I did not experience 
any crashes (for now).
Instead i'm having trouble rebuilding a clean replication agreement (old 
RUV stuff still in database), but thats another story I will eventually 
post on the mailinglist as a new topic.

As for valgrind: Never used it before. Is there a handy explanation of 
how to use it in combination with 389ds? If I still experience those 
crashes and I get it managed to use I could try it out.


Kind regards,
Dominik Korittki

>
> thanks
> thierry
>>> [05/Oct/2015:13:51:30 +0200] - slapd started.  Listening on All
>>> Interfaces port
>>> 389 for LDAP requests
>>> [05/Oct/2015:13:51:30 +0200] - Listening on All Interfaces port 636
>>> for LDAPS
>>> requests
>>> [05/Oct/2015:13:51:30 +0200] - Listening on
>>> /var/run/slapd-INTERNAL.socket for
>>> LDAPI requests
>>> [05/Oct/2015:13:51:30 +0200] slapd_ldap_sasl_interactive_bind -
>>> Error: could
>>> not perform interactive bind for id [] mech [GSSAPI]: LDAP error -2
>>> (Local
>>> error) (SASL(-1): generic failure: GSSAPI Error: Unspecified GSS
>>> failure.
>>> Minor code may provide more information (No Kerberos credentials
>>> available))
>>> errno 0 (Success)
>>> [05/Oct/2015:13:51:30 +0200] slapi_ldap_bind - Error: could not perform
>>> interactive bind for id [] authentication mechanism [GSSAPI]: error
>>> -2 (Local
>>> error)
>>> [05/Oct/2015:13:51:30 +0200] NSMMReplicationPlugin -
>>> agmt="cn=meToipa02.internal" (ipa02:389): Replication bind with
>>> GSSAPI auth
>>> failed: LDAP error -2 (Local error) (SASL(-1): generic failure:
>>> GSSAPI Error:
>>> Unspecified GSS failure.  Minor code may provide more information (No
>>> Kerberos
>>> credentials available))
>>> [05/Oct/2015:13:51:30 +0200] NSMMReplicationPlugin - changelog program -
>>> agmt="cn=masterAgreement1-ipa02.internal-pki-tomcat" (ipa02:389): CSN
>>> 54bea480000000600000 not found, we aren't as up to date, or we purged
>>> [05/Oct/2015:13:51:30 +0200] NSMMReplicationPlugin -
>>> agmt="cn=masterAgreement1-ipa02.internal-pki-tomcat" (ipa02:389):
>>> Data required
>>> to update replica has been purged. The replica must be reinitialized.
>>> [05/Oct/2015:13:51:30 +0200] NSMMReplicationPlugin -
>>> agmt="cn=masterAgreement1-ipa02.internal-pki-tomcat" (ipa02:389):
>>> Incremental
>>> update failed and requires administrator action
>>> [05/Oct/2015:13:51:33 +0200] NSMMReplicationPlugin -
>>> agmt="cn=meToipa02.internal" (ipa02:389): Replication bind with
>>> GSSAPI auth
>>> resumed
>>>
>>>
>>> These lines are present since a replayed a ldif dump from ipa02 to
>>> ipa01, but i
>>> didn't think that it related to the segfault problem (therefore i
>>> said there
>>> are no related problems in the logfile).
>>>
>>> But I am starting to believe that these errors could be in relation
>>> to each other.
>>>
>>>
>>> Kind regards,
>>> Dominik Korittki
>>>
>>>
>>>>
>>>>> Nothing in /var/log/dirsrv/slapd-INTERNAL/errors, which relates to the
>>>>> problem.
>>> Not sure about that anymore.
>>>
>>>>> I'm thinking about migrating to latest CentOS 7 FreeIPA 4, but does
>>>>> that
>>>>> solve my problems?
>>>>>
>>>>> FreeIPA server version is 3.3.3-28.el7.centos
>>>>> 389-ds-base.x86_64 is 1.3.1.6-26.el7_0
>>>>>
>>>>>
>>>>>
>>>>> Kind regards,
>>>>> Dominik Korittki
>>>>>
>>>>
>>>>
>>>>
>>>
>
>
>
>

-- 
Mittwald CM Service GmbH & Co KG
Königsberger Straße 6
32339 Espelkamp

Tel: +49(0)5772-293-100
Fax: +49(0)5772-293-333

Geschäftsführer: Robert Meyer
HRA 6640, AG Bad Oeynhausen
Komplementaerin: Robert Meyer Verwaltungs GmbH, HRB 13260, AG Bad Oeynhausen




More information about the Freeipa-users mailing list