[Freeipa-users] ns-slapd high cpu usage
Andrew E. Bruno
aebruno2 at buffalo.edu
Mon Jul 13 16:36:20 UTC 2015
On Mon, Jul 13, 2015 at 05:29:13PM +0200, Ludwig Krispenz wrote:
>
> On 07/13/2015 05:05 PM, Andrew E. Bruno wrote:
> >On Mon, Jul 13, 2015 at 04:58:46PM +0200, Ludwig Krispenz wrote:
> >>can you get a pstack of the slapd process along with a top -H to find th
> >>ethread with high cpu usage
> >Attached is the full stacktrace of the running ns-slapd proccess. top -H
> >shows this thread (2879) with high cpu usage:
> >
> >2879 dirsrv 20 0 3819252 1.962g 11680 R 99.9 3.1 8822:10 ns-slapd
> this thread is a replication thread sending updates, what is strange is that
> the current csn_str is quite old (july, 7th), I can't tell which agreeement
> this thread is handling, but looks like it is heavily reading the changeglog
> and sending updates. anything changed recently in replication setup ?
Yes, we had one replica fail on (6/19) which we removed (not this one
showing high CPU load). Had to perform some manual cleanup of the ipa-ca
RUVs. Then we added the replica back in on 7/1. Since then, replication
appears to have been running normally between the 3 replicas. We've been
monitoring utilization since 7/1 and only recently seen this spike (past
24 hours or so).
On a side note, we get hit with this bug often:
https://www.redhat.com/archives/freeipa-users/2015-July/msg00018.html
(rouge sssd_be processing hammering a replica).
This causes high ns-slapd utilization on the replica and restarting sssd
on the client host immediately fixes the issue. However, in this
case, we're not seeing this behavior.
> >
> >
> >
> >
> >
> >>On 07/13/2015 04:46 PM, Andrew E. Bruno wrote:
> >>>We have 3 freeipa-replicas. Centos 7.1.1503, ipa-server 4.1.0-18, and
> >>>389-ds 1.3.3.1-16.
> >>>
> >>>Recently, the ns-slapd process on one of our replicas started showing higher
> >>>than normal CPU usage. ns-slapd is pegged at high CPU usage more or less
> >>>constantly.
> >>>
> >>>Seems very similar to this thread:
> >>>https://www.redhat.com/archives/freeipa-users/2015-February/msg00281.html
> >>>
> >>>There are a few errors showing in /var/log/dirsrv/slapd-[domain]/errors (not
> >>>sure if these are related):
> >>>
> >>>
> >>>[13/Jul/2015:02:56:49 -0400] retrocl-plugin - retrocl_postob: operation failure [1]
> >>>[13/Jul/2015:04:11:50 -0400] - dn2entry_ext: Failed to get id for changenumber=2277387,cn=changelog from entryrdn index (-30993)
> >>>[13/Jul/2015:04:11:50 -0400] - Operation error fetching changenumber=2277387,cn=changelog (null), error -30993.
> >>>[13/Jul/2015:04:11:50 -0400] DSRetroclPlugin - replog: an error occured while adding change number 2277387, dn = changenumber=2277387,cn=changelog: Operations error.
> >>>[13/Jul/2015:04:11:50 -0400] retrocl-plugin - retrocl_postob: operation failure [1]
> >>>[13/Jul/2015:07:41:49 -0400] - Operation error fetching Null DN (01de36ac-295411e5-b94db2ab-07afbca6), error -30993.
> >>>[13/Jul/2015:07:41:49 -0400] - dn2entry_ext: Failed to get id for changenumber=2281464,cn=changelog from entryrdn index (-30993)
> >>>[13/Jul/2015:07:41:49 -0400] - Operation error fetching changenumber=2281464,cn=changelog (null), error -30993.
> >>>[13/Jul/2015:07:41:49 -0400] DSRetroclPlugin - replog: an error occured while adding change number 2281464, dn = changenumber=2281464,cn=changelog: Operations error.
> >>>[13/Jul/2015:07:41:49 -0400] retrocl-plugin - retrocl_postob: operation failure [1]
> >>>
> >>>
> >>>access logs seem to be showing normal activity. Here's the number of open
> >>>connections:
> >>>
> >>># ls -al /proc/`cat /var/run/dirsrv/slapd-[domain].pid`/fd|grep socket|wc -l
> >>>62
> >>>
> >>>Note: the other two replicas have much higher open connections (>250) and low
> >>>cpu load avgs.
> >>>
> >>>Here's some output of logconv.pl from our most recent access log on the replica
> >>>with high cpu load:
> >>>
> >>>Start of Logs: 13/Jul/2015:04:49:18
> >>>End of Logs: 13/Jul/2015:10:06:11
> >>>
> >>>Processed Log Time: 5 Hours, 16 Minutes, 53 Seconds
> >>>
> >>>Restarts: 0
> >>>Total Connections: 2343
> >>> - LDAP Connections: 2120
> >>> - LDAPI Connections: 223
> >>> - LDAPS Connections: 0
> >>> - StartTLS Extended Ops: 45
> >>> Secure Protocol Versions:
> >>> - TLS1.2 128-bit AES - 45
> >>>
> >>>Peak Concurrent Connections: 22
> >>>Total Operations: 111865
> >>>Total Results: 111034
> >>>Overall Performance: 99.3%
> >>>
> >>>Searches: 95585 (5.03/sec) (301.64/min)
> >>>Modifications: 3369 (0.18/sec) (10.63/min)
> >>>Adds: 0 (0.00/sec) (0.00/min)
> >>>Deletes: 0 (0.00/sec) (0.00/min)
> >>>Mod RDNs: 0 (0.00/sec) (0.00/min)
> >>>Compares: 0 (0.00/sec) (0.00/min)
> >>>Binds: 7082 (0.37/sec) (22.35/min)
> >>>
> >>>Proxied Auth Operations: 0
> >>>Persistent Searches: 0
> >>>Internal Operations: 0
> >>>Entry Operations: 0
> >>>Extended Operations: 5317
> >>>Abandoned Requests: 416
> >>>Smart Referrals Received: 0
> >>>
> >>>VLV Operations: 96
> >>>VLV Unindexed Searches: 0
> >>>VLV Unindexed Components: 32
> >>>SORT Operations: 64
> >>>
> >>>Entire Search Base Queries: 0
> >>>Paged Searches: 3882
> >>>Unindexed Searches: 0
> >>>Unindexed Components: 5
> >>>
> >>>FDs Taken: 2566
> >>>FDs Returned: 2643
> >>>Highest FD Taken: 249
> >>>
> >>>Broken Pipes: 0
> >>>Connections Reset By Peer: 0
> >>>Resource Unavailable: 0
> >>>Max BER Size Exceeded: 0
> >>>
> >>>Binds: 7082
> >>>Unbinds: 2443
> >>> - LDAP v2 Binds: 0
> >>> - LDAP v3 Binds: 6859
> >>> - AUTOBINDs: 223
> >>> - SSL Client Binds: 0
> >>> - Failed SSL Client Binds: 0
> >>> - SASL Binds: 6814
> >>> GSSAPI - 6591
> >>> EXTERNAL - 223
> >>> - Directory Manager Binds: 0
> >>> - Anonymous Binds: 6591
> >>> - Other Binds: 491
> >>>
> >>>
> >>>
> >>>
> >>>strace timing on the ns-slapd process:
> >>>
> >>>
> >>>% time seconds usecs/call calls errors syscall
> >>>------ ----------- ----------- --------- --------- ----------------
> >>> 94.40 0.346659 5977 58 poll
> >>> 4.10 0.015057 15057 1 restart_syscall
> >>> 0.91 0.003353 57 59 59 getpeername
> >>> 0.49 0.001796 150 12 futex
> >>> 0.10 0.000364 73 5 read
> >>>------ ----------- ----------- --------- --------- ----------------
> >>>100.00 0.367229 135 59 total
> >>>
> >>>
> >>>top output (with threads 'H'):
> >>>
> >>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> >>> 2879 dirsrv 20 0 3819252 1.962g 11680 R 99.9 3.1 8822:10 ns-slapd
> >>> 2895 dirsrv 20 0 3819252 1.962g 11680 R 34.1 3.1 115:10.62 ns-slapd
> >>> 2889 dirsrv 20 0 3819252 1.962g 11680 R 2.4 3.1 115:34.42 ns-slapd
> >>> 2917 dirsrv 20 0 3819252 1.962g 11680 S 2.4 3.1 115:26.87 ns-slapd
> >>> 2898 dirsrv 20 0 3819252 1.962g 11680 S 2.1 3.1 116:33.12 ns-slapd
> >>> 2904 dirsrv 20 0 3819252 1.962g 11680 S 2.1 3.1 115:08.56 ns-slapd
> >>> 2892 dirsrv 20 0 3819252 1.962g 11680 S 1.8 3.1 115:33.04 ns-slapd
> >>> 2897 dirsrv 20 0 3819252 1.962g 11680 R 1.8 3.1 114:54.28 ns-slapd
> >>> 2914 dirsrv 20 0 3819252 1.962g 11680 R 1.8 3.1 116:03.35 ns-slapd
> >>> 2907 dirsrv 20 0 3819252 1.962g 11680 S 1.5 3.1 115:42.25 ns-slapd
> >>> 2910 dirsrv 20 0 3819252 1.962g 11680 R 1.5 3.1 116:01.99 ns-slapd
> >>> 2870 dirsrv 20 0 3819252 1.962g 11680 R 1.2 3.1 611:30.22 ns-slapd
> >>> 2890 dirsrv 20 0 3819252 1.962g 11680 S 1.2 3.1 115:18.25 ns-slapd
> >>> 2891 dirsrv 20 0 3819252 1.962g 11680 S 1.2 3.1 115:22.24 ns-slapd
> >>> 2899 dirsrv 20 0 3819252 1.962g 11680 R 1.2 3.1 116:11.85 ns-slapd
> >>> 2888 dirsrv 20 0 3819252 1.962g 11680 S 0.9 3.1 114:51.19 ns-slapd
> >>> 2896 dirsrv 20 0 3819252 1.962g 11680 R 0.9 3.1 115:46.84 ns-slapd
> >>> 2915 dirsrv 20 0 3819252 1.962g 11680 S 0.9 3.1 115:49.34 ns-slapd
> >>> 2887 dirsrv 20 0 3819252 1.962g 11680 R 0.6 3.1 115:49.85 ns-slapd
> >>> 2894 dirsrv 20 0 3819252 1.962g 11680 S 0.6 3.1 115:58.02 ns-slapd
> >>> 2911 dirsrv 20 0 3819252 1.962g 11680 S 0.6 3.1 116:22.84 ns-slapd
> >>> 2913 dirsrv 20 0 3819252 1.962g 11680 S 0.6 3.1 114:43.56 ns-slapd
> >>>
> >>>
> >>>ns-slapd stays pegged >99%. Trying to figure out what ns-slapd is doing? Any
> >>>pointers on where else to look?
> >>>
> >>>Thanks in advance.
> >>>
> >>>--Andrew
> >>>
> >>--
> >>Manage your subscription for the Freeipa-users mailing list:
> >>https://www.redhat.com/mailman/listinfo/freeipa-users
> >>Go to http://freeipa.org for more info on the project
> >>
> >>
>
>
More information about the Freeipa-users
mailing list