Re: [Freeipa-users] General status of my FreeIPA servers - is there a method for cleaning them?

On Fri, Apr 13, 2012 at 16:41, Rich Megginson <rmeggins redhat com> wrote:
> On 04/13/2012 02:30 PM, Dan Scott wrote:
>> On Fri, Apr 13, 2012 at 15:24, Rich Megginson<rmeggins redhat com>  wrote:
>>> It's not a problem until it's a problem :-)  I would go ahead and run
>> I cleaned up a load of these entries, but now I think I've broken the
>> replication between fileserver1 and 3:
>> fileserver1:/var/log/dirsrv/slapd-ECG-MIT-EDU/errors
>> [13/Apr/2012:15:57:56 -0400] NSMMReplicationPlugin - changelog program
>> - agmt="cn=meTofileserver3.ecg.mit.edu" (fileserver3:389): CSN
>> 4f5039960000002b0000 not found, we aren't as up to date, or we purged
>> [13/Apr/2012:15:57:56 -0400] NSMMReplicationPlugin -
>> agmt="cn=meTofileserver3.ecg.mit.edu" (fileserver3:389): Data required
>> to update replica has been purged. The replica must be reinitialized.
>> [13/Apr/2012:15:57:56 -0400] NSMMReplicationPlugin -
>> agmt="cn=meTofileserver3.ecg.mit.edu" (fileserver3:389): Incremental
>> update failed and requires administrator action
>> fileserver3:/var/log/dirsrv/slapd-ECG-MIT-EDU/errors
>> [13/Apr/2012:16:19:38 -0400] NSMMReplicationPlugin - changelog program
>> - agmt="cn=meTofileserver1.ecg.mit.edu" (fileserver1:389): CSN
>> 4f031e76001d000b0000 not found, we aren't as up to date, or we purged
>> Is it safe to run:
>> [root fileserver3 ~]# ipa-replica-manage re-initialize --from
>> fileserver1.ecg.mit.edu
>> I want to make sure I get it the correct way round!
> Are you sure that fileserver1 has the correct data?

Maybe? :) I've snapshotted both VMs and re-initialized from
fileserver1 - looking good so far.

I cleaned up all the "ruv_compare_ruv: RUV [changelog max RUV] does
not contain element" errors in the logs for each of fileservers 1, 2
and 3. The ldapsearch for
is still showing entries though. Is that OK?

Also, the PKI-CA error logs are showing RUV errors, should I clean
those too? I guess that I need to modify the commands (-b o=ipaca -p
7389 -h localhost).

>>>>>> fileserver3's /var/log/dirsrv/slapd-PKI-IPA/errors contains lots of:
>>>>>> [13/Apr/2012:13:52:50 -0400] slapi_ldap_bind - Error: could not send
>>>>>> startTLS request: error -1 (Can't contact LDAP server) errno 107
>>>>>> (Transport endpoint is not connected)
>>>>> This is a real connection error - could be cert or hostname lookup
>>>>> related.
>>>> How do I find out if it's cert or hostname lookup? Which hostname?
>>>> Fileserver3 runs DNS, and it seems to be working fine.
>>> Try ldapsearch - on server3
>>> LDAPTLS_CACERTDIR=/etc/dirsrv/slapd-PKI-IPA ldapsearch -x -ZZ -H
>>> ldap://server2.fqdn -D "cn=directory manager" -W -s base -b ""
>>> If that works, check to make sure the replication agreement has the
>>> correct
>>> server2.fqdn
>>> If that doesn't work, use ldapsearch -d 1 -x ..... to get further
>>> debugging
>>> information.
>> The replication agreements (according to ipa-replica-manage) all have
>> the correct host names - I'm not sure what ldapsearch command to run
>> to check the replication agreements.
> ipa-replica-manage --list?  or something like that?

That's what I was using - they are all correct.

>>>> The /var/log/dirsrv/slapd-ECG-MIT-EDU/errors is
>>>> now full of:
>>>> [13/Apr/2012:14:59:19 -0400] NSMMReplicationPlugin - conn=1 op=571
>>>> csn=4f70a9e5000100060000: Can't created glue entry
>>>> cn=fileserver4.ecg.mit.edu,cn=masters,cn=ipa,cn=etc,dc=ecg,dc=mit,dc=edu
>>>> uniqueid=6949d104-775b11e1-abce82a1-a45dd3c3, error 68
>>>> Should I delete the LDAP entry which is trying to replicate
>>>> fileserver2 with fileserver4?
>>> Yes.  And it may be due to the fact that the entry it is trying to delete
>>> has those tombstone children that have to be deleted too.
>> OK, I'll see how this goes, once the tombstones are gone.

The tombstones for ECG-MIT-EDU are gone now, still receiving this
message in the logs.

I think that's enough for this week - I'll look into it more next
week. Thanks for your help, have a good weekend.


