[Freeipa-devel] Move replication topology to the shared tree

Simo Sorce simo at redhat.com
Mon Jun 2 12:59:24 UTC 2014


First of all, very good summary, thanks a lot!
Replies in line.

On Mon, 2014-06-02 at 10:46 +0200, Ludwig Krispenz wrote:
> Ticket 4302 is a request for an enhancement: Move replication topology 
> to the shared tree
> 
> 
> There has been some discussion in comments in the ticket, but I'd like 
> to open the discussion to a wider audience to get an agreement on what 
> should be implemented, before writing a design spec.
> 
> The implementation requires a new IPA plugin for 389 DS and eventually 
> an enhancement of the 389 replication plugin (which depends on some 
> decisions below). In the following I will use the terms “topology 
> plugin” for the new plugin and “replication plugin” for the existing 389 
> multimaster replication plugin.
> 
> 
> Lets start with the requirements: What should be achieved by this RFE ?
> 
> In my opinion there are three different levels of features to implement 
> with this request
> 
> - providing all replication configuration information consistent across 
> all deployed servers on all servers, eg to easily visualize the 
> replication topology.
> 
> - Allowing to do sanity checks on replication configuration, denying 
> modifications which would break replication topology or issue warnings.
> 
> - Use the information in the shared tree to trigger changes to the 
> replication configuration in the corresponding servers, this means to 
> allow to completely control replication configuration with modifications 
> of entries in the shared tree
> 
> 
> The main questions are
> 
> 1] which information is needed in the shared tree (eg what params in the 
> repl config should be modifiable)
> 
> 2] how is the information organized and stored (layout of the repl 
> config information shared tree)
> 
> 3] how is the interaction of the info in the shared tree and 
> configuration in cn=config and the interaction between the topology 
> plugin and the replication plugin
> 
> 
> ad 1] to verify the topology, eg connectivity info about all existing 
> replication agreements is needed, the replication agreements only 
> contain info about the target, and the parameters for connection to the 
> target, but not about the origin. If the data have to evaluated on any 
> server, information about the origin has to be added, eg replicaID, 
> serverID,...
> 
> In addition, if agreement config has to be changed based on the shared 
> tree all required parameters need to be present, eg 
> replicatedAttributeList, strippedAttrs, replicationEnabled, .....
> 
> Replication agreements only provide information on connections where 
> replication is configured, if connectivity is to be checked independent 
> info about all deployed serevers/replicas is needed.
> 
> If topology should be validated, do we need params definig requirements, 
> eg each replica to be connected to 1,2,3,... others, type of topology 
> (ring, mesh, star,.) ?

Ok from a topology point of view you need the same elements you need to
define a graph. That is: nodes and segments.

We already have the list of masters in the cn=etc tree, so all we need
is to add segments (ie connection objects).

As for parameters my idea is that we have a general set of parameters
(eg. replicatedAttributeList, strippedAttrs) in the general topology
configuration, then we might override them on a per-connection basis if
needed (should be very rare).

Also note we may need multiple topology sets, because we may have to
distinguish between the replication topology for the main shared tree
and the replication topology for other databases.

However we may want to be able to mark a topology for 'multiple' sets.
For example we may want to have by default the same topology both for
the main database and for the CA database.

> ad 2] the data required are available in the replicationAgreement (and 
> eventually replica) entries, but the question is if there should be a 
> 1:1 relationship to entries in the shared tree or a condensed 
> representation, if there should be a server or connection oriented view.

My answer is no, we need only one object per connection, but config
entries are per direction (and different ones on different servers).

> In my opinion a 1:1 relation is straight forward, easy to handle and 
> easy to extend (not the full data of a repl agreement need to be 
> present, other attributes are possible). The downside may be a larger 
> number of entries, but this is no problem for the directory server and 
> replication and the utilities eg to visualize a topology will handle this.

We want a more abstract and easy to handle view for the topology plugin,
in general.

> If the number of entries should be reduced information on multiple 
> replication agreements would have to be stored in one entry, and the 
> problem arises ho to group data belonging to one agreement. LDAP does 
> not provide a simple way to group attribute values in one entry, so all 
> the info related to one agreement (origin, target, replicated attrs and 
> other repl configuration info) could be stored in a single attribute, 
> which will make the attribute as nicely readable and managable as acis.

We can easily use subtypes if really needed, this info is quite core to
the IPA code and will not be generally accessed by random clients.
However, as I indicated above we really need one object per graph
segment which represents a two-way connection, so we shouldn't have
issues (but sharing topologies between different databases may
reintroduce it :)

> If topology verification and connectivity check is an integral part of 
> the feature, I think a connection oriented view is not sufficient, it 
> might be incomplete, so a server view is required, the server entry 
> would then have the connection information as subentries or as attributes.

We already have the list of servers, so we need to add only the list of
connections in the topology view. We may need to amend the servers
objects to add additional data in some cases. For example indicate
whether it is fully installed or not (on creation the topology plugin
would complain the server is disconnected until we create the first
segment, but that may actually be a good thing :-)

> Ad 3] The replication configuration is stored under cn=config and can be 
> modified either by ldap operations or by editing the dse.ldif. With the 
> topology plugin another source of configuration changes comes into play.
> 
> The first question is: which information has precendence ? I think if 
> there is info in the shared tree it should be used, and the information 
> in cn=config should be updated. This also means that the topology plugin 
> needs to intercept all mods to the entries in cn=config and have them 
> ignored and handle all updates to the shared tree and trigger changes to 
> the cn=config entries, which then would trigger rebuilds of the in 
> memory replication objects.

Yes, I agree.

> Next question: How to handle changes directly done in the dse.ldif, if 
> everything should be done by the topology plugin it would have to verify 
> and compare the info in cn=config and in the shared tree at every 
> startup of the directory server, which might be complicated by the fact 
> that the replication plugin might already be started and repl agreemnts 
> are active before the topology plugin is started and could do its work. 
> (plugin starting order and dependencies need to be checked).

Why do we care which one starts first ?
We can simply change replication agreements at any time, so the fact the
replication topology (and therefore agreements) can change after startup
should not be an issue.

> Next next question: should there be a “bootstrapping” of the config 
> information in the shared tree ?
> 
> I think yes, the topology plugin could check at startup if there is a 
> representation of the config info in the shared tree and if not 
> construct it, so after deployment and enabling of the topology plugin 
> the information in the shared tree would be initialized.

Nope, the topology plugin should simply log a loud warning in the error
log and wait quietly until the topology is provided. This is needed to
allow us to handle migrations gracefully and carefully construct the
topology tree at install time w/o having the topology plugin interfere. 
We will probably need a big 'enabled/disabled' flag on the topology tree
base object so we can construct a tree w/op waking up the plugin at
every change in the install phase.

> I think that not every part of the feature has to be handled in the 
> topology plugin, we could also ask for enhancements in the 389 
> replication plugin itself. There could be an extension to the replica 
> and replication agreement entries to reference an entry in the shared 
> tree. The replication plugin could check at startup if these entries 
> contain replication configuration attributes and if so use them, 
> otherwise use the values in cn=config. The presence of the reference 
> indicates to the topolgy plugin that initialization is done.
> 
> In my opinion this would simplify the coordination at startup and avoid 
> unnecessary revaluations and other deployments could benefit from this 
> new feature in directory server (one could eg have one entry for 
> replication argreements containing the fractional replication 
> configuration – and it would be identical on all servers)

I really do not want to touch the replication plugin. It works just fine
as it is, and handling topology has nothing to do with handling the low
level details of the replication. To each its own.
If other deployments want to use the topology plugin, we can later move
it to the 389ds codebase and generalize it.

> So my proposal would contain the following components
> 
> 1] Store replication configuration in the shared tree in a combination 
> of server and connection view (think we need both) and map replication 
> configuration to these entries. I would prefer a direct mapping (with a 
> subset of the cn=config attributes and required additions)

Nack, we already have the list of servers, we just need 1 object per
connection (graph segment) not one per agreement.

> 2] provide a topology plugin to do consistency checks and topology 
> verification, handle updates to trigger modification changes in 
> cn=config, intercept and reject direct mods to cn=config entries At 
> startup verify if shared tree opbjects are present, initialize them if 
> not, apply to cn=config if required

Ack

> 3] enhance replication plugin to handle config information in the shared 
> tree. This would allow to consistently handle config changes either 
> applied to the shared config, cn=config mods or des.ldif changes. This 
> feature might also be interesting to other DS deployments

Nack, leave the replication plugin alone, the topology plugin should do
all the topology work, dealing with interactions between 2 plugins would
tie them together and make things a lot more complicated than necessary.
It would also bind the development of the topology plugin to 2 schedules
(both 389ds and FreeIPA), making also the logistics of developing the
topology plugin more complicated.

Simo.

-- 
Simo Sorce * Red Hat, Inc * New York




More information about the Freeipa-devel mailing list