S 4.389 Partitioning and replication in OpenLDAP

Initiation responsibility: Head of IT, IT Security Officer

Implementation responsibility: Administrator

The distribution of subtrees of a directory service to different servers (partitioning) is an effective option to reach a higher level of availability by distributing the load. The servers will need to exchange information on all changes made to the data by means of replication to ensure that the copies of the directories are up to date at all times. Which replication mode is appropriate must be selected depending on the network connections and availability requirements.

This safeguard describes the possible implementation of these concepts using OpenLDAP. For the general planning of partitioning and replication see safeguard S 2.409 Planning of partitioning and replication in the directory service.

Partitioning

The partitioning of directory services in OpenLDAP can be configured very easily. If a part of the directory is outsourced or if a server is to know which other server stores specific subtrees, then the corresponding suffix must be created as an object of the "referral" object class in the global configuration of this server. The reference address of the server with the outsourced subtree is assigned to the "ref" attribute. The server responds to operations of clients relating to this part of the directory service referring them to this address. The assignment is referred to as "Subordinate Knowledge Information". The server "knows" which part of the directory tree can be found on which server. If it is to be ensured that the server itself browses outsourced subtrees for search requests, the respective database must be connected to the "subordinate" directive or "olcSubordinate" must be connected to the database of the server. This procedure is referred to as gluing.

A subordinate server, in turn, is not supplied with the exact information on which subtrees above its subtree or having the same rights than its subtree are stored by which other servers. Whenever an operation does not correspond to a suffix of the server, this operation is responded to using a global "referral", i.e. the requesting client is referred to a server that might be able to give the response. For partitions, the superior directory service is entered here. In this context, the referral is also referred to as "Superior Knowledge Information", although the directive can also be used irrespective of any partitioning. The directory services identified using the addresses in the referrals do not have to be run with OpenLDAP.

Using the "chain" (chaining) overlay, the server itself can also track referrals. Thus, the client does not notice any partitioning and always receives a final response by the originally requested server. This works regardless of the capabilities of the clients which might not be able to process referrals themselves.

Replication

In OpenLDAP, the replication is implemented using the "LDAP Sync Replication Engine" (syncrepl) mechanism. The mechanism is adapted to the BerkeleyDB and only supported by "back-bdb" and "back-hdb" backends. This means that OpenLDAP cannot simply be used as agent to replicate directory services, for which the slapd server is only a proxy.

Before the "syncrepl" mechanism has been developed, the "stand-alone LDAP update replication daemon" (slurpd) was used for replication. This was a program that was run like the slapd server as service and maintained the copies of the directory service contents. This service has not functioned properly and was officially removed from OpenLDAP when introducing its version 2.4. Information on "slurpd" in outdated documentation can be seen as historical. Under no circumstances may "slurpd" be used.

Master and slave, provider and consumer

Traditionally, the servers involved in replication are referred to as master and slave. The master is the actual directory service; using this server, the directory service contents can be accessed with write privileges. The slave only uses all information of the directory service and grants only reading access to this copy. Since version 2.3, this strict separation no longer applies in OpenLDAP. For the replication in OpenLDAP, a consumer service uses data from a provider service. It is important to understand that a consumer acts as client towards the provider, although the consumer itself provides its replica as server for other clients. The server security settings of the consumer do not apply to the connection to the provider. Instead, the client configuration applies; this configuration must be carried out carefully on a consumer, although it is actually not required for servers. In particular, it must be taken into consideration that the consumer has to perform a "bind" on the provider and that any access restrictions and search limits of the user deployed might impair the replication.

refreshOnly and refreshAndPersist

The replication can be run either in a pull or in a push mode. For the "pull mode", referred to as "refreshOnly" in OpenLDAP, the consumer requests the provider for changes at defined intervals. For this purpose, the consumer sends the up-to-dateness of its stored data in the form of "Sync Cookies". Due to this information, a search covering all changes since the time reported by the consumer is started on the provider. In this case, the provider does not "know" the consumer; it only responds to search requests. To ensure that these searches provide the correct results, it is particularly important that the clocks of provider and consumer run as synchronously as possible (see S 4.227 Use of a local NTP server for time synchronisation and S 4.348 Time synchronisation in virtual IT systems). For the "push mode", referred to as "refreshAndPersist" in OpenLDAP, the connection between provider and consumer remains and the provider always sends all changes to the consumer. When selecting the appropriate replication method, the rule of thumb applies that "refreshOnly" is the more useful the larger the amounts of data to be replicated are and that "refreshAndPersist" should be rather used, the more important updating the provider promptly is.

Configuration of a replication

To configure the replication of a directory service using OpenLDAP, several steps are required:

Delta replication

In general, the provider sends all attributes of changed entries as search result or within the framework of the replication. The provider acts like this even if only one of the attributes was changed in the entry. In connection with the "accesslog" overlay (see also S 4.407 Logging when using OpenLDAP), it is also possible to document the changes to attributes in a detailed manner and to subsequently transfer only the changes using the "syncrepl" mechanism. This requires a more complex configuration. For frequent changes to small attributes of relatively large objects, this option should be tested; for only a few objects or small quantities, delta replication is not required.

Multi-master and mirror-mode operation

It is also possible to configure a multi-master operation. For a multi-master operation, there is more than one server that can be accessed with write privileges and the masters are both providers and consumers for each other. The purpose of this operating mode is that there is still write access to the directory service in the event of a server failure without the need (like for a slave/only-consumer) to adapt the configuration first. This operating mode is not undisputed and is considered to be inappropriate by some members of the OpenLDAP team, since it can be a threat to the consistency of a directory. This is the case when competing changes are made on the two masters at the same time. A multi-master operation is not required for an OpenLDAP installation in an information system with normal protection requirements. If there a high or very high requirements regarding availability, a multi-master configuration can be tested. As a rule of thumb, the more important the continuous availability is, the more useful a multi-master operation; the more important the integrity of the data at all times is, the less useful the multi-master operation.

As an alternative between the single-master and multi-master operation, there is also the possibility of a mirror-mode operation. For this operating mode, there are also several servers by means of which the directory service can be access with write privileges. However, an external monitoring component always defines an active server performing the changes. If a server fails, the monitoring component automatically determines which other server is the active server. The delta replication is not supported yet in this operating mode. Another advantage is that the directory service that has actually been designed redundantly is no longer available if the monitoring component fails.

Review questions: