Introduction
A service registry is a database for the storage of data structures for application-level communication. It serves as a central location where app developers can register and find the schemas used for particular apps.
Why service registries matter
Modern software design relies on distributed and loosely coupled microservices which exchange data through application programming interfaces (APIs).
Across a large enterprise and beyond, this exchange of data among applications is mission critical. All of your applications are sending data back and forth every second to keep your business up and running. So ensuring the integrity of that data is critically important. And how do you make sure all these diverse applications are equipped to actually consume this vital data? One of the key solutions is a service registry.
A messaging system that transports data—like Apache Kafka, for example—does not provide any inherent data verification. What happens if a data producer sends out data that is not consumable? For example, what if the producer adds or removes a field or changes the data format? If the consumer of the data is not apprised of this change, it will not be able to process the data properly—and the worst case scenario is that the entire system breaks down.
Before any data exchange takes place, the data consumer needs to know the structure of the data—called the schema—that is being used by the producer. The consumer also needs to know when any changes are made to that schema. The data must be able to evolve without causing a disruption in the messaging system.
A producer can send the schema to the consumers manually, by sending an email with the file attached, for example. However, as with many manual processes, this can be complicated, error prone, and difficult to audit—and the end result is that services stop working, and there is no easy way to pinpoint the cause of the failure.
On the other hand, a service registry can provide this information via an easily accessible platform. It serves as a central location where developers of producer applications can register the schemas they are using for particular applications. Developers of the consumer applications also use the service registry to find that schema, to enable the application to consume data from that producer. Schemas that can be stored in a service registry include Apache Avro, JSON Schema, and Google Protocol Buffer.
In addition to schemas, your service registry can store other assets, also called "artifacts." For example, API specifications for application-level synchronous communication can also be stored in the service registry. As your services become more numerous and complex, the service registry becomes more useful.
The service registry is a concept that has been around for many years, but recently it has become the focus of renewed interest because it is well suited for this necessary purpose in the microservices world. The service registry serves as a single source of the truth about the data structure of a given application, agreed upon by developers of the producer and consumer applications. It supports a "contract-first" approach. Rather than coding the application first, and then offering a contract as an afterthought so other applications or organizations can communicate with your application, the service registry specifies the contract up front—including inputs, outputs, payload specifications, and possibly even validation rules. Everything is clearly posted so there is no question about how interactions should work.
Using a service registry with Apache Kafka
Using Apache Kafka as a use case, let's examine how a service registry works. Service registry is ideal in this particular use case because Kafka does not provide consumers with the data structure automatically, and Kafka does not provide any data verification. Since Kafka does not parse or even read your data, it does not use up vital resources, and consequently is able to distribute data directly to the consumers very quickly. If Kafka did take the time to verify data, the performance would be much lower. So the lack of data governance is not a problem with Kafka. Just the opposite, it enables one of the main advantages of Kafka: high performance.
However, you must implement some other governance on the structure of the data, to ensure the consuming applications can consume the data properly. The solution is a service registry, which not only provides rules but also enforces them.
Consumers and producers exchange data via Kafka, and using a service registry allows producers and consumers to document, share and agree on the metadata that defines the traffic, from the start, to avoid data-related errors down the road. The metadata is provided in the form of schemas stored in the service registry.
The developer of the producer application registers the schema in the service registry, which then ensures that the producer is adhering to the specifications of its own schema. The service registry will even go so far as to reject bad data that does not conform to the registered schema.
Schemas can be registered by a developer of a specific producer application, as shown in the example above, or they can be registered by an organization for general use among the development team. In this second case, the service registry functions as a library for developers of producer applications, as well as consumer applications.
Meanwhile, developers of consumer applications also utilize the service registry like a library, retrieving schemas so they are able to build applications that consume the data from producer applications. When changes are made to the schema, the service registry provides consumers with the latest updated schema.
Benefits of a service registry
A service registry delivers the following advantages to the development team and the business:
Decouple data structure from applications
You can use a service registry to decouple the structure of your data from your applications and to share and manage your data structures and API descriptions at runtime using a REST interface.
Superior data quality
A service registry validates the schema and detects errors in the data to ensure data integrity. The service registry can include rules to ensure that uploaded content is syntactically and semantically valid, and is backward and forward compatible with other versions. However, the service registry will stop a producer from sending bad data that does not conform to the schema.
Single documented source of truth
Service registry provides a single source of the truth, validated and agreed upon by all parties involved.
Increased developer productivity
The service registry enables consistent reuse of schemas and API designs, saving developers time when building a producer or consumer application.
Cost savings
Detecting data-related errors early in the development lifecycle, rather than at runtime, saves on the higher costs of development time incurred when fixing errors downstream in the process.