Deploying Apache Kafka on a container orchestration platform like Kubernetes allows event-driven applications to be automated, scaled, and deployed anywhere. In short, Kubernetes magnifies the inherent flexibility of apps built on Apache Kafka.
Why run Apache Kafka on Kubernetes?
Apache Kafka is frequently deployed on the Kubernetes container management system, which is used to automate deployment, scaling, and operation of containers across clusters of hosts. Apache Kafka on Kubernetes goes hand-in-hand with cloud-native development, the next generation of application development. Cloud-native applications are independent, loosely coupled, and distributed services that deliver high scalability via the cloud. In the same way, the event-driven applications built on Kafka are loosely coupled and designed to scale across a distributed hybrid cloud environment.
A key benefit for operations teams of running Apache Kafka on Kubernetes is infrastructure abstraction: it can be configured once and run everywhere. Operations teams in the modern age typically manage diverse arrays of on premises and cloud resources, and Kubernetes allows them to treat these assets as pools of compute resources to which they can allocate their software resources, including Apache Kafka. Furthermore, this same Kubernetes layer allows a single environment for managing all of their Apache Kafka instances.
The inherent scalability of Kubernetes is a natural complement to Apache Kafka. Kubernetes allows applications to scale resources up and down with a simple command, or scale automatically based on usage, to make the most economical use of computing, networking, and storage resources. Kubernetes also offers Apache Kafka the portability to span across on-premises and public, private, or hybrid clouds, and use different operating systems
Strimzi: Making Apache Kafka work on Kubernetes
Operating Apache Kafka manually is a complex endeavor that requires excessive configuration of many components. Running Apache Kafka on bare metal (or virtual machines, for that matter) is complicated. Deploying, monitoring, updating, and rolling back the nodes is extremely complicated and difficult.
Solving this complexity is where the Strimzi open source project enters the picture. Strimzi uses operators to deploy Apache Kafka configurations smoothly and seamlessly. Operators, the state of the art facility for deploying and managing applications on Kubernetes, provide development flexibility because they abstract at an infrastructure level, allowing developers to deploy applications without much information about the infrastructure. The developer doesn't need to know the technicalities—such as how many machines or what type of hardware—because operators automatically provision the infrastructure and manage all the details.
Strimzi offers the benefits of Infrastructure as Code (IaC), in that the developer can easily write a code-like instruction manual to define the infrastructure, and Strimzi will execute these instructions perfectly. Strimzi can even simplify the deployment of Apache Kafka in high availability modes, which is, again, otherwise difficult.
The operator in Strimzi supports many security concerns for Apache Kafka, which is another important reason to run Strimzi. Strimzi also automates security for Apache Kafka on Kubernetes, with single sign on, encryption, and authentication, so the developer does not have to spend time implementing basic security features
Finding the right EDA solutions
Red Hat AMQ Streams, part of Red Hat Integration, is a Red Hat enterprise distribution of Apache Kafka and the Strimzi project, Much of the extra value that AMQ Streams brings to Apache Kafka is focused on the use of Apache Kafka on Kubernetes, or Red Hat OpenShift, which is the Red Hat distribution of Kubernetes.
Red Hat AMQ Streams on OpenShift delivers Apache Kafka on Kubernetes to enable enterprise-grade, event-driven architectures that support distributed data streams and stream-processing microservices-based applications. AMQ Streams is particularly well-suited for high-scale, high-throughput scenarios because the inherent partitioning in Apache Kafka helps address scalability requirements.