Jump to section

Why run Apache Kafka on Kubernetes?

Copy URL

Deploying Apache Kafka on a container orchestration platform like Kubernetes allows event-driven applications to be automated, scaled, and deployed anywhere. In short, Kubernetes magnifies the inherent flexibility of apps built on Apache Kafka.

Apache Kafka is frequently deployed on the Kubernetes container management system, which is used to automate deployment, scaling, and operation of containers across clusters of hosts. Apache Kafka on Kubernetes goes hand-in-hand with cloud-native development, the next generation of application development. Cloud-native applications are independent, loosely coupled, and distributed services that deliver high scalability via the cloud. In the same way, the event-driven applications built on Kafka are loosely coupled and designed to scale across a distributed hybrid cloud environment.

A key benefit for operations teams of running Apache Kafka on Kubernetes is infrastructure abstraction: it can be configured once and run everywhere. Operations teams in the modern age typically manage diverse arrays of on premises and cloud resources, and Kubernetes allows them to treat these assets as pools of compute resources to which they can allocate their software resources, including Apache Kafka. Furthermore, this same Kubernetes layer allows a single environment for managing all of their Apache Kafka instances.

The inherent scalability of Kubernetes is a natural complement to Apache Kafka. Kubernetes allows applications to scale resources up and down with a simple command, or scale automatically based on usage, to make the most economical use of computing, networking, and storage resources. Kubernetes also offers Apache Kafka the portability to span across on-premises and public, private, or hybrid clouds, and use different operating systems

Operating Apache Kafka manually is a complex endeavor that requires excessive configuration of many components. Running Apache Kafka on bare metal (or virtual machines, for that matter) is complicated. Deploying, monitoring, updating, and rolling back the nodes is extremely complicated and difficult.

Solving this complexity is where the Strimzi open source project enters the picture. Strimzi uses operators to deploy Apache Kafka configurations smoothly and seamlessly. Operators, the state of the art facility for deploying and managing applications on Kubernetes, provide development flexibility because they abstract at an infrastructure level, allowing developers to deploy applications without much information about the infrastructure. The developer doesn't need to know the technicalities—such as how many machines or what type of hardware—because operators automatically provision the infrastructure and manage all the details.

Strimzi offers the benefits of Infrastructure as Code (IaC), in that the developer can easily write a code-like instruction manual to define the infrastructure, and Strimzi will execute these instructions perfectly. Strimzi can even simplify the deployment of Apache Kafka in high availability modes, which is, again, otherwise difficult.

The operator in Strimzi supports many security concerns for Apache Kafka, which is another important reason to run Strimzi. Strimzi also automates security for Apache Kafka on Kubernetes, with single sign on, encryption, and authentication, so the developer does not have to spend time implementing basic security features

Red Hat AMQ Streams, part of Red Hat Integration, is a Red Hat enterprise distribution of Apache Kafka and the Strimzi project, Much of the extra value that AMQ Streams brings to Apache Kafka is focused on the use of Apache Kafka on Kubernetes, or Red Hat OpenShift, which is the Red Hat distribution of Kubernetes.

Red Hat AMQ Streams on OpenShift delivers Apache Kafka on Kubernetes to enable enterprise-grade, event-driven architectures that support distributed data streams and stream-processing microservices-based applications. AMQ Streams is particularly well-suited for high-scale, high-throughput scenarios because the inherent partitioning in Apache Kafka helps address scalability requirements.

Get to know Red Hat OpenShift Service on AWS (ROSA)

Keep reading

Article

What is integration?

Need to know what integration is? Learn what it is, how to incorporate it, and why it’s a lot better with open source.

Article

What is Apache Kafka?

Apache Kafka is a distributed data streaming platform that can publish, subscribe to, store, and process streams of records in real time.

Article

What is an API?

API stands for application programming interface—a set of definitions and protocols to build and integrate application software.

More about integration

Products

A comprehensive set of integration and runtimes technologies engineered to help build, deploy, and operate applications with security in mind and at scale across the hybrid cloud.

Hosted and managed platform, application, and data services that streamline the hybrid cloud experience, reducing the operational cost and complexity of delivering cloud-native applications.

A set of products, tools, and components for developing and maintaining cloud-native applications. Includes Red Hat AMQ, Red Hat Data Grid, Red Hat JBoss® Enterprise Application Platform, Red Hat JBoss Web Server, a Red Hat build of OpenJDK, a Red Hat build of Quarkus, a set of cloud-native runtimes, Migration Toolkit for Applications, single sign-on, and a launcher service.

A comprehensive set of integration and messaging technologies to connect applications and data across hybrid infrastructures. Includes Red Hat 3scale API Management, Red Hat AMQ, Red Hat Runtimes, change data capture, and a service registry.

Resources

E-book

Create an agile infrastructure—and enable an adaptive organization

Training

Free training course

Red Hat Agile Integration Technical Overview