Until Kubernetes Federation hits the prime time, a number of solutions have sprung up as stop gaps to address geographically dispersing multiple cluster endpoints: stretch clusters and multiple clusters across multiple datacenters. The following article discusses how to configure Keepalived for maximum uptime of HAproxy with multiple cluster endpoints. In the following documentation an HAproxy and Keepalived configuration will be discussed in detail to load balance to the cluster(s) endpoints.
In a production environment a Global server load balancing (GSLB) or Global Traffic Manager (GTM) would be used to give a differing IP address based on the originating location of the request. This would help to ensure traffic from Virginia or New York would get the closest location to the originating request.
In an event to simulate geographically dispersed DNS, two records were created to represent endpoints destined for either datacenter. Each HAproxy node owns that virtual IP address. The configuration resembles an active/active cluster configuration. Lastly, the HAproxy servers prefer to offer from their service pools in their home datacenter.
dig +noall +answer haproxy.example.com
haproxy.example.com. 1800 IN A 10.19.114.20
haproxy.example.com. 1800 IN A 10.19.114.21
This article assumes HAproxy and Keepalived have already been installed and partially configured. For more information on preparing nodes for HAproxy and Keepalived please see the following article. At the end of this document, fully functioning configuration files will be included as a template for use.
Two HAproxy instances will be used, one for each datacenter and each OpenShift Cluster Platform.
This image is a reflection of the final HAproxy configuration:
HAproxy Load Balancer Configuration
The HAproxy load balancers distribute traffic across port groups. A sample config for Datacenter A's HAproxy is shown below:
frontend main80 *:80
default_backend router80
backend router80
balance source
option allbackups
mode tcp
server clus1-infra-0.example.com clus1-infra-0.example.com:80 check
server clus1-infra-1.example.com clus1-infra-1.example.com:80 check
server clus1-infra-2.example.com clus1-infra-2.example.com:80 check
server clus2-infra-0.example.com clus2-infra-0.example.com:80 check backup
server clus2-infra-1.example.com clus2-infra-1.example.com:80 check backup
server clus2-infra-2.example.com clus2-infra-2.example.com:80 check backup
Notice that the load balancer puts a preference on local datacenter nodes in clus1 and uses clus2 only in the event that the keepalive checks fail.
The opposite configuration in Datacenter B may look like this:
..omitted..
frontend main80 *:80
default_backend router80
backend router80
balance source
option allbackups
mode tcp
server clus2-infra-0.example.com clus2-infra-0.example.com:80 check
server clus2-infra-1.example.com clus2-infra-1.example.com:80 check
server clus2-infra-2.example.com clus2-infra-2.example.com:80 check
server clus1-infra-0.example.com clus1-infra-0.example.com:80 check backup
server clus1-infra-1.example.com clus1-infra-1.example.com:80 check backup
server clus1-infra-2.example.com clus1-infra-2.example.com:80 check backup
..omitted..
The inverse is applied above, HAproxy B prefers nodes in clus2. Keepalived performs the task of keeping the virtual IPs between either HAproxy and in the event of a failure, will failover to either HAproxy.
Keepalived Configuration
Much like the HAproxy configuration above, the Keepalived configuration is different based on each datacenter.
In Datacenter A, we have the following Keepalived config for the VIPs:
..omitted..
vrrp_instance OCP_vi1 {
state MASTER
interface ens192
virtual_router_id 51
priority 100
advert_int 10
unicast_src_ip 10.19.114.18
unicast_peer {
10.19.114.19
}
virtual_ipaddress {
10.19.114.20
}
..omitted..
vrrp_instance OCP_vi2 {
state BACKUP
interface ens192
virtual_router_id 61
priority 98
advert_int 10
unicast_src_ip 10.19.114.18
unicast_peer {
10.19.114.19
}
virtual_ipaddress {
10.19.114.21
}
..omitted..
The mirrored configuration for Datacenter B resembles the following:
..omitted..
vrrp_instance OCP_vi1 {
state BACKUP
interface ens192
virtual_router_id 51
priority 98
advert_int 10
unicast_src_ip 10.19.114.19
unicast_peer {
10.19.114.18
}
virtual_ipaddress {
10.19.114.20
# dev ens192
}
..omitted..
vrrp_instance OCP_vi2 {
state MASTER
interface ens192
virtual_router_id 61
priority 100
advert_int 10
unicast_src_ip 10.19.114.19
unicast_peer {
10.19.114.18
}
virtual_ipaddress {
10.19.114.21
}
Notice that on instance OCP_vi1 the load balancer is datacenter A is the preferred owner with datacenter B being the backup.
Testing failover
Additionally, a HAproxy group has been setup to test failover via round robin load balancing.
# Both VIPs are online and load balancing will bounce to either datacenter
[root@stretch-master-0 ~]# while [ 1 ];do curl haproxy:81 && sleep 5;done
clus1-infra-2
clus2-infra-0
clus1-infra-1
clus2-infra-2
clus1-infra-0
clus2-infra-1
# Fail datacenter A
[root@haproxy-0 ~]# systemctl stop keepalived
[root@clus1-master-0 ~]# while [ 1 ];do curl haproxy:81 && sleep 5;done
clus2-infra-0
clus2-infra-1
clus2-infra-2
clus2-infra-0
# Restore datacenter A then fail datacenter B
[root@haproxy-0 ~]# systemctl start keepalived
[root@haproxy-1 ~]# systemctl stop keepalived
[root@clus1-master-0 ~]# while [ 1 ];do curl haproxy:81 && sleep 5;done
clus1-infra-2
clus1-infra-1
clus1-infra-0
clus1-infra-1
clus1-infra-0
Conclusion
This post has described the installation and configuration of HAproxy and Keepalived to keep multiple OpenShift Container Platform's services online and highly available in the event of a failure. This configuration coupled with OCP's HA features provide maximum uptime for containers and microservices in your production environment.
Complete Configuration Files:
Datacenter A Configuration Files:
* haproxy-dc-a.cfg
* keepalived-dc-a.conf
Datacenter B Configuration Files:
* haproxy-dc-b.cfg
* keepalived-dc-a.conf
About the author
More like this
Red Hat and Sylva unify the future for telco cloud
DxEnterprise for high availability now certified for RHEL 9.6
Can Kubernetes Help People Find Love? | Compiler
Scaling For Complexity With Container Adoption | Code Comments
Browse by channel
Automation
The latest on IT automation that spans tech, teams, and environments
Artificial intelligence
Explore the platforms and partners building a faster path for AI
Cloud services
Get updates on our portfolio of managed cloud services
Security
Explore how we reduce risks across environments and technologies
Edge computing
Updates on the solutions that simplify infrastructure at the edge
Infrastructure
Stay up to date on the world’s leading enterprise Linux platform
Applications
The latest on our solutions to the toughest application challenges
Original shows
Entertaining stories from the makers and leaders in enterprise tech