, , ,

HA Galera Cluster with HAProxy and Keepalived: Simplifying the Setup

When managing a database cluster in production, ensuring high availability and load balancing is critical. Initially, we used GLB (Global Load Balancer) on each webserver to balance connections to our four Galera MySQL servers. GLB worked fine, but maintaining it across multiple webservers quickly became cumbersome. So we decided to simplify and centralize our setup using HAProxy for load balancing and Keepalived for failover.

This blog post explains why we moved away from GLB, how we implemented HAProxy and Keepalived, and the benefits we gained from this setup.


Why We Replaced GLB with HAProxy

GLB served its purpose but came with significant downsides:

  1. Maintenance Overhead
    • GLB had to be installed and configured on every webserver. This is not required but well, technical dept… Any updates or changes required adjusting multiple configurations, which added complexity.
  2. Scalability Challenges
    • Scaling the database infrastructure or adding new servers required reconfiguring GLB across all webservers.
  3. Centralized Load Balancing with HAProxy
    • By centralizing load balancing to HAProxy, we simplified management. HAProxy runs on just two servers, allowing us to manage all traffic in one place. And we already had them anyway.
  4. HAProxy’s Advanced Features
    • HAProxy offers robust health checks, detailed stats, and better performance under high load. It is designed to handle large amounts of TCP traffic efficiently.
  5. Keepalived for High Availability
    • With Keepalived, we introduced a Virtual IP (VIP) that floats between two HAProxy servers. If one HAProxy server fails, the VIP moves seamlessly to the backup server, ensuring uninterrupted access.

In short, HAProxy + Keepalived reduced maintenance, increased reliability, and made scaling our Galera cluster much easier.


The Architecture

Here’s the current setup:

  1. Web Requests
    • Users access the web servers through two virtual IPs (VIPs) managed by Keepalived.
  2. Database Connections
    • The web servers now connect to a single VIP (e.g., 192.168.1.180), managed by Keepalived, which routes traffic through HAProxy.
    • HAProxy load balances these connections across the four Galera database nodes: 192.168.1.81 to 192.168.1.84.
  3. Failover
    • Keepalived ensures the VIP stays online by promoting a backup HAProxy server to MASTER if the primary server fails.

Implementing HAProxy for Galera Load Balancing

The following HAProxy configuration handles load balancing and health checks for the Galera cluster:

Copied!
frontend mysql_frontend bind 192.168.1.180:3306 mode tcp default_backend galera_backend backend galera_backend mode tcp balance roundrobin option mysql-check user haproxy server db1 192.168.1.81:3306 check server db2 192.168.1.82:3306 check server db3 192.168.1.83:3306 check server db4 192.168.1.84:3306 check

Key Points:

  • mode tcp: Ensures HAProxy handles raw MySQL TCP connections.
  • balance roundrobin: Distributes connections evenly across the Galera nodes.
  • mysql-check user haproxy: Performs health checks to ensure nodes are up.

Configuring Keepalived for Failover

To manage the VIP, we set up Keepalived with VRRP (Virtual Router Redundancy Protocol) on both HAProxy servers. Below is the configuration for the primary server:

Copied!
vrrp_instance VI_MYSQL { state MASTER interface eth0 virtual_router_id 53 priority 100 advert_int 1 authentication { auth_type PASS auth_pass S3cr3t } virtual_ipaddress { 192.168.1.180 } track_script { chk_haproxy } }

Explanation:

  • state MASTER: This server is the primary.
  • virtual_ipaddress: The VIP that clients connect to.
  • track_script: Monitors the health of HAProxy using a simple script.

Health Check Script
Keepalived uses this script to ensure HAProxy is running:

Copied!
#!/bin/bash systemctl is-active --quiet haproxy exit $?

Make the script executable:

Copied!
chmod +x /usr/local/bin/chk_haproxy

On the backup HAProxy server, the state is set to BACKUP and the priority is lower (e.g., 90).


Results and Benefits

The switch from GLB to HAProxy + Keepalived delivered immediate improvements:

  1. Reduced Maintenance
    • Instead of managing GLB on multiple web servers, we now manage HAProxy centrally on two servers.
  2. High Availability
    • If the primary HAProxy server fails, Keepalived automatically moves the VIP to the backup server, ensuring continuous operation.
  3. Scalability
    • Adding more Galera nodes or HAProxy servers is straightforward and requires minimal changes.
  4. Simpler Management
    • With all database traffic going through HAProxy, monitoring and troubleshooting are much easier.

By replacing GLB with HAProxy and Keepalived, we significantly reduced maintenance complexity, improved failover reliability, and made the setup easier to scale. HAProxy’s robust features combined with Keepalived’s failover capabilities give us a highly available and efficient load balancing solution for our Galera cluster.

This setup is now live in production and performing flawlessly. If you’re managing a Galera cluster or looking for a scalable load balancing solution, consider HAProxy and Keepalived.


What about you?
Have you faced similar challenges with database load balancing? Let me know what solutions you’ve implemented, or if you have questions about this setup!