In high-demand environments where uptime and responsiveness are critical, databases often become the backbone of any application’s infrastructure. A Galera cluster offers a powerful solution for creating a high-availability database system with synchronous replication. When paired with GLB (Galera Load Balancer), you get efficient load distribution across the cluster, ensuring that no single database node becomes a bottleneck. In this guide, we’ll walk through setting up a Galera cluster and configuring GLB to intelligently balance requests.
What is a Galera Cluster?
A Galera cluster is a multi-master synchronous replication system designed for MySQL and MariaDB. With Galera, every node in the cluster is a “master” node, meaning that read and write operations can occur on any node. Galera’s synchronous replication guarantees data consistency across nodes, making it ideal for high-availability environments.
Why Use GLB for Load Balancing?
GLB (Galera Load Balancer) is a lightweight, high-performance TCP load balancer specifically built for distributing requests across a Galera cluster. It monitors the health of each Galera node and ensures that requests are only sent to nodes that are active and capable of handling traffic. By using GLB, we can dynamically manage traffic distribution and optimize the cluster’s performance.
Setting Up a Galera Cluster
Prepare the Nodes
Start with at least three nodes (the minimum recommended for high availability), each running MySQL or MariaDB with Galera support.
Install MariaDB with Galera
On each node, install MariaDB (or MySQL) with Galera support.
Copied!sudo apt update sudo apt install mariadb-server galera-4
Configure Galera on Each Node
Edit the MySQL configuration file (/etc/mysql/my.cnf
) on each node to configure Galera settings.
Copied![mysqld] wsrep_on=ON wsrep_provider=/usr/lib/galera/libgalera_smm.so wsrep_cluster_name="my_galera_cluster" wsrep_cluster_address="gcomm://IP_NODE1,IP_NODE2,IP_NODE3" wsrep_node_name="node1" wsrep_sst_method=rsync
Replace IP_NODE1
, IP_NODE2
, and IP_NODE3
with the actual IP addresses of each Galera node. Each node should have a unique wsrep_node_name
.
Start the Cluster:
On the first node, initialize the cluster
Copied!sudo galera_new_cluster
On the other nodes, start the MySQL service:
Copied!sudo systemctl start mysql
Verify Cluster Status: Log into MySQL on any node and check the cluster status:
Copied!SHOW STATUS LIKE 'wsrep%';
Ensure wsrep_cluster_size
reflects the total number of nodes.
Setting Up GLB for Load Balancing
With the Galera cluster in place, we can now set up GLB to handle load balancing.
Install GLB
On a server that will serve as the load balancer, install GLB. This can be done from source, as packages are not always readily available:
Copied!git clone https://github.com/galeracluster/glb.git cd glb make sudo make install
Configure GLB
Create a configuration file for GLB, typically /etc/glbd.conf
, to define the backends (Galera nodes) and balancing rules.
Copied!backend IP_NODE1:3306 backend IP_NODE2:3306 backend IP_NODE3:3306
Run GLB
Start GLB with the specified port and configuration file. For example, to listen on port 3307 and use /etc/glbd.conf
Copied!sudo glbd -p 3307 -f /etc/glbd.conf
Configure Applications to Connect via GLB:
Update the database connection settings in your applications to point to the GLB load balancer server on port 3307 instead of connecting directly to any single Galera node.
Benefits of Using a Galera Cluster with GLB
- High Availability: The Galera cluster ensures data consistency and uptime by replicating across all nodes. If one node fails, GLB redirects traffic to the remaining active nodes.
- Load Distribution: With GLB, read and write operations are distributed across the cluster, optimizing resource usage and improving response times for applications.
- Automatic Failover: GLB continually monitors the health of each node. If a node becomes unreachable, GLB will automatically remove it from the rotation until it becomes healthy again.
Conclusion
Setting up a Galera cluster with GLB creates a robust, high-availability database solution capable of handling demanding applications with ease. By distributing requests intelligently and ensuring consistent data replication, this setup maximizes performance while minimizing downtime.