All examples assume two nodes that are reachable by their short name and IP address:
- node1 - 192.168.1.1
- node2 - 192.168.1.2
The convention followed is that [ALL] # denotes a command that needs to be run on all cluster machines, and [ONE] # indicates a command that only needs to be run on one cluster host.
RHEL 6.4 onwards
Pacemaker ships as part of the Red Hat High Availability Add-on. The easiest way to try it out on RHEL is to install it from the Scientific Linux or CentOS repositories.
If you are already running CentOS or Scientific Linux, you can skip this step. Otherwise, to teach the machine where to find the CentOS packages, run:
[ALL] # cat <
Next we use yum to install pacemaker and some other necessary packages we will need:
[ALL] # yum install pacemaker cman pcs ccs resource-agents
Configure Cluster Membership and Messaging
The supported stack on RHEL6 is based on CMAN, so thats what Pacemaker uses too.
We now create a CMAN cluster and populate it with some nodes. Note that the name cannot exceed 15 characters (we'll use 'pacemaker1').
[ONE] # ccs -f /etc/cluster/cluster.conf --createcluster pacemaker1 [ONE] # ccs -f /etc/cluster/cluster.conf --addnode node1 [ONE] # ccs -f /etc/cluster/cluster.conf --addnode node2
Next we need to teach CMAN how to send it's fencing requests to Pacemaker. We do this regardless of whether or not fencing is enabled within Pacemaker.
[ONE] # ccs -f /etc/cluster/cluster.conf --addfencedev pcmk agent=fence_pcmk [ONE] # ccs -f /etc/cluster/cluster.conf --addmethod pcmk-redirect node1 [ONE] # ccs -f /etc/cluster/cluster.conf --addmethod pcmk-redirect node2 [ONE] # ccs -f /etc/cluster/cluster.conf --addfenceinst pcmk node1 pcmk-redirect port=node1 [ONE] # ccs -f /etc/cluster/cluster.conf --addfenceinst pcmk node2 pcmk-redirect port=node2
Now copy /etc/cluster/cluster.conf to all the other nodes that will be part of the cluster.
Start the Cluster
CMAN was originally written for rgmanager and assumes the cluster should not start until the node has quorum, so before we try to start the cluster, we need to disable this behavior:
[ALL] # echo "CMAN_QUORUM_TIMEOUT=0" >> /etc/sysconfig/cman
Now, on each machine, run:
[ALL] # service cman start [ALL] # service pacemaker start
A note for users of prior RHEL versions
The original cluster shell (crmsh) is no longer available on RHEL. To help people make the transition there is a quick reference guide for those wanting to know what the pcs equivalent is for various crmsh commands.
Set Cluster Options
With so many devices and possible topologies, it is nearly impossible to include Fencing in a document like this. For now we will disable it.
[ONE] # pcs property set stonith-enabled=false
One of the most common ways to deploy Pacemaker is in a 2-node configuration. However quorum as a concept makes no sense in this scenario (because you only have it when more than half the nodes are available), so we'll disable it too.
[ONE] # pcs property set no-quorum-policy=ignore
For demonstration purposes, we will force the cluster to move services after a single failure:
[ONE] # pcs resource rsc defaults migration-threshold=1
Add a Resource
Lets add a cluster service, we'll choose one doesn't require any configuration and works everywhere to make things easy. Here's the command:
[ONE] # pcs resource create my_first_svc ocf:pacemaker:Dummy op monitor interval=120s
"my_first_svc" is the name the service will be known as.
"ocf:pacemaker:Dummy" tells Pacemaker which script to use (Dummy - an agent that's useful as a template and for guides like this one), which namespace it is in (pacemaker) and what standard it conforms to (OCF).
"op monitor interval=120s" tells Pacemaker to check the health of this service every 2 minutes by calling the agent's monitor action.
You should now be able to see the service running using:
[ONE] # pcs status
[ONE] # crm_mon -1
Simulate a Service Failure
We can simulate an error by telling the service to stop directly (without telling the cluster):
[ONE] # crm_resource --resource my_first_svc --force-stop
If you now run crm_mon in interactive mode (the default), you should see (within the monitor interval - 2 minutes) the cluster notice that my_first_svc failed and move it to another node.
- Configure Fencing
- Add more services - see Clusters from Scratch for examples of how to add IP address, Apache and DRBD to a cluster
- Learn how to make services prefer a specific host
- Learn how to make services run on the same host
- Learn how to make services start and stop in a specific order
- Find out what else Pacemaker can do - see Pacemaker Explained for an comprehensive list of concepts and options