Product SiteDocumentation Site

5.3. Perform a Failover

Since our ultimate goal is high availability, we should test failover of our new resource before moving on.
First, find the node on which the IP address is running.
[root@pcmk-1 ~]# pcs status
Cluster name: mycluster
Stack: corosync
Current DC: pcmk-2 (version 1.1.18-11.el7_5.3-2b07d5c5a9) - partition with quorum
Last updated: Mon Sep 10 16:55:26 2018
Last change: Mon Sep 10 16:53:42 2018 by root via cibadmin on pcmk-1

2 nodes configured
1 resource configured

Online: [ pcmk-1 pcmk-2 ]

Full list of resources:

 ClusterIP      (ocf::heartbeat:IPaddr2):       Started pcmk-1
You can see that the status of the ClusterIP resource is Started on a particular node (in this example, pcmk-1). Shut down Pacemaker and Corosync on that machine to trigger a failover.
[root@pcmk-1 ~]# pcs cluster stop pcmk-1
Stopping Cluster (pacemaker)...
Stopping Cluster (corosync)...

Note

A cluster command such as pcs cluster stop nodename can be run from any node in the cluster, not just the affected node.
Verify that pacemaker and corosync are no longer running:
[root@pcmk-1 ~]# pcs status
Error: cluster is not currently running on this node
Go to the other node, and check the cluster status.
[root@pcmk-2 ~]# pcs status
Cluster name: mycluster
Stack: corosync
Current DC: pcmk-2 (version 1.1.18-11.el7_5.3-2b07d5c5a9) - partition with quorum
Last updated: Mon Sep 10 16:57:22 2018
Last change: Mon Sep 10 16:53:42 2018 by root via cibadmin on pcmk-1

2 nodes configured
1 resource configured

Online: [ pcmk-2 ]
OFFLINE: [ pcmk-1 ]

Full list of resources:

 ClusterIP      (ocf::heartbeat:IPaddr2):       Started pcmk-2

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled
Notice that pcmk-1 is OFFLINE for cluster purposes (its pcsd is still active, allowing it to receive pcs commands, but it is not participating in the cluster).
Also notice that ClusterIP is now running on pcmk-2 — failover happened automatically, and no errors are reported.

Quorum

If a cluster splits into two (or more) groups of nodes that can no longer communicate with each other (aka. partitions), quorum is used to prevent resources from starting on more nodes than desired, which would risk data corruption.
A cluster has quorum when more than half of all known nodes are online in the same partition, or for the mathematically inclined, whenever the following equation is true:
total_nodes < 2 * active_nodes
For example, if a 5-node cluster split into 3- and 2-node paritions, the 3-node partition would have quorum and could continue serving resources. If a 6-node cluster split into two 3-node partitions, neither partition would have quorum; pacemaker’s default behavior in such cases is to stop all resources, in order to prevent data corruption.
Two-node clusters are a special case. By the above definition, a two-node cluster would only have quorum when both nodes are running. This would make the creation of a two-node cluster pointless, but corosync has the ability to treat two-node clusters as if only one node is required for quorum.
The pcs cluster setup command will automatically configure two_node: 1 in corosync.conf, so a two-node cluster will "just work".
If you are using a different cluster shell, you will have to configure corosync.conf appropriately yourself.
Now, simulate node recovery by restarting the cluster stack on pcmk-1, and check the cluster’s status. (It may take a little while before the cluster gets going on the node, but it eventually will look like the below.)
[root@pcmk-1 ~]# pcs cluster start pcmk-1
pcmk-1: Starting Cluster...
[root@pcmk-1 ~]# pcs status
Cluster name: mycluster
Stack: corosync
Current DC: pcmk-2 (version 1.1.18-11.el7_5.3-2b07d5c5a9) - partition with quorum
Last updated: Mon Sep 10 17:00:04 2018
Last change: Mon Sep 10 16:53:42 2018 by root via cibadmin on pcmk-1

2 nodes configured
1 resource configured

Online: [ pcmk-1 pcmk-2 ]

Full list of resources:

 ClusterIP      (ocf::heartbeat:IPaddr2):       Started pcmk-2

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled