Pacemaker is a cluster resource manager.
It achieves maximum availability for your cluster services (aka. resources) by detecting and recovering from node and resource-level failures by making use of the messaging and membership capabilities provided by your preferred cluster infrastructure (either OpenAIS or Heartbeat).
It can do this for clusters of practically any size and comes with a powerful dependency model that allows the administrator to accurately express the relationships (both ordering and location) between the cluster resources.
Virtually anything that can be scripted can be managed as part of a Pacemaker cluster.
Its is also worth reiterating that Pacemaker is NOT a Fork of Heartbeat, as it seems to be a common misconception. Pacemaker is a continuation of the CRM (aka. v2 resource manager) that was originally developed for Heartbeat but has since become its own project. See the project history for more information.
Please let us know which distribution you use for Pacemaker, fill out our usage poll.
Pacemaker currently ships with Fedora (since 12), Red Hat Enterprise Linux (since 6.0 beta1), openSUSE (since 11.0), Debian (since "Squeeze"), Ubuntu LTS (since 10.4 "Lucid Lynx”) and as a key component of the High Availability Extension for SUSE Linux Enterprise Server 11 (available free of charge to existing SLES10 customers).
See our Install page for more information.
Building from Source
Pacemaker and its dependancies can also be easily compiled from source for all Linux based distributions and most BSD ones (Including MacOS X).
See SourceInstall for more information.
|Version||Current Release||First Released||This Release||Next Release|
|1.1||1.1.13||15 Jan 2010||24 Jun 2015||TBD|
|Version||Last Release||First Released||Last Released|
|1.0||1.0.13||9 Oct 2008||13 Feb 2013|
|0.7||0.7.3||25 Jun 2008||22 Sep 2008|
|0.6||0.6.7||16 Jan 2008||15 Dec 2008|
See Also: Releases
Common node configurations that are possible to configure with Pacemaker.
Supported Cluster Stacks
|stonithd||The Heartbeat fencing subsystem.|
|lrmd||Short for Local Resource Management Daemon. Non-cluster aware daemon that presents a common interface to the supported resource types. Interacts directly with resource agents (scripts).|
|pengine||Short for Policy Engine. Computes the next state of the cluster based on the current state and the configuration. Produces a transition graph contained a list of actions and dependancies.|
|cib||Short for Cluster Information Base. Contains definitions of all cluster options, nodes, resources, their relationships to one another and current status. Synchronizes updates to all cluster nodes.|
|crmd||Short for Cluster Resource Management Daemon. Largely a message broker for the PEngine and LRM, it also elects a leader to co-ordinate the activities (including starting/stopping resources) of the cluster.|
|openais||The OpenAIS messaging and membership layer.|
|heartbeat||The Heartbeat messaging layer, an alternative to OpenAIS.|
|ccm||Short for Consensus Cluster Membership. The Heartbeat membership layer.|
The CIB uses XML to represent both the cluster’s configuration and current state of all resources in the cluster. The contents of the CIB are automatically kept in sync across the entire cluster and are used by the PEngine to compute the ideal state of the cluster and how it should be achieved.
This list of instructions is then fed to the DC (Designated Co-ordinator). Pacemaker centralizes all cluster decision making by electing one of the CRMd instances to act as a master. Should the elected CRMd process, or the node it is on, fail... a new one is quickly established.
The DC carries out the PEngine’s instructions in the required order by passing them to either the LRMd (Local Resource Management daemon) or CRMd peers on other nodes via the cluster messaging infrastructure (which in turn passes them on to their LRMd process).
The peer nodes all report the results of their operations back to the DC and based on the expected and actual results, will either execute any actions that needed to wait for the previous one to complete, or abort processing and ask the PEngine to recalculate the ideal cluster state based on the unexpected results.
In some cases, it may be necessary to power off nodes in order to protect shared data or complete resource recovery. For this Pacemaker comes with STONITHd. STONITH is an acronym for Shoot-The-Other-Node-In-The-Head and is usually implemented with a remote power switch. In Pacemaker, STONITH devices are modeled as resources (and configured in the CIB) to enable them to be easily monitored for failure, however STONITHd takes care of understanding the STONITH topology such that its clients simply request a node be fenced and it does the rest.
Pacemaker came to life in late 2003 when Lars convinced SUSE to hire me to implement a new cluster resource manager for Heartbeat. Although simple to configure, the old version 1 cluster manager had four key deficiencies
- Maximum of 2-nodes
- Highly coupled design and implementation
- Overly simplistic group-based resource model
- Inability to detect and recover from resource-level failures
Then, year and a half later, on Saturday July 30 (2005), Heartbeat 2.0.0 was released containing the first public version of the CRM.
After many successful releases, the decision was made at the end of 2007 to spin-off the CRM into its own project after the 2.1.3 Heartbeat release in order to
- support both the OpenAIS and Heartbeat cluster stacks equally
- decouple the release cycles of two projects at very different stages of their life-cycles
- foster clearer package boundaries, thus leading to
- better and more stable interfaces
This transition was completed on January 16, 2008 with the 0.6.0 release of Pacemaker which was the first to support both cluster stacks. The (feature frozen) 0.6 stable series was derived from, and fully compatible with, the 2.1.3 CRM. It received bug-fix-only updates throughout 2008 and 2009 before being deprecated in March 2010.
The current Pacemaker stable series is 1.0 and contains many improvements over prior releases, including:
- A more intuitive syntax
- Failure (migration) thresholds and timeouts
- Tool for making offline configuration changes
- A unified command line configuration tool that hides the underlying xml
- Rules, instance_attributes, meta_attributes and sets of operations can be deﬁned once and referenced in multiple places
- The ability to connect to the CIB from non-cluster machines
- Allow recurring actions to be triggered at known times
- A more powerful RelaxNG-based configuration schema