Pacemaker

From ClusterLabs

Jump to: navigation, search

Contents

Pacemaker is a cluster resource manager.

It achieves maximum availability for your cluster services (aka. resources) by detecting and recovering from node and resource-level failures by making use of the messaging and membership capabilities provided by your preferred cluster infrastructure (either OpenAIS or Heartbeat).

It can do this for clusters of practically any size and comes with a powerful dependency model that allows the administrator to accurately express the relationships (both ordering and location) between the cluster resources.

Virtually anything that can be scripted can be managed as part of a Pacemaker cluster.

Its is also worth reiterating that Pacemaker is NOT a Fork of Heartbeat, as it seems to be a common misconception. Pacemaker is a continuation of the CRM (aka. v2 resource manager) that was originally developed for Heartbeat but has since become its own project. See the project history for more information.

Features

Features

Availability

Please let us know which distribution you use for Pacemaker, fill out our usage poll.

Installation Channels

Vendor Packages

Pacemaker currently ships with Fedora (since 12), Red Hat Enterprise Linux (since 6.0 beta1), openSUSE (since 11.0), Debian (since "Squeeze"), Ubuntu LTS (since 10.4 "Lucid Lynx”) and as a key component of the High Availability Extension for SUSE Linux Enterprise Server 11 (available free of charge to existing SLES10 customers).

Upstream Packages

Binary packages are also available for current versions of RHEL and Fedora.

See our Install page for more information.

Building from Source

Pacemaker and its dependancies can also be easily compiled from source for all Linux based distributions and most BSD ones (Including MacOS X).

See our Install page for more information.

Current Releases

Supported Branches

Version Current Release First Released This Release Next Release
1.1 1.1.9 Jan 15, 2010 Mar 8, 2013 July 2013
1.0 1.0.13 Oct 9, 2008 Feb 13, 2013 As needed

Deprecated Branches

Version Last Release First Released Last Released
0.7 0.7.3 June 25, 2008 Sep 22, 2008
0.6 0.6.7 Jan 16, 2008 Dec 15, 2008

See Also: Releases

Example Configurations

Common node configurations that are possible to configure with Pacemaker.

Two-node Active/Passive clusters using Pacemaker and DRBD are a cost-effective solution for many High Availability situations
By supporting many nodes, Pacemaker can dramatically reduce hardware costs by allowing several active/passive clusters to be combined and share a common backup node
When shared storage is available, every node can potentially be used for failover. Pacemaker can even run multiple copies of services to spread out the workload.
Pacemaker 1.2 will include enhancements to simplify the creation of split-site clusters

Architecture

Supported Cluster Stacks

Cluster stack based on OpenAIS
Legacy cluster stack based on Heartbeat

Internals

Cluster stack

Cluster Components

Component Description
stonithd The Heartbeat fencing subsystem.
lrmd Short for Local Resource Management Daemon. Non-cluster aware daemon that presents a common interface to the supported resource types. Interacts directly with resource agents (scripts).
pengine Short for Policy Engine. Computes the next state of the cluster based on the current state and the configuration. Produces a transition graph contained a list of actions and dependancies.
cib Short for Cluster Information Base. Contains definitions of all cluster options, nodes, resources, their relationships to one another and current status. Synchronizes updates to all cluster nodes.
crmd Short for Cluster Resource Management Daemon. Largely a message broker for the PEngine and LRM, it also elects a leader to co-ordinate the activities (including starting/stopping resources) of the cluster.
openais The OpenAIS messaging and membership layer.
heartbeat The Heartbeat messaging layer, an alternative to OpenAIS.
ccm Short for Consensus Cluster Membership. The Heartbeat membership layer.

Functional Overview

The CIB uses XML to represent both the cluster’s configuration and current state of all resources in the cluster. The contents of the CIB are automatically kept in sync across the entire cluster and are used by the PEngine to compute the ideal state of the cluster and how it should be achieved.

This list of instructions is then fed to the DC (Designated Co-ordinator). Pacemaker centralizes all cluster decision making by electing one of the CRMd instances to act as a master. Should the elected CRMd process, or the node it is on, fail... a new one is quickly established.

The DC carries out the PEngine’s instructions in the required order by passing them to either the LRMd (Local Resource Management daemon) or CRMd peers on other nodes via the cluster messaging infrastructure (which in turn passes them on to their LRMd process).

The peer nodes all report the results of their operations back to the DC and based on the expected and actual results, will either execute any actions that needed to wait for the previous one to complete, or abort processing and ask the PEngine to recalculate the ideal cluster state based on the unexpected results.

In some cases, it may be necessary to power off nodes in order to protect shared data or complete resource recovery. For this Pacemaker comes with STONITHd. STONITH is an acronym for Shoot-The-Other-Node-In-The-Head and is usually implemented with a remote power switch. In Pacemaker, STONITH devices are modeled as resources (and configured in the CIB) to enable them to be easily monitored for failure, however STONITHd takes care of understanding the STONITH topology such that its clients simply request a node be fenced and it does the rest.

Pacemaker Composition

Project History

Pacemaker came to life in late 2003 when Lars convinced SUSE to hire me to implement a new cluster resource manager for Heartbeat. Although simple to configure, the old version 1 cluster manager had four key deficiencies

  • Maximum of 2-nodes
  • Highly coupled design and implementation
  • Overly simplistic group-based resource model
  • Inability to detect and recover from resource-level failures

Then, year and a half later, on Saturday July 30 (2005), Heartbeat 2.0.0 was released containing the first public version of the CRM.

After many successful releases, the decision was made at the end of 2007 to spin-off the CRM into its own project after the 2.1.3 Heartbeat release in order to

  • support both the OpenAIS and Heartbeat cluster stacks equally
  • decouple the release cycles of two projects at very different stages of their life-cycles
  • foster clearer package boundaries, thus leading to
  • better and more stable interfaces

This transition was completed on January 16, 2008 with the 0.6.0 release of Pacemaker which was the first to support both cluster stacks. The (feature frozen) 0.6 stable series was derived from, and fully compatible with, the 2.1.3 CRM. It received bug-fix-only updates throughout 2008 and 2009 before being deprecated in March 2010.

The current Pacemaker stable series is 1.0 and contains many improvements over prior releases, including:

  • A more intuitive syntax
  • Failure (migration) thresholds and timeouts
  • Tool for making offline configuration changes
  • A unified command line configuration tool that hides the underlying xml
  • Rules, instance_attributes, meta_attributes and sets of operations can be defined once and referenced in multiple places
  • The ability to connect to the CIB from non-cluster machines
  • Allow recurring actions to be triggered at known times
  • A more powerful RelaxNG-based configuration schema
Personal tools