Product SiteDocumentation Site

Pacemaker 1.1

Configuration Explained

An A-Z guide to Pacemaker's Configuration Options

Edition 10

Andrew Beekhof

Primary author 
Red Hat

Dan Frîncu

Romanian translation 

Philipp Marek

Style and formatting updates. Indexing. 
LINBit

Tanja Roth

Utilization chapter Resource Templates chapter Multi-Site Clusters chapter 
SUSE

Lars Marowsky-Bree

Multi-Site Clusters chapter 
SUSE

Yan Gao

Utilization chapter Resource Templates chapter Multi-Site Clusters chapter 
SUSE

Thomas Schraitle

Utilization chapter Resource Templates chapter Multi-Site Clusters chapter 
SUSE

Dejan Muhamedagic

Resource Templates chapter 
SUSE

Legal Notice

Copyright © 2009-2017 Andrew Beekhof.
The text of and illustrations in this document are licensed under version 4.0 or later of the Creative Commons Attribution-ShareAlike International Public License ("CC-BY-SA")[1].
In accordance with CC-BY-SA, if you distribute this document or an adaptation of it, you must provide the URL for the original version.
In addition to the requirements of this license, the following activities are looked upon favorably:
  1. If you are distributing Open Publication works on hardcopy or CD-ROM, you provide email notification to the authors of your intent to redistribute at least thirty days before your manuscript or media freeze, to give the authors time to provide updated documents. This notification should describe modifications, if any, made to the document.
  2. All substantive modifications (including deletions) be either clearly marked up in the document or else described in an attachment to the document.
  3. Finally, while it is not mandatory under this license, it is considered good form to offer a free copy of any hardcopy or CD-ROM expression of the author(s) work.

Abstract

The purpose of this document is to definitively explain the concepts used to configure Pacemaker. To achieve this, it will focus exclusively on the XML syntax used to configure Pacemaker's Cluster Information Base (CIB).

Table of Contents

Preface
1. Document Conventions
1.1. Typographic Conventions
1.2. Pull-quote Conventions
1.3. Notes and Warnings
2. We Need Feedback!
1. Read-Me-First
1.1. The Scope of this Document
1.2. What Is Pacemaker?
1.3. Pacemaker Architecture
1.3.1. Internal Components
1.4. Types of Pacemaker Clusters
2. Configuration Basics
2.1. Configuration Layout
2.2. The Current State of the Cluster
2.3. How Should the Configuration be Updated?
2.3.1. Editing the CIB Using XML
2.3.2. Quickly Deleting Part of the Configuration
2.3.3. Updating the Configuration Without Using XML
2.4. Making Configuration Changes in a Sandbox
2.5. Testing Your Configuration Changes
2.5.1. Small Cluster Transition
2.5.2. Complex Cluster Transition
2.6. Do I Need to Update the Configuration on All Cluster Nodes?
3. Cluster-Wide Configuration
3.1. CIB Properties
3.1.1. Working with CIB Properties
3.2. Cluster Options
3.2.1. Querying and Setting Cluster Options
3.2.2. When Options are Listed More Than Once
4. Cluster Nodes
4.1. Defining a Cluster Node
4.2. Where Pacemaker Gets the Node Name
4.3. Node Attributes
4.4. Managing Nodes in a Corosync-Based Cluster
4.4.1. Adding a New Corosync Node
4.4.2. Removing a Corosync Node
4.4.3. Replacing a Corosync Node
4.5. Managing Nodes in a Heartbeat-based Cluster
4.5.1. Adding a New Heartbeat Node
4.5.2. Removing a Heartbeat Node
4.5.3. Replacing a Heartbeat Node
5. Cluster Resources
5.1. What is a Cluster Resource?
5.2. Resource Classes
5.2.1. Open Cluster Framework
5.2.2. Linux Standard Base
5.2.3. Systemd
5.2.4. Upstart
5.2.5. System Services
5.2.6. STONITH
5.2.7. Nagios Plugins
5.3. Resource Properties
5.4. Resource Options
5.4.1. Resource Meta-Attributes
5.4.2. Setting Global Defaults for Resource Meta-Attributes
5.4.3. Resource Instance Attributes
5.5. Resource Operations
5.5.1. Monitoring Resources for Failure
5.5.2. Monitoring Resources When Administration is Disabled
5.5.3. Setting Global Defaults for Operations
5.5.4. When Implicit Operations Take a Long Time
5.5.5. Multiple Monitor Operations
5.5.6. Disabling a Monitor Operation
6. Resource Constraints
6.1. Scores
6.1.1. Infinity Math
6.2. Deciding Which Nodes a Resource Can Run On
6.2.1. Location Properties
6.2.2. Asymmetrical "Opt-In" Clusters
6.2.3. Symmetrical "Opt-Out" Clusters
6.2.4. What if Two Nodes Have the Same Score
6.3. Specifying the Order in which Resources Should Start/Stop
6.3.1. Ordering Properties
6.3.2. Optional and mandatory ordering
6.4. Placing Resources Relative to other Resources
6.4.1. Colocation Properties
6.4.2. Mandatory Placement
6.4.3. Advisory Placement
6.4.4. Colocation by Node Attribute
6.5. Resource Sets
6.6. Ordering Sets of Resources
6.6.1. Ordered Set
6.6.2. Ordering Multiple Sets
6.6.3. Resource Set OR Logic
6.7. Colocating Sets of Resources
7. Alerts
7.1. Alert Agents
7.2. Alert Recipients
7.3. Alert Meta-Attributes
7.4. Alert Instance Attributes
7.5. Alert Filters
7.6. Using the Sample Alert Agents
7.7. Writing an Alert Agent
8. Rules
8.1. Rule Properties
8.2. Node Attribute Expressions
8.3. Time- and Date-Based Expressions
8.3.1. Date Specifications
8.3.2. Durations
8.3.3. Sample Time-Based Expressions
8.4. Using Rules to Determine Resource Location
8.4.1. Location Rules Based on Other Node Properties
8.4.2. Using score-attribute Instead of score
8.5. Using Rules to Control Resource Options
8.6. Using Rules to Control Cluster Options
8.7. Ensuring Time-Based Rules Take Effect
9. Advanced Configuration
9.1. Connecting from a Remote Machine
9.2. Specifying When Recurring Actions are Performed
9.3. Handling Resource Failure
9.3.1. Failure Counts
9.3.2. Failure Response
9.4. Moving Resources
9.4.1. Moving Resources Manually
9.4.2. Moving Resources Due to Connectivity Changes
9.4.3. Migrating Resources
9.5. Tracking Node Health
9.5.1. Node Health Attributes
9.5.2. Node Health Strategy
9.5.3. Measuring Node Health
9.6. Reloading Services After a Definition Change
10. Advanced Resource Types
10.1. Groups - A Syntactic Shortcut
10.1.1. Group Properties
10.1.2. Group Options
10.1.3. Group Instance Attributes
10.1.4. Group Contents
10.1.5. Group Constraints
10.1.6. Group Stickiness
10.2. Clones - Resources That Get Active on Multiple Hosts
10.2.1. Clone Properties
10.2.2. Clone Options
10.2.3. Clone Instance Attributes
10.2.4. Clone Contents
10.2.5. Clone Constraints
10.2.6. Clone Stickiness
10.2.7. Clone Resource Agent Requirements
10.3. Multi-state - Resources That Have Multiple Modes
10.3.1. Multi-state Properties
10.3.2. Multi-state Options
10.3.3. Multi-state Instance Attributes
10.3.4. Multi-state Contents
10.3.5. Monitoring Multi-State Resources
10.3.6. Multi-state Constraints
10.3.7. Multi-state Stickiness
10.3.8. Which Resource Instance is Promoted
10.3.9. Requirements for Multi-state Resource Agents
10.4. Bundles - Isolated Environments
10.4.1. Bundle Properties
10.4.2. Docker Properties
10.4.3. rkt Properties
10.4.4. Bundle Network Properties
10.4.5. Bundle Storage Properties
10.4.6. Bundle Primitive
10.4.7. Bundle Node Attributes
10.4.8. Bundle Meta-Attributes
10.4.9. Limitations of Bundles
11. Reusing Parts of the Configuration
11.1. Reusing Resource Definitions
11.1.1. Configuring Resources with Templates
11.1.2. Using Templates in Constraints
11.1.3. Using Templates in Resource Sets
11.2. Reusing Rules, Options and Sets of Operations
11.3. Tagging Configuration Elements
11.3.1. Configuring Tags
11.3.2. Using Tags in Constraints and Resource Sets
12. Utilization and Placement Strategy
12.1. Utilization attributes
12.2. Placement Strategy
12.3. Allocation Details
12.3.1. Which node is preferred to get consumed first when allocating resources?
12.3.2. Which node has more free capacity?
12.3.3. Which resource is preferred to be assigned first?
12.4. Limitations and Workarounds
13. STONITH
13.1. What Is STONITH?
13.2. What STONITH Device Should You Use?
13.3. Special Treatment of STONITH Resources
13.4. Unfencing
13.5. Configuring STONITH
13.5.1. Example STONITH Configuration
13.6. Advanced STONITH Configurations
13.6.1. Example Dual-Layer, Dual-Device Fencing Topologies
13.7. Remapping Reboots
14. Status — Here be dragons
14.1. Node Status
14.2. Transient Node Attributes
14.3. Operation History
14.3.1. Simple Operation History Example
14.3.2. Complex Operation History Example
15. Multi-Site Clusters and Tickets
15.1. Challenges for Multi-Site Clusters
15.2. Conceptual Overview
15.2.1. Ticket
15.2.2. Dead Man Dependency
15.2.3. Cluster Ticket Registry
15.2.4. Configuration Replication
15.3. Configuring Ticket Dependencies
15.4. Managing Multi-Site Clusters
15.4.1. Granting and Revoking Tickets Manually
15.4.2. Granting and Revoking Tickets via a Cluster Ticket Registry
15.4.3. General Management of Tickets
15.5. For more information
A. FAQ
B. More About OCF Resource Agents
B.1. Location of Custom Scripts
B.2. Actions
B.3. How are OCF Return Codes Interpreted?
B.4. OCF Return Codes
C. Installing
C.1. Installing the Software
C.2. Enabling Pacemaker
C.2.1. Enabling Pacemaker For Corosync 2.x
C.2.2. Enabling Pacemaker For Corosync 1.x
C.2.3. Enabling Pacemaker For Heartbeat
D. Upgrading
D.1. Upgrading Cluster Software
D.1.1. Complete Cluster Shutdown
D.1.2. Rolling (node by node)
D.1.3. Detach and Reattach
D.2. Upgrading the Configuration
D.3. What Changed in 1.0
D.3.1. New
D.3.2. Changed
D.3.3. Removed
E. Init Script LSB Compliance
F. Sample Configurations
F.1. Empty
F.2. Simple
F.3. Advanced Configuration
G. Further Reading
H. Revision History
Index

List of Figures

1.1. The Pacemaker Stack
1.2. Internal Components
1.3. Active/Passive Redundancy
1.4. Shared Failover
1.5. N to N Redundancy
6.1. Visual representation of the four resources' start order for the above constraints
6.2. Visual representation of the start order for two ordered sets of unordered resources
6.3. Visual representation of the start order for the three sets defined above
6.4. Visual representation the above example (resources to the left are placed first)

List of Tables

3.1. CIB Properties
3.2. Cluster Options
5.1. Properties of a Primitive Resource
5.2. Meta-attributes of a Primitive Resource
5.3. Properties of an Operation
6.1. Properties of a rsc_location Constraint
6.2. Properties of a rsc_order Constraint
6.3. Properties of a rsc_colocation Constraint
6.4. Properties of a resource_set
7.1. Meta-Attributes of an Alert
7.2. Environment variables passed to alert agents
8.1. Properties of a Rule
8.2. Properties of an Expression
8.3. Built-in node attributes
8.4. Properties of a Date Expression
8.5. Properties of a Date Specification
9.1. Environment Variables Used to Connect to Remote Instances of the CIB
9.2. Extra top-level CIB properties for remote access
9.3. Common Options for a ping Resource
9.4. Allowed Values for Node Health Attributes
9.5. Node Health Strategies
10.1. Properties of a Group Resource
10.2. Properties of a Clone Resource
10.3. Clone-specific configuration options
10.4. Environment variables supplied with Clone notify actions
10.5. Properties of a Multi-State Resource
10.6. Multi-state-specific resource configuration options
10.7. Additional colocation constraint options for multi-state resources
10.8. Additional colocation set options relevant to multi-state resources
10.9. Additional ordered set options relevant to multi-state resources
10.10. Role implications of OCF return codes
10.11. Environment variables supplied with multi-state notify actions
10.12. Properties of a Bundle
10.13. Properties of a Bundle’s Docker Element
10.14. Properties of a Bundle’s rkt Element
10.15. Properties of a Bundle’s Network Element
10.16. Properties of a Bundle’s Port-Mapping Element
10.17. Properties of a Bundle’s Storage-Mapping Element
13.1. Additional Properties of Fencing Resources
13.2. Properties of Fencing Levels
14.1. Authoritative Sources for State Information
14.2. Node Status Fields
14.3. Contents of an lrm_rsc_op job
B.1. Required Actions for OCF Agents
B.2. Optional Actions for OCF Resource Agents
B.3. Types of recovery performed by the cluster
B.4. OCF Return Codes and their Recovery Types
D.1. Upgrade Methods
D.2. Version Compatibility Table

List of Examples

2.1. An empty configuration
2.2. Sample output from crm_mon
2.3. Sample output from crm_mon -n
2.4. Safely using an editor to modify the cluster configuration
2.5. Safely using an editor to modify only the resources section
2.6. Searching for STONITH-related configuration items
2.7. Creating and displaying the active sandbox
2.8. Use sandbox to make multiple changes all at once, discard them, and verify real configuration is untouched
3.1. Attributes set for a cib object
3.2. Deleting an option that is listed twice
4.1. Example Heartbeat cluster node entry
4.2. Example Corosync cluster node entry
4.3. Result of using crm_attribute to specify which kernel pcmk-1 is running
5.1. A system resource definition
5.2. An OCF resource definition
5.3. An LSB resource with cluster options
5.4. An example OCF resource with instance attributes
5.5. Displaying the metadata for the Dummy resource agent template
5.6. An OCF resource with a recurring health check
5.7. An OCF resource with custom timeouts for its implicit actions
5.8. An OCF resource with two recurring health checks, performing different levels of checks specified via OCF_CHECK_LEVEL.
5.9. Example of an OCF resource with a disabled health check
6.1. Opt-in location constraints for two resources
6.2. Opt-out location constraints for two resources
6.3. Constraints where a resource prefers two nodes equally
6.4. Optional and mandatory ordering constraints
6.5. Mandatory colocation constraint for two resources
6.6. Mandatory anti-colocation constraint for two resources
6.7. Advisory colocation constraint for two resources
6.8. A set of 3 resources
6.9. A chain of ordered resources
6.10. A chain of ordered resources expressed as a set
6.11. Ordered sets of unordered resources
6.12. Advanced use of set ordering - Three ordered sets, two of which are internally unordered
6.13. Resource Set "OR" logic: Three ordered sets, where the first set is internally unordered with "OR" logic
6.14. Chain of colocated resources
6.15. Equivalent colocation chain expressed using resource_set
6.16. Using colocated sets to specify a common peer
6.17. Colocation chain in which the members of the middle set have no interdependencies, and the last listed set (which the cluster places first) is restricted to instances in master status.
7.1. Simple alert configuration
7.2. Alert configuration with recipient
7.3. Alert configuration with meta-attributes
7.4. Alert configuration with instance attributes
7.5. Alert configuration to receive only node events and fencing events
7.6. Alert configuration to be called when certain node attributes change
7.7. Sending cluster events as SNMP traps
7.8. Sending cluster events as e-mails
8.1. True if now is any time in the year 2005
8.2. Equivalent expression
8.3. 9am-5pm Monday-Friday
8.4. 9am-6pm Monday through Friday or anytime Saturday
8.5. 9am-5pm or 9pm-12am Monday through Friday
8.6. Mondays in March 2005
8.7. A full moon on Friday the 13th
8.8. Prevent myApacheRsc from running on c001n03
8.9. Prevent myApacheRsc from running on c001n03 - expanded version
8.10. A sample nodes section for use with score-attribute
8.11. Defining different resource options based on the node name
8.12. Change resource-stickiness during working hours
9.1. Specifying a Base for Recurring Action Intervals
9.2. An example ping cluster resource that checks node connectivity once every minute
9.3. Don’t run a resource on unconnected nodes
9.4. Run only on nodes connected to three or more ping targets.
9.5. Prefer the node with the most connected ping nodes
9.6. How the cluster translates the above location constraint
9.7. A more complex example of choosing a location based on connectivity
9.8. The DRBD agent’s logic for supporting reload
9.9. The DRBD Agent Advertising Support for the reload Operation
9.10. Parameter that can be changed using reload
10.1. A group of two primitive resources
10.2. How the cluster sees a group resource
10.3. Some constraints involving groups
10.4. A clone of an LSB resource
10.5. Some constraints involving clones
10.6. Notification variables
10.7. Monitoring both states of a multi-state resource
10.8. Constraints involving multi-state resources
10.9. Colocate C and D with A’s and B’s master instances
10.10. Start C and D after first promoting A and B
10.11. Explicitly preferring node1 to be promoted to master
10.12. A bundle for a containerized web server
11.1. Resource template for a migratable Xen virtual machine
11.2. Xen primitive resource using a resource template
11.3. Equivalent Xen primitive resource not using a resource template
11.4. Xen resource overriding template values
11.5. Referencing rules from other constraints
11.6. Referencing attributes, options, and operations from other resources
11.7. Tag referencing three resources
11.8. Constraint using a tag
11.9. Equivalent constraints without tags
12.1. Specifying CPU and RAM capacities of two nodes
12.2. Specifying CPU and RAM consumed by several resources
13.1. Obtaining a list of STONITH Parameters
13.2. An IPMI-based STONITH Resource
13.3. Fencing topology with different devices for different nodes
14.1. A bare-bones status entry for a healthy node cl-virt-1
14.2. A set of transient node attributes for node cl-virt-1
14.3. A record of the apcstonith resource
14.4. A monitor operation (determines current state of the apcstonith resource)
14.5. Resource history of a pingd clone with multiple jobs
15.1. Constraint that fences node if ticketA is revoked
15.2. Constraint that demotes rsc1 if ticketA is revoked
15.3. Ticket constraint for multiple resources
C.1. Corosync 2.x configuration file for two nodes myhost1 and myhost2
C.2. Corosync 2.x configuration file for three nodes myhost1, myhost2 and myhost3
C.3. Corosync 1.x configuration file for a cluster with all nodes on the 192.0.2.0/24 network
C.4. Corosync 1._x_configuration fragment to enable Pacemaker plugin
C.5. Heartbeat configuration fragment to enable Pacemaker
F.1. An Empty Configuration
F.2. A simple configuration with two nodes, some cluster options and a resource
F.3. An advanced configuration with groups, clones and STONITH