2.3. Cluster Options

⁠2.3. Cluster Options

Cluster options, as you might expect, control how the cluster behaves when confronted with certain situations.

They are grouped into sets within the crm_config section, and, in advanced configurations, there may be more than one set. (This will be described later in the section on Chapter 8, Rules where we will show how to have the cluster use different sets of options during working hours than during weekends.) For now, we will describe the simple case where each option is present at most once.

You can obtain an up-to-date list of cluster options, including their default values, by running the man pacemaker-schedulerd and man pacemaker-controld commands.

⁠

Table 2.2. Cluster Options

Option	Default	Description
`cluster-name`		An (optional) name for the cluster as a whole. This is mostly for users' convenience for use as desired in administration, but this can be used in the Pacemaker configuration in rules (as the `#cluster-name` node attribute). It may also be used by higher-level tools when displaying cluster information, and by certain resource agents (for example, the `ocf:heartbeat:GFS2` agent stores the cluster name in filesystem meta-data).
`dc-version`		Version of Pacemaker on the cluster’s DC. Determined automatically by the cluster. Often includes the hash which identifies the exact Git changeset it was built from. Used for diagnostic purposes.
`cluster-infrastructure`		The messaging stack on which Pacemaker is currently running. Determined automatically by the cluster. Used for informational and diagnostic purposes.
`no-quorum-policy`	stop	What to do when the cluster does not have quorum. Allowed values: `ignore:` continue all resource management `freeze:` continue resource management, but don’t recover resources from nodes not in the affected partition `stop:` stop all resources in the affected cluster partition `demote:` demote promotable resources and stop all other resources in the affected cluster partition `suicide:` fence all nodes in the affected cluster partition
`batch-limit`	0	The maximum number of actions that the cluster may execute in parallel across all nodes. The "correct" value will depend on the speed and load of your network and cluster nodes. If zero, the cluster will impose a dynamically calculated limit only when any node has high load.
`migration-limit`	-1	The number of live migration actions that the cluster is allowed to execute in parallel on a node. A value of -1 means unlimited.
`symmetric-cluster`	TRUE	Can all resources run on any node by default?
`stop-all-resources`	FALSE	Should the cluster stop all resources?
`stop-orphan-resources`	TRUE	Should deleted resources be stopped? This value takes precedence over `is-managed` (i.e. even unmanaged resources will be stopped if deleted from the configuration when this value is TRUE).
`stop-orphan-actions`	TRUE	Should deleted actions be cancelled?
`start-failure-is-fatal`	TRUE	Should a failure to start a resource on a particular node prevent further start attempts on that node? If FALSE, the cluster will decide whether the same node is still eligible based on the resource’s current failure count and `migration-threshold` (see Section 9.2, “Handling Resource Failure”).
`enable-startup-probes`	TRUE	Should the cluster check for active resources during startup?
`maintenance-mode`	FALSE	Should the cluster refrain from monitoring, starting and stopping resources?
`stonith-enabled`	TRUE	Should failed nodes and nodes with resources that can’t be stopped be shot? If you value your data, set up a STONITH device and enable this. If true, or unset, the cluster will refuse to start resources unless one or more STONITH resources have been configured. If false, unresponsive nodes are immediately assumed to be running no resources, and resource takeover to online nodes starts without any further protection (which means data loss if the unresponsive node still accesses shared storage, for example). See also the `requires` meta-attribute in Section 4.4, “Resource Options”.
`stonith-action`	reboot	Action to send to STONITH device. Allowed values are `reboot` and `off`. The value `poweroff` is also allowed, but is only used for legacy devices.
`stonith-timeout`	60s	How long to wait for STONITH actions (reboot, on, off) to complete
`stonith-max-attempts`	10	How many times fencing can fail for a target before the cluster will no longer immediately re-attempt it.
`stonith-watchdog-timeout`	0	If nonzero, along with `have-watchdog=true` automatically set by the cluster, when fencing is required, watchdog-based self-fencing will be performed via SBD without requiring a fencing resource explicitly configured. If `stonith-watchdog-timeout` is set to a positive value, unseen nodes are assumed to self-fence within this much time. `WARNING:` It must be ensured that this value is larger than the `SBD_WATCHDOG_TIMEOUT` environment variable on all nodes. Pacemaker verifies the settings individually on all nodes and prevents startup or shuts down if configured wrongly on the fly. It’s strongly recommended that `SBD_WATCHDOG_TIMEOUT` is set to the same value on all nodes. If `stonith-watchdog-timeout` is set to a negative value, and `SBD_WATCHDOG_TIMEOUT` is set, twice that value will be used. `WARNING:` In this case, it’s essential (currently not verified by pacemaker) that `SBD_WATCHDOG_TIMEOUT` is set to the same value on all nodes.
`concurrent-fencing`	FALSE	Is the cluster allowed to initiate multiple fence actions concurrently?
`fence-reaction`	stop	How should a cluster node react if notified of its own fencing? A cluster node may receive notification of its own fencing if fencing is misconfigured, or if fabric fencing is in use that doesn’t cut cluster communication. Allowed values are `stop` to attempt to immediately stop pacemaker and stay stopped, or `panic` to attempt to immediately reboot the local node, falling back to stop on failure. The default is likely to be changed to `panic` in a future release. (since 2.0.3)
`priority-fencing-delay`	0	Apply specified delay for the fencings that are targeting the lost nodes with the highest total resource priority in case we don’t have the majority of the nodes in our cluster partition, so that the more significant nodes potentially win any fencing match, which is especially meaningful under split-brain of 2-node cluster. A promoted resource instance takes the base priority + 1 on calculation if the base priority is not 0. Any static/random delays that are introduced by `pcmk_delay_base/max` configured for the corresponding fencing resources will be added to this delay. This delay should be significantly greater than, safely twice, the maximum `pcmk_delay_base/max`. By default, priority fencing delay is disabled. (since 2.0.4)
`cluster-delay`	60s	Estimated maximum round-trip delay over the network (excluding action execution). If the DC requires an action to be executed on another node, it will consider the action failed if it does not get a response from the other node in this time (after considering the action’s own timeout). The "correct" value will depend on the speed and load of your network and cluster nodes.
`dc-deadtime`	20s	How long to wait for a response from other nodes during startup. The "correct" value will depend on the speed/load of your network and the type of switches used.
`cluster-ipc-limit`	500	The maximum IPC message backlog before one cluster daemon will disconnect another. This is of use in large clusters, for which a good value is the number of resources in the cluster multiplied by the number of nodes. The default of 500 is also the minimum. Raise this if you see "Evicting client" messages for cluster daemon PIDs in the logs.
`pe-error-series-max`	-1	The number of PE inputs resulting in ERRORs to save. Used when reporting problems. A value of -1 means unlimited (report all).
`pe-warn-series-max`	-1	The number of PE inputs resulting in WARNINGs to save. Used when reporting problems. A value of -1 means unlimited (report all).
`pe-input-series-max`	-1	The number of "normal" PE inputs to save. Used when reporting problems. A value of -1 means unlimited (report all).
`placement-strategy`	default	How the cluster should allocate resources to nodes (see Chapter 12, Utilization and Placement Strategy). Allowed values are `default`, `utilization`, `balanced`, and `minimal`.
`node-health-strategy`	none	How the cluster should react to node health attributes (see Section 9.4, “Tracking Node Health”). Allowed values are `none`, `migrate-on-red`, `only-green`, `progressive`, and `custom`.
`enable-acl`	FALSE	Whether access control lists (ACLs) (see Chapter 13, ACLs) can be used to authorize modifications to the CIB.
`node-health-base`	0	The base health score assigned to a node. Only used when `node-health-strategy` is `progressive`.
`node-health-green`	0	The score to use for a node health attribute whose value is `green`. Only used when `node-health-strategy` is `progressive` or `custom`.
`node-health-yellow`	0	The score to use for a node health attribute whose value is `yellow`. Only used when `node-health-strategy` is `progressive` or `custom`.
`node-health-red`	0	The score to use for a node health attribute whose value is `red`. Only used when `node-health-strategy` is `progressive` or `custom`.
`cluster-recheck-interval`	15min	Pacemaker is primarily event-driven, and looks ahead to know when to recheck the cluster for failure timeouts and most time-based rules. However, it will also recheck the cluster after this amount of inactivity. This has two goals: rules with `date_spec` are only guaranteed to be checked this often, and it also serves as a fail-safe for certain classes of scheduler bugs. A value of 0 disables this polling; positive values are a time interval.
`shutdown-lock`	false	The default of false allows active resources to be recovered elsewhere when their node is cleanly shut down, which is what the vast majority of users will want. However, some users prefer to make resources highly available only for failures, with no recovery for clean shutdowns. If this option is true, resources active on a node when it is cleanly shut down are kept "locked" to that node (not allowed to run elsewhere) until they start again on that node after it rejoins (or for at most shutdown-lock-limit, if set). Stonith resources and Pacemaker Remote connections are never locked. Clone and bundle instances and the master role of promotable clones are currently never locked, though support could be added in a future release. Locks may be manually cleared using the `--refresh` option of `crm_resource` (both the resource and node must be specified; this works with remote nodes if their connection resource’s target-role is set to Stopped, but not if Pacemaker Remote is stopped on the remote node without disabling the connection resource). (since 2.0.4)
`shutdown-lock-limit`	0	If shutdown-lock is true, and this is set to a nonzero time duration, locked resources will be allowed to start after this much time has passed since the node shutdown was initiated, even if the node has not rejoined. (This works with remote nodes only if their connection resource’s target-role is set to Stopped.) (since 2.0.4)
`remove-after-stop`	FALSE	Advanced Use Only: Should the cluster remove resources from the LRM after they are stopped? Values other than the default are, at best, poorly tested and potentially dangerous.
`startup-fencing`	TRUE	Advanced Use Only: Should the cluster shoot unseen nodes? Not using the default is very unsafe!
`election-timeout`	2min	Advanced Use Only: If you need to adjust this value, it probably indicates the presence of a bug.
`shutdown-escalation`	20min	Advanced Use Only: If you need to adjust this value, it probably indicates the presence of a bug.
`join-integration-timeout`	3min	Advanced Use Only: If you need to adjust this value, it probably indicates the presence of a bug.
`join-finalization-timeout`	30min	Advanced Use Only: If you need to adjust this value, it probably indicates the presence of a bug.
`transition-delay`	0s	Advanced Use Only: Delay cluster recovery for the configured interval to allow for additional/related events to occur. Useful if your configuration is sensitive to the order in which ping updates arrive. Enabling this option will slow down cluster recovery under all conditions.