Product SiteDocumentation Site

4.5. Resource Operations

Operations are actions the cluster can perform on a resource by calling the resource agent. Resource agents must support certain common operations such as start, stop, and monitor, and may implement any others.
Operations may be explicitly configured for two purposes: to override defaults for options (such as timeout) that the cluster will use whenever it initiates the operation, and to run an operation on a recurring basis (for example, to monitor the resource for failure).

Example 4.6. An OCF resource with a non-default start timeout

<primitive id="Public-IP" class="ocf" type="IPaddr" provider="heartbeat">
  <operations>
     <op id="Public-IP-start" name="start" timeout="60s"/>
  </operations>
  <instance_attributes id="params-public-ip">
     <nvpair id="public-ip-addr" name="ip" value="192.0.2.2"/>
  </instance_attributes>
</primitive>
Pacemaker identifies operations by a combination of name and interval, so this combination must be unique for each resource. That is, you should not configure two operations for the same resource with the same name and interval.

4.5.1. Operation Properties

Operation properties may be specified directly in the op element as XML attributes, or in a separate meta_attributes block as nvpair elements. XML attributes take precedence over nvpair elements if both are specified.

Table 4.3. Properties of an Operation

FieldDefaultDescription
id
A unique name for the operation.
name
The action to perform. This can be any action supported by the agent; common values include monitor, start, and stop.
interval
0
How frequently (in seconds) to perform the operation. A value of 0 means "when needed". A positive value defines a recurring action, which is typically used with monitor.
timeout
How long to wait before declaring the action has failed
on-fail
Varies by action:
  • stop: fence if stonith-enabled is true or block otherwise
  • demote: on-fail of the monitor action with role set to Master, if present, enabled, and configured to a value other than demote, or restart otherwise
  • all other actions: restart
The action to take if this action ever fails. Allowed values:
  • ignore: Pretend the resource did not fail.
  • block: Don’t perform any further operations on the resource.
  • stop: Stop the resource and do not start it elsewhere.
  • demote: Demote the resource, without a full restart. This is valid only for promote actions, and for monitor actions with both a nonzero interval and role set to Master; for any other action, a configuration error will be logged, and the default behavior will be used.
  • restart: Stop the resource and start it again (possibly on a different node).
  • fence: STONITH the node on which the resource failed.
  • standby: Move all resources away from the node on which the resource failed.
enabled
TRUE
If false, ignore this operation definition. This is typically used to pause a particular recurring monitor operation; for instance, it can complement the respective resource being unmanaged (is-managed=false), as this alone will not block any configured monitoring. Disabling the operation does not suppress all actions of the given type. Allowed values: true, false.
record-pending
TRUE
If true, the intention to perform the operation is recorded so that GUIs and CLI tools can indicate that an operation is in progress. This is best set as an operation default (see Section 4.5.4, “Setting Global Defaults for Operations”). Allowed values: true, false.
role
Run the operation only on node(s) that the cluster thinks should be in the specified role. This only makes sense for recurring monitor operations. Allowed (case-sensitive) values: Stopped, Started, and in the case of promotable clone resources, Slave and Master.

Note

When on-fail is set to demote, recovery from failure by a successful demote causes the cluster to recalculate whether and where a new instance should be promoted. The node with the failure is eligible, so if master scores have not changed, it will be promoted again.
There is no direct equivalent of migration-threshold for the master role, but the same effect can be achieved with a location constraint using a rule with a node attribute expression for the resource’s fail count.
For example, to immediately ban the master role from a node with any failed promote or master monitor:
<rsc_location id="loc1" rsc="my_primitive">
    <rule id="rule1" score="-INFINITY" role="Master" boolean-op="or">
      <expression id="expr1" attribute="fail-count-my_primitive#promote_0"
        operation="gte" value="1"/>
      <expression id="expr2" attribute="fail-count-my_primitive#monitor_10000"
        operation="gte" value="1"/>
    </rule>
</rsc_location>
This example assumes that there is a promotable clone of the my_primitive resource (note that the primitive name, not the clone name, is used in the rule), and that there is a recurring 10-second-interval monitor configured for the master role (fail count attributes specify the interval in milliseconds).