migration-threshold
resource meta-attribute.
[13]
migration-threshold=N
for a resource, it will be banned from the original node after N failures.
Note
migration-threshold
is per resource, even though fail counts are tracked per operation. The operation fail counts are added together to compare against the migration-threshold
.
crm_resource --cleanup
or crm_failcount --delete
(hopefully after first fixing the failure’s cause). It is possible to have fail counts expire automatically by setting the failure-timeout
resource meta-attribute.
Important
migration-threshold=2
and failure-timeout=60s
would cause the resource to move to a new node after 2 failures, and allow it to move back (depending on stickiness and constraint scores) after one minute.
Note
failure-timeout
is measured since the most recent failure. That is, older failures do not individually time out and lower the fail count. Instead, all failures are timed out simultaneously (and the fail count is reset to 0) if there is no new failure for the timeout period.
start-failure-is-fatal
is set to true
(which is the default), start failures cause the fail count to be set to INFINITY
and thus always cause the resource to move immediately.