<div dir="ltr"><div><div>Hi Andrew,<br><br></div>Here is the output of the verbose crm_failcount.<br><br>   trace: set_crm_log_level:     New log level: 8<br>   trace: cib_native_signon_raw:     Connecting cib_rw channel<br>   trace: pick_ipc_buffer:     Using max message size of 524288<br>   debug: qb_rb_open_2:     shm size:524301; real_size:528384; rb-&gt;word_size:132096<br>   debug: qb_rb_open_2:     shm size:524301; real_size:528384; rb-&gt;word_size:132096<br>   debug: qb_rb_open_2:     shm size:524301; real_size:528384; rb-&gt;word_size:132096<br>   trace: mainloop_add_fd:     Added connection 1 for cib_rw[0x1fd79c0].4<br>   trace: pick_ipc_buffer:     Using max message size of 51200<br>   trace: crm_ipc_send:     Sending from client: cib_rw request id: 1 bytes: 131 timeout:-1 msg...<br>   trace: crm_ipc_send:     Recieved response 1, size=140, rc=140, text: &lt;cib_common_callback_worker cib_op=&quot;register&quot; cib_clientid=&quot;f8cfae2d-51e6-4cd7-97f8-2d6f49bf1f17&quot;/&gt;<br>   trace: cib_native_signon_raw:     reg-reply   &lt;cib_common_callback_worker cib_op=&quot;register&quot; cib_clientid=&quot;f8cfae2d-51e6-4cd7-97f8-2d6f49bf1f17&quot;/&gt;<br>   debug: cib_native_signon_raw:     Connection to CIB successful<br>   trace: cib_create_op:     Sending call options: 00001100, 4352<br>   trace: cib_native_perform_op_delegate:     Sending cib_query message to CIB service (timeout=120s)<br>   trace: crm_ipc_send:     Sending from client: cib_rw request id: 2 bytes: 211 timeout:120000 msg...<br>   trace: internal_ipc_get_reply:     client cib_rw waiting on reply to msg id 2<br>   trace: crm_ipc_send:     Recieved response 2, size=944, rc=944, text: &lt;cib-reply t=&quot;cib&quot; cib_op=&quot;cib_query&quot; cib_callid=&quot;2&quot; cib_clientid=&quot;f8cfae2d-51e6-4cd7-97f8-2d6f49bf1f17&quot; cib_callopt=&quot;4352&quot; cib_rc=&quot;0&quot;&gt;&lt;cib_calldata&gt;&lt;nodes&gt;&lt;node uname=&quot;<a href="http://node2.domain.com">node2.domain.com</a>&quot; id=&quot;o<br>   trace: cib_native_perform_op_delegate:     Reply   &lt;cib-reply t=&quot;cib&quot; cib_op=&quot;cib_query&quot; cib_callid=&quot;2&quot; cib_clientid=&quot;f8cfae2d-51e6-4cd7-97f8-2d6f49bf1f17&quot; cib_callopt=&quot;4352&quot; cib_rc=&quot;0&quot;&gt;<br>   trace: cib_native_perform_op_delegate:     Reply     &lt;cib_calldata&gt;<br>   trace: cib_native_perform_op_delegate:     Reply       &lt;nodes&gt;<br>   trace: cib_native_perform_op_delegate:     Reply         &lt;node uname=&quot;<a href="http://node2.domain.com">node2.domain.com</a>&quot; id=&quot;<a href="http://node2.domain.com">node2.domain.com</a>&quot;&gt;<br>   trace: cib_native_perform_op_delegate:     Reply           &lt;instance_attributes id=&quot;<a href="http://nodes-node2.domain.com">nodes-node2.domain.com</a>&quot;&gt;<br>   trace: cib_native_perform_op_delegate:     Reply             &lt;nvpair id=&quot;nodes-node2.domain.com-postgres_msg-data-status&quot; name=&quot;postgres_msg-data-status&quot; value=&quot;STREAMING|SYNC&quot;/&gt;<br>   trace: cib_native_perform_op_delegate:     Reply             &lt;nvpair id=&quot;nodes-node2.domain.com-standby&quot; name=&quot;standby&quot; value=&quot;off&quot;/&gt;<br>   trace: cib_native_perform_op_delegate:     Reply           &lt;/instance_attributes&gt;<br>   trace: cib_native_perform_op_delegate:     Reply         &lt;/node&gt;<br>   trace: cib_native_perform_op_delegate:     Reply         &lt;node uname=&quot;<a href="http://node1.domain.com">node1.domain.com</a>&quot; id=&quot;<a href="http://node1.domain.com">node1.domain.com</a>&quot;&gt;<br>   trace: cib_native_perform_op_delegate:     Reply           &lt;instance_attributes id=&quot;<a href="http://nodes-node1.domain.com">nodes-node1.domain.com</a>&quot;&gt;<br>   trace: cib_native_perform_op_delegate:     Reply             &lt;nvpair id=&quot;nodes-node1.domain.com-postgres_msg-data-status&quot; name=&quot;postgres_msg-data-status&quot; value=&quot;LATEST&quot;/&gt;<br>   trace: cib_native_perform_op_delegate:     Reply             &lt;nvpair id=&quot;nodes-node1.domain.com-standby&quot; name=&quot;standby&quot; value=&quot;off&quot;/&gt;<br>   trace: cib_native_perform_op_delegate:     Reply           &lt;/instance_attributes&gt;<br>   trace: cib_native_perform_op_delegate:     Reply         &lt;/node&gt;<br>   trace: cib_native_perform_op_delegate:     Reply       &lt;/nodes&gt;<br>   trace: cib_native_perform_op_delegate:     Reply     &lt;/cib_calldata&gt;<br>   trace: cib_native_perform_op_delegate:     Reply   &lt;/cib-reply&gt;<br>   trace: cib_native_perform_op_delegate:     Syncronous reply 2 received<br>   debug: get_cluster_node_uuid:     Result section   &lt;nodes&gt;<br>   debug: get_cluster_node_uuid:     Result section     &lt;node uname=&quot;<a href="http://node2.domain.com">node2.domain.com</a>&quot; id=&quot;<a href="http://node2.domain.com">node2.domain.com</a>&quot;&gt;<br>   debug: get_cluster_node_uuid:     Result section       &lt;instance_attributes id=&quot;<a href="http://nodes-node2.domain.com">nodes-node2.domain.com</a>&quot;&gt;<br>   debug: get_cluster_node_uuid:     Result section         &lt;nvpair id=&quot;nodes-node2.domain.com-postgres_msg-data-status&quot; name=&quot;postgres_msg-data-status&quot; value=&quot;STREAMING|SYNC&quot;/&gt;<br>   debug: get_cluster_node_uuid:     Result section         &lt;nvpair id=&quot;nodes-node2.domain.com-standby&quot; name=&quot;standby&quot; value=&quot;off&quot;/&gt;<br>   debug: get_cluster_node_uuid:     Result section       &lt;/instance_attributes&gt;<br>   debug: get_cluster_node_uuid:     Result section     &lt;/node&gt;<br>   debug: get_cluster_node_uuid:     Result section     &lt;node uname=&quot;<a href="http://node1.domain.com">node1.domain.com</a>&quot; id=&quot;<a href="http://node1.domain.com">node1.domain.com</a>&quot;&gt;<br>   debug: get_cluster_node_uuid:     Result section       &lt;instance_attributes id=&quot;<a href="http://nodes-node1.domain.com">nodes-node1.domain.com</a>&quot;&gt;<br>   debug: get_cluster_node_uuid:     Result section         &lt;nvpair id=&quot;nodes-node1.domain.com-postgres_msg-data-status&quot; name=&quot;postgres_msg-data-status&quot; value=&quot;LATEST&quot;/&gt;<br>   debug: get_cluster_node_uuid:     Result section         &lt;nvpair id=&quot;nodes-node1.domain.com-standby&quot; name=&quot;standby&quot; value=&quot;off&quot;/&gt;<br>   debug: get_cluster_node_uuid:     Result section       &lt;/instance_attributes&gt;<br>   debug: get_cluster_node_uuid:     Result section     &lt;/node&gt;<br>   debug: get_cluster_node_uuid:     Result section   &lt;/nodes&gt;<br>    info: query_node_uuid:     Mapped <a href="http://node1.domain.com">node1.domain.com</a> to <a href="http://node1.domain.com">node1.domain.com</a><br>   trace: pick_ipc_buffer:     Using max message size of 51200<br>    info: attrd_update_delegate:     Connecting to cluster... 5 retries remaining<br>   debug: qb_rb_open_2:     shm size:51213; real_size:53248; rb-&gt;word_size:13312<br>   debug: qb_rb_open_2:     shm size:51213; real_size:53248; rb-&gt;word_size:13312<br>   debug: qb_rb_open_2:     shm size:51213; real_size:53248; rb-&gt;word_size:13312<br>   trace: crm_ipc_send:     Sending from client: attrd request id: 3 bytes: 168 timeout:5000 msg...<br>   trace: internal_ipc_get_reply:     client attrd waiting on reply to msg id 3<br>   trace: crm_ipc_send:     Recieved response 3, size=88, rc=88, text: &lt;ack function=&quot;attrd_ipc_dispatch&quot; line=&quot;129&quot;/&gt;<br>   debug: attrd_update_delegate:     Sent update: (null)=(null) for <a href="http://node1.domain.com">node1.domain.com</a><br>    info: main:     Update (null)=&lt;none&gt; sent via attrd<br>   debug: cib_native_signoff:     Signing out of the CIB Service<br>   trace: mainloop_del_fd:     Removing client cib_rw[0x1fd79c0]<br>   trace: mainloop_gio_destroy:     Destroying client cib_rw[0x1fd79c0]<br>   trace: crm_ipc_close:     Disconnecting cib_rw IPC connection 0x1fdb020 (0x1fdb1a0.(nil))<br>   debug: qb_ipcc_disconnect:     qb_ipcc_disconnect()<br>   trace: qb_rb_close:     ENTERING qb_rb_close()<br>   debug: qb_rb_close:     Closing ringbuffer: /dev/shm/qb-cib_rw-request-8347-9344-14-header<br>   trace: qb_rb_close:     ENTERING qb_rb_close()<br>   debug: qb_rb_close:     Closing ringbuffer: /dev/shm/qb-cib_rw-response-8347-9344-14-header<br>   trace: qb_rb_close:     ENTERING qb_rb_close()<br>   debug: qb_rb_close:     Closing ringbuffer: /dev/shm/qb-cib_rw-event-8347-9344-14-header<br>   trace: cib_native_destroy:     destroying 0x1fd7910<br>   trace: crm_ipc_destroy:     Destroying IPC connection to cib_rw: 0x1fdb020<br>   trace: mainloop_gio_destroy:     Destroyed client cib_rw[0x1fd79c0]<br>   trace: crm_exit:     cleaning up libxml<br>    info: crm_xml_cleanup:     Cleaning up memory from libxml2<br>   trace: crm_exit:     exit 0<br><br></div>I hope it helps.<br></div><div class="gmail_extra"><br><div class="gmail_quote">2015-05-20 6:34 GMT+02:00 Andrew Beekhof <span dir="ltr">&lt;<a href="mailto:andrew@beekhof.net" target="_blank">andrew@beekhof.net</a>&gt;</span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class=""><br>
&gt; On 4 May 2015, at 6:43 pm, Alexandre &lt;<a href="mailto:alxgomz@gmail.com">alxgomz@gmail.com</a>&gt; wrote:<br>
&gt;<br>
&gt; Hi,<br>
&gt;<br>
&gt; I have a pacemaker / corosync / cman cluster running on redhat 6.6.<br>
&gt; Although cluster is working as expected, I have some trace of old failures (several monthes ago) I can&#39;t gert rid of.<br>
&gt; Basically I have set cluster-recheck-interval=&quot;300&quot; and failure-timeout=&quot;600&quot; (in rsc_defaults) as shown bellow:<br>
&gt;<br>
&gt; property $id=&quot;cib-bootstrap-options&quot; \<br>
&gt;     dc-version=&quot;1.1.10-14.el6-368c726&quot; \<br>
&gt;     cluster-infrastructure=&quot;cman&quot; \<br>
&gt;     expected-quorum-votes=&quot;2&quot; \<br>
&gt;     no-quorum-policy=&quot;ignore&quot; \<br>
&gt;     stonith-enabled=&quot;false&quot; \<br>
&gt;     last-lrm-refresh=&quot;1429702408&quot; \<br>
&gt;     maintenance-mode=&quot;false&quot; \<br>
&gt;     cluster-recheck-interval=&quot;300&quot;<br>
&gt; rsc_defaults $id=&quot;rsc-options&quot; \<br>
&gt;     failure-timeout=&quot;600&quot;<br>
&gt;<br>
&gt; So I would expect old failure to be purged from the cib long ago, but actually I have the following when issuing crm_mon -frA1.<br>
<br>
</span>I think automatic deletion didnt arrive until later.<br>
<span class=""><br>
&gt;<br>
&gt; Migration summary:<br>
&gt; * Node host1:<br>
&gt;    etc_ml_drbd: migration-threshold=1000000 fail-count=244 last-failure=&#39;Sat Feb 14 17:04:05 2015&#39;<br>
&gt;    spool_postfix_drbd_msg: migration-threshold=1000000 fail-count=244 last-failure=&#39;Sat Feb 14 17:04:05 2015&#39;<br>
&gt;    lib_ml_drbd: migration-threshold=1000000 fail-count=244 last-failure=&#39;Sat Feb 14 17:04:05 2015&#39;<br>
&gt;    lib_imap_drbd: migration-threshold=1000000 fail-count=244 last-failure=&#39;Sat Feb 14 17:04:05 2015&#39;<br>
&gt;    spool_imap_drbd: migration-threshold=1000000 fail-count=11654 last-failure=&#39;Sat Feb 14 17:04:05 2015&#39;<br>
&gt;    spool_ml_drbd: migration-threshold=1000000 fail-count=244 last-failure=&#39;Sat Feb 14 17:04:05 2015&#39;<br>
&gt;    documents_drbd: migration-threshold=1000000 fail-count=248 last-failure=&#39;Sat Feb 14 17:58:55 2015&#39;<br>
&gt; * Node host2<br>
&gt;    documents_drbd: migration-threshold=1000000 fail-count=548 last-failure=&#39;Sat Feb 14 16:26:33 2015&#39;<br>
&gt;<br>
&gt; I have tried to crm_failcount -D the resources also tried cleanup... but it&#39;s still there!<br>
<br>
</span>Oh?  Can you re-run with -VVVVVV and show us the result?<br>
<span class=""><br>
&gt; How can I get reid of those record (so my monitoring tools stop complaining) .<br>
&gt;<br>
&gt; Regards.<br>
</span>&gt; _______________________________________________<br>
&gt; Users mailing list: <a href="mailto:Users@clusterlabs.org">Users@clusterlabs.org</a><br>
&gt; <a href="http://clusterlabs.org/mailman/listinfo/users" target="_blank">http://clusterlabs.org/mailman/listinfo/users</a><br>
&gt;<br>
&gt; Project Home: <a href="http://www.clusterlabs.org" target="_blank">http://www.clusterlabs.org</a><br>
&gt; Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>
&gt; Bugs: <a href="http://bugs.clusterlabs.org" target="_blank">http://bugs.clusterlabs.org</a><br>
<br>
<br>
_______________________________________________<br>
Users mailing list: <a href="mailto:Users@clusterlabs.org">Users@clusterlabs.org</a><br>
<a href="http://clusterlabs.org/mailman/listinfo/users" target="_blank">http://clusterlabs.org/mailman/listinfo/users</a><br>
<br>
Project Home: <a href="http://www.clusterlabs.org" target="_blank">http://www.clusterlabs.org</a><br>
Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>
Bugs: <a href="http://bugs.clusterlabs.org" target="_blank">http://bugs.clusterlabs.org</a><br>
</blockquote></div><br></div>