<div class="zcontentRow"> <p>Hi,</p><p><br></p><p>> crm configure show</p><p>+ crm configure show</p><p>node $id="336855579" paas-controller-1</p><p>node $id="336855580" paas-controller-2</p><p>node $id="336855581" paas-controller-3</p><p>primitive apigateway ocf:heartbeat:apigateway \</p><p>&nbsp; &nbsp; &nbsp; &nbsp; op monitor interval="2s" timeout="20s" on-fail="restart" \</p><p>&nbsp; &nbsp; &nbsp; &nbsp; op stop interval="0" timeout="200s" on-fail="restart" \</p><p>&nbsp; &nbsp; &nbsp; &nbsp; op start interval="0" timeout="9999h" on-fail="restart"</p><p>primitive apigateway_vip ocf:heartbeat:IPaddr2 \</p><p>&nbsp; &nbsp; &nbsp; &nbsp; params ip="20.20.2.7" cidr_netmask="24" \</p><p>&nbsp; &nbsp; &nbsp; &nbsp; op start interval="0" timeout="20" \</p><p>&nbsp; &nbsp; &nbsp; &nbsp; op stop interval="0" timeout="20" \</p><p>&nbsp; &nbsp; &nbsp; &nbsp; op monitor timeout="20s" interval="2s" depth="0"</p><p>primitive router ocf:heartbeat:router \</p><p>&nbsp; &nbsp; &nbsp; &nbsp; op monitor interval="2s" timeout="20s" on-fail="restart" \</p><p>&nbsp; &nbsp; &nbsp; &nbsp; op stop interval="0" timeout="200s" on-fail="restart" \</p><p>&nbsp; &nbsp; &nbsp; &nbsp; op start interval="0" timeout="9999h" on-fail="restart"</p><p>primitive router_vip ocf:heartbeat:IPaddr2 \</p><p>&nbsp; &nbsp; &nbsp; &nbsp; params ip="10.10.1.7" cidr_netmask="24" \</p><p>&nbsp; &nbsp; &nbsp; &nbsp; op start interval="0" timeout="20" \</p><p>&nbsp; &nbsp; &nbsp; &nbsp; op stop interval="0" timeout="20" \</p><p>&nbsp; &nbsp; &nbsp; &nbsp; op monitor timeout="20s" interval="2s" depth="0"</p><p>primitive sdclient ocf:heartbeat:sdclient \</p><p>&nbsp; &nbsp; &nbsp; &nbsp; op monitor interval="2s" timeout="20s" on-fail="restart" \</p><p>&nbsp; &nbsp; &nbsp; &nbsp; op stop interval="0" timeout="200s" on-fail="restart" \</p><p>&nbsp; &nbsp; &nbsp; &nbsp; op start interval="0" timeout="9999h" on-fail="restart"</p><p>primitive sdclient_vip ocf:heartbeat:IPaddr2 \</p><p>&nbsp; &nbsp; &nbsp; &nbsp; params ip="10.10.1.8" cidr_netmask="24" \</p><p>&nbsp; &nbsp; &nbsp; &nbsp; op start interval="0" timeout="20" \</p><p>&nbsp; &nbsp; &nbsp; &nbsp; op stop interval="0" timeout="20" \</p><p>&nbsp; &nbsp; &nbsp; &nbsp; op monitor timeout="20s" interval="2s" depth="0"</p><p>clone apigateway_rep apigateway</p><p>clone router_rep router</p><p>clone sdclient_rep sdclient</p><p>location apigateway_loc apigateway_vip \</p><p>&nbsp; &nbsp; &nbsp; &nbsp; rule $id="apigateway_loc-rule" +inf: apigateway_workable eq 1</p><p>location router_loc router_vip \</p><p>&nbsp; &nbsp; &nbsp; &nbsp; rule $id="router_loc-rule" +inf: router_workable eq 1</p><p>location sdclient_loc sdclient_vip \</p><p>&nbsp; &nbsp; &nbsp; &nbsp; rule $id="sdclient_loc-rule" +inf: sdclient_workable eq 1</p><p>property $id="cib-bootstrap-options" \</p><p>&nbsp; &nbsp; &nbsp; &nbsp; dc-version="1.1.10-42f2063" \</p><p>&nbsp; &nbsp; &nbsp; &nbsp; cluster-infrastructure="corosync" \</p><p>&nbsp; &nbsp; &nbsp; &nbsp; stonith-enabled="false" \</p><p>&nbsp; &nbsp; &nbsp; &nbsp; no-quorum-policy="ignore" \</p><p>&nbsp; &nbsp; &nbsp; &nbsp; start-failure-is-fatal="false" \</p><p>&nbsp; &nbsp; &nbsp; &nbsp; last-lrm-refresh="1486981647"</p><p>op_defaults $id="op_defaults-options" \</p><p>&nbsp; &nbsp; &nbsp; &nbsp; on-fail="restart"</p><p>-------------------------------------------------------------------------------------------------</p><p><br></p><p>and B.T.W, I am using "crm_attribute -N $HOSTNAME -q&nbsp;-l reboot --name <prefix>_workable -v <1 or 0>" in the monitor to update the transient attributes, which control the vip location.</p><p>and also found, the vip resource won't get moved if the related clone resource failed to restart.</p><p style="font-family: 宋体; font-size: medium; line-height: normal; widows: 1;"><br></p><span style="line-height: normal; widows: 1; font-size: 7.0px;;color:#58595b;font-size:10px"></span><div><div class="zhistoryRow" style="display:block"><div class="zhistoryDes" style="width: 100%; height: 28px; line-height: 28px; background-color: #E0E5E9; color: #1388FF; text-align: center;" language-data="HistoryOrgTxt">原始邮件</div><div id="zwriteHistoryContainer"><div class="control-group zhistoryPanel"><div class="zhistoryHeader" style="padding: 8px; background-color: #F5F6F8;"><div><strong language-data="HistorySenderTxt">发件人:</strong><span class="zreadUserName"> <kgaillot@redhat.com>;</span></div><div><strong language-data="HistoryTOTxt">收件人:</strong><span class="zreadUserName" style="display: inline-block;"> <users@clusterlabs.org>;</span></div><div><strong language-data="HistoryDateTxt">日 期 :</strong><span class="">2017年02月13日 23:04</span></div><div><strong language-data="HistorySubjectTxt">主 题 :</strong><span class="zreadTitle"><strong>Re: [ClusterLabs] clone resource not get restarted on fail</strong></span></div></div><p class="zhistoryContent"><br></p><div>On&nbsp;02/13/2017&nbsp;07:57&nbsp;AM,&nbsp;he.hailong5@zte.com.cn&nbsp;wrote:<br>>&nbsp;Pacemaker&nbsp;1.1.10<br>>&nbsp;<br>>&nbsp;Corosync&nbsp;2.3.3<br>>&nbsp;<br>>&nbsp;<br>>&nbsp;this&nbsp;is&nbsp;a&nbsp;3&nbsp;nodes&nbsp;cluster&nbsp;configured&nbsp;with&nbsp;3&nbsp;clone&nbsp;resources,&nbsp;each<br>>&nbsp;attached&nbsp;wih&nbsp;a&nbsp;vip&nbsp;resource&nbsp;of&nbsp;IPAddr2:<br>>&nbsp;<br>>&nbsp;<br>>&nbsp;>crm&nbsp;status<br>>&nbsp;<br>>&nbsp;<br>>&nbsp;Online:&nbsp;[&nbsp;paas-controller-1&nbsp;paas-controller-2&nbsp;paas-controller-3&nbsp;]<br>>&nbsp;<br>>&nbsp;<br>>&nbsp;&nbsp;router_vip&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(ocf::heartbeat:IPaddr2):&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Started&nbsp;paas-controller-1&nbsp;<br>>&nbsp;<br>>&nbsp;&nbsp;sdclient_vip&nbsp;&nbsp;&nbsp;(ocf::heartbeat:IPaddr2):&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Started&nbsp;paas-controller-3&nbsp;<br>>&nbsp;<br>>&nbsp;&nbsp;apigateway_vip&nbsp;(ocf::heartbeat:IPaddr2):&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Started&nbsp;paas-controller-2&nbsp;<br>>&nbsp;<br>>&nbsp;&nbsp;Clone&nbsp;Set:&nbsp;sdclient_rep&nbsp;[sdclient]<br>>&nbsp;<br>>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Started:&nbsp;[&nbsp;paas-controller-1&nbsp;paas-controller-2&nbsp;paas-controller-3&nbsp;]<br>>&nbsp;<br>>&nbsp;&nbsp;Clone&nbsp;Set:&nbsp;router_rep&nbsp;[router]<br>>&nbsp;<br>>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Started:&nbsp;[&nbsp;paas-controller-1&nbsp;paas-controller-2&nbsp;paas-controller-3&nbsp;]<br>>&nbsp;<br>>&nbsp;&nbsp;Clone&nbsp;Set:&nbsp;apigateway_rep&nbsp;[apigateway]<br>>&nbsp;<br>>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Started:&nbsp;[&nbsp;paas-controller-1&nbsp;paas-controller-2&nbsp;paas-controller-3&nbsp;]<br>>&nbsp;<br>>&nbsp;<br>>&nbsp;It&nbsp;is&nbsp;observed&nbsp;that&nbsp;sometimes&nbsp;the&nbsp;clone&nbsp;resource&nbsp;is&nbsp;stuck&nbsp;to&nbsp;monitor<br>>&nbsp;when&nbsp;the&nbsp;service&nbsp;fails:<br>>&nbsp;<br>>&nbsp;<br>>&nbsp;&nbsp;router_vip&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(ocf::heartbeat:IPaddr2):&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Started&nbsp;paas-controller-1&nbsp;<br>>&nbsp;<br>>&nbsp;&nbsp;sdclient_vip&nbsp;&nbsp;&nbsp;(ocf::heartbeat:IPaddr2):&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Started&nbsp;paas-controller-2&nbsp;<br>>&nbsp;<br>>&nbsp;&nbsp;apigateway_vip&nbsp;(ocf::heartbeat:IPaddr2):&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Started&nbsp;paas-controller-3&nbsp;<br>>&nbsp;<br>>&nbsp;&nbsp;Clone&nbsp;Set:&nbsp;sdclient_rep&nbsp;[sdclient]<br>>&nbsp;<br>>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Started:&nbsp;[&nbsp;paas-controller-1&nbsp;paas-controller-2&nbsp;]<br>>&nbsp;<br>>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Stopped:&nbsp;[&nbsp;paas-controller-3&nbsp;]<br>>&nbsp;<br>>&nbsp;&nbsp;Clone&nbsp;Set:&nbsp;router_rep&nbsp;[router]<br>>&nbsp;<br>>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;router&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(ocf::heartbeat:router):&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Started<br>>&nbsp;paas-controller-3&nbsp;FAILED&nbsp;<br>>&nbsp;<br>>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Started:&nbsp;[&nbsp;paas-controller-1&nbsp;paas-controller-2&nbsp;]<br>>&nbsp;<br>>&nbsp;&nbsp;Clone&nbsp;Set:&nbsp;apigateway_rep&nbsp;[apigateway]<br>>&nbsp;<br>>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;apigateway&nbsp;(ocf::heartbeat:apigateway):&nbsp;&nbsp;&nbsp;&nbsp;Started<br>>&nbsp;paas-controller-3&nbsp;FAILED&nbsp;<br>>&nbsp;<br>>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Started:&nbsp;[&nbsp;paas-controller-1&nbsp;paas-controller-2&nbsp;]<br>>&nbsp;<br>>&nbsp;<br>>&nbsp;in&nbsp;the&nbsp;example&nbsp;above.&nbsp;the&nbsp;sdclient_rep&nbsp;get&nbsp;restarted&nbsp;on&nbsp;node&nbsp;3,&nbsp;while<br>>&nbsp;the&nbsp;other&nbsp;two&nbsp;hang&nbsp;at&nbsp;monitoring&nbsp;on&nbsp;node&nbsp;3,&nbsp;here&nbsp;are&nbsp;the&nbsp;ocf&nbsp;logs:<br>>&nbsp;<br>>&nbsp;<br>>&nbsp;abnormal&nbsp;(apigateway_rep):<br>>&nbsp;<br>>&nbsp;2017-02-13&nbsp;18:27:53&nbsp;[23586]===print_log&nbsp;test_monitor&nbsp;run_func&nbsp;main===<br>>&nbsp;Starting&nbsp;health&nbsp;check.<br>>&nbsp;<br>>&nbsp;2017-02-13&nbsp;18:27:53&nbsp;[23586]===print_log&nbsp;test_monitor&nbsp;run_func&nbsp;main===<br>>&nbsp;health&nbsp;check&nbsp;succeed.<br>>&nbsp;<br>>&nbsp;2017-02-13&nbsp;18:27:55&nbsp;[24010]===print_log&nbsp;test_monitor&nbsp;run_func&nbsp;main===<br>>&nbsp;Starting&nbsp;health&nbsp;check.<br>>&nbsp;<br>>&nbsp;2017-02-13&nbsp;18:27:55&nbsp;[24010]===print_log&nbsp;test_monitor&nbsp;run_func&nbsp;main===<br>>&nbsp;Failed:&nbsp;docker&nbsp;daemon&nbsp;is&nbsp;not&nbsp;running.<br>>&nbsp;<br>>&nbsp;2017-02-13&nbsp;18:27:57&nbsp;[24095]===print_log&nbsp;test_monitor&nbsp;run_func&nbsp;main===<br>>&nbsp;Starting&nbsp;health&nbsp;check.<br>>&nbsp;<br>>&nbsp;2017-02-13&nbsp;18:27:57&nbsp;[24095]===print_log&nbsp;test_monitor&nbsp;run_func&nbsp;main===<br>>&nbsp;Failed:&nbsp;docker&nbsp;daemon&nbsp;is&nbsp;not&nbsp;running.<br>>&nbsp;<br>>&nbsp;2017-02-13&nbsp;18:27:59&nbsp;[24159]===print_log&nbsp;test_monitor&nbsp;run_func&nbsp;main===<br>>&nbsp;Starting&nbsp;health&nbsp;check.<br>>&nbsp;<br>>&nbsp;2017-02-13&nbsp;18:27:59&nbsp;[24159]===print_log&nbsp;test_monitor&nbsp;run_func&nbsp;main===<br>>&nbsp;Failed:&nbsp;docker&nbsp;daemon&nbsp;is&nbsp;not&nbsp;running.<br>>&nbsp;<br>>&nbsp;<br>>&nbsp;normal&nbsp;(sdclient_rep):<br>>&nbsp;<br>>&nbsp;2017-02-13&nbsp;18:27:52&nbsp;[23507]===print_log&nbsp;sdclient_monitor&nbsp;run_func<br>>&nbsp;main===&nbsp;health&nbsp;check&nbsp;succeed.<br>>&nbsp;<br>>&nbsp;2017-02-13&nbsp;18:27:54&nbsp;[23630]===print_log&nbsp;sdclient_monitor&nbsp;run_func<br>>&nbsp;main===&nbsp;Starting&nbsp;health&nbsp;check.<br>>&nbsp;<br>>&nbsp;2017-02-13&nbsp;18:27:54&nbsp;[23630]===print_log&nbsp;sdclient_monitor&nbsp;run_func<br>>&nbsp;main===&nbsp;Failed:&nbsp;docker&nbsp;daemon&nbsp;is&nbsp;not&nbsp;running.<br>>&nbsp;<br>>&nbsp;2017-02-13&nbsp;18:27:55&nbsp;[23710]===print_log&nbsp;sdclient_stop&nbsp;run_func&nbsp;main===<br>>&nbsp;Starting&nbsp;stop&nbsp;the&nbsp;container.<br>>&nbsp;<br>>&nbsp;2017-02-13&nbsp;18:27:55&nbsp;[23710]===print_log&nbsp;sdclient_stop&nbsp;run_func&nbsp;main===<br>>&nbsp;docker&nbsp;daemon&nbsp;lost,&nbsp;pretend&nbsp;stop&nbsp;succeed.<br>>&nbsp;<br>>&nbsp;2017-02-13&nbsp;18:27:55&nbsp;[23763]===print_log&nbsp;sdclient_start&nbsp;run_func&nbsp;main===<br>>&nbsp;Starting&nbsp;run&nbsp;the&nbsp;container.<br>>&nbsp;<br>>&nbsp;2017-02-13&nbsp;18:27:55&nbsp;[23763]===print_log&nbsp;sdclient_start&nbsp;run_func&nbsp;main===<br>>&nbsp;docker&nbsp;daemon&nbsp;lost,&nbsp;try&nbsp;again&nbsp;in&nbsp;5&nbsp;secs.<br>>&nbsp;<br>>&nbsp;2017-02-13&nbsp;18:28:00&nbsp;[23763]===print_log&nbsp;sdclient_start&nbsp;run_func&nbsp;main===<br>>&nbsp;docker&nbsp;daemon&nbsp;lost,&nbsp;try&nbsp;again&nbsp;in&nbsp;5&nbsp;secs.<br>>&nbsp;<br>>&nbsp;2017-02-13&nbsp;18:28:05&nbsp;[23763]===print_log&nbsp;sdclient_start&nbsp;run_func&nbsp;main===<br>>&nbsp;docker&nbsp;daemon&nbsp;lost,&nbsp;try&nbsp;again&nbsp;in&nbsp;5&nbsp;secs.<br>>&nbsp;<br>>&nbsp;<br>>&nbsp;If&nbsp;I&nbsp;disable&nbsp;2&nbsp;clone&nbsp;resource,&nbsp;the&nbsp;switch&nbsp;over&nbsp;test&nbsp;for&nbsp;one&nbsp;clone<br>>&nbsp;resource&nbsp;works&nbsp;as&nbsp;expected:&nbsp;fail&nbsp;the&nbsp;service&nbsp;->&nbsp;monitor&nbsp;fails&nbsp;->&nbsp;stop<br>>&nbsp;->&nbsp;start<br>>&nbsp;<br>>&nbsp;<br>>&nbsp;Online:&nbsp;[&nbsp;paas-controller-1&nbsp;paas-controller-2&nbsp;paas-controller-3&nbsp;]<br>>&nbsp;<br>>&nbsp;<br>>&nbsp;&nbsp;sdclient_vip&nbsp;&nbsp;&nbsp;(ocf::heartbeat:IPaddr2):&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Started&nbsp;paas-controller-2&nbsp;<br>>&nbsp;<br>>&nbsp;&nbsp;Clone&nbsp;Set:&nbsp;sdclient_rep&nbsp;[sdclient]<br>>&nbsp;<br>>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Started:&nbsp;[&nbsp;paas-controller-1&nbsp;paas-controller-2&nbsp;]<br>>&nbsp;<br>>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Stopped:&nbsp;[&nbsp;paas-controller-3&nbsp;]<br>>&nbsp;<br>>&nbsp;<br>>&nbsp;what's&nbsp;the&nbsp;reason&nbsp;behind????&nbsp;<br><br>Can&nbsp;you&nbsp;show&nbsp;the&nbsp;configuration&nbsp;of&nbsp;the&nbsp;three&nbsp;clones,&nbsp;their&nbsp;operations,<br>and&nbsp;any&nbsp;constraints?<br><br>Normally,&nbsp;the&nbsp;response&nbsp;is&nbsp;controlled&nbsp;by&nbsp;the&nbsp;monitor&nbsp;operation's&nbsp;on-fail<br>attribute&nbsp;(which&nbsp;defaults&nbsp;to&nbsp;restart).<br><br><br>_______________________________________________<br>Users&nbsp;mailing&nbsp;list:&nbsp;Users@clusterlabs.org<br>http://lists.clusterlabs.org/mailman/listinfo/users<br><br>Project&nbsp;Home:&nbsp;http://www.clusterlabs.org<br>Getting&nbsp;started:&nbsp;http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf<br>Bugs:&nbsp;http://bugs.clusterlabs.org<br></div><p><br></p></div></div></div></div><p><br></p> </div>