<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Jan 30, 2024 at 2:21 PM Walker, Chris <<a href="mailto:christopher.walker@hpe.com">christopher.walker@hpe.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div class="msg4093602634382639174">


<div lang="EN-US" style="overflow-wrap: break-word;">

<div class="m_-7491632143547205549WordSection1">

<p class="MsoNormal"><span style="color:rgb(33,33,33)">>>> However, now it seems to wait that amount of time before it elects a<br>

>>> DC, even when quorum is acquired earlier.  In my log snippet below,<br>

>>> with dc-deadtime 300s,<br>

>><br>

>> The dc-deadtime is not waiting for quorum, but for another DC to show<br>

>> up. If all nodes show up, it can proceed, but otherwise it has to wait.<br>

<br>

> I believe all the nodes showed up by 14:17:04, but it still waited until 14:19:26 to elect a DC:<br>

<br>

> Jan 29 14:14:25 gopher12 pacemaker-controld  [123697] (peer_update_callback)    info: Cluster node gopher12 is now membe  (was in unknown state)<br>

> Jan 29 14:17:04 gopher12 pacemaker-controld  [123697] (peer_update_callback)    info: Cluster node gopher11 is now membe  (was in unknown state)<br>

> Jan 29 14:17:04 gopher12 pacemaker-controld  [123697] (quorum_notification_cb)  notice: Quorum acquired | membership=54 members=2<br>

> Jan 29 14:19:26 gopher12 pacemaker-controld  [123697] (do_log)  info: Input I_ELECTION_DC received in state S_ELECTION from election_win_cb<br>

<br>

> This is a cluster with 2 nodes, gopher11 and gopher12.<br>

<br>

This is our experience with dc-deadtime too: even if both nodes in the cluster show up, dc-deadtime must elapse before the cluster starts.  This was discussed on this list a while back (<a href="https://www.mail-archive.com/users@clusterlabs.org/msg03897.html" target="_blank">https://www.mail-archive.com/users@clusterlabs.org/msg03897.html</a></span>)<span style="color:rgb(33,33,33)">

 and an RFE came out of it (<a href="https://bugs.clusterlabs.org/show_bug.cgi?id=5310" target="_blank">https://bugs.clusterlabs.org/show_bug.cgi?id=5310</a></span>). 

</p>

<p class="MsoNormal"><u></u> <u></u></p>

<p class="MsoNormal">I’ve worked around this by having an ExecStartPre directive for Corosync that does essentially:</p>

<p class="MsoNormal"><u></u> <u></u></p>

<p class="MsoNormal">while ! systemctl -H ${peer} is-active corosync; do sleep 5; done</p>

<p class="MsoNormal"><u></u> <u></u></p>

<p class="MsoNormal">With this in place, the nodes wait for each other before starting Corosync and Pacemaker.  We can then use the default 20s dc-deadtime so that the DC election happens quickly once both nodes are up.</p></div></div></div></blockquote><div><br></div><div>Actually wait-for-all coming per default with 2-node should lead to quorum being delayed till both nodes showed up.</div><div>And if we make the cluster not ignore quorum it shouldn't start fencing before it sees the peer - right?</div><div>Running a 2-node-cluster ignoring quorum or without wait-for-all is a delicate thing anyway I would say</div><div>and shouldn't work in a generic case. Not saying it is an issue here - guess there just isn't enough</div><div>info about the cluster to say.</div><div>So you shouldn't need this raised dc-deadtime and thus wouldn't experience large startup-delays.</div><div><br></div><div>Regards,</div><div>Klaus</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div class="msg4093602634382639174"><div lang="EN-US" style="overflow-wrap: break-word;"><div class="m_-7491632143547205549WordSection1">

<p class="MsoNormal">Thanks,</p>

<p class="MsoNormal">Chris</p>

<p class="MsoNormal"><u></u> <u></u></p>

<div style="border-right:none;border-bottom:none;border-left:none;border-top:1pt solid rgb(181,196,223);padding:3pt 0in 0in">

<p class="MsoNormal" style="margin-right:0in;margin-bottom:12pt;margin-left:0.5in">

<b><span style="font-size:12pt;color:black">From: </span></b><span style="font-size:12pt;color:black">Users <<a href="mailto:users-bounces@clusterlabs.org" target="_blank">users-bounces@clusterlabs.org</a>> on behalf of Faaland, Olaf P. via Users <<a href="mailto:users@clusterlabs.org" target="_blank">users@clusterlabs.org</a>><br>

<b>Date: </b>Monday, January 29, 2024 at 7:46 PM<br>

<b>To: </b>Ken Gaillot <<a href="mailto:kgaillot@redhat.com" target="_blank">kgaillot@redhat.com</a>>, Cluster Labs - All topics related to open-source clustering welcomed <<a href="mailto:users@clusterlabs.org" target="_blank">users@clusterlabs.org</a>><br>

<b>Cc: </b>Faaland, Olaf P. <<a href="mailto:faaland1@llnl.gov" target="_blank">faaland1@llnl.gov</a>><br>

<b>Subject: </b>Re: [ClusterLabs] controlling cluster behavior on startup<u></u><u></u></span></p>

</div>

<div>

<p class="MsoNormal" style="margin-left:0.5in">>> However, now it seems to wait that amount of time before it elects a<br>

>> DC, even when quorum is acquired earlier.  In my log snippet below,<br>

>> with dc-deadtime 300s,<br>

><br>

> The dc-deadtime is not waiting for quorum, but for another DC to show<br>

> up. If all nodes show up, it can proceed, but otherwise it has to wait.<br>

<br>

I believe all the nodes showed up by 14:17:04, but it still waited until 14:19:26 to elect a DC:<br>

<br>

Jan 29 14:14:25 gopher12 pacemaker-controld  [123697] (peer_update_callback)    info: Cluster node gopher12 is now membe  (was in unknown state)<br>

Jan 29 14:17:04 gopher12 pacemaker-controld  [123697] (peer_update_callback)    info: Cluster node gopher11 is now membe  (was in unknown state)<br>

Jan 29 14:17:04 gopher12 pacemaker-controld  [123697] (quorum_notification_cb)  notice: Quorum acquired | membership=54 members=2<br>

Jan 29 14:19:26 gopher12 pacemaker-controld  [123697] (do_log)  info: Input I_ELECTION_DC received in state S_ELECTION from election_win_cb<br>

<br>

This is a cluster with 2 nodes, gopher11 and gopher12.<br>

<br>

Am I misreading that?<br>

<br>

thanks,<br>

Olaf<br>

<br>

________________________________________<br>

From: Ken Gaillot <<a href="mailto:kgaillot@redhat.com" target="_blank">kgaillot@redhat.com</a>><br>

Sent: Monday, January 29, 2024 3:49 PM<br>

To: Faaland, Olaf P.; Cluster Labs - All topics related to open-source clustering welcomed<br>

Subject: Re: [ClusterLabs] controlling cluster behavior on startup<br>

<br>

On Mon, 2024-01-29 at 22:48 +0000, Faaland, Olaf P. wrote:<br>

> Thank you, Ken.<br>

><br>

> I changed my configuration management system to put an initial<br>

> cib.xml into /var/lib/pacemaker/cib/, which sets all the property<br>

> values I was setting via pcs commands, including dc-deadtime.  I<br>

> removed those "pcs property set" commands from the ones that are run<br>

> at startup time.<br>

><br>

> That worked in the sense that after Pacemaker start, the node waits<br>

> my newly specified dc-deadtime of 300s before giving up on the<br>

> partner node and fencing it, if the partner never appears as a<br>

> member.<br>

><br>

> However, now it seems to wait that amount of time before it elects a<br>

> DC, even when quorum is acquired earlier.  In my log snippet below,<br>

> with dc-deadtime 300s,<br>

<br>

The dc-deadtime is not waiting for quorum, but for another DC to show<br>

up. If all nodes show up, it can proceed, but otherwise it has to wait.<br>

<br>

><br>

> 14:14:24 Pacemaker starts on gopher12<br>

> 14:17:04 quorum is acquired<br>

> 14:19:26 Election Trigger just popped (start time + dc-deadtime<br>

> seconds)<br>

> 14:19:26 gopher12 wins the election<br>

><br>

> Is there other configuration that needs to be present in the cib at<br>

> startup time?<br>

><br>

> thanks,<br>

> Olaf<br>

><br>

> === log extract using new system of installing partial cib.xml before<br>

> startup<br>

> Jan 29 14:14:24 gopher12 pacemakerd          [123690]<br>

> (main)    notice: Starting Pacemaker 2.1.7-1.t4 | build=2.1.7<br>

> features:agent-manpages ascii-docs compat-2.0 corosync-ge-2 default-<br>

> concurrent-fencing generated-manpages monotonic nagios ncurses remote<br>

> systemd<br>

> Jan 29 14:14:25 gopher12 pacemaker-attrd     [123695]<br>

> (attrd_start_election_if_needed)  info: Starting an election to<br>

> determine the writer<br>

> Jan 29 14:14:25 gopher12 pacemaker-attrd     [123695]<br>

> (election_check)  info: election-attrd won by local node<br>

> Jan 29 14:14:25 gopher12 pacemaker-controld  [123697]<br>

> (peer_update_callback)    info: Cluster node gopher12 is now member<br>

> (was in unknown state)<br>

> Jan 29 14:17:04 gopher12 pacemaker-controld  [123697]<br>

> (quorum_notification_cb)  notice: Quorum acquired | membership=54<br>

> members=2<br>

> Jan 29 14:19:26 gopher12 pacemaker-controld  [123697]<br>

> (crm_timer_popped)        info: Election Trigger just popped |<br>

> input=I_DC_TIMEOUT time=300000ms<br>

> Jan 29 14:19:26 gopher12 pacemaker-controld  [123697]<br>

> (do_log)  warning: Input I_DC_TIMEOUT received in state S_PENDING<br>

> from crm_timer_popped<br>

> Jan 29 14:19:26 gopher12 pacemaker-controld  [123697]<br>

> (do_state_transition)     info: State transition S_PENDING -><br>

> S_ELECTION | input=I_DC_TIMEOUT cause=C_TIMER_POPPED<br>

> origin=crm_timer_popped<br>

> Jan 29 14:19:26 gopher12 pacemaker-controld  [123697]<br>

> (election_check)  info: election-DC won by local node<br>

> Jan 29 14:19:26 gopher12 pacemaker-controld  [123697] (do_log)  info:<br>

> Input I_ELECTION_DC received in state S_ELECTION from election_win_cb<br>

> Jan 29 14:19:26 gopher12 pacemaker-controld  [123697]<br>

> (do_state_transition)     notice: State transition S_ELECTION -><br>

> S_INTEGRATION | input=I_ELECTION_DC cause=C_FSA_INTERNAL<br>

> origin=election_win_cb<br>

> Jan 29 14:19:26 gopher12 pacemaker-schedulerd[123696]<br>

> (recurring_op_for_active)         info: Start 10s-interval monitor<br>

> for gopher11_zpool on gopher11<br>

> Jan 29 14:19:26 gopher12 pacemaker-schedulerd[123696]<br>

> (recurring_op_for_active)         info: Start 10s-interval monitor<br>

> for gopher12_zpool on gopher12<br>

><br>

><br>

> === initial cib.xml contents<br>

> <cib crm_feature_set="3.19.0" validate-with="pacemaker-3.9" epoch="9"<br>

> num_updates="0" admin_epoch="0" cib-last-written="Mon Jan 29 11:07:06<br>

> 2024" update-origin="gopher12" update-client="root" update-<br>

> user="root" have-quorum="0" dc-uuid="2"><br>

>   <configuration><br>

>     <crm_config><br>

>       <cluster_property_set id="cib-bootstrap-options"><br>

>         <nvpair id="cib-bootstrap-options-stonith-action"<br>

> name="stonith-action" value="off"/><br>

>         <nvpair id="cib-bootstrap-options-have-watchdog" name="have-<br>

> watchdog" value="false"/><br>

>         <nvpair id="cib-bootstrap-options-dc-version" name="dc-<br>

> version" value="2.1.7-1.t4-2.1.7"/><br>

>         <nvpair id="cib-bootstrap-options-cluster-infrastructure"<br>

> name="cluster-infrastructure" value="corosync"/><br>

>         <nvpair id="cib-bootstrap-options-cluster-name"<br>

> name="cluster-name" value="gopher11"/><br>

>         <nvpair id="cib-bootstrap-options-cluster-recheck-inte"<br>

> name="cluster-recheck-interval" value="60"/><br>

>         <nvpair id="cib-bootstrap-options-start-failure-is-fat"<br>

> name="start-failure-is-fatal" value="false"/><br>

>         <nvpair id="cib-bootstrap-options-dc-deadtime" name="dc-<br>

> deadtime" value="300"/><br>

>       </cluster_property_set><br>

>     </crm_config><br>

>     <nodes><br>

>       <node id="1" uname="gopher11"/><br>

>       <node id="2" uname="gopher12"/><br>

>     </nodes><br>

>     <resources/><br>

>     <constraints/><br>

>   </configuration><br>

> </cib><br>

><br>

> ________________________________________<br>

> From: Ken Gaillot <<a href="mailto:kgaillot@redhat.com" target="_blank">kgaillot@redhat.com</a>><br>

> Sent: Monday, January 29, 2024 10:51 AM<br>

> To: Cluster Labs - All topics related to open-source clustering<br>

> welcomed<br>

> Cc: Faaland, Olaf P.<br>

> Subject: Re: [ClusterLabs] controlling cluster behavior on startup<br>

><br>

> On Mon, 2024-01-29 at 18:05 +0000, Faaland, Olaf P. via Users wrote:<br>

> > Hi,<br>

> ><br>

> > I have configured clusters of node pairs, so each cluster has 2<br>

> > nodes.  The cluster members are statically defined in corosync.conf<br>

> > before corosync or pacemaker is started, and quorum {two_node: 1}<br>

> > is<br>

> > set.<br>

> ><br>

> > When both nodes are powered off and I power them on, they do not<br>

> > start pacemaker at exactly the same time.  The time difference may<br>

> > be<br>

> > a few minutes depending on other factors outside the nodes.<br>

> ><br>

> > My goals are (I call the first node to start pacemaker "node1"):<br>

> > 1) I want to control how long pacemaker on node1 waits before<br>

> > fencing<br>

> > node2 if node2 does not start pacemaker.<br>

> > 2) If node1 is part-way through that waiting period, and node2<br>

> > starts<br>

> > pacemaker so they detect each other, I would like them to proceed<br>

> > immediately to probing resource state and starting resources which<br>

> > are down, not wait until the end of that "grace period".<br>

> ><br>

> > It looks from the documentation like dc-deadtime is how #1 is<br>

> > controlled, and #2 is expected normal behavior.  However, I'm<br>

> > seeing<br>

> > fence actions before dc-deadtime has passed.<br>

> ><br>

> > Am I misunderstanding Pacemaker's expected behavior and/or how dc-<br>

> > deadtime should be used?<br>

><br>

> You have everything right. The problem is that you're starting with<br>

> an<br>

> empty configuration every time, so the default dc-deadtime is being<br>

> used for the first election (before you can set the desired value).<br>

><br>

> I can't think of anything you can do to get around that, since the<br>

> controller starts the timer as soon as it starts up. Would it be<br>

> possible to bake an initial configuration into the PXE image?<br>

><br>

> When the timer value changes, we could stop the existing timer and<br>

> restart it. There's a risk that some external automation could make<br>

> repeated changes to the timeout, thus never letting it expire, but<br>

> that<br>

> seems preferable to your problem. I've created an issue for that:<br>

><br>

><br>

> <a href="https://urldefense.us/v3/__https:/projects.clusterlabs.org/T764" target="_blank">https://urldefense.us/v3/__https:/projects.clusterlabs.org/T764</a>

<br>

><br>

> BTW there's also election-timeout. I'm not sure offhand how that<br>

> interacts; it might be necessary to raise that one as well.<br>

><br>

> > One possibly unusual aspect of this cluster is that these two nodes<br>

> > are stateless - they PXE boot from an image on another server - and<br>

> > I<br>

> > build the cluster configuration at boot time with a series of pcs<br>

> > commands, because the nodes have no local storage for this<br>

> > purpose.  The commands are:<br>

> ><br>

> > ['pcs', 'cluster', 'start']<br>

> > ['pcs', 'property', 'set', 'stonith-action=off']<br>

> > ['pcs', 'property', 'set', 'cluster-recheck-interval=60']<br>

> > ['pcs', 'property', 'set', 'start-failure-is-fatal=false']<br>

> > ['pcs', 'property', 'set', 'dc-deadtime=300']<br>

> > ['pcs', 'stonith', 'create', 'fence_gopher11', 'fence_powerman',<br>

> > 'ip=192.168.64.65', 'pcmk_host_check=static-list',<br>

> > 'pcmk_host_list=gopher11,gopher12']<br>

> > ['pcs', 'stonith', 'create', 'fence_gopher12', 'fence_powerman',<br>

> > 'ip=192.168.64.65', 'pcmk_host_check=static-list',<br>

> > 'pcmk_host_list=gopher11,gopher12']<br>

> > ['pcs', 'resource', 'create', 'gopher11_zpool', 'ocf:llnl:zpool',<br>

> > 'import_options="-f -N -d /dev/disk/by-vdev"', 'pool=gopher11',<br>

> > 'op',<br>

> > 'start', 'timeout=805']<br>

> > ...<br>

> > ['pcs', 'property', 'set', 'no-quorum-policy=ignore']<br>

><br>

> BTW you don't need to change no-quorum-policy when you're using<br>

> two_node with Corosync.<br>

><br>

> > I could, instead, generate a CIB so that when Pacemaker is started,<br>

> > it has a full config.  Is that better?<br>

> ><br>

> > thanks,<br>

> > Olaf<br>

> ><br>

> > === corosync.conf:<br>

> > totem {<br>

> >     version: 2<br>

> >     cluster_name: gopher11<br>

> >     secauth: off<br>

> >     transport: udpu<br>

> > }<br>

> > nodelist {<br>

> >     node {<br>

> >         ring0_addr: gopher11<br>

> >         name: gopher11<br>

> >         nodeid: 1<br>

> >     }<br>

> >     node {<br>

> >         ring0_addr: gopher12<br>

> >         name: gopher12<br>

> >         nodeid: 2<br>

> >     }<br>

> > }<br>

> > quorum {<br>

> >     provider: corosync_votequorum<br>

> >     two_node: 1<br>

> > }<br>

> ><br>

> > === Log excerpt<br>

> ><br>

> > Here's an except from Pacemaker logs that reflect what I'm<br>

> > seeing.  These are from gopher12, the node that came up first.  The<br>

> > other node, which is not yet up, is gopher11.<br>

> ><br>

> > Jan 25 17:55:38 gopher12 pacemakerd          [116033]<br>

> > (main)    notice: Starting Pacemaker 2.1.7-1.t4 | build=2.1.7<br>

> > features:agent-manpages ascii-docs compat-2.0 corosync-ge-2<br>

> > default-<br>

> > concurrent-fencing generated-manpages monotonic nagios ncurses<br>

> > remote<br>

> > systemd<br>

> > Jan 25 17:55:39 gopher12 pacemaker-controld  [116040]<br>

> > (peer_update_callback)    info: Cluster node gopher12 is now member<br>

> > (was in unknown state)<br>

> > Jan 25 17:55:43 gopher12 pacemaker-based     [116035]<br>

> > (cib_perform_op)  info: ++<br>

> > /cib/configuration/crm_config/cluster_property_set[@id='cib-<br>

> > bootstrap-options']:  <nvpair id="cib-bootstrap-options-dc-<br>

> > deadtime"<br>

> > name="dc-deadtime" value="300"/><br>

> > Jan 25 17:56:00 gopher12 pacemaker-controld  [116040]<br>

> > (crm_timer_popped)        info: Election Trigger just popped |<br>

> > input=I_DC_TIMEOUT time=300000ms<br>

> > Jan 25 17:56:01 gopher12 pacemaker-based     [116035]<br>

> > (cib_perform_op)  info: ++<br>

> > /cib/configuration/crm_config/cluster_property_set[@id='cib-<br>

> > bootstrap-options']:  <nvpair id="cib-bootstrap-options-no-quorum-<br>

> > policy" name="no-quorum-policy" value="ignore"/><br>

> > Jan 25 17:56:01 gopher12 pacemaker-controld  [116040]<br>

> > (abort_transition_graph)  info: Transition 0 aborted by cib-<br>

> > bootstrap-options-no-quorum-policy doing create no-quorum-<br>

> > policy=ignore: Configuration change | cib=0.26.0<br>

> > source=te_update_diff_v2:464<br>

> > path=/cib/configuration/crm_config/cluster_property_set[@id='cib-<br>

> > bootstrap-options'] complete=true<br>

> > Jan 25 17:56:01 gopher12 pacemaker-controld  [116040]<br>

> > (controld_execute_fence_action)   notice: Requesting fencing (off)<br>

> > targeting node gopher11 | action=11 timeout=60<br>

> ><br>

> ><br>

> > _______________________________________________<br>

> > Manage your subscription:<br>

> > <a href="https://urldefense.us/v3/__https:/lists.clusterlabs.org/mailman/listinfo/users" target="_blank">https://urldefense.us/v3/__https:/lists.clusterlabs.org/mailman/listinfo/users</a>

<br>

> ><br>

> > ClusterLabs home:<br>

> > <a href="https://urldefense.us/v3/__https:/www.clusterlabs.org/" target="_blank">https://urldefense.us/v3/__https:/www.clusterlabs.org/</a>

<br>

> ><br>

> --<br>

> Ken Gaillot <<a href="mailto:kgaillot@redhat.com" target="_blank">kgaillot@redhat.com</a>><br>

><br>

--<br>

Ken Gaillot <<a href="mailto:kgaillot@redhat.com" target="_blank">kgaillot@redhat.com</a>><br>

<br>

_______________________________________________<br>

Manage your subscription:<br>

<a href="https://lists.clusterlabs.org/mailman/listinfo/users" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a>

<br>

<br>

ClusterLabs home: <a href="https://www.clusterlabs.org/" target="_blank">https://www.clusterlabs.org/</a>

</p>

</div>

</div>

</div>


_______________________________________________<br>

Manage your subscription:<br>

<a href="https://lists.clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a><br>

<br>

ClusterLabs home: <a href="https://www.clusterlabs.org/" rel="noreferrer" target="_blank">https://www.clusterlabs.org/</a><br>

</div></blockquote></div></div>