[ClusterLabs] Quorum when reducing cluster from 3 nodes to 2 nodes

Hayden,Robert RHAYDEN at CERNER.COM
Mon May 31 14:55:13 EDT 2021



> -----Original Message-----
> From: Users <users-bounces at clusterlabs.org> On Behalf Of Tomas Jelinek
> Sent: Monday, May 31, 2021 6:29 AM
> To: users at clusterlabs.org
> Subject: Re: [ClusterLabs] Quorum when reducing cluster from 3 nodes to 2
> nodes
>
> Hi Robert,
>
> Can you share your /etc/corosync/corosync.conf file? Also check if it's
> the same on all nodes.
>

I verified that the corosync.conf file is the same across the nodes.   As part of the troubleshooting, I manually ran the command "crm_node --remove=app3 --force" to remove the third node from the corosync configuration.   My concern is around why the quorum number did not auto downgrade to a value of "1", especially since we run with the "last_man_standing" flag.    I suspect the issue is in the two-node special case.  That is, if I was removing a node from a 4+ node cluster, I would not have had an issue.

Here is the information you requested, slightly redacted for security.

root:@app1:/root
#20:45:02 # cat /etc/corosync/corosync.conf
totem {
    version: 2
    cluster_name: XXXX_app_2
    secauth: off
    transport: udpu
    token: 61000
}

nodelist {
    node {
        ring0_addr: app1
        nodeid: 1
    }

    node {
        ring0_addr: app2
        nodeid: 3
    }
}

quorum {
    provider: corosync_votequorum
    wait_for_all: 1
    last_man_standing: 1
    two_node: 1
}

logging {
    to_logfile: yes
    logfile: /var/log/cluster/corosync.log
    to_syslog: yes
}
root:@app1:/root
#20:45:12 # ssh app2 md5sum /etc/corosync/corosync.conf
d69b80cd821ff44224b56ae71c5d731c  /etc/corosync/corosync.conf
root:@app1:/root
#20:45:30 # md5sum /etc/corosync/corosync.conf
d69b80cd821ff44224b56ae71c5d731c  /etc/corosync/corosync.conf

Thanks
Robert

> Dne 26. 05. 21 v 17:48 Hayden,Robert napsal(a):
> > I had a SysAdmin reduce the number of nodes in a OL 7.9 cluster from
> > three nodes to two nodes.
> >
> >  From internal testing, I found the following commands would work and
> > the 2Node attribute would be automatically added.  The other cluster
> > parameters we use are WaitForAll and LastManStanding.
> >
> > pcs resource disable res_app03
> >
> > pcs resource delete res_app03
> >
> > pcs cluster node remove app03
> >
> > pcs stonith delete fence_app03
> >
> > Unfortunately, real world didn't go as planned.   I am unsure if the
> > commands were ran out of order or something else was going on (e.g.
> > unexpected location constraints).   When I got involved, I noticed that
> > pcs status had the app3 node in an OFFLINE state, but the pcs cluster
> > node remove app03 command was successful.   I noticed some leftover
> > location constraints from past "moves" of resources.  I manually removed
> > those constraints and I ended up removing the app03 node from the
> > corosync configuration with "crm_node --remove=app3 --force" command.
>
> This removes the node from pacemaker config, not from corosync config.
>
> Regards,
> Tomas
>
> >     Now pcs status no longer shows any information for app3 and crm_node
> > -l does not show app3.
> >
> > My concern is with Quorum.   From the pcs quorum status output below, I
> > still see Quorum set at 2 (expected to be 1) and the 2Node attribute was
> > not added.   Am I stuck in this state until the next full cluster
> > downtime?  Or is there a way to manipulate the expected quorum votes in
> > the run time cluster?
> >
> > #17:25:08 # pcs quorum status
> >
> > Quorum information
> >
> > ------------------
> >
> > Date:             Wed May 26 17:25:16 2021
> >
> > Quorum provider:  corosync_votequorum
> >
> > Nodes:            2
> >
> > Node ID:          3
> >
> > Ring ID:          1/85
> >
> > Quorate:          Yes
> >
> > Votequorum information
> >
> > ----------------------
> >
> > Expected votes:   2
> >
> > Highest expected: 2
> >
> > Total votes:      2
> >
> > Quorum:           2
> >
> > Flags:            Quorate WaitForAll LastManStanding
> >
> > Membership information
> >
> > ----------------------
> >
> >      Nodeid      Votes    Qdevice Name
> >
> >           1          1         NR app1
> >
> >           3          1         NR app2 (local)
> >


CONFIDENTIALITY NOTICE This message and any included attachments are from Cerner Corporation and are intended only for the addressee. The information contained in this message is confidential and may constitute inside or non-public information under international, federal, or state securities laws. Unauthorized forwarding, printing, copying, distribution, or use of such information is strictly prohibited and may be unlawful. If you are not the addressee, please promptly delete this message and notify the sender of the delivery error by e-mail or you may call Cerner's corporate offices in Kansas City, Missouri, U.S.A at (+1) (816)221-1024.


More information about the Users mailing list