[ClusterLabs] Problem with high load (IO)

Ken Gaillot kgaillot at redhat.com
Mon Sep 27 13:41:25 EDT 2021


On Mon, 2021-09-27 at 17:12 +0000, Strahil Nikolov wrote:
> Hey Ken,
> 
> how should someone set the maintenace via pcs ?
> 
> Best Regards,
> Strahil Nikolov

pcs only supports rules for operation defaults, resource defaults, and
location rules (and implicitly for move/ban lifetimes).

So, you could lengthen the default timeout for all operations that
don't have explicit timeouts configured, between the hours of 9 and 11
p.m., with

pcs resource op defaults set create score=INFINITY \
   meta timeout=30m rule date-spec hours=21-23

If you want to lengthen the timeout for an operation that does already
have an explicit timeout, you'll have to use "pcs cluster edit" or
cibadmin with the raw XML.

> > On Mon, Sep 27, 2021 at 19:56, Ken Gaillot
> > <kgaillot at redhat.com> wrote:
> > On Mon, 2021-09-27 at 12:37 +0200, Lentes, Bernd wrote:
> > > Hi,
> > > 
> > > i have a two-node cluster running on SLES 12SP5 with two HP
> > servers
> > > and a common FC SAN.
> > > Most of my resources are virtual domains offering databases and
> > web
> > > pages.
> > > The disks from the domains reside on a OCFS2 Volume on a FC SAN.
> > > Each night a 9pm all domains are snapshotted with the OCFS2 tool
> > > reflink.
> > > After the snapshot is created the disks of the domains are copied
> > to
> > > a NAS, domains are still running.
> > > The copy procedure occupies the CPU and IO intensively. IO is
> > > occupied by copy about 90%, the CPU has sometimes a wait about
> > 50%.
> > > Because of that the domains aren't responsive, so that the
> > monitor
> > > operation from the RA fails sometimes.
> > > In worst cases one domain is fenced.
> > > What would you do in such a situation ?
> > > I'm thinking of making the cp procedure nicer, with nice. Maybe
> > about
> > > 10.
> > > 
> > > More ideas ?
> > > 
> > > 
> > > Bernd
> > 
> > This is a classic use case for rules:
> > 
> > https://clusterlabs.org/pacemaker/doc/2.1/Pacemaker_Explained/singlehtml/index.html#using-rules-to-control-cluster-options
> > 
> > You can put the cluster into maintenance mode for the window, or
> > disable the monitor for the window. Of course that also disables
> > any
> > cluster response. You could instead lengthen operation timeouts
> > during
> > the window.
> > -- 
> > Ken Gaillot <kgaillot at redhat.com>
> > 
> > 
> > _______________________________________________
> > Manage your subscription:
> > https://lists.clusterlabs.org/mailman/listinfo/users
> > 
> > ClusterLabs home: https://www.clusterlabs.org/
-- 
Ken Gaillot <kgaillot at redhat.com>



More information about the Users mailing list