[ClusterLabs] DRBD + VDO HowTo?

Eric Robinson eric.robinson at psmnv.com
Mon May 17 14:28:19 EDT 2021


Andrei --

Sorry for the novels. Sometimes it is hard to tell whether people want all the configs, logs, and scripts first, or if they want a description of the problem and what one is trying to accomplish first. I'll send whatever you want. I am very eager to get to the bottom of this.

I'll start with my custom LSB RA. I can send the Pacemaker config a bit later.

[root at ha09a init.d]# ll|grep vdo
lrwxrwxrwx. 1 root root     9 May 16 10:28 vdo0 -> vdo_multi
lrwxrwxrwx. 1 root root     9 May 16 10:28 vdo1 -> vdo_multi
-rwx------. 1 root root  3623 May 16 13:21 vdo_multi

[root at ha09a init.d]#  cat vdo_multi
#!/bin/bash

#--custom script for managing vdo volumes

#--functions
function isActivated() {
        R=$(/usr/bin/vdo status -n $VOL 2>&1)
        if [ $? -ne 0 ]; then
                #--error occurred checking vdo status
                echo "$VOL: an error occurred checking activation status on $MY_HOSTNAME"
                return 1
        fi
        R=$(/usr/bin/vdo status -n $VOL|grep Activate|awk '{$1=$1};1'|cut -d" " -f2)
        echo "$R"
        return 0
}

function isOnline() {
        R=$(/usr/bin/vdo status -n $VOL 2>&1)
        if [ $? -ne 0 ]; then
                #--error occurred checking vdo status
                echo "$VOL: an error occurred checking activation status on $MY_HOSTNAME"
                return 1
        fi
        R=$(/usr/bin/vdo status -n $VOL|grep "Index status"|awk '{$1=$1};1'|cut -d" " -f3)
        echo "$R"
        return 0
}

#--vars
MY_HOSTNAME=$(hostname -s)

#--get the volume name
VOL=$(basename $0)

#--get the action
ACTION=$1

#--take the requested action
case $ACTION in

        start)

                #--check current status
                R=$(isOnline "$VOL")
                if [ $? -ne 0 ]; then
                        echo "error occurred checking $VOL status on $MY_HOSTNAME"
                        exit 0
                fi
                if [ "$R"  == "online" ]; then
                        echo "running on $MY_HOSTNAME"
                        exit 0 #--lsb: success
                fi

                #--enter activation loop
                ACTIVATED=no
                TIMER=15
                while [ $TIMER -ge 0 ]; do
                        R=$(isActivated "$VOL")
                        if [ "$R" == "enabled" ]; then
                                ACTIVATED=yes
                                break
                        fi
                        sleep 1
                        TIMER=$(( TIMER-1 ))
                done
                if [ "$ACTIVATED" == "no" ]; then
                        echo "$VOL: not activated on $MY_HOSTNAME"
                        exit 5 #--lsb: not running
                fi

                #--enter start loop
                /usr/bin/vdo start -n $VOL
                ONLINE=no
                TIMER=15
                while [ $TIMER -ge 0 ]; do
                        R=$(isOnline "$VOL")
                        if [ "$R" == "online" ]; then
                                ONLINE=yes
                                break
                        fi
                        sleep 1
                        TIMER=$(( TIMER-1 ))
                done
                if [ "$ONLINE" == "yes" ]; then
                        echo "$VOL: started on $MY_HOSTNAME"
                        exit 0 #--lsb: success
                else
                        echo "$VOL: not started on $MY_HOSTNAME (unknown problem)"
                        exit 0 #--lsb: unknown problem
                fi
                ;;
        stop)

                #--check current status
                R=$(isOnline "$VOL")
                if [ $? -ne 0 ]; then
                        echo "error occurred checking $VOL status on $MY_HOSTNAME"
                        exit 0
                fi

                if [ "$R" == "not" ]; then
                        echo "not started on $MY_HOSTNAME"
                        exit 0 #--lsb: success
                fi

                #--enter stop loop
                /usr/bin/vdo stop -n $VOL
                ONLINE=yes
                TIMER=15
                while [ $TIMER -ge 0 ]; do
                        R=$(isOnline "$VOL")
                        if [ "$R" == "not" ]; then
                                ONLINE=no
                                break
                        fi
                        sleep 1
                        TIMER=$(( TIMER-1 ))
                done
                if [ "$ONLINE" == "no" ]; then
                        echo "$VOL: stopped on $MY_HOSTNAME"
                        exit 0 #--lsb:success
                else
                        echo "$VOL: failed to stop on $MY_HOSTNAME (unknown problem)"
                        exit 0
                fi
                ;;
        status)
                R=$(isOnline "$VOL")
                if [ $? -ne 0 ]; then
                        echo "error occurred checking $VOL status on $MY_HOSTNAME"
                        exit 5
                fi
                if [ "$R"  == "online" ]; then
                        echo "$VOL started on $MY_HOSTNAME"
                        exit 0 #--lsb: success
                else
                        echo "$VOL not started on $MY_HOSTNAME"
                        exit 3 #--lsb: not running
                fi
                ;;

esac



> -----Original Message-----
> From: Users <users-bounces at clusterlabs.org> On Behalf Of Andrei
> Borzenkov
> Sent: Monday, May 17, 2021 12:49 PM
> To: users at clusterlabs.org
> Subject: Re: [ClusterLabs] DRBD + VDO HowTo?
>
> On 17.05.2021 18:18, Eric Robinson wrote:
> > To Strahil and Klaus –
> >
> > I created the vdo devices using default parameters, so ‘auto’ mode was
> selected by default. vdostatus shows that the current mode is async. The
> underlying drbd devices are running protocol C, so I assume that vdo should
> be changed to sync mode?
> >
> > The VDO service is disabled and is solely under the control of Pacemaker,
> but I have been unable to get a resource agent to work reliably. I have two
> nodes. Under normal operation, Node A is primary for disk drbd0, and device
> vdo0 rides on top of that. Node B is primary for disk drbd1 and device vdo1
> rides on top of that. In the event of a node failure, the vdo device and the
> underlying drbd disk should migrate to the other node, and then that node
> will be primary for both drbd disks and both vdo devices.
> >
> > The default systemd vdo service does not work because it uses the –all flag
> and starts/stops all vdo devices. I noticed that there is also a vdo-start-by-
> dev.service, but there is no documentation on how to use it. I wrote my own
> vdo-by-dev system service, but that did not work reliably either. Then I
> noticed that there is already an OCF resource agent named vdo-vol, but that
> did not work either. I finally tried writing my own OCF-compliant RA, and
> then I tried writing an LSB-compliant script, but none of those worked very
> well.
> >
>
> You continue to write novels instead of simply showing your resource agent,
> your configuration and logs.
>
> > My big problem is that I don’t understand how Pacemaker uses the
> monitor action. Pacemaker would often fail vdo resources because the
> monitor action received an error when it ran on the standby node. For
> example, when Node A is primary for disk drbd1 and device vdo1, Pacemaker
> would fail device vdo1 because when it ran the monitor action on Node B,
> the RA reported an error. But OF COURSE it would report an error, because
> disk drbd1 is secondary on that node, and is therefore inaccessible to the vdo
> driver. I DON’T UNDERSTAND.
> >
>
> May be your definition of "error" does not match pacemaker definition of
> "error". It is hard to comment without seeing code.
>
> > -Eric
> >
> >
> >
> > From: Strahil Nikolov <hunter86_bg at yahoo.com>
> > Sent: Monday, May 17, 2021 5:09 AM
> > To: kwenning at redhat.com; Klaus Wenninger <kwenning at redhat.com>;
> > Cluster Labs - All topics related to open-source clustering welcomed
> > <users at clusterlabs.org>; Eric Robinson <eric.robinson at psmnv.com>
> > Subject: Re: [ClusterLabs] DRBD + VDO HowTo?
> >
> > Have you tried to set VDO in async mode ?
> >
> > Best Regards,
> > Strahil Nikolov
> > On Mon, May 17, 2021 at 8:57, Klaus Wenninger
> > <kwenning at redhat.com<mailto:kwenning at redhat.com>> wrote:
> > Did you try VDO in sync-mode for the case the flush-fua stuff isn't
> > working through the layers?
> > Did you check that VDO-service is disabled and solely under
> > pacemaker-control and that the dependencies are set correctly?
> >
> > Klaus
> >
> > On 5/17/21 6:17 AM, Eric Robinson wrote:
> >
> > Yes, DRBD is working fine.
> >
> >
> >
> > From: Strahil Nikolov
> > <hunter86_bg at yahoo.com><mailto:hunter86_bg at yahoo.com>
> > Sent: Sunday, May 16, 2021 6:06 PM
> > To: Eric Robinson
> > <eric.robinson at psmnv.com><mailto:eric.robinson at psmnv.com>; Cluster
> > Labs - All topics related to open-source clustering welcomed
> > <users at clusterlabs.org><mailto:users at clusterlabs.org>
> > Subject: RE: [ClusterLabs] DRBD + VDO HowTo?
> >
> >
> >
> > Are you sure that the DRBD is working properly ?
> >
> >
> >
> > Best Regards,
> >
> > Strahil Nikolov
> >
> > On Mon, May 17, 2021 at 0:32, Eric Robinson
> >
> > <eric.robinson at psmnv.com<mailto:eric.robinson at psmnv.com>> wrote:
> >
> > Okay, it turns out I was wrong. I thought I had it working, but I keep running
> into problems. Sometimes when I demote a DRBD resource on Node A and
> promote it on Node B, and I try to mount the filesystem, the system
> complains that it cannot read the superblock. But when I move the DRBD
> primary back to Node A, the file system is mountable again. Also, I have
> problems with filesystems not mounting because the vdo devices are not
> present. All kinds of issues.
> >
> >
> >
> >
> >
> > From: Users
> > <users-bounces at clusterlabs.org<mailto:users-bounces at clusterlabs.org>>
> > On Behalf Of Eric Robinson
> > Sent: Friday, May 14, 2021 3:55 PM
> > To: Strahil Nikolov
> > <hunter86_bg at yahoo.com<mailto:hunter86_bg at yahoo.com>>; Cluster
> Labs -
> > All topics related to open-source clustering welcomed
> > <users at clusterlabs.org<mailto:users at clusterlabs.org>>
> > Subject: Re: [ClusterLabs] DRBD + VDO HowTo?
> >
> >
> >
> >
> >
> > Okay, I have it working now. The default systemd service definitions did
> not work, so I created my own.
> >
> >
> >
> >
> >
> > From: Strahil Nikolov
> > <hunter86_bg at yahoo.com<mailto:hunter86_bg at yahoo.com>>
> > Sent: Friday, May 14, 2021 3:41 AM
> > To: Eric Robinson
> > <eric.robinson at psmnv.com<mailto:eric.robinson at psmnv.com>>; Cluster
> > Labs - All topics related to open-source clustering welcomed
> > <users at clusterlabs.org<mailto:users at clusterlabs.org>>
> > Subject: RE: [ClusterLabs] DRBD + VDO HowTo?
> >
> >
> >
> > There is no VDO RA according to my knowledge, but you can use systemd
> service as a resource.
> >
> >
> >
> > Yet, the VDO service that comes with thr OS is a generic one and controlls
> all VDOs - so you need to create your own vdo service.
> >
> >
> >
> > Best Regards,
> >
> > Strahil Nikolov
> >
> > On Fri, May 14, 2021 at 6:55, Eric Robinson
> >
> > <eric.robinson at psmnv.com<mailto:eric.robinson at psmnv.com>> wrote:
> >
> > I created the VDO volumes fine on the drbd devices, formatted them as xfs
> filesystems, created cluster filesystem resources, and the cluster us using
> them. But the cluster won’t fail over. Is there a VDO cluster RA out there
> somewhere already?
> >
> >
> >
> >
> >
> > From: Strahil Nikolov
> > <hunter86_bg at yahoo.com<mailto:hunter86_bg at yahoo.com>>
> > Sent: Thursday, May 13, 2021 10:07 PM
> > To: Cluster Labs - All topics related to open-source clustering
> > welcomed <users at clusterlabs.org<mailto:users at clusterlabs.org>>; Eric
> > Robinson
> <eric.robinson at psmnv.com<mailto:eric.robinson at psmnv.com>>
> > Subject: Re: [ClusterLabs] DRBD + VDO HowTo?
> >
> >
> >
> > For DRBD there is enough info, so let's focus on VDO.
> >
> > There is a systemd service that starts all VDOs on the system. You can
> create the VDO once drbs is open for writes and then you can create your
> own systemd '.service' file which can be used as a cluster resource.
> >
> >
> > Best Regards,
> >
> > Strahil Nikolov
> >
> >
> >
> > On Fri, May 14, 2021 at 2:33, Eric Robinson
> >
> > <eric.robinson at psmnv.com<mailto:eric.robinson at psmnv.com>> wrote:
> >
> > Can anyone point to a document on how to use VDO de-duplication with
> DRBD? Linbit has a blog page about it, but it was last updated 6 years ago and
> the embedded links are dead.
> >
> >
> >
> > https://linbit.com/blog/albireo-virtual-data-optimizer-vdo-on-drbd/
> >
> >
> >
> > -Eric
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > Disclaimer : This email and any files transmitted with it are confidential and
> intended solely for intended recipients. If you are not the named addressee
> you should not disseminate, distribute, copy or alter this email. Any views or
> opinions presented in this email are solely those of the author and might not
> represent those of Physician Select Management. Warning: Although
> Physician Select Management has taken reasonable precautions to ensure
> no viruses are present in this email, the company cannot accept responsibility
> for any loss or damage arising from the use of this email or attachments.
> >
> > _______________________________________________
> > Manage your subscription:
> > https://lists.clusterlabs.org/mailman/listinfo/users
> >
> > ClusterLabs home: https://www.clusterlabs.org/
> >
> > Disclaimer : This email and any files transmitted with it are confidential and
> intended solely for intended recipients. If you are not the named addressee
> you should not disseminate, distribute, copy or alter this email. Any views or
> opinions presented in this email are solely those of the author and might not
> represent those of Physician Select Management. Warning: Although
> Physician Select Management has taken reasonable precautions to ensure
> no viruses are present in this email, the company cannot accept responsibility
> for any loss or damage arising from the use of this email or attachments.
> >
> > Disclaimer : This email and any files transmitted with it are confidential and
> intended solely for intended recipients. If you are not the named addressee
> you should not disseminate, distribute, copy or alter this email. Any views or
> opinions presented in this email are solely those of the author and might not
> represent those of Physician Select Management. Warning: Although
> Physician Select Management has taken reasonable precautions to ensure
> no viruses are present in this email, the company cannot accept responsibility
> for any loss or damage arising from the use of this email or attachments.
> >
> > Disclaimer : This email and any files transmitted with it are confidential and
> intended solely for intended recipients. If you are not the named addressee
> you should not disseminate, distribute, copy or alter this email. Any views or
> opinions presented in this email are solely those of the author and might not
> represent those of Physician Select Management. Warning: Although
> Physician Select Management has taken reasonable precautions to ensure
> no viruses are present in this email, the company cannot accept responsibility
> for any loss or damage arising from the use of this email or attachments.
> > Disclaimer : This email and any files transmitted with it are confidential and
> intended solely for intended recipients. If you are not the named addressee
> you should not disseminate, distribute, copy or alter this email. Any views or
> opinions presented in this email are solely those of the author and might not
> represent those of Physician Select Management. Warning: Although
> Physician Select Management has taken reasonable precautions to ensure
> no viruses are present in this email, the company cannot accept responsibility
> for any loss or damage arising from the use of this email or attachments.
> >
> > _______________________________________________
> >
> > Manage your subscription:
> >
> > https://lists.clusterlabs.org/mailman/listinfo/users
> >
> >
> >
> > ClusterLabs home: https://www.clusterlabs.org/
> >
> > Disclaimer : This email and any files transmitted with it are confidential and
> intended solely for intended recipients. If you are not the named addressee
> you should not disseminate, distribute, copy or alter this email. Any views or
> opinions presented in this email are solely those of the author and might not
> represent those of Physician Select Management. Warning: Although
> Physician Select Management has taken reasonable precautions to ensure
> no viruses are present in this email, the company cannot accept responsibility
> for any loss or damage arising from the use of this email or attachments.
> >
> >
> > _______________________________________________
> > Manage your subscription:
> > https://lists.clusterlabs.org/mailman/listinfo/users
> >
> > ClusterLabs home: https://www.clusterlabs.org/
> >
>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
Disclaimer : This email and any files transmitted with it are confidential and intended solely for intended recipients. If you are not the named addressee you should not disseminate, distribute, copy or alter this email. Any views or opinions presented in this email are solely those of the author and might not represent those of Physician Select Management. Warning: Although Physician Select Management has taken reasonable precautions to ensure no viruses are present in this email, the company cannot accept responsibility for any loss or damage arising from the use of this email or attachments.


More information about the Users mailing list