MySQL高可用方案设计与测试(上篇)

阿凡达2018-07-13 09:21

一 测试内容

基于keepalived设计MySQL高可用方案,测试方案的可行性。

方案目标

1.实现在以下两层上检查状态,服务不可用时切换

a)网络不可用或服务器宕机。

b)mysql服务不可用。

 

2.原主恢复后不自动升级为master,防止因网络问题导致频繁切换

 

3.保障keepalived切换时,从机可用(复制延迟在可接受范围),否则不提供服务。

方案设计

本方案实现双机热备(HA)功能。基本设计思路:通过设置 keepalived ,使两个mysql节点属于一个集群(共享一个VIPVRIDVIP绑定在主机的网卡上)。外部通过VIP访问mysql,当前主机不可用时,从机会抢占VIP。因此外部可以通过VIP来访问mysql以实现高可用。网络层的服务不可用的判定与切换由keepalived实现(基于VRRP协议)。

应用层的服务不可用的判定由自定义脚本实现。判定服务不可用后通过关闭本节点keepalived,释放VIP触发重新选主。此外,还需实现原主恢复后不自动升级为master功能。keepalived自带nopreempt参数实现不抢占功能,但当新主服务再挂掉后由于原主带nopreempt参数,即使原主优先级高仍无法完成切换。故现在结合自定义脚本实现类似功能,备机节点脚本只有当自身mysql可用且主机mysql不可用时才触发切换。

MySql复制有可能出现延迟、复制出错等情况,造成从机MySql不可用,因此在切换时首先判定从机MySql复制状态。初始时从机设成readonly,当从机要升级成主时,先判断复制状态、延迟时间(延迟时间由平时运行的check_run脚本收集),符合条件后再关闭readonly。如果从不符合升级条件就直接把从的vip也关闭,整个服务设为不可用。

方案实施:

安装keepalived

1.在主备两台服务器上安装keepalived

2.设置keepalived的配置文件、添加相应的监控与切换检测脚本。

3.设置两台服务器上mysql主主复制。

配置keepalived.conf

node-1keepalived.conf配置:

! Configuration File for keepalived 

 

         global_defs { 

     router_id MySQL-ha 

     }     

         vrrp_script check_run {

         script "/home/ddb/keep_check/keepalived_check_mysql.sh"

         interval 10  # check every 10 seconds

         weight -10   # if failed, decrease 10 of the priority

    fall   3     # require 2 failures for failures

    rise   1     # require 1 sucesses for ok

         }

    vrrp_instance VI_1 {  

     state BACKUP    #两台配置此处均是BACKUP 

     interface eth1 

     virtual_router_id 51  

     priority 100   #优先级

     advert_int 1  

          notify_backup "/home/ddb/keep_check/keepalived_to_backup.sh"

          notify_master "/home/ddb/keep_check/keepalived_to_master.sh"

     nopreempt  #不抢占,只在优先级高的机器上设置

     authentication { 

     auth_type PASS 

     auth_pass 1111 

     }  

         track_script {

      check_run

         }

 

node-2keepalived.conf配置

 ! Configuration File for keepalived  

         global_defs { 

     router_id MySQL-ha 

     }  

        vrrp_script check_run {

         script "/home/ddb/keep_check/keepalived_check_mysql.sh"

         interval 10  # check every 10 seconds

         weight -10   # if failed, decrease 40 of the priority

    fall   3     # require 2 failures for failures

    rise   1     # require 1 sucesses for ok

         }

         vrrp_instance VI_1 { 

     state BACKUP    

     interface eth1

     virtual_router_id 51 

     priority 50  

     advert_int 1 

          notify_backup "/home/ddb/keep_check/keepalived_to_backup.sh"

          notify_master "/home/ddb/keep_check/keepalived_to_master.sh"

    authentication { 

     auth_type PASS 

     auth_pass 1111 

     } 

        track_script {

      check_run

         }

     virtual_ipaddress { 

     10.130.10.43/23

     } 

     }  

node-1/node-2 mysql基本配置

[mysqld]

bind-address=0.0.0.0

port=5331

mysql状态检测脚本

vi /home/pw/mysql/keepalived_check_mysql.sh

chmod +x /home/pw/mysql/keepalived_check_mysql.sh

keepalived_check_mysql.sh

#!/bin/bash

#监测mysql状态,主mysql不可用时才触发切换。监测当前主从复制状态,如果复制失败输出标志到本地,供keepalived_to_master脚本使用

VIP=10.130.10.43

MYSQL=/home/ddb/mysql/xtradb-5.1.47_netease/bin/mysql

MYSQL_HOST=127.0.0.1

MYSQL_PORT=5331

MYSQL_USER=root

MYSQL_PASSWORD=123

PDIR=(dirname0)

Seconds_Behind=30       #复制延迟时间报警阀值

Email_interval=30       #发送报警的时间间隔

echo>/tmp/mysql_status.txt

MYSQLhMYSQL_HOST -PMYSQLPORTuMYSQL_USER -p${MYSQL_PASSWORD} -e "show slave status\G"  >/tmp/mysql_status.txt 2>&1   

MYSQL_STATUS=$?

#master:0 backup:1

ip a|grep $VIP

IS_MASTER=$?

#判断mysql实例是否可用,不可用时释放VIP,确定VIP被别人抢占后再重新开启keepalived

if [ IS_MASTER
-eq 0 ] && [
MYSQL_STATUS -eq 1 ] ;then

        sudo /etc/init.d/keepalived stop

        while ip a|grep $VIP ; do

        sleep 1

        done

        echo "1">>$PDIR/keepalived.log

        if ping $VIP -w 30 >/dev/null ;then

                sudo /etc/init.d/keepalived start

        fi

        echo "2">>$PDIR/keepalived.log

        msg="[`date +\"%Y-%m-%d %T \"`] MySql on `hostname -i`:$MYSQL_PORT HALT ! "

        echo msg>>PDIR/keepalived.log

        cd PDIR && python sendmail.py
"
msg"

        cd PDIR && python message.py
"
msg"

        exit 0

#判断复制是否可用,供keepalived_to_master脚本使用

isOK=1

while read line ; do

        if echo $line|grep -i 'Seconds_Behind_Master' ;then

                status=(echoline|cut -d':' -f2)

                if ! echo status|grep
"NULL" &&  [
status -gt $Seconds_Behind ] ; then

                        isOK=0

                fi

        fi

        if echo $line|grep -i 'Slave_SQL_Running' ;then

                status=(echoline|cut -d':' -f2)

                if echo $status|grep "No"; then

                        isOK=0 

                fi

        fi

  if echo $line|grep -i 'Master_Host' ;then

                Master_Host=(echoline|cut -d':' -f2)

        fi

        if echo $line|grep -i 'Master_Port' ;then

                Master_Port=(echoline|cut -d':' -f2)

        fi

        if echo $line|grep -i 'Empty set' ;then

                isOK=0

        fi

done</tmp/mysql_status.txt

 

if [ $isOK -eq 0 ] ; then

        if [ -f  PDIR/repl ] && grep "fail"PDIR/repl ; then

                lasttime=(lslPDIR/repl --time-|awk '{print $6}')

                curtime=$(date "+%s")

                cur_interval=`expr curtimelasttime`

                echo curinterval>>PDIR/keepalived.log

                if [ curintervalltEmail_interval ] ;then

                        exit 0

                fi

        fi

        echo "fail">$PDIR/repl

        msg="[`date +\"%Y-%m-%d %T \"`] MySql on `hostname -i`:MYSQLPORT==>MasterHost:Master_Port replication failed"

        echo msg>>PDIR/keepalived.log

        cd PDIR && python sendmail.py
"
msg"

else

相关阅读:MySQL高可用方案设计与测试(下篇)

本文来自网易实践者社区,经作者潘威授权发布。