mysql 高可用MHA

133 阅读9分钟

MHA Master High Availability

MHA工作原理和架构

  1. MHA利用 SELECT 1 As Value 指令判断master服务器的健康性,一旦master 宕机,MHA 从宕机崩溃的master保存二进制日志事件(binlog events)
  2. 识别含有最新更新的slave
  3. 应用差异的中继日志(relay log)到其他的slave
  4. 应用从master保存的二进制日志事件(binlog events)
  5. 提升一个slave为新的master
  6. 使其他的slave连接新的master进行复制

MHA软件

MHA软件由两部分组成,Manager工具包和Node工具包

Manager工具包主要包括以下几个工具:

masterha_check_ssh       检查MHA的SSH配置状况
masterha_check_repl     检查MySQL复制状况
masterha_manger         启动MHA
masterha_check_status   检测当前MHA运行状态
masterha_master_monitor 检测master是否宕机
masterha_master_switch 故障转移(自动或手动)
masterha_conf_host     添加或删除配置的server信息
masterha_stop  --conf=app1.cnf 停止MHA
masterha_secondary_check 两个或多个网络线路检查MySQL主服务器的可用

Node工具包:这些工具通常由MHA Manager的脚本触发,无需人为操作)主要包括以下几个工具:

save_binary_logs     #保存和复制master的二进制日志
apply_diff_relay_logs   #识别差异的中继日志事件并将其差异的事件应用于其他的slave
filter_mysqlbinlog   #去除不必要的ROLLBACK事件(MHA已不再使用此工具)
purge_relay_logs #清除中继日志(不会阻塞SQL线程)

MHA配置文件:

global配置,为各application提供默认配置,默认文件路径 /etc/masterha_default.cnf
application配置:为每个主从复制集群

实现mha

MHA 实验

机器作用
7-4 192.168.10.30MHA管理节点
7-1 192.168.10.1
7-2 192.168.10.10
7-3 192.168.10.20

准备文件

7-4 需要客户端和服务端,其余只需要客户端

1关闭防火墙selinux

systemctl disable --now firewalld
setenforce 0

2.主节点(7-4)安装 管理和客户端工具

[root@localhost opt]#yum install epel-release.noarch -y
#有依赖性用yum安装  需要先安装  epel源
​
[root@localhost data]#ls
mha4mysql-manager-0.58-0.el7.centos.noarch.rpm  mha4mysql-node-0.58-0.el7.centos.noarch.rpm
​
[root@localhost opt]#yum -y install mha4mysql-*.rpm

3.其余所有节点(7-1,7-2,7-3)安装客户端

[root@localhost opt]#yum install epel-release.noarch -y
[root@localhost data]# yum install  mha4mysql-node-0.58-0.el7.centos.noarch.rpm -y

4. 所有节点 基于key验证

[root@localhost data]#ssh-keygen 
[root@localhost data]#ssh-copy-id  127.0.0.1
#自己和自己连生成 秘钥
​
[root@localhost data]#cd
​
[root@localhost data]#rsync -a .ssh   192.168.10.10:/root/
[root@localhost data]#rsync -a .ssh   192.168.91.1:/root/
[root@localhost data]#rsync -a .ssh   192.168.91.20:/root/
#注意.ssh 后不能加/    -a  保留属性

5 主节点(7-4)建立mha文件夹和配置文件

[root@localhost ~]#mkdir /etc/mastermha
[root@localhost ~]#vim /etc/mastermha/app1.cnf
[server default]
user=mhauser
password=Admin@123
manager_workdir=/data/mastermha/app1/
manager_log=/data/mastermha/app1/manager.log
remote_workdir=/data/mastermha/app1/
ssh_user=root
repl_user=test
repl_password=Admin@123
ping_interval=1
master_ip_failover_script=/usr/local/bin/master_ip_failover
check_repl_delay=0
master_binlog_dir=/data/mysql/
​
​
[server1]
hostname=192.168.10.1
candidate_master=1[server2]
hostname=192.168.10.10
candidate_master=1[server3]
hostname=192.168.10.20

6 准备切换脚本

​
​
[root@localhost ~]#vim  master_ip_failover
#!/usr/bin/env perl
use strict;
use warnings FATAL => 'all';
use Getopt::Long;
my (
$command, $ssh_user, $orig_master_host, $orig_master_ip,
$orig_master_port, $new_master_host, $new_master_ip, $new_master_port
);
my $vip = '192.168.10.188/24';
my $gateway = '192.168.10.2';
my $interface = 'ens33';
my $key = "1";
my $ssh_start_vip = "/sbin/ifconfig $interface:$key $vip;/sbin/arping -I $interface -c 3 -s $vip $gateway >/dev/null 2>&1";
my $ssh_stop_vip = "/sbin/ifconfig $interface:$key down";
GetOptions(
'command=s' => $command,
'ssh_user=s' => $ssh_user,
'orig_master_host=s' => $orig_master_host,
'orig_master_ip=s' => $orig_master_ip,
'orig_master_port=i' => $orig_master_port,
'new_master_host=s' => $new_master_host,
'new_master_ip=s' => $new_master_ip,
'new_master_port=i' => $new_master_port,
);
exit &main();
sub main {
print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";
if ( $command eq "stop" || $command eq "stopssh" ) {
# $orig_master_host, $orig_master_ip, $orig_master_port are passed.
# If you manage master ip address at global catalog database,
# invalidate orig_master_ip here.
my $exit_code = 1;
eval {
print "Disabling the VIP on old master: $orig_master_host \n";
&stop_vip();
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {
# all arguments are passed.
# If you manage master ip address at global catalog database,
# activate new_master_ip here.
# You can also grant write access (create user, set read_only=0, etc) here.
my $exit_code = 10;
eval {
print "Enabling the VIP - $vip on the new master - $new_master_host \n";
&start_vip();
$exit_code = 0;
};
if ($@) {
warn $@;
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
print "Checking the Status of the script.. OK \n";
`ssh $ssh_user@$orig_master_host " $ssh_start_vip "`;
exit 0;
}
else {
&usage();
exit 1;
}
}
# A simple system call that enable the VIP on the new master
sub start_vip() {
`ssh $ssh_user@$new_master_host " $ssh_start_vip "`;
}
# A simple system call that disable the VIP on the old_master
sub stop_vip() {
`ssh $ssh_user@$orig_master_host " $ssh_stop_vip "`;
}
sub usage {
print
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}
​
​
​
​
[root@localhost ~]#cp master_ip_failover   /usr/local/bin/
# 移动文件到对应的地方 之前的配置文件中规定了地方
[root@localhost ~]#chmod +x  /usr/local/bin/master_ip_failover 
#加上执行权限

7实现主从复制

7.1主服务器操作(7-1)

[root@localhost ~]#vim  /etc/my.cnf
#修改文件
[mysqld]
server_id=100
log-bin=/data/mysql/mysql-bin
skip_name_resolve=1
general_log 
#通用日志[root@localhost ~]#mkdir /data/mysql/   -p
#建立文件夹
[root@localhost ~]#chown mysql.mysql /data/ -R
#注意修改权限
[root@localhost ~]#systemctl restart mysqld
​
​
​
​
[root@localhost data]#mysql -uroot -pabc123
mysql> show master status;
+------------------+----------+--------------+------------------+-------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| mysql-bin.000001 |      154 |              |                  |                   |
+------------------+----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)
​
​
mysql> grant replication slave on *.* to test@'192.168.10.%' identified by 'Admin@123';
#建立复制用户
Query OK, 0 rows affected, 1 warning (0.00 sec)
​
mysql> grant all on *.* to mhauser@'192.168.91.%' identified by 'Admin@123';
#建立  mha管理账户
Query OK, 0 rows affected, 1 warning (0.00 sec)
​

7.2从服务器设置(7-2,7-3)

7.2.1 服务器7-2配置

[root@localhost ~]#vim  /etc/my.cnf
#修改文件
server_id=101
log-bin=/data/mysql/mysql-bin
read_only
relay_log_purge=0
skip_name_resolve=1 
general_log 
​
​
​
[root@localhost ~]#mkdir /data/mysql/   -p
#建立文件夹
[root@localhost ~]#chown mysql.mysql /data/ -R
#注意修改权限
[root@localhost ~]#systemctl restart mysqld
​
​
[root@localhost data]#mysql -uroot -pabc123
CHANGE MASTER TO
  MASTER_HOST='192.168.10.1',
  MASTER_USER='test',
  MASTER_PASSWORD='Admin@123',
  MASTER_PORT=3306,
  MASTER_LOG_FILE='mysql-bin.000001',
  MASTER_LOG_POS=154;
  注意最后分号
  
mysql> start slave;
mysql> show slave status\G;

7.2.2服务器7-3配置

[root@localhost ~]#vim  /etc/my.cnf
#修改文件
server_id=102
log-bin=/data/mysql/mysql-bin
read_only
relay_log_purge=0
skip_name_resolve=1 
general_log 
​
​
​
[root@localhost ~]#mkdir /data/mysql/   -p
#建立文件夹
[root@localhost ~]#chown mysql.mysql /data/ -R
#注意修改权限
[root@localhost ~]#systemctl restart mysqld
​
​
[root@localhost data]#mysql -uroot -pabc123
CHANGE MASTER TO
  MASTER_HOST='192.168.91.100',
  MASTER_USER='test',
  MASTER_PASSWORD='Admin@123',
  MASTER_PORT=3306,
  MASTER_LOG_FILE='mysql-bin.000001',
  MASTER_LOG_POS=154;
  注意最后分号
  
mysql> start slave;
mysql> show slave status\G;

8.设置虚拟地址

在 mysql 主节点上配置 虚拟地址 也就是7-1

[root@localhost ~]#ifconfig ens33:1 192.168.10.188/24

9在运行前需要先检测环境是否符合

在管理节点 7-4 上执行

9.1 检测 ssh 免密登录是否成功

[root@localhost ~]#masterha_check_ssh --conf=/etc/mastermha/app1.cnf
Thu Jul  4 23:55:53 2024 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Thu Jul  4 23:55:53 2024 - [info] Reading application default configuration from /etc/mastermha/app1.cnf..
Thu Jul  4 23:55:53 2024 - [info] Reading server configuration from /etc/mastermha/app1.cnf..
Thu Jul  4 23:55:53 2024 - [info] Starting SSH connection tests..
Thu Jul  4 23:55:55 2024 - [debug] 
Thu Jul  4 23:55:53 2024 - [debug]  Connecting via SSH from root@192.168.91.100(192.168.91.100:22) to root@192.168.91.101(192.168.91.101:22)..
Thu Jul  4 23:55:54 2024 - [debug]   ok.
Thu Jul  4 23:55:54 2024 - [debug]  Connecting via SSH from root@192.168.91.100(192.168.91.100:22) to root@192.168.91.102(192.168.91.102:22)..
Thu Jul  4 23:55:55 2024 - [debug]   ok.
Thu Jul  4 23:55:56 2024 - [debug] 
Thu Jul  4 23:55:54 2024 - [debug]  Connecting via SSH from root@192.168.91.102(192.168.91.102:22) to root@192.168.91.100(192.168.91.100:22)..
Thu Jul  4 23:55:55 2024 - [debug]   ok.
Thu Jul  4 23:55:55 2024 - [debug]  Connecting via SSH from root@192.168.91.102(192.168.91.102:22) to root@192.168.91.101(192.168.91.101:22)..
Thu Jul  4 23:55:56 2024 - [debug]   ok.
Thu Jul  4 23:55:56 2024 - [debug] 
Thu Jul  4 23:55:54 2024 - [debug]  Connecting via SSH from root@192.168.91.101(192.168.91.101:22) to root@192.168.91.100(192.168.91.100:22)..
Thu Jul  4 23:55:55 2024 - [debug]   ok.
Thu Jul  4 23:55:55 2024 - [debug]  Connecting via SSH from root@192.168.91.101(192.168.91.101:22) to root@192.168.91.102(192.168.91.102:22)..
Thu Jul  4 23:55:55 2024 - [debug]   ok.
Thu Jul  4 23:55:56 2024 - [info] All SSH connection tests passed successfully.
​

9.2 检测主从复制 是否可以

[root@localhost /]#masterha_check_repl --conf=/etc/mastermha/app1.cnf
#  --conf=/etc/mastermha/app1.cnf  指明配置文件
.........................................................
.........................................................
.........................................................
Checking the Status of the script.. OK 
Fri Jul  5 00:03:44 2024 - [info]  OK.
Fri Jul  5 00:03:44 2024 - [warning] shutdown_script is not defined.
Fri Jul  5 00:03:44 2024 - [info] Got exit code 0 (Not master dead).
​
MySQL Replication Health is OK.
​
​
​
​

报错

[root@mha-manager ~]#masterha_check_repl --conf=/etc/mastermha/app1.cnf
#如果设置了默认字符集起不来    在  /etc/my.cnf  文件中
unknown variable 'default-character-set=utf8'

9.3 查看状态未开启

[root@localhost /]#masterha_check_status --conf=/etc/mastermha/app1.cnf
app1 is stopped(2:NOT_RUNNING).
​

10 开启mha

#开启MHA,默认是前台运行,生产环境一般为后台执行
nohup masterha_manager --conf=/etc/mastermha/app1.cnf &> /dev/null 
#非后台
masterha_manager --conf=/etc/mastermha/app1.cnf 
​
#查看状态
masterha_check_status --conf=/etc/mastermha/app1.cnf  

11测试

11.1 mha 如何发现主节点宕机

通过发送 SELECT 1 As Value 指令 把1 设置成 value 给主, 主无法执行就认为他死了

[root@localhost mysql]#tail -f   /var/lib/mysql/localhost.log 
2024-07-04T16:11:14.137683Z     9 Query SELECT 1 As Value
2024-07-04T16:11:15.137991Z     9 Query SELECT 1 As Value
2024-07-04T16:11:16.137965Z     9 Query SELECT 1 As Value
2024-07-04T16:11:17.138401Z     9 Query SELECT 1 As Value
2024-07-04T16:11:18.138703Z     9 Query SELECT 1 As Value
2024-07-04T16:11:19.138877Z     9 Query SELECT 1 As Value
2024-07-04T16:11:20.139094Z     9 Query SELECT 1 As Value
2024-07-04T16:11:21.139400Z     9 Query SELECT 1 As Value
2024-07-04T16:11:22.140600Z     9 Query SELECT 1 As Value
2024-07-04T16:11:23.140507Z     9 Query SELECT 1 As Value
2024-07-04T16:11:24.141510Z     9 Query SELECT 1 As Value
2024-07-04T16:11:25.141256Z     9 Query SELECT 1 As Value
​

11.2 查看 mha 服务的日志

[root@localhost ~]#tail  -f  /data/mastermha/app1/manager.log 
​
IN SCRIPT TEST====/sbin/ifconfig ens33:1 down==/sbin/ifconfig ens33:1 192.168.91.188/24;/sbin/arping -I ens33 -c 3 -s 192.168.91.188/24 192.168.91.2 >/dev/null 2>&1===
​
Checking the Status of the script.. OK 
Fri Jul  5 00:08:27 2024 - [info]  OK.
Fri Jul  5 00:08:27 2024 - [warning] shutdown_script is not defined.
Fri Jul  5 00:08:27 2024 - [info] Set master ping interval 1 seconds.
Fri Jul  5 00:08:27 2024 - [warning] secondary_check_script is not defined. It is highly recommended setting it to check master reachability from two or more routes.
Fri Jul  5 00:08:27 2024 - [info] Starting ping health check on 192.168.91.100(192.168.91.100:3306)..
## Fri Jul  5 00:08:27 2024 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..

11.3 模拟 mysql 主节点故障

切换的过程 会将 从服务器的 readonly 指令改成可写

mysql> select @@read_only;
+-------------+
| @@read_only |
+-------------+
|           0 |
+-------------+
1 row in set (0.00 sec)
​
切换成我们指定的 101

在从节点 102 上查看slave 信息 可以看到指向 新的主

mysql> show slave status\G;

查看日志

[root@localhost ~]#tail  -f  /data/mastermha/app1/manager.log 
----- Failover Report -----
​
app1: MySQL Master failover 192.168.10.1(192.168.10.1:3306) to 192.168.10.10(192.168.10.10:3306) succeeded
​
Master 192.168.10.1(192.168.10.1:3306) is down!
​
Check MHA Manager logs at localhost.localdomain:/data/mastermha/app1/manager.log for details.
​
Started automated(non-interactive) failover.
Invalidated master IP address on 192.168.10.1(192.168.10.1:3306)
The latest slave 192.168.10.10(192.168.10.10:3306) has all relay logs for recovery.
Selected 192.168.10.10(192.168.10.10:3306) as a new master.
192.168.10.10(192.168.10.10:3306): OK: Applying all logs succeeded.
192.168.10.10(192.168.10.10:3306): OK: Activated master IP address.
192.168.10.20(192.168.10.20:3306): This host has the latest relay log events.
Generating relay diff files from the latest slave succeeded.
192.168.10.20(192.168.10.20:3306): OK: Applying all logs succeeded. Slave started, replicating from 192.168.10.10(192.168.10.10:3306)
192.168.10.10(192.168.10.10:3306): Resetting slave info succeeded.
## Master failover to 192.168.10.10(192.168.10.10:3306) completed successfully.
​
​

并且虚拟ip也在101 服务器上出现

ifconfig