基于MMM实现MariaDB的高可用
一、MMM
1、简介
MMM即Master-Master Replication Manager for MySQL(mysql主主复制管理器),是关于mysql主主复制配置的监控、故障转移和管理的一套可伸缩的脚本套件(在任何时候只有一个节点可以被写入),这个套件也能基于标准的主从配置的任意数量的从服务器进行读负载均衡,所以你可以用它来在一组居于复制的服务器启动虚拟ip,除此之外,它还有实现数据备份、节点之间重新同步功能的脚本。
MySQL本身没有提供replication failover的解决方案,通过MMM方案能实现服务器的故障转移,从而实现mysql的高可用。
2、MMM的功能
MMM主要功能由下面三个脚本提供
mmm_mond :负责所有的监控工作的监控守护进程,决定节点的移除等等
mmm_agentd :运行在mysql服务器上的代理守护进程,通过简单远程服务集提供给监控节点
mmm_control :通过命令行管理mmm_mond进程
3、MMM的优缺点及应用场景
优点:安全性、稳定性高,可扩展性好,当主服务器挂掉以后,另一个主立即接管,其他的从服务器能自动切换,不用人工干预。
缺点:至少三个节点,对主机的数量有要求,需要实现读写分离,可以在程序扩展上比较难实现。同时对主从(双主)同步延迟要求比较高!因此不适合数据安全非常严格的场合。
应用场所:高访问量,业务增长快,并且要求实现读写分离的场景。
二、MMM架构原理图
三、资源配置
1、服务器列表
服务器 | IP | 主机名 | server id |
monitoring host | 172.16.7.100 | monitor | - |
master 1 | 172.16.7.200 | db1 | 1 |
master 2 | 172.16.7.201 | db2 | 2 |
master 3 | 172.16.7.202 | db3 | 3 |
2、虚拟IP列表
IP | role | description |
172.16.7.1 | write | 对就用程序连接的VIP进行写操作 |
172.16.7.2 | read | 对就用程序连接的VIP进行读操作 |
172.16.7.3 | read |
四、MMM的实现
1、配置master 1
(1)修改/etc/my.cnf配置文件
server-id = 1 datadir = /mydata/data log-bin = /mydata/binglogs/master-bin relay_log = /mydata/relaylogs/relay binlog_format=mixed thread_concurrency = 4 log-slave-updates sync_binlog=1 auto_increment_increment=2 auto_increment-offset=1 |
(2)为master2 和 slave 授权复制用户
MariaDB [(none)]> grant replication slave,replication client on *.* to ‘repluser‘@‘172.16.7.201‘ identified by ‘repluser‘; MariaDB [(none)]> grant replication slave,replication client on *.* to ‘repluser‘@‘172.16.7.202‘ identified by ‘repluser‘;
(3)查看状态信息,从服务器连接主服务器时使用
MariaDB [(none)]> show master status; +-------------------+----------+--------------+------------------+ | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | +-------------------+----------+--------------+------------------+ | master-bin.000001 | 755 | | | +-------------------+----------+--------------+------------------+ |
2、配置master 2
(1)修改/etc/my.cnf配置文件
server-id = 2 datadir = /mydata/data log-bin = /mydata/binglogs/master-bin relay_log = /mydata/relaylogs/relay binlog_format=mixed thread_concurrency = 4 log-slave-updates sync_binlog=1 auto_increment_increment=2 auto_increment-offset=2 |
(2)为master1 和 slave 授权复制用户
MariaDB [(none)]> grant replication slave,replication client on *.* to ‘repluser‘@‘172.16.7.200‘ identified by ‘repluser‘; MariaDB [(none)]> grant replication slave,replication client on *.* to ‘repluser‘@‘172.16.7.202‘ identified by ‘repluser‘;
(3)查看状态信息,从服务器连接主服务器时使用
MariaDB [(none)]> show master status; +-------------------+----------+--------------+------------------+ | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | +-------------------+----------+--------------+------------------+ | master-bin.000001 | 755 | | | +-------------------+----------+--------------+------------------+ |
(4)master 2 连接 master 1
MariaDB [(none)]> change master to master_host=‘172.16.7.200‘,master_user=‘repluser‘,master_password=‘repluser‘,master_log_file=‘master-bin.000001‘,master_log_pos=755; MariaDB [(none)]> MariaDB [(none)]> start slave; Query OK, 0 rows affected (0.00 sec) MariaDB [(none)]> show slave status\G; *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 172.16.7.200 Master_User: repluser Master_Port: 3306 Connect_Retry: 60 Master_Log_File: master-bin.000001 Read_Master_Log_Pos: 755 Relay_Log_File: relay.000003 Relay_Log_Pos: 536 Relay_Master_Log_File: master-bin.000001 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 755 Relay_Log_Space: 823 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 1 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: No Gtid_IO_Pos:
(5)master 1 连接 master 2
MariaDB [(none)]> change master to master_host=‘172.16.7.201‘,master_user=‘repluser‘,master_password=‘repluser‘,master_log_file=‘master-bin.000001‘,master_log_pos=755; Query OK, 0 rows affected (0.02 sec) MariaDB [(none)]> start slave; Query OK, 0 rows affected (0.02 sec) MariaDB [(none)]> show slave status\G; *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 172.16.7.201 Master_User: repluser Master_Port: 3306 Connect_Retry: 60 Master_Log_File: master-bin.000002 Read_Master_Log_Pos: 327 Relay_Log_File: db1-relay-bin.000003 Relay_Log_Pos: 615 Relay_Master_Log_File: master-bin.000002 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 327 Relay_Log_Space: 954 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 2 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: No Gtid_IO_Pos:
(6)测试master 1 与master 2 是否可以正常主从同步
=========== master 1 ================ MariaDB [(none)]> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | mysql | | performance_schema | | test | +--------------------+ 4 rows in set (0.00 sec) MariaDB [(none)]> create database hlbr; Query OK, 1 row affected (0.00 sec) MariaDB [(none)]> show databases; +--------------------+ | Database | +--------------------+ | hlbr | | information_schema | | mysql | | performance_schema | | test | +--------------------+ 5 rows in set (0.00 sec) =========== master 2 ================ MariaDB [(none)]> show databases; +--------------------+ | Database | +--------------------+ | hlbr | #同步成功 | information_schema | | mysql | | performance_schema | | test | +--------------------+ 5 rows in set (0.02 sec) |
(7)测试master 2 与master 1 是否可以正常主从同步
我在master 2 上创建了一个bynr库,可以同步到master 1 上,为了减小篇幅,这里就不贴图了。
经过(6),(7)的测试,说明双主已然成功
3、配置slave
(1)修改/etc/my.cnf配置文件
server-id = 3 datadir = /mydata/data log-bin = /mydata/binglogs/master-bin relay_log = /mydata/relaylogs/relay binlog_format=mixed thread_concurrency = 4 log-slave-updates sync_binlog=1 |
(2)slave连接master 1
MariaDB [(none)]> change master to master_host=‘172.16.7.200‘,master_user=‘repluser‘,master_password=‘repluser‘,master_log_file=‘master-bin.000001‘,master_log_pos=755; Query OK, 0 rows affected (0.05 sec) MariaDB [(none)]> start slave; Query OK, 0 rows affected (0.01 sec) MariaDB [(none)]> show slave status\G; *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 172.16.7.200 Master_User: repluser Master_Port: 3306 Connect_Retry: 60 Master_Log_File: master-bin.000001 Read_Master_Log_Pos: 1261 Relay_Log_File: relay.000002 Relay_Log_Pos: 1042 Relay_Master_Log_File: master-bin.000001 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 1261 Relay_Log_Space: 1329 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 1 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: No Gtid_IO_Pos:
(3)再次测试,在master 1 上创建数据库,去slave上查看
====================== master 1 ===================== MariaDB [(none)]> create database hhht; Query OK, 1 row affected (0.02 sec) ======================== slave ====================== MariaDB [(none)]> show databases; +--------------------+ | Database | +--------------------+ | bynr | | hhht | | hlbr | #所有的全部同步成功 | information_schema | | mysql | | performance_schema | | test | +--------------------+ |
4、安装配置mysql-mmm
因为monitor主机负责所有的监控工作,决定节点的移除等等,所以它必须要得到所有master和slave的授权;且每台数据库服务器上都需要安装mysql-mmm-agent,通过简单远程服务集提供给监控节点
(1)在3台mysql服务器上安装mysql-mmm-agent
因为有好多依赖关系,所以选择yum安装
[root@db1 ~]# yum -y install mysql-mmm-agent |
(2)在3台mysql服务器上为monitor授权
用户 | description | privileges |
monitor user | 用于mmm检查mysql服务器健康状况的用户 | replication client |
agent user | 用于mmm代理为只读模式和复制等的用户 | super,replication client,process |
replication user | 复制的用户 | replication slave |
接下来要做的就是为上表中的用户授权给监控主机,由于我们的主主、主从已经做好了,权限也已经授过了,所以上表中的第三个就不需重复操作了,而且我们只需在主服务器上授权剩下的两个就好了,它会自动同步到另外两台mysql服务器上的
MariaDB [(none)]> grant replication client on *.* to ‘monitor‘@‘172.16.7.100‘ identified by ‘monitor‘; MariaDB [(none)]> grant super,replication client,process on *.* to ‘agent‘@‘172.16.7.100‘ identified by ‘agent‘; MariaDB [(none)]> flush privileges;
(3)在monitor节点上安装mmm
[root@monitor ~]# yum -y install mysql-mmm* |
①、配置/etc/mysql-mmm/mmm_common.conf
[root@monitor ~]# vim /etc/mysql-mmm/mmm_common.conf active_master_role writer <host default> cluster_interface eth0 pid_path /var/run/mysql-mmm/mmm_agentd.pid bin_path /usr/libexec/mysql-mmm/ replication_user repluser #授权复制用户 replication_password repluser #密码 agent_user agent #代理用户 agent_password agent #代理用户密码 </host> <host db1> ip 172.16.7.200 mode master #主的 peer db1 </host> <host db2> ip 172.16.7.201 mode master peer db2 </host> <host db3> ip 172.16.7.202 mode slave #从的 </host> <role writer> hosts db1, db2 ips 172.16.7.1 mode exclusive #exclusive表示排它 </role> <role reader> hosts db2, db3 ips 172.16.7.2, 172.16.7.3 mode balanced #balanced表示均衡 </role> |
②、拷贝此文件到3台mysql服务器
这个不是数据库中的数据,不能同步了,只能手动来了
[root@monitor ~]# scp /etc/mysql-mmm/mmm_common.conf root@172.16.7.202:/etc/mysql-mmm/ #3个节点都需这么做
③、在monitor上修改mmm_mon.conf文件
[root@monitor ~]# vim /etc/mysql-mmm/mmm_mon.conf include mmm_common.conf <monitor> ip 172.16.7.100 #监控主机的ip pid_path /var/run/mysql-mmm/mmm_mond.pid bin_path /usr/libexec/mysql-mmm status_path /var/lib/mysql-mmm/mmm_mond.status ping_ips 172.16.7.200, 172.16.7.201, 172.16.7.202 #各数据库服务器的ip auto_set_online 60 # The kill_host_bin does not exist by default, though the monitor will # throw a warning about it missing. See the section 5.10 "Kill Host # Functionality" in the PDF documentation. # # kill_host_bin /usr/libexec/mysql-mmm/monitor/kill_host # </monitor> <host default> monitor_user monitor #监控用户的用户名 monitor_password monitor #密码 </host> debug 0 #如果程序无序正常监控,可使用debug 1进行排查 |
(4)在每个DB上修改mmm_agent.conf
[root@db1 ~]# vim /etc/mysql-mmm/mmm_agent.conf include mmm_common.conf #调用此文件 # The ‘this‘ variable refers to this server. Proper operation requires # that ‘this‘ server (db1 by default), as well as all other servers, have the # proper IP addresses set in mmm_common.conf. this db1 #这一行是标记此主机的角色(引用mmm_common.conf中的host段),当前主机是db几,这里就要改为this db几 |
5、启动MMM并测试
(1)启动mysql-mmm-agent
[root@db1 ~]# service mysql-mmm-agent start Starting MMM Agent Daemon: [ OK ] #每个DB上的mysql-mmm-agent都要启动 |
(2)启动monitor监控程序
[root@monitor ~]# chkconfig mysql-mmm-monitor on [root@monitor ~]# service mysql-mmm-monitor start Starting MMM Monitor Daemon: [ OK ] |
(3)在监控主机查看监控状态
[root@monitor ~]# service mysql-mmm-monitor status mmm_mond (pid 1586) is running... [root@monitor ~]# mmm_control show db1(172.16.7.200) master/ONLINE. Roles: writer(172.16.7.1) db2(172.16.7.201) master/ONLINE. Roles: reader(172.16.7.3) db3(172.16.7.202) slave/ONLINE. Roles: reader(172.16.7.2) |
(4)设置db1离线,再查看状态
[root@monitor ~]# mmm_control set_offline db1 #设置db1离线 OK: State of ‘db1‘ changed to ADMIN_OFFLINE. Now you can wait some time and check all roles! [root@monitor ~]# mmm_control show db1(172.16.7.200) master/ADMIN_OFFLINE. Roles: #db1已经离线,VIP转移到了db2上 db2(172.16.7.201) master/ONLINE. Roles: reader(172.16.7.3), writer(172.16.7.1) db3(172.16.7.202) slave/ONLINE. Roles: reader(172.16.7.2) |
在开始的时候,我们设置了一个写IP:172.16.7.1和两个读IP:172.16.7.2、172.16.7.3,由(3)和(4)可以看出,在三个节点都正常的情况下,写IP是在master 1上,当master 1挂了的话,写IP就飘到master 2上了,这就解决了mysql数据库的单点故障,通过MMM实现了mysql数据库的高可用
本文出自 “nmshuishui的博客” 博客,请务必保留此出处http://nmshuishui.blog.51cto.com/1850554/1405197