這個事情怎么產生的
MyCat作為經典的分庫分表中間件,在長時間內被廣泛認為是管理超大MySQL數據庫集合的有效解決方案。近來接到客戶需求,需要將MyCat集群遷移到GreatSQL中,并且在一段時間內需要實時從MyCat中同步數據到GreatSQL中,全量同步數據比較容易操作,增量同步有如下兩個棘手的問題:
- 多個server,不同的庫名字,都要同步到GreatSQL一個庫中,即同步關系如下
server1:db1.tab->gdb:db.tab;
server2:db2.tab->gdb:db.tab;
server3:db3.tab->gdb:db.tab;
- ddl同步多次執行會沖突。當MyCat的表中添加一個索引、添加一個字段時,實際上是后端所有db都會執行這個DDL,同步到GreatSQL時,多次執行DDL,復制會異常中斷。
為了解決上面兩個問題,經過查詢資料,發現有兩個不常用,官方也不建議使用的功能,剛好能夠滿足需求
- 為解決庫名映射問題:需要在配置文件中添加參數
replicate_rewrite_db="channel_1:test_rep1->test_rep"
replicate_rewrite_db="channel_2:test_rep2->test_rep"
replicate_rewrite_db="channel_3:test_rep3->test_rep"
- 為了解決DDL同步后重復執行導致復制中斷問題,在配置文件中添加
slave-skip-errors=ddl_exist_errors
驗證一下
為了簡化問題,MyCat集群咱們就不搭建了,簡化為多源同步復制問題。
1.初始化4個實例,同步關系如下
源端口 |
源DB_NAME |
目標端口 |
目標映射DB |
channel_name |
3306 |
test_rep1 |
3309 |
test_rep |
channel_3306 |
3307 |
test_rep2 |
3309 |
test_rep |
channel_3307 |
3308 |
test_rep3 |
3309 |
test_rep |
channel_3308 |
2.在3309的實例配置文件中,添加庫映射關系配置和DDL沖突忽略參數
replicate_rewrite_db="channel_3306:test_rep1->test_rep"
replicate_rewrite_db="channel_3307:test_rep2->test_rep"
replicate_rewrite_db="channel_3308:test_rep3->test_rep"
slave-skip-errors=ddl_exist_errors
4.在3309實例中,配置三個channel
greatsql> change master to MASTER_HOST = '172.17.137.91',MASTER_USER = 'greatsql',MASTER_PASSword = 'greatsql',MASTER_PORT = 3306,MASTER_AUTO_POSITION = 1 FOR CHANNEL 'channel_3306';
greatsql> change master to MASTER_HOST = '172.17.137.91',MASTER_USER = 'greatsql',MASTER_PASSWORD = 'greatsql',MASTER_PORT = 3307,MASTER_AUTO_POSITION = 1 FOR CHANNEL 'channel_3307';
greatsql> change master to MASTER_HOST = '172.17.137.91',MASTER_USER = 'greatsql',MASTER_PASSWORD = 'greatsql',MASTER_PORT = 3308,MASTER_AUTO_POSITION = 1 FOR CHANNEL 'channel_3308';
greatsql> start slave;
3.檢查channel配置狀態
greatsql> show slave status G
*************************** 1. row ***************************
Slave_IO_State: WAIting for source to send event
Master_Host: 172.17.137.91
Master_User: greatsql
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: binlog.000002
Read_Master_Log_Pos: 1119
Relay_Log_File: relaylog-channel_3306.000007
Relay_Log_Pos: 397
Relay_Master_Log_File: binlog.000002
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 1119
Relay_Log_Space: 606
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 3306
Master_UUID: 5facacd7-9ed6-11ee-b76b-00163e5af5d6
Master_Info_File: mysql.slave_master_info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Replica has read all relay log; waiting for more updates
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set:
Executed_Gtid_Set: 49b66af0-9ed6-11ee-ae4f-00163e5af5d6:3,
5a369d0b-9ed6-11ee-b3d1-00163e5af5d6:3,
5facacd7-9ed6-11ee-b76b-00163e5af5d6:1-4,
9a5f9c26-4262-11ee-85fd-00163e5af5d6:1-16571224
Auto_Position: 1
Replicate_Rewrite_DB: (test_rep1,test_rep)
Channel_Name: channel_3306
Master_TLS_Version:
Master_public_key_path:
Get_master_public_key: 0
.NETwork_Namespace:
*************************** 2. row ***************************
Slave_IO_State: Waiting for source to send event
Master_Host: 172.17.137.91
Master_User: greatsql
Master_Port: 3307
Connect_Retry: 60
Master_Log_File: binlog.000002
Read_Master_Log_Pos: 1119
Relay_Log_File: relaylog-channel_3307.000004
Relay_Log_Pos: 1034
Relay_Master_Log_File: binlog.000002
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 1119
Relay_Log_Space: 1243
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 3307
Master_UUID: 5a369d0b-9ed6-11ee-b3d1-00163e5af5d6
Master_Info_File: mysql.slave_master_info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Replica has read all relay log; waiting for more updates
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: 5a369d0b-9ed6-11ee-b3d1-00163e5af5d6:1-2:4
Executed_Gtid_Set: 49b66af0-9ed6-11ee-ae4f-00163e5af5d6:3,
5a369d0b-9ed6-11ee-b3d1-00163e5af5d6:3,
5facacd7-9ed6-11ee-b76b-00163e5af5d6:1-4,
9a5f9c26-4262-11ee-85fd-00163e5af5d6:1-16571224
Auto_Position: 1
Replicate_Rewrite_DB: (test_rep2,test_rep)
Channel_Name: channel_3307
Master_TLS_Version:
Master_public_key_path:
Get_master_public_key: 0
Network_Namespace:
*************************** 3. row ***************************
Slave_IO_State: Waiting for source to send event
Master_Host: 172.17.137.91
Master_User: greatsql
Master_Port: 3308
Connect_Retry: 60
Master_Log_File: binlog.000002
Read_Master_Log_Pos: 1119
Relay_Log_File: relaylog-channel_3308.000004
Relay_Log_Pos: 1034
Relay_Master_Log_File: binlog.000002
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 1119
Relay_Log_Space: 1243
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 3308
Master_UUID: 49b66af0-9ed6-11ee-ae4f-00163e5af5d6
Master_Info_File: mysql.slave_master_info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Replica has read all relay log; waiting for more updates
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: 49b66af0-9ed6-11ee-ae4f-00163e5af5d6:1-2:4
Executed_Gtid_Set: 49b66af0-9ed6-11ee-ae4f-00163e5af5d6:3,
5a369d0b-9ed6-11ee-b3d1-00163e5af5d6:3,
5facacd7-9ed6-11ee-b76b-00163e5af5d6:1-4,
9a5f9c26-4262-11ee-85fd-00163e5af5d6:1-16571224
Auto_Position: 1
Replicate_Rewrite_DB: (test_rep3,test_rep)
Channel_Name: channel_3308
Master_TLS_Version:
Master_public_key_path:
Get_master_public_key: 0
Network_Namespace:
3 rows in set, 1 warning (0.00 sec)
在上面的輸出中,可以重點關注如下字段信息,說明db轉換映射成功
$ MYSQL_PWD=greatsql mysql -ugreatsql -h127.0.0.1 -P3309 -e 'show replica status G'| grep -wE 'Replica_IO_Running|Replica_SQL_Running|Replicate_Rewrite_DB|Channel_Name'
Replica_IO_Running: Yes
Replica_SQL_Running: Yes
Replicate_Rewrite_DB: (test_rep1,test_rep)
Channel_Name: channel_3306
Replica_IO_Running: Yes
Replica_SQL_Running: Yes
Replicate_Rewrite_DB: (test_rep2,test_rep)
Channel_Name: channel_3307
Replica_IO_Running: Yes
Replica_SQL_Running: Yes
Replicate_Rewrite_DB: (test_rep3,test_rep)
Channel_Name: channel_3308
在3309實例中,查詢replica_skip_errors,確認復制異常跳過的錯誤碼,設置為ddl_exist_errors會自動轉換為如下錯誤碼
greatsql> select @@replica_skip_errors;
+---------------------------------------------------+
| @@replica_skip_errors |
+---------------------------------------------------+
| 1007,1008,1050,1051,1054,1060,1061,1068,1091,1146 |
+---------------------------------------------------+
1 row in set (0.00 sec)
5.數據同步驗證
- 在3309庫中,創建database test_rep
這個庫需要手動創建,是測試發現映射關系只對庫下面的表生效,庫不會自動轉換創建。
- 在3306庫中,創建database test_rep1,并且創建tab1表
$ mysql -ugreatsql -h127.0.0.1 -p -P3306 -s
greatsql> create database test_rep1;
greatsql> use test_rep1;
greatsql> create table tab1(id int primary key auto_increment, cname varchar(10), age int);
- 在3307庫中,創建database test_rep2,并且創建tab1表
$ mysql -ugreatsql -h127.0.0.1 -p -P3307 -s
greatsql> create database test_rep2;
greatsql> use test_rep2;
greatsql> create table tab1(id int primary key auto_increment, cname varchar(10), age int);
- 在3308庫中,創建database test_rep3,并且創建tab1表
$ mysql -ugreatsql -h127.0.0.1 -p -P3308 -s
greatsql> create database test_rep3;
greatsql> use test_rep3;
greatsql> create table tab1(id int primary key auto_increment, cname varchar(10), age int);
- 在3309中確認database及表的同步
$ mysql -ugreatsql -h127.0.0.1 -p -P3309 -s
greatsql> show databases;
Database
information_schema
mysql
performance_schema
sys
test_db
test_rep
test_rep1
test_rep2
test_rep3
greatsql> show tables from test_rep;
Tables_in_test_rep
tab1
greatsql> show tables from test_rep1;
greatsql> show tables from test_rep2;
greatsql> show tables from test_rep3;
從上面的信息可以看出,在3309中,3306、3307、3308中創建的庫均按照原有的名字進行了同步,但是表只同步在了3309映射的庫test_rep中。
- 分別在3306、3307、3308中插入一條記錄
3306 : insert into test_rep1.tab1 values(1,'a',10);
3307 : insert into test_rep2.tab1 values(2,'b',20);
3308 : insert into test_rep3.tab1 values(3,'c',30);
然后在各自節點查詢數據插入情況
$ mysql -ugreatsql -h127.0.0.1 -p -P3306 -s test_rep1 -e 'select * from tab1'
id cname age
1 a 10
$ mysql -ugreatsql -h127.0.0.1 -p -P3307 -s test_rep2 -e 'select * from tab1'
id cname age
2 b 20
$ mysql -ugreatsql -h127.0.0.1 -p -P3308 -s test_rep3 -e 'select * from tab1'
id cname age
3 c 30
$ mysql -ugreatsql -h127.0.0.1 -p -P3309 -s test_rep -e 'select * from tab1'
id cname age
1 a 10
2 b 20
3 c 30
$ mysql -ugreatsql -h127.0.0.1 -p -P3309 -s test_rep1 -e 'select * from tab1'
ERROR 1146 (42S02) at line 1: Table 'test_rep1.tab1' doesn't exist
$ mysql -ugreatsql -h127.0.0.1 -p -P3309 -s test_rep2 -e 'select * from tab1'
ERROR 1146 (42S02) at line 1: Table 'test_rep2.tab1' doesn't exist
$ mysql -ugreatsql -h127.0.0.1 -p -P3309 -s test_rep3 -e 'select * from tab1'
ERROR 1146 (42S02) at line 1: Table 'test_rep3.tab1' doesn't exist
從上面的查詢情況可以看出,3306、3307、3308節點中只有一條記錄,并且記錄都被同步到了3309的test_rep.tab1表中,而且在3309的test_rep1、test_rep2、test_rep3中是沒有表存在的。
- 分別在3306、3307、3308給表tab創建一個索引
greatsql> alter table tab1 add index idx_cname(cname);
- 觀察3309中表的索引情況,可以看到索引idx_cname被同步過來了
$ mysql -ugreatsql -h127.0.0.1 -p -P3309 -s
greatsql> use test_rep
greatsql> show create table tab1 G
*************************** 1. row ***************************
Table: tab1
Create Table: CREATE TABLE `tab1` (
`id` int NOT NULL AUTO_INCREMENT,
`cname` varchar(10) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`age` int DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `idx_cname` (`cname`)
) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
1 row in set (0.00 sec)
- 分別在3306、3307、3308做update、delete操作
greatsql> update test_rep1.tab1 set age=110 where id=1;
greatsql> update test_rep2.tab1 set age=120 where id=2;
greatsql> update test_rep3.tab1 set age=130 where id=3;
greatsql> delete from test_rep1.tab1 where id=1;
greatsql> delete from test_rep2.tab1 where id=1;
greatsql> delete from test_rep3.tab1 where id=1;
查看3309的數據同步情況,確認數據被清理
$ mysql -ugreatsql -h127.0.0.1 -p -P3309 -s
Welcome to the MySQL monitor. Commands end with ; or g.
greatsql> select * from test_rep.tab1;
- 觀察3個channel的同步情況,可以確認三個復制同步均正常
$ mysql -ugreatsql -h127.0.0.1 -p -P3309 -e 'show replica status G'| grep -E 'Replica_IO_Running|Replica_SQL_Running|Channel_Name'
greatsql: [Warning] Using a password on the command line interface can be insecure.
Replica_IO_Running: Yes
Replica_SQL_Running: Yes
Replica_SQL_Running_State: Replica has read all relay log; waiting for more updates
Channel_Name: channel_3306
Replica_IO_Running: Yes
Replica_SQL_Running: Yes
Replica_SQL_Running_State: Replica has read all relay log; waiting for more updates
Channel_Name: channel_3307
Replica_IO_Running: Yes
Replica_SQL_Running: Yes
Replica_SQL_Running_State: Replica has read all relay log; waiting for more updates
Channel_Name: channel_3308
至此dml、ddl同步均驗證。
方案缺陷
- 本方案中,業務訪問MyCat的表名字,和server后端的表名字完全一致,只是庫名字不相同,然后MyCat代理表名和實際server的表名字可以不相同,這種情況下,暫時無法映射處理
- MyCat代理的實際上是多個單獨的庫,如果這些庫之前沒有做自增主鍵步長處理,或者其他一些主鍵不重復策略,同步過程中,會存在主鍵沖突導致數據同步中斷的情況,需要提前準備處理方案。
最后附上參考資料
- https://dev.mysql.com/doc/refman/8.0/en/replication-options-replica.html#option_mysqld_replicate-rewrite-db
- https://dev.mysql.com/doc/refman/8.0/en/replication-options-replica.html#option_mysqld_slave-skip-errors