2020-09-09
MySQL GTID主从复制不一致故障处理
主从数据重复是指,在从库有一条数据,在主库又执行了同一条语句,导致从主库同步的数据传到从库后发生错误,因为从库也有同一条数据,这时处时方法如下:
表结构: show create table test;
+-------+------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+-------+------------------------------------------------------------------------------------------------------------+
| test | CREATE TABLE `test` (
`id` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
主库数据:
(root@node230_master) [test]> select * from test;
+------+
| id |
+------+
| 1 |
| 2 |
| 3 |
+------+
从库数据:
(root@node231_slave) [test]> select * from test;
+------+
| id |
+------+
| 1 |
| 2 |
| 3 |
+------+
3 rows in set (0.00 sec)
这时在从库插入一条数据:
(root@node231_slave) [test]> insert into test (id) values(4);
Query OK, 1 row affected (0.10 sec)
(root@node231_slave) [test]> select * from test;
+------+
| id |
+------+
| 1 |
| 2 |
| 3 |
| 4 |
+------+
4 rows in set (0.00 sec)
插入完成后再到主库再插入同一条数据:
(root@node230_master) [test]> insert into test (id) values(4);
Query OK, 1 row affected (0.02 sec)
(root@node230_master) [test]> select * from test;
+------+
| id |
+------+
| 1 |
| 2 |
| 3 |
| 4 |
+------+
4 rows in set (0.00 sec)
这时再看从库时,会发出主从同步异常:
(root@node231_slave) [test]> show slave status\G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 172.16.1.230
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000008
Read_Master_Log_Pos: 1134
Relay_Log_File: relay-bin.000009
Relay_Log_Pos: 1052
Relay_Master_Log_File: mysql-bin.000008
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 1062
Last_Error: Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction 'd8886781-ae2c-11ea-ab05-000c2922452d:9' at master log mysql-bin.000008, end_log_pos 1103. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.
Skip_Counter: 0
Exec_Master_Log_Pos: 879
Relay_Log_Space: 1508
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 1062
Last_SQL_Error: Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction 'd8886781-ae2c-11ea-ab05-000c2922452d:9' at master log mysql-bin.000008, end_log_pos 1103. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1
Master_UUID: d8886781-ae2c-11ea-ab05-000c2922452d
Master_Info_File: mysql.slave_master_info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State:
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp: 200909 17:09:36
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: d8886781-ae2c-11ea-ab05-000c2922452d:6-9
Executed_Gtid_Set: d8886781-ae2c-11ea-ab05-000c2922452d:1-8,
e1a34bdd-ae2c-11ea-aa29-000c29ed2d17:1-3
Auto_Position: 1
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
1 row in set (0.00 sec)
ERROR:
No query specified
查看是什么原因导致了主从复制出错:
(root@node231_slave) [test]> select * from performance_schema.replication_applier_status_by_worker;
| | 1 | NULL | OFF | d8886781-ae2c-11ea-ab05-000c2922452d:9 | 1062 | Worker 1 failed executing transaction 'd8886781-ae2c-11ea-ab05-000c2922452d:9' at master log mysql-bin.000008, end_log_pos 1103; Could not execute Write_rows event on table test.test; Duplicate entry '4' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log mysql-bin.000008, end_log_pos 1103 | 2020-09-09 17:09:36 |
以上提示是主键值重复导致了主从复制出错
在主库上通过查看binlog日志,查看具体是那条SQL导致的:
(root@node230_master) [test]> show binlog events in 'mysql-bin.000008';
+------------------+------+----------------+-----------+-------------+-------------------------------------------------------------------+
| Log_name | Pos | Event_type | Server_id | End_log_pos | Info |
+------------------+------+----------------+-----------+-------------+-------------------------------------------------------------------+
| mysql-bin.000008 | 4 | Format_desc | 1 | 123 | Server ver: 5.7.18-log, Binlog ver: 4 |
| mysql-bin.000008 | 123 | Previous_gtids | 1 | 194 | d8886781-ae2c-11ea-ab05-000c2922452d:1-5 |
| mysql-bin.000008 | 194 | Gtid | 1 | 259 | SET @@SESSION.GTID_NEXT= 'd8886781-ae2c-11ea-ab05-000c2922452d:6' |
| mysql-bin.000008 | 259 | Query | 1 | 331 | BEGIN |
| mysql-bin.000008 | 331 | Table_map | 1 | 378 | table_id: 219 (test.test) |
| mysql-bin.000008 | 378 | Write_rows | 1 | 418 | table_id: 219 flags: STMT_END_F |
| mysql-bin.000008 | 418 | Xid | 1 | 449 | COMMIT /* xid=36 */ |
| mysql-bin.000008 | 449 | Gtid | 1 | 514 | SET @@SESSION.GTID_NEXT= 'd8886781-ae2c-11ea-ab05-000c2922452d:7' |
| mysql-bin.000008 | 514 | Query | 1 | 586 | BEGIN |
| mysql-bin.000008 | 586 | Table_map | 1 | 633 | table_id: 219 (test.test) |
| mysql-bin.000008 | 633 | Delete_rows | 1 | 673 | table_id: 219 flags: STMT_END_F |
| mysql-bin.000008 | 673 | Xid | 1 | 704 | COMMIT /* xid=39 */ |
| mysql-bin.000008 | 704 | Gtid | 1 | 769 | SET @@SESSION.GTID_NEXT= 'd8886781-ae2c-11ea-ab05-000c2922452d:8' |
| mysql-bin.000008 | 769 | Query | 1 | 879 | use `test`; alter table test add primary key(id) |
| mysql-bin.000008 | 879 | Gtid | 1 | 944 | SET @@SESSION.GTID_NEXT= 'd8886781-ae2c-11ea-ab05-000c2922452d:9' |
| mysql-bin.000008 | 944 | Query | 1 | 1016 | BEGIN |
| mysql-bin.000008 | 1016 | Table_map | 1 | 1063 | table_id: 221 (test.test) |
| mysql-bin.000008 | 1063 | Write_rows | 1 | 1103 | table_id: 221 flags: STMT_END_F |
| mysql-bin.000008 | 1103 | Xid | 1 | 1134 | COMMIT /* xid=50 */ |
+------------------+------+----------------+-----------+-------------+-------------------------------------------------------------------+
19 rows in set (0.00 sec)
在主库上,通过mysqlbinlog查看具体的SQL语句
[root@node230 mysql]# mysqlbinlog -vv mysql-bin.000008 | grep -C 5 '1103'
BEGIN
/*!*/;
# at 1016
#200909 17:09:36 server id 1 end_log_pos 1063 CRC32 0xf58cd356 Table_map: `test`.`test` mapped to number 221
# at 1063
#200909 17:09:36 server id 1 end_log_pos 1103 CRC32 0xb4afe198 Write_rows: table id 221 flags: STMT_END_F
BINLOG '
0JtYXxMBAAAALwAAACcEAAAAAN0AAAAAAAEABHRlc3QABHRlc3QAAQMAAFbTjPU=
0JtYXx4BAAAAKAAAAE8EAAAAAN0AAAAAAAEAAgAB//4EAAAAmOGvtA==
'/*!*/;
### INSERT INTO `test`.`test`
### SET
### @1=4 /* INT meta=0 nullable=0 is_null=0 */
# at 1103
#200909 17:09:36 server id 1 end_log_pos 1134 CRC32 0xdc700e01 Xid = 50
COMMIT/*!*/;
SET @@SESSION.GTID_NEXT= 'AUTOMATIC' /* added by mysqlbinlog */ /*!*/;
DELIMITER ;
# End of log file
#以上是由于主库插入了id=4的记录,而从库也有ID=4的记录,并且id为这张表的主键,所有导致的主从不一致故障
解决方法:
在从库上操作:
查看从库的UUID:
(root@node231_slave) [test]> show variables like '%uuid%';
+---------------+--------------------------------------+
| Variable_name | Value |
+---------------+--------------------------------------+
| server_uuid | e1a34bdd-ae2c-11ea-aa29-000c29ed2d17 |
+---------------+--------------------------------------+
1 row in set (0.00 sec)
(root@node231_slave) [test]> show slave status\G;
Master_Server_Id: 1
Master_UUID: d8886781-ae2c-11ea-ab05-000c2922452d #主库的UUID
Master_Info_File: mysql.slave_master_info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State:
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp: 200909 17:09:36
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: d8886781-ae2c-11ea-ab05-000c2922452d:6-9 #主库要执行的UUID值+事务ID值,现在要执行主库ID=9的事务报错了
Executed_Gtid_Set: d8886781-ae2c-11ea-ab05-000c2922452d:1-8, #已经执行了主库ID=8的事务
e1a34bdd-ae2c-11ea-aa29-000c29ed2d17:1-3
Auto_Position: 1
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
修改gtid_next的值:
(root@node231_slave) [test]> select @@gtid_next;
+-------------+
| @@gtid_next |
+-------------+
| AUTOMATIC |
+-------------+
1 row in set (0.00 sec)
配置gtid_next的值为9:
(root@node231_slave) [test]> set gtid_next='d8886781-ae2c-11ea-ab05-000c2922452d:9';
再执行一个空事务:
(root@node231_slave) [test]> begin;
Query OK, 0 rows affected (0.00 sec)
(root@node231_slave) [test]> commit;
Query OK, 0 rows affected (0.00 sec)
再将gtid_next修改成原来的值:
(root@node231_slave) [test]> set gtid_next='AUTOMATIC';
重启复制进程:
(root@node231_slave) [test]> stop slave;
Query OK, 0 rows affected (0.01 sec)
(root@node231_slave) [test]> start slave;
Query OK, 0 rows affected (0.21 sec)
查看复制进程是否正常:
(root@node231_slave) [test]> show slave status\G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 172.16.1.230
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000008
Read_Master_Log_Pos: 1134
Relay_Log_File: relay-bin.000010
Relay_Log_Pos: 454
Relay_Master_Log_File: mysql-bin.000008
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 1134
Relay_Log_Space: 1808
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1
Master_UUID: d8886781-ae2c-11ea-ab05-000c2922452d
Master_Info_File: mysql.slave_master_info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: d8886781-ae2c-11ea-ab05-000c2922452d:6-9
Executed_Gtid_Set: d8886781-ae2c-11ea-ab05-000c2922452d:1-9,
e1a34bdd-ae2c-11ea-aa29-000c29ed2d17:1-3
Auto_Position: 1
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
1 row in set (0.00 sec)
ERROR:
No query specified
这时复制正常已恢复正常
主库:
(root@node230_master) [test]> insert into test values(5);
Query OK, 1 row affected (0.12 sec)
(root@node230_master) [test]> select * from test;
+----+
| id |
+----+
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
+----+
5 rows in set (0.00 sec)
从库
(root@node231_slave) [test]> select * from test;
+----+
| id |
+----+
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
+----+
5 rows in set (0.00 sec)
主从库同步也正常
评论
发表评论
姓 名: