我们中的许多人,现在的的MySQL DBA使用SHOW REPLICA STATUS中的Seconds_Behind_Source来了解(异步)复制的状态和正确执行。
请注意新的技术。我相信我们都用过旧的术语。
然而,MySQL复制已经发展了很多,复制团队已经努力包含了许多关于MySQL可用的所有复制方式的有用信息。
例如,我们添加了并行复制、组复制……所有这些信息都是从旧的SHOW REPLICA STATUS结果中看不到的。
使用Performance_Schema可以更好地监视和观察复制过程。
当前在Performance_Schema中,有15个表与复制检测有关:
+------------------------------------------------------+
| Tables_in_performance_schema (replication%) |
+------------------------------------------------------+
| replication_applier_configuration |
| replication_applier_filters |
| replication_applier_global_filters |
| replication_applier_status |
| replication_applier_status_by_coordinator |
| replication_applier_status_by_worker |
| replication_asynchronous_connection_failover |
| replication_asynchronous_connection_failover_managed |
| replication_connection_configuration |
| replication_connection_status |
| replication_group_communication_information |
| replication_group_configuration_version |
| replication_group_member_actions |
| replication_group_member_stats |
| replication_group_members |
+------------------------------------------------------+
15 rows in set (0.0038 sec)
但确实,要理解这些指标的含义以及在哪里寻找对我们MySQL DBA重要的有用信息并不总是那么容易:我的复制落后于源库吗?
我创建了一些可以安装在sys中的视图,以使用这些度量中的大多数来获得与我们DBA相关的东西:mysql_8_replication_observability.sql [https://gist.github.com/lefred/1bad64403923664a14e0f20f572d7526]
让我们仔细看看这些观点。
查看复制延迟的SQL:
select * from sys.replication_lag;
+---------------------------+-----------------------+------------------------+
| channel_name | max_lag_from_original | max_lag_from_immediate |
+---------------------------+-----------------------+------------------------+
| clusterset_replication | 00:00:04.963223 | 00:00:04.940782 |
| group_replication_applier | 0 | 0 |
+---------------------------+-----------------------+------------------------+
从上面的输出中,我们可以看到MySQL实例是一个异步副本,但它也是组复制集群的一部分。
事实上,这是InnoDB ClusterSet中DR集群的主要成员。
我们还可以看到,这个复制副本几乎延迟了5秒(滞后)。
然后,我们有了复制通道的名称和与原始提交器和直接源(在级联复制的情况下)的最大延迟/滞后(因为在并行复制的情况中可能有几个工作者)。
在组复制集群(InnoDB cluster)的辅助成员上,我们可以看到以下输出:
select * from sys.replication_lag;
+----------------------------+-----------------------+------------------------+
| channel_name | max_lag_from_original | max_lag_from_immediate |
+----------------------------+-----------------------+------------------------+
| group_replication_recovery | null | null |
| group_replication_applier | 00:00:02.733008 | 00:00:02.733008 |
+----------------------------+-----------------------+------------------------+
我们可以看到,用于恢复的通道(在节点加入组时读取丢失的二进制日志事件、事务)没有被使用,并且组复制的应用程序有点落后。
复制状态
这个视图更完整,每个worders都有一行。
让我们以InnoDB ClusterSet的DR站点的Primary成员为例:
select * from replication_status;
+-------------------------------+----------+----------+---------+-------------------+--------------------+
| channel | io_state | co_state | w_state | lag_from_original | lag_from_immediate |
+-------------------------------+----------+----------+---------+-------------------+--------------------+
| group_replication_applier (1) | ON | ON | ON | none | none |
| group_replication_applier (2) | ON | ON | ON | none | none |
| group_replication_applier (3) | ON | ON | ON | none | none |
| group_replication_applier (4) | ON | ON | ON | none | none |
| clusterset_replication (1) | ON | ON | ON | 00:00:15.395870 | 00:00:15.380884 |
| clusterset_replication (2) | ON | ON | ON | 00:00:15.395686 | 00:00:15.380874 |
| clusterset_replication (3) | ON | ON | ON | 00:00:15.411204 | 00:00:15.388451 |
| clusterset_replication (4) | ON | ON | ON | 00:00:15.406154 | 00:00:15.388434 |
+-------------------------------+----------+----------+---------+-------------------+--------------------+
我们可以看到,来自Primary集群的并行(异步)复制使用了4个并行工作线程。
我们还可以看到,他们正在落后…
您可能已经注意到有3种状态(均为ON)。使用SHOW REPLICA STATUS,我们只能看到:
Replica_IO_Running: Yes
Replica_SQL_Running: Yes
对于并行复制,在应用binlog事件期间,我们有另一个参与复制的线程:协调器线程。
查看完整的复制状态:
select * from sys.replication_status_full\G
*************************** 1. row ***************************
channel: group_replication_applier (1)
host: <NULL>
port: 0
user:
source_uuid: 7b6bf13d-40ed-11ee-bfdd-c8cb9e32df8e
group_name: 7b6bf13d-40ed-11ee-bfdd-c8cb9e32df8e
last_heartbeat_timestamp: 0000-00-00 00:00:00.000000
heartbeat_interval: 30
io_state: ON
io_thread_state: NULL
io_errno: 0
io_errmsg:
io_errtime: 0000-00-00 00:00:00.000000
co_state: ON
co_thread_state: Replica has read all relay log; waiting for more updates
co_errno: 0
co_errmsg:
co_errtime: 0000-00-00 00:00:00.000000
w_state: ON
w_thread_state: Waiting for an event from Coordinator
w_errno: 0
w_errmsg:
w_errtime: 0000-00-00 00:00:00.000000
time_since_last_message: 03:36:40.474223
applier_busy_state: IDLE
lag_from_original: none
lag_from_immediate: none
transport_time: 1.80 us
time_to_relay_log: 12.00 us
apply_time: 784.00 us
last_applied_transaction: 7b6bf4f0-40ed-11ee-bfdd-c8cb9e32df8e:3
last_queued_transaction: 7b6bf4f0-40ed-11ee-bfdd-c8cb9e32df8e:3
queued_gtid_set_to_apply:
*************************** 2. row ***************************
channel: group_replication_applier (2)
host: <NULL>
port: 0
user:
source_uuid: 7b6bf13d-40ed-11ee-bfdd-c8cb9e32df8e
group_name: 7b6bf13d-40ed-11ee-bfdd-c8cb9e32df8e
last_heartbeat_timestamp: 0000-00-00 00:00:00.000000
heartbeat_interval: 30
io_state: ON
io_thread_state: NULL
io_errno: 0
io_errmsg:
io_errtime: 0000-00-00 00:00:00.000000
co_state: ON
co_thread_state: Replica has read all relay log; waiting for more updates
co_errno: 0
co_errmsg:
co_errtime: 0000-00-00 00:00:00.000000
w_state: ON
w_thread_state: Waiting for an event from Coordinator
w_errno: 0
w_errmsg:
w_errtime: 0000-00-00 00:00:00.000000
time_since_last_message: 03:36:40.474223
applier_busy_state: IDLE
lag_from_original: none
lag_from_immediate: none
transport_time: 1.80 us
time_to_relay_log: 12.00 us
apply_time: 0 ps
last_applied_transaction:
last_queued_transaction: 7b6bf4f0-40ed-11ee-bfdd-c8cb9e32df8e:3
queued_gtid_set_to_apply:
*************************** 3. row ***************************
channel: group_replication_applier (3)
host: <NULL>
port: 0
user:
source_uuid: 7b6bf13d-40ed-11ee-bfdd-c8cb9e32df8e
group_name: 7b6bf13d-40ed-11ee-bfdd-c8cb9e32df8e
last_heartbeat_timestamp: 0000-00-00 00:00:00.000000
heartbeat_interval: 30
io_state: ON
io_thread_state: NULL
io_errno: 0
io_errmsg:
io_errtime: 0000-00-00 00:00:00.000000
co_state: ON
co_thread_state: Replica has read all relay log; waiting for more updates
co_errno: 0
co_errmsg:
co_errtime: 0000-00-00 00:00:00.000000
w_state: ON
w_thread_state: Waiting for an event from Coordinator
w_errno: 0
w_errmsg:
w_errtime: 0000-00-00 00:00:00.000000
time_since_last_message: 03:36:40.474223
applier_busy_state: IDLE
lag_from_original: none
lag_from_immediate: none
transport_time: 1.80 us
time_to_relay_log: 12.00 us
apply_time: 0 ps
last_applied_transaction:
last_queued_transaction: 7b6bf4f0-40ed-11ee-bfdd-c8cb9e32df8e:3
queued_gtid_set_to_apply:
*************************** 4. row ***************************
channel: group_replication_applier (4)
host: <NULL>
port: 0
user:
source_uuid: 7b6bf13d-40ed-11ee-bfdd-c8cb9e32df8e
group_name: 7b6bf13d-40ed-11ee-bfdd-c8cb9e32df8e
last_heartbeat_timestamp: 0000-00-00 00:00:00.000000
heartbeat_interval: 30
io_state: ON
io_thread_state: NULL
io_errno: 0
io_errmsg:
io_errtime: 0000-00-00 00:00:00.000000
co_state: ON
co_thread_state: Replica has read all relay log; waiting for more updates
co_errno: 0
co_errmsg:
co_errtime: 0000-00-00 00:00:00.000000
w_state: ON
w_thread_state: Waiting for an event from Coordinator
w_errno: 0
w_errmsg:
w_errtime: 0000-00-00 00:00:00.000000
time_since_last_message: 03:36:40.474223
applier_busy_state: IDLE
lag_from_original: none
lag_from_immediate: none
transport_time: 1.80 us
time_to_relay_log: 12.00 us
apply_time: 0 ps
last_applied_transaction:
last_queued_transaction: 7b6bf4f0-40ed-11ee-bfdd-c8cb9e32df8e:3
queued_gtid_set_to_apply:
*************************** 5. row ***************************
channel: clusterset_replication (1)
host: 127.0.0.1
port: 3310
user: mysql_innodb_cs_b0adbc6c
source_uuid: 2cb77a02-40eb-11ee-83f4-c8cb9e32df8e
group_name:
last_heartbeat_timestamp: 2023-08-22 18:48:41.037817
heartbeat_interval: 30
io_state: ON
io_thread_state: Waiting for source to send event
io_errno: 0
io_errmsg:
io_errtime: 0000-00-00 00:00:00.000000
co_state: ON
co_thread_state: Waiting for replica workers to process their queues
co_errno: 0
co_errmsg:
co_errtime: 0000-00-00 00:00:00.000000
w_state: ON
w_thread_state: waiting for handler commit
w_errno: 0
w_errmsg:
w_errtime: 0000-00-00 00:00:00.000000
time_since_last_message: 00:00:00.001134
applier_busy_state: APPLYING
lag_from_original: 00:00:01.799071
lag_from_immediate: 00:00:01.783404
transport_time: 2.26 ms
time_to_relay_log: 19.00 us
apply_time: 14.63 ms
last_applied_transaction: 54d83026-40eb-11ee-a5d3-c8cb9e32df8e:105180
last_queued_transaction: 54d83026-40eb-11ee-a5d3-c8cb9e32df8e:105547
queued_gtid_set_to_apply: 54d83026-40eb-11ee-a5d3-c8cb9e32df8e:105184-105547
*************************** 6. row ***************************
channel: clusterset_replication (2)
host: 127.0.0.1
port: 3310
user: mysql_innodb_cs_b0adbc6c
source_uuid: 2cb77a02-40eb-11ee-83f4-c8cb9e32df8e
group_name:
last_heartbeat_timestamp: 2023-08-22 18:48:41.037817
heartbeat_interval: 30
io_state: ON
io_thread_state: Waiting for source to send event
io_errno: 0
io_errmsg:
io_errtime: 0000-00-00 00:00:00.000000
co_state: ON
co_thread_state: Waiting for replica workers to process their queues
co_errno: 0
co_errmsg:
co_errtime: 0000-00-00 00:00:00.000000
w_state: ON
w_thread_state: waiting for handler commit
w_errno: 0
w_errmsg:
w_errtime: 0000-00-00 00:00:00.000000
time_since_last_message: 00:00:00.001134
applier_busy_state: APPLYING
lag_from_original: 00:00:01.797743
lag_from_immediate: 00:00:01.783390
transport_time: 2.26 ms
time_to_relay_log: 19.00 us
apply_time: 21.47 ms
last_applied_transaction: 54d83026-40eb-11ee-a5d3-c8cb9e32df8e:105181
last_queued_transaction: 54d83026-40eb-11ee-a5d3-c8cb9e32df8e:105547
queued_gtid_set_to_apply: 54d83026-40eb-11ee-a5d3-c8cb9e32df8e:105184-105547
*************************** 7. row ***************************
channel: clusterset_replication (3)
host: 127.0.0.1
port: 3310
user: mysql_innodb_cs_b0adbc6c
source_uuid: 2cb77a02-40eb-11ee-83f4-c8cb9e32df8e
group_name:
last_heartbeat_timestamp: 2023-08-22 18:48:41.037817
heartbeat_interval: 30
io_state: ON
io_thread_state: Waiting for source to send event
io_errno: 0
io_errmsg:
io_errtime: 0000-00-00 00:00:00.000000
co_state: ON
co_thread_state: Waiting for replica workers to process their queues
co_errno: 0
co_errmsg:
co_errtime: 0000-00-00 00:00:00.000000
w_state: ON
w_thread_state: waiting for handler commit
w_errno: 0
w_errmsg:
w_errtime: 0000-00-00 00:00:00.000000
time_since_last_message: 00:00:00.001134
applier_busy_state: APPLYING
lag_from_original: 00:00:01.786087
lag_from_immediate: 00:00:01.767563
transport_time: 2.26 ms
time_to_relay_log: 19.00 us
apply_time: 21.58 ms
last_applied_transaction: 54d83026-40eb-11ee-a5d3-c8cb9e32df8e:105182
last_queued_transaction: 54d83026-40eb-11ee-a5d3-c8cb9e32df8e:105547
queued_gtid_set_to_apply: 54d83026-40eb-11ee-a5d3-c8cb9e32df8e:105184-105547
*************************** 8. row ***************************
channel: clusterset_replication (4)
host: 127.0.0.1
port: 3310
user: mysql_innodb_cs_b0adbc6c
source_uuid: 2cb77a02-40eb-11ee-83f4-c8cb9e32df8e
group_name:
last_heartbeat_timestamp: 2023-08-22 18:48:41.037817
heartbeat_interval: 30
io_state: ON
io_thread_state: Waiting for source to send event
io_errno: 0
io_errmsg:
io_errtime: 0000-00-00 00:00:00.000000
co_state: ON
co_thread_state: Waiting for replica workers to process their queues
co_errno: 0
co_errmsg:
co_errtime: 0000-00-00 00:00:00.000000
w_state: ON
w_thread_state: waiting for handler commit
w_errno: 0
w_errmsg:
w_errtime: 0000-00-00 00:00:00.000000
time_since_last_message: 00:00:00.001134
applier_busy_state: APPLYING
lag_from_original: 00:00:01.785881
lag_from_immediate: 00:00:01.767550
transport_time: 2.26 ms
time_to_relay_log: 19.00 us
apply_time: 29.59 ms
last_applied_transaction: 54d83026-40eb-11ee-a5d3-c8cb9e32df8e:105183
last_queued_transaction: 54d83026-40eb-11ee-a5d3-c8cb9e32df8e:105547
queued_gtid_set_to_apply: 54d83026-40eb-11ee-a5d3-c8cb9e32df8e:105184-105547
有了这个视图,我们有了更多的细节,例如,包括复制心跳。我们还概述了GTID(排队、应用…)。
我们还可以看到有关传输时间(网络)、写入中继日志的时间以及最后应用的时间的信息。
当然,您可以使用该视图来显示您需要的任何内容,例如:
MySQL InnoDB Cluster, ClusterSet, Read Replicas
如果您使用由Admin API与MySQL Shell管理的良好集成解决方案,那么所有这些信息都已经可以使用status()方法获得。
status()方法可以用3个值进行扩展:
1:包括有关元数据版本、组协议版本、组名称、集群成员UUID、集群成员角色和状态的信息,如组复制报告的,以及系统变量列表;
2:包括有关通过连接处理的交易的信息,以及应用者;
3:包括有关每个复制机制的更详细统计信息集群成员;
让我们来看一个具有扩展选项3的ClusterSet示例:
JS> cs.status({extended:3})
{
"clusters": {
"cluster2": {
"clusterRole": "REPLICA",
"clusterSetReplication": {
"applierQueuedTransactionSet": "54d83026-40eb-11ee-a5d3-c8cb9e32df8e:137385-138500",
"applierQueuedTransactionSetSize": 1116,
"applierState": "ON",
"applierStatus": "APPLYING",
"applierThreadState": "waiting for handler commit",
"applierWorkerThreads": 4,
"coordinatorState": "ON",
"coordinatorThreadState": "Waiting for replica workers to process their queues",
"options": {
"connectRetry": 3,
"delay": 0,
"heartbeatPeriod": 30,
"retryCount": 10
},
"receiver": "127.0.0.1:4420",
"receiverStatus": "ON",
"receiverThreadState": "Waiting for source to send event",
"receiverTimeSinceLastMessage": "00:00:00.002737",
"replicationSsl": null,
"source": "127.0.0.1:3310"
},
"clusterSetReplicationStatus": "OK",
"communicationStack": "MYSQL",
"globalStatus": "OK",
"groupName": "7b6bf13d-40ed-11ee-bfdd-c8cb9e32df8e",
"groupViewChangeUuid": "7b6bf4f0-40ed-11ee-bfdd-c8cb9e32df8e",
"paxosSingleLeader": "OFF",
"receivedTransactionSet": "54d83026-40eb-11ee-a5d3-c8cb9e32df8e:129-138500",
"ssl": "REQUIRED",
"status": "OK_NO_TOLERANCE",
"statusText": "Cluster is NOT tolerant to any failures.",
"topology": {
"127.0.0.1:4420": {
"address": "127.0.0.1:4420",
"applierWorkerThreads": 4,
"fenceSysVars": [
"read_only",
"super_read_only"
],
"memberId": "c3d726ac-40ec-11ee-ab38-c8cb9e32df8e",
"memberRole": "PRIMARY",
"memberState": "ONLINE",
"mode": "R/O",
"readReplicas": {},
"replicationLagFromImmediateSource": "00:00:05.420247",
"replicationLagFromOriginalSource": "00:00:05.433548",
"role": "HA",
"status": "ONLINE",
"version": "8.1.0"
},
"127.0.0.1:4430": {
"address": "127.0.0.1:4430",
"applierWorkerThreads": 4,
"fenceSysVars": [
"read_only",
"super_read_only"
],
"memberId": "709b15ea-40ed-11ee-a9b3-c8cb9e32df8e",
"memberRole": "SECONDARY",
"memberState": "ONLINE",
"mode": "R/O",
"readReplicas": {},
"replicationLagFromImmediateSource": "00:00:00.038075",
"replicationLagFromOriginalSource": "00:00:05.432536",
"role": "HA",
"status": "ONLINE",
"version": "8.1.0"
}
},
"transactionSet": "2cb77a02-40eb-11ee-83f4-c8cb9e32df8e:1-4,54d83026-40eb-11ee-a5d3-c8cb9e32df8e:1-137384,54d8329c-40eb-11ee-a5d3-c8cb9e32df8e:1-5,7b6bf4f0-40ed-11ee-bfdd-c8cb9e32df8e:1-3",
"transactionSetConsistencyStatus": "OK",
"transactionSetErrantGtidSet": "",
"transactionSetMissingGtidSet": "54d83026-40eb-11ee-a5d3-c8cb9e32df8e:137385-138552"
},
"myCluster": {
"clusterRole": "PRIMARY",
"communicationStack": "MYSQL",
"globalStatus": "OK",
"groupName": "54d83026-40eb-11ee-a5d3-c8cb9e32df8e",
"groupViewChangeUuid": "54d8329c-40eb-11ee-a5d3-c8cb9e32df8e",
"paxosSingleLeader": "OFF",
"primary": "127.0.0.1:3310",
"ssl": "REQUIRED",
"status": "OK",
"statusText": "Cluster is ONLINE and can tolerate up to ONE failure.",
"topology": {
"127.0.0.1:3310": {
"address": "127.0.0.1:3310",
"applierWorkerThreads": 4,
"fenceSysVars": [],
"memberId": "2cb77a02-40eb-11ee-83f4-c8cb9e32df8e",
"memberRole": "PRIMARY",
"memberState": "ONLINE",
"mode": "R/W",
"readReplicas": {
"127.0.0.1:4410": {
"address": "127.0.0.1:4410",
"applierStatus": "APPLYING",
"applierThreadState": "waiting for handler commit",
"applierWorkerThreads": 4,
"receiverStatus": "ON",
"receiverThreadState": "Waiting for source to send event",
"replicationSources": [
"PRIMARY"
],
"replicationSsl": "TLS_AES_256_GCM_SHA384 TLSv1.3",
"role": "READ_REPLICA",
"status": "ONLINE",
"version": "8.1.0"
}
},
"role": "HA",
"status": "ONLINE",
"version": "8.1.0"
},
"127.0.0.1:3320": {
"address": "127.0.0.1:3320",
"applierWorkerThreads": 4,
"fenceSysVars": [
"read_only",
"super_read_only"
],
"memberId": "327cb102-40eb-11ee-9904-c8cb9e32df8e",
"memberRole": "SECONDARY",
"memberState": "ONLINE",
"mode": "R/O",
"readReplicas": {},
"replicationLagFromImmediateSource": "00:00:04.536190",
"replicationLagFromOriginalSource": "00:00:04.536190",
"role": "HA",
"status": "ONLINE",
"version": "8.1.0"
},
"127.0.0.1:3330": {
"address": "127.0.0.1:3330",
"applierWorkerThreads": 4,
"fenceSysVars": [
"read_only",
"super_read_only"
],
"memberId": "3d141d7e-40eb-11ee-933b-c8cb9e32df8e",
"memberRole": "SECONDARY",
"memberState": "ONLINE",
"mode": "R/O",
"readReplicas": {},
"replicationLagFromImmediateSource": "00:00:04.652745",
"replicationLagFromOriginalSource": "00:00:04.652745",
"role": "HA",
"status": "ONLINE",
"version": "8.1.0"
}
},
"transactionSet": "2cb77a02-40eb-11ee-83f4-c8cb9e32df8e:1-4,54d83026-40eb-11ee-a5d3-c8cb9e32df8e:1-138552,54d8329c-40eb-11ee-a5d3-c8cb9e32df8e:1-5"
}
},
"domainName": "myClusterSet",
"globalPrimaryInstance": "127.0.0.1:3310",
"metadataServer": "127.0.0.1:3310",
"primaryCluster": "myCluster",
"status": "HEALTHY",
"statusText": "All Clusters available."
MySQL HeatWave
如果你在OCI上的MySQL HeatWave中使用任何类型的复制,你可以使用相同的视图,但你需要在不同的数据库上创建它们,因为sys是写保护的。
因此,如果您使用HA、读取副本或手动复制通道,您也可以使用相同的视图来获得复制的准确概述。
结论:
复制可观察性非常详细,并提供了大量MySQL 8的信息。也许现在是更改查看或监视复制的方式的好时机。
文章来源于mysql博客:https://blogs.oracle.com/mysql/post/mysql-8-and-replication-observability
2023-08-30
MySQL 8 全新查看主从延迟的方法
评论
发表评论
姓 名: