MySQL运维过程中，锁等待和死锁问题的分析处理-4747i站长资讯

前言：

在 MySQL 运维过程中，锁等待和死锁问题是 DBA 和开发人员非常头疼的问题。此类问题的发生会导致业务回滚、冻结等故障，特别是对于业务繁忙的系统，出现死锁问题后影响会更加严重。在这篇文章中，我们来了解一下什么是锁等待和死锁，以及我们应该如何分析和处理此类问题？

1.解锁等待和死锁

锁等待或死锁的原因是访问数据库需要锁。那你可能要问了，我们为什么要加锁呢？这样做是为了保证并发更新场景下数据的正确性，保证数据库事务的隔离。

想象一个场景。如果你想从图书馆借一本《High MySQL》，为了防止有人提前借书，你可以提前预约（锁定）。这个锁怎么加？

锁粒度越细，并发级别越高，实现越复杂。

锁等待也可以称为事务等待。后面执行的事务等待事务释放锁，但是等待时间超过了MySQL的锁等待时间，就会引发这个异常。等待超时后的错误信息是“Lock wait…”。

死锁的原因是两个事务相互等待释放同一个资源的锁，导致死循环。发生死锁后会立即报错“found when to get lock…”。

2.现象复发及治疗

下面我们以MySQL 5.7.23版本（隔离级别为RR）为例，重现上述两种异常现象。

mysql> show create table test_tbG
*************************** 1. row ***************************
       Table: test_tb
Create Table: CREATE TABLE `test_tb` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `col1` varchar(50) NOT NULL DEFAULT '',
  `col2` int(11) NOT NULL DEFAULT '1',
  `col3` varchar(20) NOT NULL DEFAULT '',
  PRIMARY KEY (`id`),
  KEY `idx_col1` (`col1`)
) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=utf8
1 row in set (0.00 sec)
mysql> select * from test_tb;
+----+------+------+------+
| id | col1 | col2 | col3 |
+----+------+------+------+
|  1 | fdg  |    1 | abc  |
|  2 | a    |    2 | fg   |
|  3 | ghrv |    2 | rhdv |
+----+------+------+------+
3 rows in set (0.00 sec)
# 事务一首先执行
mysql> begin;
Query OK, 0 rows affected (0.00 sec)
mysql> select * from test_tb where col1 = 'a' for update;
+----+------+------+------+
| id | col1 | col2 | col3 |
+----+------+------+------+
|  2 | a    |    2 | fg   |
+----+------+------+------+
1 row in set (0.00 sec)
# 事务二然后执行
mysql> begin;
Query OK, 0 rows affected (0.01 sec)
mysql> update test_tb set col2 = 1 where col1 = 'a';
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction

出现上述异常的原因是事务2正在等待事务1的行锁，但是事务1还没有提交，等待超时导致出错。行锁等待超时时间由 eout 参数控制。该参数默认值为50，单位为秒。也就是说，默认情况下mysql 共享锁死锁，事务二将等待 50 秒。如果仍然无法获取行锁，则会报告等待超时异常并回滚该条目。声明。

对于5.7 版本，当发生锁等待时，我们可以检查多个系统表来查询事务状态。

# 锁等待发生时 查看innodb_trx表可以看到所有事务 
# trx_state值为LOCK WAIT 则代表该事务处于等待状态
mysql> select * from information_schema.innodb_trxG
*************************** 1. row ***************************
                    trx_id: 38511
                 trx_state: LOCK WAIT
               trx_started: 2021-03-24 17:20:43
     trx_requested_lock_id: 38511:156:4:2
          trx_wait_started: 2021-03-24 17:20:43
                trx_weight: 2
       trx_mysql_thread_id: 1668447
                 trx_query: update test_tb set col2 = 1 where col1 = 'a'
       trx_operation_state: starting index read
         trx_tables_in_use: 1
         trx_tables_locked: 1
          trx_lock_structs: 2
     trx_lock_memory_bytes: 1136
           trx_rows_locked: 1
         trx_rows_modified: 0
   trx_concurrency_tickets: 0
       trx_isolation_level: REPEATABLE READ
         trx_unique_checks: 1
    trx_foreign_key_checks: 1
trx_last_foreign_key_error: NULL
 trx_adaptive_hash_latched: 0
 trx_adaptive_hash_timeout: 0
          trx_is_read_only: 0
trx_autocommit_non_locking: 0
*************************** 2. row ***************************
                    trx_id: 38510
                 trx_state: RUNNING
               trx_started: 2021-03-24 17:18:54
     trx_requested_lock_id: NULL
          trx_wait_started: NULL
                trx_weight: 4
       trx_mysql_thread_id: 1667530
                 trx_query: NULL
       trx_operation_state: NULL
         trx_tables_in_use: 0
         trx_tables_locked: 1
          trx_lock_structs: 4
     trx_lock_memory_bytes: 1136
           trx_rows_locked: 3
         trx_rows_modified: 0
   trx_concurrency_tickets: 0
       trx_isolation_level: REPEATABLE READ
         trx_unique_checks: 1
    trx_foreign_key_checks: 1
trx_last_foreign_key_error: NULL
 trx_adaptive_hash_latched: 0
 trx_adaptive_hash_timeout: 0
          trx_is_read_only: 0
trx_autocommit_non_locking: 0
2 rows in set (0.00 sec)
# innodb_trx 字段值含义
trx_id：事务ID。
trx_state：事务状态，有以下几种状态：RUNNING、LOCK WAIT、ROLLING BACK 和 COMMITTING。
trx_started：事务开始时间。
trx_requested_lock_id：事务当前正在等待锁的标识，可以和 INNODB_LOCKS 表 JOIN 以得到更多详细信息。
trx_wait_started：事务开始等待的时间。
trx_weight：事务的权重。
trx_mysql_thread_id：事务线程 ID，可以和 PROCESSLIST 表 JOIN。
trx_query：事务正在执行的 SQL 语句。
trx_operation_state：事务当前操作状态。
trx_tables_in_use：当前事务执行的 SQL 中使用的表的个数。
trx_tables_locked：当前执行 SQL 的行锁数量。
trx_lock_structs：事务保留的锁数量。
trx_isolation_level：当前事务的隔离级别。
# sys.innodb_lock_waits 视图也可看到事务等待状况，且给出了杀链接的SQL
mysql> select * from sys.innodb_lock_waitsG
*************************** 1. row ***************************
                wait_started: 2021-03-24 17:20:43
                    wait_age: 00:00:22
               wait_age_secs: 22
                locked_table: `testdb`.`test_tb`
                locked_index: idx_col1
                 locked_type: RECORD
              waiting_trx_id: 38511
         waiting_trx_started: 2021-03-24 17:20:43
             waiting_trx_age: 00:00:22
     waiting_trx_rows_locked: 1
   waiting_trx_rows_modified: 0
                 waiting_pid: 1668447
               waiting_query: update test_tb set col2 = 1 where col1 = 'a'
             waiting_lock_id: 38511:156:4:2
           waiting_lock_mode: X
             blocking_trx_id: 38510
                blocking_pid: 1667530
              blocking_query: NULL
            blocking_lock_id: 38510:156:4:2
          blocking_lock_mode: X
        blocking_trx_started: 2021-03-24 17:18:54
            blocking_trx_age: 00:02:11
    blocking_trx_rows_locked: 3
  blocking_trx_rows_modified: 0
     sql_kill_blocking_query: KILL QUERY 1667530
sql_kill_blocking_connection: KILL 1667530

系统。 view整合了事务的等待状态，还给出了kill语句杀死被阻塞的源。但是，是否杀掉链接还是需要综合考虑的。

死锁与锁等待略有不同mysql 共享锁死锁，我们也来简单复现死锁现象。

# 开启两个事务
# 事务一执行
mysql> update test_tb set col2 = 1 where col1 = 'a';
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0
# 事务二执行
mysql> update test_tb set col2 = 1 where id = 3;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0
# 回到事务一执行 回车后 此条语句处于锁等待状态
mysql> update test_tb set col1 = 'abcd' where id = 3;
Query OK, 1 row affected (5.71 sec)
Rows matched: 1  Changed: 1  Warnings: 0
# 回到事务二再执行 此时二者相互等待发生死锁
mysql> update test_tb set col3 = 'gddx' where col1 = 'a';
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction

发生死锁后，会选择一个事务回滚。如果想找出死锁的原因，可以执行show查看死锁日志。根据死锁日志，结合业务逻辑，进一步定位死锁原因。

在实际应用中，要尽量避免死锁现象的发生，可以从以下几个方面入手：

总结：

本文简单介绍了锁等待和死锁的原因。事实上，在实际业务中很难分析死锁，需要一定的经验。本文仅供初学者参考，希望你能对死锁有个简单的印象。

文章来源:https://zhuanlan.zhihu.com/p/361031330

------本页内容已结束，喜欢请分享------

感谢您的来访，获取更多精彩文章请收藏本站。

© 版权声明 1、本网站名称：4747i站长资讯
2、本站永久网址：https://4747i.com
3、本网站的文章部分内容可能来源于网络，仅供大家学习与参考，如有侵权，请联系站长进行删除处理。
4、本站一切资源不代表本站立场，并不代表本站赞同其观点和对其真实性负责。
5、本站一律禁止以任何方式发布或转载任何违法的相关信息，访客发现请向站长举报
6、本站资源大多存储在云盘，如发现链接失效，请联系我们我们会第一时间更新。

THE END