前言:
我們都知道,當執行 select 查詢語句時,用沒用到索引區別是很大的,若沒用到索引,一條 select 語句可能執行好幾秒或更久,若使用到索引則可能瞬間完成。那么當執行 update 語句時,用沒用到索引有什么區別呢,執行時間相差大嗎?本篇文章我們一起來探究下。
1. update SQL 測試
為了對比出差距,這里筆者創建兩張一樣數據的大表,一張有普通索引,一張無普通索引,我們來對比下二者的差別。
# tb_noidx 表無普通索引
MySQL> show create table tb_noidxG
*************************** 1. row ***************************
Table: tb_noidx
Create Table: CREATE TABLE `tb_noidx` (
`increment_id` int(11) unsigned NOT NULL AUTO_INCREMENT COMMENT '自增主鍵',
`col1` char(32) NOT NULL COMMENT '字段1',
`col2` char(32) NOT NULL COMMENT '字段2',
...
`del` tinyint(4) NOT NULL DEFAULT '0' COMMENT '是否刪除',
) ENGINE=InnoDB AUTO_INCREMENT=3696887 DEFAULT CHARSET=utf8 COMMENT='無索引表'
mysql> select count(*) from tb_noidx;
+----------+
| count(*) |
+----------+
| 3590105 |
+----------+
mysql> select concat(round(sum(data_length/1024/1024),2),'MB') as data_length_MB, concat(round(sum(index_length/1024/1024),2),'MB') as index_length_MB
-> from information_schema.tables where table_schema='testdb' and table_name = 'tb_noidx';
+----------------+-----------------+
| data_length_MB | index_length_MB |
+----------------+-----------------+
| 841.98MB | 0.00MB |
+----------------+-----------------+
# tb_withidx 表有普通索引
mysql> show create table tb_withidxG
*************************** 1. row ***************************
Table: tb_withidx
Create Table: CREATE TABLE `tb_withidx` (
`increment_id` int(11) unsigned NOT NULL AUTO_INCREMENT COMMENT '自增主鍵',
`col1` char(32) NOT NULL COMMENT '字段1',
`col2` char(32) NOT NULL COMMENT '字段2',
...
`del` tinyint(4) NOT NULL DEFAULT '0' COMMENT '是否刪除',
PRIMARY KEY (`increment_id`),
KEY `idx_col1` (`col1`),
KEY `idx_del` (`del`)
) ENGINE=InnoDB AUTO_INCREMENT=3696887 DEFAULT CHARSET=utf8 COMMENT='有索引表'
mysql> select count(*) from tb_withidx;
+----------+
| count(*) |
+----------+
| 3590105 |
+----------+
mysql> select concat(round(sum(data_length/1024/1024),2),'MB') as data_length_MB, concat(round(sum(index_length/1024/1024),2),'MB') as index_length_MB
-> from information_schema.tables where table_schema='testdb' and table_name = 'tb_withidx';
+----------------+-----------------+
| data_length_MB | index_length_MB |
+----------------+-----------------+
| 841.98MB | 210.50MB |
+----------------+-----------------+
這里說明下,tb_noidx 和 tb_withidx 兩張表數據完全相同,表大概有 360W 條數據,約占用 840M 空間。其中 col1 字段區分度較高,del 字段區分度很低,下面我們分別以這兩個字段為篩選條件來執行 update 語句:
# 以 col1 字段為篩選條件 來更新 col2 字段
mysql> explain update tb_withidx set col2 = '48348a10d7794d269ecf10f9e3f20b52' where col1 = '48348a10d7794d269ecf10f9e3f20b52';
+----+-------------+------------+------------+-------+---------------+----------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------------+-------+---------------+----------+---------+-------+------+----------+-------------+
| 1 | UPDATE | tb_withidx | NULL | range | idx_col1 | idx_col1 | 96 | const | 1 | 100.00 | Using where |
+----+-------------+------------+------------+-------+---------------+----------+---------+-------+------+----------+-------------+
1 row in set (0.00 sec)
mysql> update tb_withidx set col2 = '48348a10d7794d269ecf10f9e3f20b52' where col1 = '48348a10d7794d269ecf10f9e3f20b52';
Query OK, 1 row affected (0.01 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> explain update tb_noidx set col2 = '48348a10d7794d269ecf10f9e3f20b52' where col1 = '48348a10d7794d269ecf10f9e3f20b52';
+----+-------------+----------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
| 1 | UPDATE | tb_noidx | NULL | index | NULL | PRIMARY | 4 | NULL | 3557131 | 100.00 | Using where |
+----+-------------+----------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
1 row in set (0.00 sec)
mysql> update tb_noidx set col2 = '48348a10d7794d269ecf10f9e3f20b52' where col1 = '48348a10d7794d269ecf10f9e3f20b52';
Query OK, 1 row affected (13.29 sec)
Rows matched: 1 Changed: 1 Warnings: 0
# 以 col1 字段為篩選條件 來更新 col1 字段
mysql> explain update tb_withidx set col1 = 'col1aac4c0f07449c688af42886465b76b' where col1 = '95aac4c0f07449c688af42886465b76b';
+----+-------------+------------+------------+-------+---------------+----------+---------+-------+------+----------+------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------------+-------+---------------+----------+---------+-------+------+----------+------------------------------+
| 1 | UPDATE | tb_withidx | NULL | range | idx_col1 | idx_col1 | 96 | const | 1 | 100.00 | Using where; Using temporary |
+----+-------------+------------+------------+-------+---------------+----------+---------+-------+------+----------+------------------------------+
1 row in set (0.01 sec)
mysql> update tb_withidx set col1 = 'col1aac4c0f07449c688af42886465b76b' where col1 = '95aac4c0f07449c688af42886465b76b';
Query OK, 1 row affected, 1 warning (0.01 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> explain update tb_noidx set col1 = 'col1aac4c0f07449c688af42886465b76b' where col1 = '95aac4c0f07449c688af42886465b76b';
+----+-------------+----------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
| 1 | UPDATE | tb_noidx | NULL | index | NULL | PRIMARY | 4 | NULL | 3557131 | 100.00 | Using where |
+----+-------------+----------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
1 row in set (0.01 sec)
mysql> update tb_noidx set col1 = 'col1aac4c0f07449c688af42886465b76b' where col1 = '95aac4c0f07449c688af42886465b76b';
Query OK, 1 row affected, 1 warning (13.15 sec)
Rows matched: 1 Changed: 1 Warnings: 0
# 以 del 字段為篩選條件 來更新 col2 字段
# del為0的大概203W條 del為1的大概155W條
mysql> select del,count(*) from tb_withidx GROUP BY del;
+-----+----------+
| del | count(*) |
+-----+----------+
| 0 | 2033080 |
| 1 | 1557025 |
+-----+----------+
mysql> explain update tb_withidx set col2 = 'col24c0f07449c68af42886465b76' where del = 0;
+----+-------------+------------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
| 1 | UPDATE | tb_withidx | NULL | index | idx_del | PRIMARY | 4 | NULL | 3436842 | 100.00 | Using where |
+----+-------------+------------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
1 row in set (0.00 sec)
mysql> update tb_withidx set col2 = 'col24c0f07449c68af42886465b76' where del = 0;
Query OK, 2033080 rows affected (47.15 sec)
Rows matched: 2033080 Changed: 2033080 Warnings: 0
mysql> explain update tb_noidx set col2 = 'col24c0f07449c68af42886465b76' where del = 0;
+----+-------------+----------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
| 1 | UPDATE | tb_noidx | NULL | index | NULL | PRIMARY | 4 | NULL | 3296548 | 100.00 | Using where |
+----+-------------+----------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
1 row in set (0.00 sec)
mysql> update tb_noidx set col2 = 'col24c0f07449c68af42886465b76' where del = 0;
Query OK, 2033080 rows affected (49.79 sec)
Rows matched: 2033080 Changed: 2033080 Warnings: 0
# 以 del 字段為篩選條件 來更新 del 字段
mysql> explain update tb_withidx set del = 2 where del = 0;
+----+-------------+------------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
| 1 | UPDATE | tb_withidx | NULL | index | idx_del | PRIMARY | 4 | NULL | 3436842 | 100.00 | Using where |
+----+-------------+------------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
1 row in set (0.03 sec)
mysql> update tb_withidx set del = 2 where del = 0;
Query OK, 2033080 rows affected (2 min 34.96 sec)
Rows matched: 2033080 Changed: 2033080 Warnings: 0
mysql> explain update tb_noidx set del = 2 where del = 0;
+----+-------------+----------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
| 1 | UPDATE | tb_noidx | NULL | index | NULL | PRIMARY | 4 | NULL | 3296548 | 100.00 | Using where |
+----+-------------+----------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
1 row in set (0.00 sec)
mysql> update tb_noidx set del = 2 where del = 0;
Query OK, 2033080 rows affected (50.57 sec)
Rows matched: 2033080 Changed: 2033080 Warnings: 0
從以上實驗大致可以看出,是否用到索引,對于 update 語句執行速度影響還是很大的,具體表現如下:
- 若在區分度較高的字段上添加索引,并以該字段為篩選條件進行更新,則無論是更新該字段還是其他字段,用到索引的更新都要快好多。
- 若在區分度很低的字段上添加索引,并以該字段為篩選條件進行更新,當更新其他字段時,有無索引區別不大,當更新這個區分度很低的字段時,用到索引的更新反而更慢。
2.一些經驗總結
我們試著來解釋下以上實驗結果,首先來看下 update SQL 執行流程,大致如下:
- 首先客戶端發送請求到服務端,建立連接。
- 服務端先看下查詢緩存,對于更新某張表的 SQL ,該表的所有查詢緩存都失效。
- 接著來到解析器,進行語法分析,一些系統關鍵字校驗,校驗語法是否合規。
- 然后優化器進行 SQL 優化,比如怎么選擇索引之類,然后生成執行計劃。
- 執行器去存儲引擎查詢需要更新的數據。
- 存儲引擎判斷當前緩沖池中是否存在需要更新的數據,存在就直接返回,否則去從磁盤加載數據。
- 執行器調用存儲引擎 API 去更新數據。
- 存儲器更新數據,同時寫入 undo log 、redo log 信息。
- 執行器寫 binlog ,提交事務,流程結束。
也就是說,執行更新語句首先需要將被更新的記錄查詢出來,這也就不難理解為啥以區分度較高的字段為篩選條件進行更新,有索引的情況下執行更快。
對于區分度很低的字段,用沒用到索引則區別不大,原因是查詢出將被更新的記錄所需時間差別不大,需要掃描的行數差別不大。當更新區分度很低的字段的字段時,因為要維護索引 b+ 樹,所以會拖慢更新速度。
之前也有講過,雖然索引能加速查詢,但索引也是有缺點的,那就是索引需要動態的維護,當對表中的數據進行增加、刪除、修改時,會降低數據的維護速度。本次實驗結果也能論證這個結論。
通過本次實驗,我們也能得到一些索引相關經驗:
- 只為用于搜索、排序、分組、連接的列創建索引。
- 索引盡量建在區分度高的字段上,避免在區分度低的字段上建索引。
- 對經常更新的表避免創建過多的索引。
- 不要有冗余索引,會增加維護成本。