MySQL -- 索引上的函式
如果對 索引欄位 做 函式 操作,可能會 破壞索引值的有序性 ,因此 優化器 會決定 放棄 走 樹搜尋 功能
條件欄位函式操作
交易日誌表
CREATE TABLE `tradelog` ( `id` INT(11) NOT NULL, `tradeid` VARCHAR(32) DEFAULT NULL, `operator` INT(11) DEFAULT NULL, `t_modified` DATETIME DEFAULT CURRENT_TIMESTAMP, PRIMARY KEY (`id`), KEY `tradeid` (`tradeid`), KEY `t_modified` (`t_modified`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
-- 94608000 = 3 * 365 * 24 * 3600 -- t_modified : 2016-01-01 00:00:00 ~ 2019-01-01 00:00:00 DELIMITER ;; CREATE PROCEDURE tdata() BEGIN DECLARE i INT; SET i=0; WHILE i<1000000 DO INSERT INTO tradelog VALUES (i,i,i,FROM_UNIXTIME(UNIX_TIMESTAMP('2016-01-01 00:00:00')+FLOOR(0+(RAND()*94608000)))); SET i=i+1; END WHILE; END;; DELIMITER ; CALL tdata();
month函式
SELECT COUNT(*) FROM tradelog WHERE MONTH(t_modified)=7;
explain
mysql> EXPLAIN SELECT COUNT(*) FROM tradelog WHERE MONTH(t_modified)=7\G; *************************** 1. row *************************** id: 1 select_type: SIMPLE table: tradelog partitions: NULL type: index possible_keys: NULL key: t_modified key_len: 6 ref: NULL rows: 998838 filtered: 100.00 Extra: Using where; Using index
-
key=t_modified
:優化器選擇了遍歷二級索引t_modified
-
type=index
:表示 全索引掃描 (二級索引) -
rows=998,838≈1,000,000
:說明這條語句基本 掃描 了整個二級索引t_modified
-
Using index
:表示使用了 覆蓋索引 ( 無需回表 ) - 在索引欄位
t_modified
上加上MONTH
函式,導致了 全索引掃描 ,無法使用 樹搜尋 功能
slowlog
Rows_examined=1,000,000
,佐證了 全索引掃描
# Time: 2019-02-12T14:25:07.158350+08:00 # User@Host: root[root] @ localhost []Id:13 # Query_time: 0.208787Lock_time: 0.000162 Rows_sent: 1Rows_examined: 1000000 SET timestamp=1549952707; SELECT COUNT(*) FROM tradelog WHERE MONTH(t_modified)=7;
分析
-
WHERE t_modified='2018-07-01'
,InnoDB會按照綠色箭頭的路線找到結果(樹搜尋)- 這源於B+樹的特性: 同一層兄弟節點的有序性
-
WHERE MONTH(t_modified)=7
,在樹的第一層就不知道如何操作,因此 優化器放棄了樹搜尋功能- 優化器可以選擇遍歷 聚簇索引 ,或者遍歷 二級索引
t_modified
- 優化器在對比索引大小後發現,二級索引
t_modified
更小,最終選擇了遍歷二級索引t_modified
- 優化器可以選擇遍歷 聚簇索引 ,或者遍歷 二級索引
優化方案
mysql> SELECT COUNT(*) FROM tradelog WHERE -> (t_modified >= '2016-7-1' AND t_modified<'2016-8-1') OR -> (t_modified >= '2017-7-1' AND t_modified<'2017-8-1') OR -> (t_modified >= '2018-7-1' AND t_modified<'2018-8-1');
explain
mysql> EXPLAIN SELECT COUNT(*) FROM tradelog WHERE -> (t_modified >= '2016-7-1' AND t_modified<'2016-8-1') OR -> (t_modified >= '2017-7-1' AND t_modified<'2017-8-1') OR -> (t_modified >= '2018-7-1' AND t_modified<'2018-8-1')\G; *************************** 1. row *************************** id: 1 select_type: SIMPLE table: tradelog partitions: NULL type: range possible_keys: t_modified key: t_modified key_len: 6 ref: NULL rows: 180940 filtered: 100.00 Extra: Using where; Using index
-
type=range
:表示 索引範圍掃描 (二級索引) -
rows=180,940 < 998,838
,掃描行數 遠小於 上面使用MONTH
函式的情況
slowlog
Rows_examined=84,704 < 1,000,000
, Query_time
也僅為使用 MONTH
函式情況的 25%
# Time: 2019-02-12T14:56:51.727672+08:00 # User@Host: root[root] @ localhost []Id:13 # Query_time: 0.051701Lock_time: 0.000239 Rows_sent: 1Rows_examined: 84704 SET timestamp=1549954611; SELECT COUNT(*) FROM tradelog WHERE (t_modified >= '2016-7-1' AND t_modified<'2016-8-1') OR (t_modified >= '2017-7-1' AND t_modified<'2017-8-1') OR (t_modified >= '2018-7-1' AND t_modified<'2018-8-1');
id+1
mysql> explain select * from tradelog where id+1 = 1000000\G; *************************** 1. row *************************** id: 1 select_type: SIMPLE table: tradelog partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 998838 filtered: 100.00 Extra: Using where mysql> EXPLAIN SELECT * FROM tradelog WHERE id = 999999\G; *************************** 1. row *************************** id: 1 select_type: SIMPLE table: tradelog partitions: NULL type: const possible_keys: PRIMARY key: PRIMARY key_len: 4 ref: const rows: 1 filtered: 100.00 Extra: NULL
- 優化器會偷懶,依然認為
id+1=1,000,000
是應用在索引欄位上的函式,因此採用的是 全表掃描 - 而
id=999,999
會走 聚簇索引 的 樹搜尋 ,const
表示這是 常量 操作(最多隻會有一行記錄匹配)
隱式型別轉換
字串 -> 數字
在MySQL中,如果字串和數字做比較,會先 將字串轉換為數字
mysql> SELECT '10' > 9; +----------+ | '10' > 9 | +----------+ |1 | +----------+
tradeid
explain
mysql> EXPLAIN SELECT * FROM tradelog WHERE tradeid=625912\G; *************************** 1. row *************************** id: 1 select_type: SIMPLE table: tradelog partitions: NULL type: ALL possible_keys: tradeid key: NULL key_len: NULL ref: NULL rows: 998838 filtered: 10.00 Extra: Using where
-
type=ALL
:表示 全表掃描 -
rows=998,838≈1,000,000
- 等價於
SELECT * FROM tradelog WHERE CAST(tradid AS SIGNED INT)=625912;
- 隱式的型別轉換,導致會在索引欄位上做函式操作,優化器會放棄走樹搜尋的功能
slowlog
Rows_examined
依然為 1,000,000
# Time: 2019-02-12T15:30:09.033772+08:00 # User@Host: root[root] @ localhost []Id:13 # Query_time: 0.312170Lock_time: 0.000114 Rows_sent: 1Rows_examined: 1000000 SET timestamp=1549956609; SELECT * FROM tradelog WHERE tradeid=625912;
id
explain
mysql> EXPLAIN SELECT * FROM tradelog WHERE id='625912'\G; *************************** 1. row *************************** id: 1 select_type: SIMPLE table: tradelog partitions: NULL type: const possible_keys: PRIMARY key: PRIMARY key_len: 4 ref: const rows: 1 filtered: 100.00 Extra: NULL
-
type=const
:表示 常量操作 -
key=PRIMARY
:走 聚簇索引 的樹搜尋功能 -
rows=1
:只需要掃描一行 - 等價於
SELECT * FROM tradelog WHERE id=CAST('625912' AS SIGNED INT);
- 只是在 輸入引數 上做隱式型別轉換,在索引欄位上並沒有做函式操作,依然可以走 聚簇索引 的樹搜尋功能
slowlog
Rows_examined=1
,只需要掃描一行
# Time: 2019-02-12T15:45:38.222760+08:00 # User@Host: root[root] @ localhost []Id:13 # Query_time: 0.000476Lock_time: 0.000210 Rows_sent: 1Rows_examined: 1 SET timestamp=1549957538; SELECT * FROM tradelog WHERE id='625912';
隱式字元編碼轉換
交易詳情表
-- tradelog的編碼為utf8mb4,trade_detail的編碼為utf8 CREATE TABLE `trade_detail` ( `id` INT(11) NOT NULL, `tradeid` VARCHAR(32) DEFAULT NULL, `trade_step` INT(11) DEFAULT NULL, `step_info` VARCHAR(32) DEFAULT NULL, PRIMARY KEY (`id`), KEY `tradeid` (`tradeid`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO tradelog VALUES (1, 'aaaaaaaa', 1000, NOW()); INSERT INTO tradelog VALUES (2, 'aaaaaaab', 1000, NOW()); insert into tradelog VALUES (3, 'aaaaaaac', 1000, NOW()); INSERT INTO trade_detail VALUES (1, 'aaaaaaaa', 1, 'add'); INSERT INTO trade_detail VALUES (2, 'aaaaaaaa', 2, 'update'); INSERT INTO trade_detail VALUES (3, 'aaaaaaaa', 3, 'commit'); INSERT INTO trade_detail VALUES (4, 'aaaaaaab', 1, 'add'); INSERT INTO trade_detail VALUES (5, 'aaaaaaab', 2, 'update'); INSERT INTO trade_detail VALUES (6, 'aaaaaaab', 3, 'update again'); INSERT INTO trade_detail VALUES (7, 'aaaaaaab', 4, 'commit'); INSERT INTO trade_detail VALUES (8, 'aaaaaaac', 1, 'add'); INSERT INTO trade_detail VALUES (9, 'aaaaaaac', 2, 'update'); INSERT INTO trade_detail VALUES (10, 'aaaaaaac', 3, 'update again'); INSERT INTO trade_detail VALUES (11, 'aaaaaaac', 4, 'commit');
函式作用於二級索引
explain
mysql> EXPLAIN SELECT d.* FROM tradelog l, trade_detail d WHERE d.tradeid=l.tradeid AND l.id=2; +----+-------------+-------+------------+-------+-----------------+---------+---------+-------+------+----------+-------------+ | id | select_type | table | partitions | type| possible_keys| key| key_len | ref| rows | filtered | Extra| +----+-------------+-------+------------+-------+-----------------+---------+---------+-------+------+----------+-------------+ |1 | SIMPLE| l| NULL| const | PRIMARY,tradeid | PRIMARY | 4| const |1 |100.00 | NULL| |1 | SIMPLE| d| NULL| ALL| NULL| NULL| NULL| NULL|11 |100.00 | Using where | +----+-------------+-------+------------+-------+-----------------+---------+---------+-------+------+----------+-------------+
-
tradelog
稱為 驅動表 ,trade_detail
稱為 被驅動表 ,tradeid
為 關聯欄位 。驅動原則: 小表驅動大表 - 優化器會先在
tradelog
表上查詢id=2
的行,使用了tradelog
的聚簇索引,只掃描了一行,取出tradeid='aaaaaaab'
- 然後到
trade_detail
表上查詢tradeid='aaaaaaab'
的行,但沒有選擇 二級索引tradeid
,而選擇了 全表掃描-
type=ALL
,不符合預期,本希望走二級索引tradeid
的樹搜尋功能 - 原因:兩個表的 字符集不相同
-
tradelog
的編碼為utf8mb4
,trade_detail
的編碼為utf8
, -
utf8mb4
是utf8
的超集,詳見 mysql中utf8和utf8mb4區別 -
d.tradeid=l.tradeid
時,需要先 將utf8
字串轉換成utf8mb4
字串 - 因此,被驅動表
trade_detail
裡面的tradeid
欄位需要先轉換成utf8mb4
型別,再跟L2進行比較
-
- 等價於
SELECT * FROM trade_detail WHERE CONVERT(traideid USING utf8mb4)=$L2.tradeid.value;
- 隱式的 字元編碼轉換 ,導致會在二級索引
tradeid
上做函式操作,優化器會放棄走 樹搜尋 的功能
- 隱式的 字元編碼轉換 ,導致會在二級索引
-
slowlog
# Time: 2019-02-12T16:45:14.841502+08:00 # User@Host: root[root] @ localhost []Id:13 # Query_time: 0.000470Lock_time: 0.000202 Rows_sent: 4Rows_examined: 11 SET timestamp=1549961114; SELECT d.* FROM tradelog l, trade_detail d WHERE d.tradeid=l.tradeid AND l.id=2;
函式作用於輸入引數
explain
mysql> EXPLAIN SELECT l.* FROM tradelog l, trade_detail d WHERE d.tradeid=l.tradeid AND d.id=4; +----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+ | id | select_type | table | partitions | type| possible_keys | key| key_len | ref| rows | filtered | Extra | +----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+ |1 | SIMPLE| d| NULL| const | PRIMARY| PRIMARY | 4| const |1 |100.00 | NULL| |1 | SIMPLE| l| NULL| ref| tradeid| tradeid | 131| const |1 |100.00 | NULL| +----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
-
trade_detail
稱為 驅動表 ,tradelog
稱為 被驅動表 ,tradeid
為 關聯欄位 - 被驅動表
tradelog
的編碼為utf8mb4
,驅動表trade_detail
的編碼為utf8
- 等價於
SELECT * FROM tradelog WHERE traideid = CONVERT($R4.tradeid.value USING utf8mb4);
- 函式是用在 輸入引數 上的,並非二級索引
tradeid
上,因此可以用 樹搜尋 功能(key=tradeid
和rows=1
) -
type=ref
: Join語句中被驅動表索引引用的查詢
- 等價於
slowlog
# Time: 2019-02-12T17:31:50.553151+08:00 # User@Host: root[root] @ localhost []Id:13 # Query_time: 0.004090Lock_time: 0.001874 Rows_sent: 1Rows_examined: 1 SET timestamp=1549963910; SELECT l.* FROM tradelog l, trade_detail d WHERE d.tradeid=l.tradeid AND d.id=4;
優化方案
- 常用:將
trade_detail.tradeid
的字串編碼修改為utf8mb4
-
ALTER TABLE trade_detail MODIFY tradeid VARCHAR(32) CHARACTER SET utf8mb4 DEFAULT NULL;
-
- 修改SQL(場景:資料量較大或暫不支援該DDL)
- 主動把
l.tradeid
轉換為utf8
,避免了 被驅動表上的隱式字元編碼轉換 -
SELECT d.* FROM tradelog l, trade_detail d WHERE d.tradeid=CONVERT(l.tradeid USING utf8) AND l.id=2;
- 主動把
mysql> EXPLAIN SELECT d.* FROM tradelog l, trade_detail d WHERE d.tradeid=CONVERT(l.tradeid USING utf8) AND l.id=2; +----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+ | id | select_type | table | partitions | type| possible_keys | key| key_len | ref| rows | filtered | Extra | +----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+ |1 | SIMPLE| l| NULL| const | PRIMARY| PRIMARY | 4| const |1 |100.00 | NULL| |1 | SIMPLE| d| NULL| ref| tradeid| tradeid | 99| const |4 |100.00 | NULL| +----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
# Time: 2019-02-12T17:50:29.844772+08:00 # User@Host: root[root] @ localhost []Id:13 # Query_time: 0.000504Lock_time: 0.000206 Rows_sent: 4Rows_examined: 4 SET timestamp=1549965029; SELECT d.* FROM tradelog l, trade_detail d WHERE d.tradeid=CONVERT(l.tradeid USING utf8) AND l.id=2;
參考資料
《MySQL實戰45講》
轉載請註明出處:http://zhongmingmao.me/2019/02/12/mysql-index-function/
訪問原文「 MySQL -- 索引上的函式 」獲取最佳閱讀體驗並參與討論