相關子查詢與不相關子查詢的優化(三)
5.1 非相關子查詢,多種優化情況
示例1: 聚集非相關子查詢,沒有消除子查詢,但被優化為只執行一次
聚集函式操作在非相關子查詢中,查詢執行計劃如下:
mysql> EXPLAIN SELECT * FROM t1 WHERE t1.a1>(SELECT MIN(t2.a2) FROM t2);
+----+-------------+-------+------------+------+------------------------------+
| id | select_type | table | partitions | type | Extra |
+----+-------------+-------+------------+------+------------------------------+
| 1 | PRIMARY | t1 | NULL | ALL | Using where |
| 2 | SUBQUERY | NULL | NULL | NULL | Select tables optimized away |
+----+-------------+-------+------------+------+------------------------------+
2 rows in set, 1 warning (0.00 sec)
從查詢執行計劃看,非相關子查詢存在(列id值為2的行的列select_type的值為SUBQUERY),沒有被消除且也沒有必要消除(因為只執行一次即可得出結果值)。
示例2: IN謂詞表達的非相關子查詢
IN非相關子查詢,查詢執行計劃如下:
mysql> EXPLAIN SELECT * FROM t1 WHERE t1.a1 IN (SELECT a2 FROM t2 WHERE t2.a2>5);
+----+-------------+-------+------+------+-----------------------------+
| id | select_type | table | type | key | Extra |
+----+-------------+-------+------+------+-----------------------------+
| 1 | SIMPLE | t1 | ALL | NULL | Using where |
| 1 | SIMPLE | t2 | ref | i2 | Using index; FirstMatch(t1) |
+----+-------------+-------+------+------+-----------------------------+
2 rows in set, 1 warning (0.00 sec)
從查詢執行計劃看,子查詢不存在,表t1和t2直接做了連線,採用首次匹配策略(FirstMatch)把子查詢上拉到父查詢中用連線實現IN非相關子查詢的優化。
另外一個IN非相關子查詢,查詢執行計劃如下:
mysql> EXPLAIN SELECT * FROM t1 WHERE t1.a1 IN (SELECT a2 FROM t2 WHERE t2.a2=5);
+----+-------------+-------+------+------+-----------------------------+
| id | select_type | table | type | key | Extra |
+----+-------------+-------+------+------+-----------------------------+
| 1 | SIMPLE | t1 | ref | i1 | NULL |
| 1 | SIMPLE | t2 | ref | i2 | Using index; FirstMatch(t1) |
+----+-------------+-------+------+------+-----------------------------+
2 rows in set, 1 warning (0.00 sec)
查詢後的語句變形為:
/* select#1 */ select `test`.`t1`.`a1` AS `a1`,`test`.`t1`.`b1` AS `b1`,`test`.`t1`.`c1` AS `c1`
from `test`.`t1` semi join (`test`.`t2`)
where ((`test`.`t1`.`a1` = 5) and (`test`.`t2`.`a2` = 5))
從查詢執行計劃看,子查詢不存在,表t1和t2直接做了半連線,把子查詢上拉到父查詢中用半連線實現IN操作。另外,由於子查詢上拉,使得增加連線條件“a1=a2”,而原先的條件“a2=5”可以利用常量傳遞優化技術,使得“a1=a2=5”,所以查詢執行計劃中,兩個索引掃描的條件分別為:a1 = 10、a2 = 5。
5.2 相關子查詢,多種優化情況
再對比一個IN相關子查詢,子查詢沒有別優化,查詢執行計劃如下:
mysql> EXPLAIN SELECT * FROM t1 WHERE t1.a1 IN (SELECT a2 FROM t2 WHERE t1.a1=5);
+----+-------------+-------+------+------+-----------------------------+
| id | select_type | table | type | key | Extra |
+----+-------------+-------+------+------+-----------------------------+
| 1 | SIMPLE | t1 | ref | i1 | NULL |
| 1 | SIMPLE | t2 | ref | i2 | Using index; FirstMatch(t1) |
+----+-------------+-------+------+------+-----------------------------+
2 rows in set, 2 warnings (0.00 sec)
查詢後的語句變形為:
/* select#1 */ select `test`.`t1`.`a1` AS `a1`,`test`.`t1`.`b1` AS `b1`,`test`.`t1`.`c1` AS `c1`
from `test`.`t1` semi join (`test`.`t2`)
where ((`test`.`t1`.`a1` = 5) and (`test`.`t2`.`a2` = 5))
從查詢執行計劃看,子查詢不存在,表t1和t2直接做了半連線,把子查詢上拉到父查詢中用半連線實現IN操作。另外,由於子查詢上拉,使得增加連線條件“a1=a2”,而原先的條件“a2=5”可以利用常量傳遞優化技術,使得“a1=a2=5”,所以查詢執行計劃中,兩個索引掃描的條件分別為:a1 = 10、a2 = 5。這個比PostgreSQL不做優化要好。
另外一個相關子查詢的例子,子查詢被優化:
mysql> EXPLAIN SELECT * FROM t1 WHERE EXISTS (SELECT 1 FROM t2 WHERE t1.b1= t2.b2 AND t1.a1=5);
+----+--------------------+-------+------+------+-------------+
| id | select_type | table | type | key | Extra |
+----+--------------------+-------+------+------+-------------+
| 1 | PRIMARY | t1 | ALL | NULL | Using where |
| 2 | DEPENDENT SUBQUERY | t2 | ALL | NULL | Using where |
+----+--------------------+-------+------+------+-------------+
2 rows in set, 3 warnings (0.00 sec)
查詢後的語句變形為:
/* select#1 */ select `test`.`t1`.`a1` AS `a1`,`test`.`t1`.`b1` AS `b1`,`test`.`t1`.`c1` AS `c1`
from `test`.`t1`
where
exists(
/* select#2 */ select 1
from `test`.`t2`
where ((`test`.`t1`.`b1` = `test`.`t2`.`b2`) and (`test`.`t1`.`a1` = 5))
)
從查詢執行計劃看,子查詢存在。MySQL沒有對此類的相關子查詢進行優化。這一點不如PostgreSQL做得好。
從以上幾個例子看,MySQL對子查詢的優化,也並沒有明確的規律區分是相關或非相關子查詢。
所以,子查詢的優化,兩大開源的資料庫,都沒有明確區分相關或非相關的概念。但是對於聚集非相關子查詢,都能提供子查詢的一次性求解,從而優化此類子查詢。