從多表連接後的select count(*)看待SQL優化
阿新 • • 發佈:2018-09-01
create itl sele aggregate null 表連接 相等 eat back
3.1 a表連接b表再連接c表(
3.2 b表連接a表再連接c表(
從多表連接後的select count(*)看待SQL優化
一朋友問我,以下這SQL能直接改寫成select count(*) from a
嗎?
SELECT COUNT(*) FROM a LEFT JOIN b ON a.a1 = b.b1 LEFT JOIN c ON b.b1 = c.c1
廢話不多說,直接上實驗。
1. 準備數據
創建測試表a,b,c
,並插入數據,a有重復數據,b是唯一數據,c是唯一數據,d有重復數據。
1) 創建a表
create table a (a1 int); insert into a select 1; insert intoa select 2; insert into a select 3; insert into a select 1; insert into a select 2; insert into a select 3; insert into a values(null); insert into a values(null); insert into a values(null); insert into a values(null);
2)創建b表
create table b (b1 int); insert into b select 1; insert into b select2; insert into b select 3; insert into b select 4; insert into b select 5;
3)創建c表
create table c (c1 int); insert into c select 7; insert into c select 8; insert into c select 9; insert into c values(null); insert into c values(null);
4)創建d表
create table d (d1 int); insert into d select 1; insert into d select 1; insert into d select 1; insert into d select 1; insert into d select 1; insert into d select 1;
2. 數據查看
a表 | b表 | c表 | d表 |
---|---|---|---|
1 | 1 | 7 | 1 |
2 | 2 | 8 | 1 |
3 | 3 | 9 | 1 |
1 | 4 | null | 1 |
2 | 5 | null | 1 |
3 | 1 | ||
null | |||
null | |||
null | |||
null |
3. SQL示例
3.1 a表連接b表再連接c表(N:1:1
的關系)
a表連接列有重復數據,b,c兩表的連接列都是唯一數據
SELECT COUNT(*) FROM a LEFT JOIN b ON a.a1 = b.b1 LEFT JOIN c ON b.b1 = c.c1 +----------+ | COUNT(*) | +----------+ | 10 | +----------+ 1 row in set (0.00 sec)
返回的10條數據
此時SQL只返回a表的數據,那麽這時候SQL可以改寫成
mysql> select count(*) from a; +----------+ | count(*) | +----------+ | 10 | +----------+ 1 row in set (0.00 sec)
3.2 b表連接a表再連接c表(1:N:1
的關系)
SELECT count(*) FROM b LEFT JOIN a ON b.b1 = a.a1 LEFT JOIN c ON a.a1 = c.c1 +----------+ | count(*) | +----------+ | 8 | +----------+ 1 row in set (0.00 sec)
原本b表是5條數據,left join後變為8條,此時就不能改寫成上述形式了,我們來看下,具體數據是什麽。
+------+------+------+ | b1 | a1 | c1 | +------+------+------+ | 1 | 1 | NULL | | 2 | 2 | NULL | | 3 | 3 | NULL | | 1 | 1 | NULL | | 2 | 2 | NULL | | 3 | 3 | NULL | | 4 | NULL | NULL | | 5 | NULL | NULL | +------+------+------+ 8 rows in set (0.00 sec)
可以看到a表的重復數據,在b表重復展現了,c表與a表連接,沒有相等的數據(null不等於null)所以c1列展現都為null值。
這時候此SQL可以等價於以下:
SELECT count(*) FROM b LEFT JOIN a ON b.b1 = a.a1; +----------+ | count(*) | +----------+ | 8 | +----------+ 1 row in set (0.00 sec)
3.3 a表與d表相連接(N:N關系)
SELECT * FROM a LEFT JOIN d ON a.a1 =d.d1; +------+------+ | a1 | d1 | +------+------+ | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 2 | NULL | | 3 | NULL | | 2 | NULL | | 3 | NULL | | NULL | NULL | | NULL | NULL | | NULL | NULL | | NULL | NULL | +------+------+ 20 rows in set (0.00 sec)
可以看a表a1列數據組成是 a表2個1
* b表 6個1
= 12個1
,再加上原本a1列的數據8條,總共20條數據。
4. 總結
從以上實驗可以延伸到,如果連接列基數很低,此時left join就相當於笛卡兒積。。
所以在做SQL優化時候,尤其需要關註連接列的基數,與表與表之間的關系。
從多表連接後的select count(*)看待SQL優化