1. 程式人生 > >Postgresql 資料在一張表中存在,另一張表不滿足完整性的查詢

Postgresql 資料在一張表中存在,另一張表不滿足完整性的查詢

有兩張表T1,T2,表結構和資料如下:
create table t1 (
  id int
);

create table t2 (
  id int,
  finished int
);

insert into t1 values (1);
insert into t1 values (2);
insert into t1 values (3);
insert into t1 values (4);
insert into t1 values (5);
insert into t1 values (6);

insert into t2 values (1,1);
insert into t2 values (2,0);
insert into t2 values (3,1);
insert into t2 values (4,0);
SQL Fiddle 可以測試SQL語句的執行。
想要實現T1中存在前提下,T2中不存在或者finished=0,也就是查詢結果是:2, 4, 5, 6.

一、效能測試:

1. not in:
explain (analyze,verbose,costs,buffers) select ID from T1 where ID not in (select ID from T2 where finished=1);
Total runtime: 0.128 ms

2. not exists:
explain (analyze,verbose,costs,buffers) select ID from T1 where not exists (select 1 from T2 where T1.ID=T2.ID and T2.finished=1);
Total runtime: 0.105 ms

3. left join:
explain (analyze,verbose,costs,buffers) select T1.ID from T1 left join T2 on T1.ID=T2.ID and T2.finished=1 where T2.ID is null;
Total runtime: 0.096 ms

4. 網上還看到一種更快方法,但測試下來此方法不對,所以不討論:
select ID from T2 where (select count(1) from T1 where T1.ID=T2.ID) = 0; 這條語句查詢結果為空


因此在postgresql 9.3 上語句執行速度 left join > not exists > not in

當T1和T2表中ID出現null時,not in 語句會有不同的表現,所以推薦總是用not exists 代替 not in.



二、大資料量效能測試:

在大量資料的時候,not in有嚴重效能下降的問題,下面是我在i5 2.4GHz MAC pro 13吋上的測試。
department(T1) 為59280條資料,資料長度29字元;dept(T2) 為23633條資料,資料長度29字元。

1. explain analyze select department.id from department where department.id not in (select id from dept where finished=true);
Total runtime: 447073.065 ms

2. explain analyze select department.id from department where not exists (select 1 from dept where department.id=dept.id and finished=true);
Total runtime: 325.732 ms

3. explain analyze select department.id from department left join dept on department.id=dept.id and dept.finished=true where dept.id is null;
Total runtime: 319.869 ms


三、總結:

在Postgresql 9.3上:
not in 不僅效能差,而且邏輯可能有問題。
not exists 效能不錯,思考起來比較容易。
left join 效能最好,但總體跟not exists 比也快不了多少,思考稍微有點繞。


下面是一張網上的left join 的圖,但找不到出處,有助於理解 left join 的過程: