1. 程式人生 > >oracle hash join和nested loop下的驅動表相關測試

oracle hash join和nested loop下的驅動表相關測試

Oracle 驅動表

Oracle驅動表也叫做外部表,也叫外層表,是在多表關聯查詢中首先遍歷的表,驅動表的每一行都要到另一個表中尋找相應的記錄,然後計算返回最終資料。

驅動表的概念只在nested loopshash join時存在。

原則:

1.        驅動表一般是小表,但不絕對,看下邊

2.        驅動表一般是通過where條件篩選後剩餘行數較少的表。

3.        如果表的一條記錄很長,佔用幾個資料塊也適合做驅動表

4.        CBORBO中,對於驅動表的選擇是不同的,CBO中通過對統計資訊的參考進行計算來選擇驅動表,而RBO中按照既定原則選擇驅動表。

5.

        RBO中,from後邊最右邊的表為驅動表(from後邊表從右向左遍歷,where條件從下向上遍歷)

6.        涉及驅動表的查詢,連線條件的索引很重要,驅動表連線欄位可以沒有索引,但是被驅動表需要被掃描驅動表經過篩選後剩餘條數的遍數,所以被驅動表的連線欄位上有一條索引是非常重要的。

分析:

假設a10行記錄,b1000行記錄,兩個表都有id列,查詢時使用id列進行關聯

Select * from a,b where a.id=b.id anda.id=100;

A表作為驅動表比較合適,假設a.id=100只有1行,即使全表掃描a表也就幾個塊,假設a表佔用10個塊。

B表的id假如非唯一,如果b

表的id列有索引,b表佔用100個塊,每個塊10行記錄,id列索引佔用10個塊,並且id1002條記錄,在兩個塊中

那麼這條語句的成本(以塊計算,下同):

A表(10個塊)*b表索引(10個塊)+bid1002個塊=102個塊

如果b表沒有索引,成本為:

A表(10個塊)*b表(100個塊)=1000個塊

如果ab表都沒有索引,可以看出不管哪個表作為驅動表,語句的執行成本都是一樣的。

如果abid列都有索引,aid列索引佔2個塊,成本為:

Aid列索引(2個塊)*bid列索引(10個塊)+ bid1002個塊=22個塊

如果B表的記錄很長,可以作為驅動表的情況比較複雜,大家可以自己想象適合的場景。

可以看出,在連線中,如果連線列有索引是多麼的重要。

實驗支撐

SQL> create table a(id,name) as selectobject_id,object_name from all_objects where rownum < 200;

Table created.

SQL>      

SQL> create table b as select * fromall_objects ;             

Table created.

SQL> select count(*) from a;

 COUNT(*)

----------

      199

SQL> select count(*) from b

SQL>

 COUNT(*)

----------

    89083

SQL>

SQL> execdbms_stats.gather_table_stats('TEST','A');

PL/SQL procedure successfully completed.

SQL>

SQL> execdbms_stats.gather_table_stats('TEST','B');

PL/SQL procedure successfully completed.

兩個表都沒有索引

Select count(*) from a,b wherea.id=b.object_id

And a.id=53

執行計劃:(B表驅動)

SQL> Select count(*) from a,b wherea.id=b.object_id

 2  And a.id=53

 3  /

 COUNT(*)

----------

        1

Execution Plan

----------------------------------------------------------

Plan hash value: 319234518

----------------------------------------------------------------------------

| Id | Operation           | Name |Rows  | Bytes | Cost (%CPU)| Time     |

----------------------------------------------------------------------------

|   0| SELECT STATEMENT    |      |    1 |     9 |   420  (1)| 00:00:01 |

|   1|  SORT AGGREGATE     |     |     1 |     9 |            |          |

|*  2|   HASH JOIN         |     |     1 |    9 |   420   (1)| 00:00:01 |

|*  3|    TABLE ACCESS FULL| B    |    1 |     5 |   417  (1)| 00:00:01 |

|*  4|    TABLE ACCESS FULL| A    |    1 |     4 |     3  (0)| 00:00:01 |

----------------------------------------------------------------------------

Predicate Information (identified byoperation id):

---------------------------------------------------

   2- access("A"."ID"="B"."OBJECT_ID")

   3- filter("B"."OBJECT_ID"=53)

   4- filter("A"."ID"=53)

Statistics

----------------------------------------------------------

         1  recursive calls

         0  db block gets

      1506  consistent gets

         0  physical reads

         0  redo size

       542  bytes sent via SQL*Net toclient

       543  bytes received via SQL*Netfrom client

         2  SQL*Net roundtrips to/fromclient

         0  sorts (memory)

         0  sorts (disk)

         1  rows processed

SQL>

A表作為驅動表

SQL> Select /*+ ordered use_nl(a)  */count(*) from a,b where a.id=b.object_id

  2  Anda.id=53;

 COUNT(*)

----------

        1

1 row selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 1397777030

----------------------------------------------------------------------------

| Id | Operation           | Name |Rows  | Bytes | Cost (%CPU)| Time     |

----------------------------------------------------------------------------

|   0| SELECT STATEMENT    |      |    1 |     9 |   420  (1)| 00:00:01 |

|   1|  SORT AGGREGATE     |      |    1 |     9 |            |          |

|*  2|   HASH JOIN         |     |     1 |     9 |  420   (1)| 00:00:01 |

|*  3|    TABLE ACCESS FULL| A    |    1 |     4 |     3  (0)| 00:00:01 |

|*  4|    TABLE ACCESS FULL| B    |    1 |     5 |   417  (1)| 00:00:01 |

----------------------------------------------------------------------------

Predicate Information (identified byoperation id):

---------------------------------------------------

   2 -access("A"."ID"="B"."OBJECT_ID")

   3- filter("A"."ID"=53)

   4- filter("B"."OBJECT_ID"=53)

Statistics

----------------------------------------------------------

         1  recursive calls

         0  db block gets

      1506  consistent gets

         0  physical reads

         0  redo size

       542  bytes sent via SQL*Net toclient

       543  bytes received via SQL*Netfrom client

         2  SQL*Net roundtrips to/fromclient

         0  sorts (memory)

         0  sorts (disk)

          1 rows processed

SQL>

發現上面兩個語句的代價是一樣的

/*+ Ordered use_nl(table_name) */   --使用hint強制表作為驅動表,另外,這裡使用的use_nl,但是走的是hash join,說明在沒有索引的情況下,oracle優化器更傾向hash join

執行計劃中,hash join下第一個表為驅動表,此處為A表

B  object_id列有索引的情況

SQL> create index id_b_object_id onb(object_id);

Index created.

SQL> execdbms_stats.gather_table_stats(ownname => 'TEST',TABNAME => 'B',CASCADE=> TRUE);

PL/SQL procedure successfully completed.

SQL>

執行計劃:

SQL> Select count(*) from a,b wherea.id=b.object_id

 2  And a.id=53;

 COUNT(*)

----------

        1

1 row selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 3168189658

----------------------------------------------------------------------------------------

| Id | Operation             |Name           | Rows  | Bytes | Cost (%CPU)| Time     |

----------------------------------------------------------------------------------------

|   0| SELECT STATEMENT      |                |     1 |    9 |     4   (0)| 00:00:01 |

|   1|  SORT AGGREGATE       |                |     1 |    9 |            |          |

|   2|   MERGE JOINCARTESIAN|                |     1 |    9 |     4  (0)| 00:00:01 |

|*  3|    TABLE ACCESS FULL  | A              |     1 |    4 |     3   (0)| 00:00:01 |

|   4|    BUFFER SORT        |                |     1 |    5 |     1   (0)| 00:00:01 |

|*  5|     INDEX RANGE SCAN  | ID_B_OBJECT_ID |     1 |    5 |     1   (0)| 00:00:01 |

----------------------------------------------------------------------------------------

Predicate Information (identified byoperation id):

---------------------------------------------------

   3- filter("A"."ID"=53)

   5 -access("B"."OBJECT_ID"=53)

Statistics

----------------------------------------------------------

        92  recursive calls

         0  db block gets

       134  consistent gets

        23  physical reads

         0  redo size

       542  bytes sent via SQL*Net toclient

       543  bytes received via SQL*Netfrom client

         2  SQL*Net roundtrips to/fromclient

        12  sorts (memory)

         0  sorts (disk)

         1  rows processed

SQL>

發現執行計劃並沒有使用nested loophash join,不過走索引後,執行代價明顯減少。Merge join發生了排序,如果記憶體夠用還好,不夠用就比較耗時了。

強制hash

A表驅動

SQL> Select /*+ use_hash(a,b)  */count(*) from a,b where a.id=b.object_id

 2  And a.id=53;

 COUNT(*)

----------

        1

1 row selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 895278611

--------------------------------------------------------------------------------------

| Id | Operation           | Name           | Rows  | Bytes | Cost (%CPU)| Time     |

--------------------------------------------------------------------------------------

|   0| SELECT STATEMENT    |                |     1 |    9 |     4   (0)| 00:00:01 |

|   1|  SORT AGGREGATE     |                |     1 |    9 |            |          |

|*  2|   HASH JOIN         |                |     1 |    9 |     4   (0)| 00:00:01 |

|*  3|    TABLE ACCESS FULL| A              |     1 |    4 |     3   (0)| 00:00:01 |

|*  4|    INDEX RANGE SCAN | ID_B_OBJECT_ID|     1 |     5 |    1   (0)| 00:00:01 |

--------------------------------------------------------------------------------------

Predicate Information (identified byoperation id):

---------------------------------------------------

   2- access("A"."ID"="B"."OBJECT_ID")

   3- filter("A"."ID"=53)

   4- access("B"."OBJECT_ID"=53)

Statistics

----------------------------------------------------------

         1  recursive calls

         0  db block gets

         5  consistent gets

         0  physical reads

         0  redo size

       542  bytes sent via SQL*Net toclient

       543  bytes received via SQL*Netfrom client

         2  SQL*Net roundtrips to/fromclient

         0  sorts (memory)

          0 sorts (disk)

         1  rows processed

SQL>

--強制使用hash joina表預設變為了驅動表,執行代價很低,符合要求

B表驅動

SQL> Select /*+ ordered use_hash(b)  */count(*) from a,b where a.id=b.object_id

 2  And a.id=53;

 COUNT(*)

----------

        1

1 row selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 895278611

--------------------------------------------------------------------------------------

| Id | Operation           | Name           | Rows  | Bytes | Cost (%CPU)| Time     |

--------------------------------------------------------------------------------------

|   0| SELECT STATEMENT    |                |     1 |    9 |     4   (0)| 00:00:01 |

|   1|  SORT AGGREGATE     |                |     1 |    9 |            |          |

|*  2|   HASH JOIN         |                |     1 |    9 |     4   (0)| 00:00:01 |

|*  3|    TABLE ACCESS FULL| A              |     1 |    4 |     3   (0)| 00:00:01 |

|*  4|    INDEX RANGE SCAN | ID_B_OBJECT_ID|     1 |     5|     1  (0)| 00:00:01 |

--------------------------------------------------------------------------------------

Predicate Information (identified byoperation id):

---------------------------------------------------

   2- access("A"."ID"="B"."OBJECT_ID")

   3- filter("A"."ID"=53)

   4- access("B"."OBJECT_ID"=53)

Statistics

----------------------------------------------------------

         1  recursive calls

         0  db block gets

         5  consistent gets

         0  physical reads

         0  redo size

       542  bytes sent via SQL*Net toclient

       543  bytes received via SQL*Netfrom client

         2  SQL*Net roundtrips to/fromclient

         0  sorts (memory)

         0  sorts (disk)

         1  rows processed

SQL>

發現有索引,並且有統計資訊的情況下,無法強制B表作為驅動表,oraclehint進行了忽略。

刪除統計資訊試試:

SQL> EXEC dbms_stats.delete_table_stats(user,'B',cascade_parts =>TRUE);

PL/SQL procedure successfully completed

SQL> EXEC dbms_stats.delete_table_stats(user,'A',cascade_parts =>TRUE);

PL/SQL procedure successfully completed

SQL>

--測試發現仍然不能將B表作為驅動表,修改optimizer_moderule

alter session set optimizer_mode=rule;

SQL> Select /*+ ordered use_nl(b)  */count(*) from a,b where a.id=b.object_id

 2  And object_id=53;

--發現仍然不能將B表作為驅動表

強制nested loop

SQL> Select /*+ ordered use_nl(b)  */count(*) from a,b where a.id=b.object_id

 2  And object_id=53;

 COUNT(*)

----------

        1

1 row selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 1183094437

--------------------------------------------------------------------------------------

| Id | Operation           | Name           | Rows  | Bytes | Cost (%CPU)| Time     |

--------------------------------------------------------------------------------------

|   0| SELECT STATEMENT    |                |     1 |   26 |     4   (0)| 00:00:01 |

|   1|  SORT AGGREGATE     |                |     1 |   26 |            |          |

|   2|   NESTED LOOPS      |                |     1 |   26 |     4   (0)| 00:00:01 |

|*  3|    TABLE ACCESS FULL| A              |     1 |   13 |     3   (0)| 00:00:01 |

|*  4|    INDEX RANGE SCAN | ID_B_OBJECT_ID|     1 |    13 |    1   (0)| 00:00:01 |

--------------------------------------------------------------------------------------

Predicate Information (identified byoperation id):

---------------------------------------------------

   3- filter("A"."ID"=53)

   4- access("OBJECT_ID"=53)

Note

-----

   -dynamic statistics used: dynamic sampling (level=2)

Statistics

----------------------------------------------------------

        10  recursive calls

         0  db block gets

        73  consistent gets

         1  physical reads

         0  redo size

       542  bytes sent via SQL*Net toclient

       543  bytes received via SQL*Netfrom client

         2  SQL*Net roundtrips to/fromclient

         0  sorts (memory)

         0  sorts (disk)

         1  rows processed

SQL>

--代價和hash join差不多,另外,即使強制B表作為驅動表,仍然不能將B表作為驅動表。

兩個都有索引的情況

SQL> create index id_a_id on a(id);

Index created.

SQL> execdbms_stats.gather_table_stats(user,'A',CASCADE=>TRUE);

PL/SQL procedure successfully completed.

SQL> execdbms_stats.gather_table_stats(user,'B',cascade => true);

PL/SQL procedure successfully completed.

SQL>

SQL> Select /*+ ordered use_nl(b)  */count(*) from a,b where a.id=b.object_id

 2  And object_id=53;

 COUNT(*)

----------

        1

1 row selected.

Elapsed: 00:00:00.01

Execution Plan

----------------------------------------------------------

Plan hash value: 2751652919

-------------------------------------------------------------------------------------

| Id | Operation          | Name           | Rows  | Bytes | Cost (%CPU)| Time     |

-------------------------------------------------------------------------------------

|   0| SELECT STATEMENT   |                |     1 |    9 |     2   (0)| 00:00:01 |

|   1|  SORT AGGREGATE    |                |     1 |    9 |            |          |

|   2|   NESTED LOOPS     |                |     1 |    9 |     2   (0)| 00:00:01 |

|*  3|    INDEX RANGE SCAN| ID_A_ID        |    1 |     4 |     1  (0)| 00:00:01 |

|*  4|    INDEX RANGE SCAN| ID_B_OBJECT_ID|     1 |     5 |    1   (0)| 00:00:01 |

-------------------------------------------------------------------------------------

Predicate Information (identified byoperation id):

---------------------------------------------------

   3- access("A"."ID"=53)

   4- access("OBJECT_ID"=53)

Statistics

----------------------------------------------------------

         1  recursive calls

         0  db block gets

         3  consistent gets

         0  physical reads

         0  redo size

       542  bytes sent via SQL*Net toclient

        543 bytes received via SQL*Net from client

         2  SQL*Net roundtrips to/fromclient

         0  sorts (memory)

         0  sorts (disk)

         1  rows processed

SQL>

--hint強制不能將B表作為驅動表

代價明顯變小,又減少一倍(索引是多麼重要)

 我這裡使用的是12c的庫,發現12c對於執行計劃的準確性確實有提高,hint作為輔助手段越來越顯得必要性很小,這是dba要失業的勁頭還是幫助dba減輕負擔,~~