1. 程式人生 > >btree和點陣圖索引的對比

btree和點陣圖索引的對比

1、btree 索引


通過建表t1 object_id的值沒有重複值,而t2 表的object_id的值重複率很高
通過實驗在t1,t2表的object_id列建立普通索引,來證明普通索引列比較適合列的重複值比較低的列


優點:適合鍵值重複率較低的欄位上使用
     那麼有個B-tree索引我們就像翻書目錄一樣,直接定位rowid立刻就找到了我們想要的資料,實質減少了I/O操作就提高速度,它有一     個顯著特點查詢效能與表中資料量無關
缺點:不適合鍵值重複率較高的欄位上使用,

SQL> create table t1 as select object_id,object_name from dba_objects;
Table created.
SQL> create table t2 as select mod(object_id,2) object_id,object_name from dba_objects;
Table created.
SQL> create index ind_t1 on t1(object_id);
Index created.
SQL> create index ind_t2 on t2(object_id);
Index created.


收集統計資訊:
BEGIN  
         DBMS_STATS.GATHER_TABLE_STATS(ownname => 'scott',  
         tabname => 't1',  
         estimate_percent =>100,  
          method_opt => 'for all columns size 1',  
        degree => 8,  
         cascade=>TRUE  
         );  
         END;  
BEGIN  
         DBMS_STATS.GATHER_TABLE_STATS(ownname => 'scott',  
         tabname => 't2',  
         estimate_percent =>100,  
          method_opt => 'for all columns size 1',  
        degree => 8,  
         cascade=>TRUE  
         );  
         END;  


SQL> select count(*) from t1 where object_id=1;


  COUNT(*)
----------
	 0

Execution Plan
----------------------------------------------------------
Plan hash value: 2587783732


----------------------------------------------------------------------------
| Id  | Operation	  | Name   | Rows  | Bytes | Cost (%CPU)| Time	   |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |	   |	 1 |	 5 |	 1   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE   |	   |	 1 |	 5 |		|	   |
|*  2 |   INDEX RANGE SCAN| IND_T1 |	 1 |	 5 |	 1   (0)| 00:00:01 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------


   2 - access("OBJECT_ID"=1)
Statistics
----------------------------------------------------------
	  1  recursive calls
	  0  db block gets
	  2  consistent gets
	  0  physical reads
	  0  redo size
	525  bytes sent via SQL*Net to client
	523  bytes received via SQL*Net from client
	  2  SQL*Net roundtrips to/from client
	  0  sorts (memory)
	  0  sorts (disk)
	  1  rows processed


SQL> select count(*) from t2 where object_id=1;


  COUNT(*)
----------
     36200
Execution Plan
----------------------------------------------------------
Plan hash value: 2800912005


--------------------------------------------------------------------------------
| Id  | Operation	      | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |        |     1 |     3 |    38	 (0)| 00:00:01 |
|   1 |  SORT AGGREGATE       |        |     1 |     3 |	    |	       |
|*  2 |   INDEX FAST FULL SCAN| IND_T2 | 36086 |   105K|    38	 (0)| 00:00:01 |
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("OBJECT_ID"=1)
Statistics
----------------------------------------------------------
	  1  recursive calls
	  0  db block gets
	144  consistent gets
	  0  physical reads
	  0  redo size
	527  bytes sent via SQL*Net to client
	523  bytes received via SQL*Net from client
	  2  SQL*Net roundtrips to/from client
	  0  sorts (memory)
	  0  sorts (disk)
	  1  rows processed

SQL> drop index ind_t1;
Index dropped.


SQL> drop index ind_t2;
Index dropped.






2、點陣圖索引


點陣圖索引適合於:列的基數很少,可列舉,重複值很多,資料不會被經常更新,由於一個鍵值對應很多行(rowid), 更新索引鍵值的時候,就會鎖定索引,導致其他行不可被修改,阻塞


優點:OLAP 例如報表類資料庫 重複率高的資料 特定型別的查詢例如count、or、and等邏輯操作因為只需要進行位運算即可得到我們需要的結果


缺點:不適合重複率低的欄位,還有經常DML操作(insert,update,delete),因為點陣圖索引的鎖代價極高,修改一個位圖索引段影響整個點陣圖段,例如修改
一個鍵值,會影響同鍵值的多行,所以對於OLTP 系統點陣圖索引基本上是不適用的


接著上面的實驗,在t1 t2表上建立點陣圖索引
SQL> create bitmap index ind_t1 on t1(object_id);
Index created.


SQL> create bitmap index ind_t2 on t2(object_id);
Index created.


SQL>  select segment_name,bytes from user_segments where segment_name like '%T1%' OR  SEGMENT_NAME LIKE '%T2%';
SEGMENT_NAME									       BYTES
--------------------------------------------------------------------------------- ----------
T1										     3145728
T2										     3145728
IND_T1										     3145728
IND_T2										       65536

我們可以看出t1表的object_id列沒有重複值,而t2表的object_id列重複值很多,建立點陣圖索引的時候,重複值越多,點陣圖索引就越小

SQL> drop table t1;
SQL> drop table t2;




下面我們來看一下,在重複率很高的情況下,點陣圖索引和btree的效率


 create table t1 as select object_id,object_type from dba_objects;
create table t2 as select  object_id,object_type from dba_objects;
create  index ind_t1 on t1(object_type);
create bitmap index ind_t2 on t2(object_type);

SQL> select count(*) from t1 where object_type='TABLE';

Execution Plan
----------------------------------------------------------
Plan hash value: 2587783732


----------------------------------------------------------------------------
| Id  | Operation	  | Name   | Rows  | Bytes | Cost (%CPU)| Time	   |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |	   |	 1 |	 9 |	 5   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE   |	   |	 1 |	 9 |		|	   |
|*  2 |   INDEX RANGE SCAN| IND_T1 |  1678 | 15102 |	 5   (0)| 00:00:01 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("OBJECT_TYPE"='TABLE')

Statistics
----------------------------------------------------------
	  1  recursive calls
	  0  db block gets
	  8  consistent gets
	  0  physical reads
	  0  redo size
	527  bytes sent via SQL*Net to client
	523  bytes received via SQL*Net from client
	  2  SQL*Net roundtrips to/from client
	  0  sorts (memory)
	  0  sorts (disk)
	  1  rows processed


SQL> select count(*) from t2 where object_type='TABLE';
Execution Plan
----------------------------------------------------------
Plan hash value: 2032664525


--------------------------------------------------------------------------------------
| Id  | Operation		    | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT	    |	     |	   1 |	   9 |	   1   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE 	    |	     |	   1 |	   9 |		  |	     |
|   2 |   BITMAP CONVERSION COUNT   |	     |	1678 | 15102 |	   1   (0)| 00:00:01 |
|*  3 |    BITMAP INDEX SINGLE VALUE| IND_T2 |	     |	     |		  |	     |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("OBJECT_TYPE"='TABLE')
Statistics
----------------------------------------------------------
	  1  recursive calls
	  0  db block gets
	  2  consistent gets
	  0  physical reads
	  0  redo size
	526  bytes sent via SQL*Net to client
	523  bytes received via SQL*Net from client
	  2  SQL*Net roundtrips to/from client
	  0  sorts (memory)
	  0  sorts (disk)
	  1  rows processed

SQL> select * from t1 where object_type='TABLE';
2799 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 634656657


--------------------------------------------------------------------------------------
| Id  | Operation		    | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT	    |	     |	1678 | 23492 |	  26   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1     |	1678 | 23492 |	  26   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN	    | IND_T1 |	1678 |	     |	   5   (0)| 00:00:01 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("OBJECT_TYPE"='TABLE')
Statistics
----------------------------------------------------------
	  1  recursive calls
	  0  db block gets
	427  consistent gets
	  0  physical reads
	  0  redo size
      79004  bytes sent via SQL*Net to client
       2569  bytes received via SQL*Net from client
	188  SQL*Net roundtrips to/from client
	  0  sorts (memory)
	  0  sorts (disk)
       2799  rows processed


SQL> select * from t2 where object_type='TABLE';
2800 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 2737179948
---------------------------------------------------------------------------------------
| Id  | Operation		     | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT	     |	      |  1678 | 23492 |    47	(0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID | T2     |  1678 | 23492 |    47	(0)| 00:00:01 |
|   2 |   BITMAP CONVERSION TO ROWIDS|	      |       |       | 	   |	      |
|*  3 |    BITMAP INDEX SINGLE VALUE | IND_T2 |       |       | 	   |	      |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("OBJECT_TYPE"='TABLE')
Statistics
----------------------------------------------------------
	  1  recursive calls
	  0  db block gets
	235  consistent gets
	  0  physical reads
	  0  redo size
      79020  bytes sent via SQL*Net to client
       2569  bytes received via SQL*Net from client
	188  SQL*Net roundtrips to/from client
	  0  sorts (memory)
	  0  sorts (disk)
       2800  rows processed
在等值查詢中我們可以看出點陣圖索引的效率依言高於B-tree索引
上面實驗參考了http://www.itpub.net/thread-1700144-1-1.html
create table t1 as select  object_id,mod(object_id,2) id, object_name, object_type From dba_objects;
create table t2 as select  object_id,mod(object_id,2) id, object_name, object_type From dba_objects;


create index ind_type_t1 on t1(object_type);
create bitmap index ind_type_t2 on t2(object_type);

create index ind_object_id_t1 on t1(object_id);
create bitmap index ind_object_id_t2 on t2(object_id);

create index ind_id_t1 on t1(id);
create bitmap index ind_id_t2 on t2(id);


SQL> select * from t1 where object_id in (1,2,10,20,30,50,60,70,40);
8 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 1020377091

-------------------------------------------------------------------------------------------------
| Id  | Operation		     | Name		| Rows	| Bytes | Cost (%CPU)| Time	|
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT	     |			|     9 |   369 |    10   (0)| 00:00:01 |
|   1 |  INLIST ITERATOR	     |			|	|	|	     |		|
|   2 |   TABLE ACCESS BY INDEX ROWID| T1		|     9 |   369 |    10   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN	     | IND_OBJECT_ID_T1 |     9 |	|     9   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("OBJECT_ID"=1 OR "OBJECT_ID"=2 OR "OBJECT_ID"=10 OR "OBJECT_ID"=20 OR
	      "OBJECT_ID"=30 OR "OBJECT_ID"=40 OR "OBJECT_ID"=50 OR "OBJECT_ID"=60 OR "OBJECT_ID"=70)

Statistics
----------------------------------------------------------
	  1  recursive calls
	  0  db block gets
	 12  consistent gets
	  1  physical reads
	  0  redo size
	980  bytes sent via SQL*Net to client
	523  bytes received via SQL*Net from client
	  2  SQL*Net roundtrips to/from client
	  0  sorts (memory)
	  0  sorts (disk)
	  8  rows processed


SQL> select * from t2 where object_id in (1,2,10,20,30,50,60,70,40);
8 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 3310774432

--------------------------------------------------------------------------------------------------
| Id  | Operation		      | Name		 | Rows  | Bytes | Cost (%CPU)| Time	 |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT	      | 		 |     9 |   369 |    11   (0)| 00:00:01 |
|   1 |  INLIST ITERATOR	      | 		 |	 |	 |	      | 	 |
|   2 |   TABLE ACCESS BY INDEX ROWID | T2		 |     9 |   369 |    11   (0)| 00:00:01 |
|   3 |    BITMAP CONVERSION TO ROWIDS| 		 |	 |	 |	      | 	 |
|*  4 |     BITMAP INDEX SINGLE VALUE | IND_OBJECT_ID_T2 |	 |	 |	      | 	 |
--------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("OBJECT_ID"=1 OR "OBJECT_ID"=2 OR "OBJECT_ID"=10 OR "OBJECT_ID"=20 OR
	      "OBJECT_ID"=30 OR "OBJECT_ID"=40 OR "OBJECT_ID"=50 OR "OBJECT_ID"=60 OR "OBJECT_ID"=70)
Statistics
----------------------------------------------------------
	  1  recursive calls
	  0  db block gets
	 15  consistent gets
	  1  physical reads
	  0  redo size
	980  bytes sent via SQL*Net to client
	523  bytes received via SQL*Net from client
	  2  SQL*Net roundtrips to/from client
	  0  sorts (memory)
	  0  sorts (disk)
	  8  rows processed


結論:在使用or 的情況下,object_id 重複率比較低的情況下,還是btree效率高一些

SQL> select * From t2 where object_type in ('INDEX','CLUSTER');
3805 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 879057093
---------------------------------------------------------------------------------------------
| Id  | Operation		      | Name	    | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT	      | 	    |  3357 |	134K|	127   (0)| 00:00:02 |
|   1 |  INLIST ITERATOR	      | 	    |	    |	    |		 |	    |
|   2 |   TABLE ACCESS BY INDEX ROWID | T2	    |  3357 |	134K|	127   (0)| 00:00:02 |
|   3 |    BITMAP CONVERSION TO ROWIDS| 	    |	    |	    |		 |	    |
|*  4 |     BITMAP INDEX SINGLE VALUE | IND_TYPE_T2 |	    |	    |		 |	    |
---------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("OBJECT_TYPE"='CLUSTER' OR "OBJECT_TYPE"='INDEX')
Statistics
----------------------------------------------------------
	  1  recursive calls
	  0  db block gets
	349  consistent gets
	  1  physical reads
	  0  redo size
     198248  bytes sent via SQL*Net to client
       3306  bytes received via SQL*Net from client
	255  SQL*Net roundtrips to/from client
	  0  sorts (memory)
	  0  sorts (disk)
       3805  rows processed

SQL> select * From t1 where object_type in ('INDEX','CLUSTER');
3805 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 918902357
--------------------------------------------------------------------------------------------
| Id  | Operation		     | Name	   | Rows  | Bytes | Cost (%CPU)| Time	   |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT	     |		   |  3357 |   134K|	87   (0)| 00:00:02 |
|   1 |  INLIST ITERATOR	     |		   |	   |	   |		|	   |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1	   |  3357 |   134K|	87   (0)| 00:00:02 |
|*  3 |    INDEX RANGE SCAN	     | IND_TYPE_T1 |  3357 |	   |	11   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("OBJECT_TYPE"='CLUSTER' OR "OBJECT_TYPE"='INDEX')
Statistics
----------------------------------------------------------
	  1  recursive calls
	  0  db block gets
	610  consistent gets
	  0  physical reads
	  0  redo size
     198248  bytes sent via SQL*Net to client
       3306  bytes received via SQL*Net from client
	255  SQL*Net roundtrips to/from client
	  0  sorts (memory)
	  0  sorts (disk)
       3805  rows processed


結論:在使用or的情況下,object_type重複率比較高的表,還是點陣圖索引效率高一些