1. 程式人生 > >PostgreSQL 11 新特性解讀: 分割槽表增加雜湊分割槽

PostgreSQL 11 新特性解讀: 分割槽表增加雜湊分割槽

PostgreSQL 11 的一個重量級新特性為分割槽表得到較大增強,例如支援雜湊分割槽(HASH)表,因此 PostgreSQL 支援範圍分割槽(RANGE)、列表分割槽(LIST)、>雜湊分割槽(HASH)三種分割槽方式,本文簡單演示下雜湊分割槽表。

Hash Partitioning

The table is partitioned by specifying a modulus and a remainder for each partition. Each partition will hold the rows for which the hash value of the partition key divided by the specified modulus will produce the specified remainder.

Hash分割槽表的分割槽定義包含兩個屬性,如下:

  • modulus: 指Hash分割槽個數。
  • remainder: 指Hash分割槽鍵取模餘。

建立分割槽表語法

CREATE TABLE table_name (  ...  )
[ PARTITION BY { RANGE | LIST | HASH }  (  { column_name |  ( expression )  }
 CREATE TABLE table_name
PARTITION OF parent_table [  (
)  ] FOR VALUES partition_bound_spec

建立資料生成函式

為了便於生成測試資料,建立以下兩個函式用來隨機生成指定長度的字串,建立 random_range(int4, int4) 函式如下:

CREATE OR REPLACE FUNCTION random_range(int4, int4)
RETURNS int4
LANGUAGE SQL
AS 
$$

    SELECT ($1 + FLOOR(($2 - $1 + 1) * random() ))::int4;

$$
;

接著建立random_text_simple(length int4)函式,此函式會呼叫random_range(int4, int4)函式。

CREATE OR REPLACE FUNCTION random_text_simple(length int4)
RETURNS text
LANGUAGE PLPGSQL
AS 
$$

DECLARE
    possible_chars text := '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ';
    output text := '';
    i int4;
    pos int4;
BEGIN

    FOR i IN 1..length LOOP
        pos := random_range(1, length(possible_chars));
        output := output || substr(possible_chars, pos, 1);
    END LOOP;

    RETURN output;
END;

$$
;

random_text_simple(length int4)函式可以隨機生成指定長度字串,如下隨機生成含三位字元的字串。

mydb=> SELECT random_text_simple(3);
 random_text_simple 
--------------------
 LL9
(1 row)

隨機生成含六位字元的字串,如下所示:

mydb=> SELECT random_text_simple(6);
 random_text_simple 
--------------------
 B81BPW
(1 row)

後面會用到這個函式生成測試資料。

建立雜湊分割槽父表

CREATE TABLE student (
 stuname text ,
 ctime   timestamp(6) without time zone
) PARTITION BY HASH(stuname);

建立索引

CREATE INDEX idx_stuendt_stuname on student using btree(stuname);

建立子表

CREATE TABLE student_p0 PARTITION OF student FOR VALUES WITH(MODULUS 4, REMAINDER 0);
CREATE TABLE student_p1 PARTITION OF student FOR VALUES WITH(MODULUS 4, REMAINDER 1);
CREATE TABLE student_p2 PARTITION OF student FOR VALUES WITH(MODULUS 4, REMAINDER 2);
CREATE TABLE student_p3 PARTITION OF student FOR VALUES WITH(MODULUS 4, REMAINDER 3);

檢視分割槽表定義

francs=> \d+ student
                                              Table "francs.student"
 Column  |              Type              | Collation | Nullable | Default | Storage  | Stats target | Description 
---------+--------------------------------+-----------+----------+---------+----------+--------------+-------------
 stuname | text                           |           |          |         | extended |              | 
 ctime   | timestamp(6) without time zone |           |          |         | plain    |              | 
Partition key: HASH (stuname)
Indexes:
    "idx_stuendt_stuname" btree (stuname)
Partitions: student_p0 FOR VALUES WITH (modulus 4, remainder 0),
            student_p1 FOR VALUES WITH (modulus 4, remainder 1),
            student_p2 FOR VALUES WITH (modulus 4, remainder 2),
            student_p3 FOR VALUES WITH (modulus 4, remainder 3)

從以上看出表 student 和它的四個分割槽。

插入測試資料

使用之前建立的函式 random_text_simple() 生成100萬測試資料,如下。

INSERT INTO student(stuname,ctime) SELECT random_text_simple(6),clock_timestamp() FROM generate_series(1,1000000);            

檢視分割槽表資料

表資料如下

francs=> SELECT * FROM student LIMIT 3;
 stuname |        ctime        
---------+---------------------
 4JJOPN  | 2018-09-20 10:45:06
 NHQONC  | 2018-09-20 10:45:06
 8V5BGH  | 2018-09-20 10:45:06
(3 rows)

統計分割槽資料量

francs=> SELECT tableoid::regclass,count(*) from student group by 1 order by 1;
  tableoid  | count  
------------+--------
 student_p0 | 250510
 student_p1 | 249448
 student_p2 | 249620
 student_p3 | 250422
(4 rows)

可見資料均勻分佈到了四個分割槽。

根據分割槽鍵查詢

francs=> EXPLAIN ANALYZE SELECT * FROM student WHERE stuname='3LXBEV';
                                                                QUERY PLAN                                                          
      
------------------------------------------------------------------------------------------------------------------------------------

 Append  (cost=0.42..8.44 rows=1 width=15) (actual time=0.017..0.018 rows=1 loops=1)
   ->  Index Scan using student_p3_stuname_idx on student_p3  (cost=0.42..8.44 rows=1 width=15) (actual time=0.017..0.017 rows=1 loops=1)
         Index Cond: (stuname = '3LXBEV'::text)
 Planning Time: 0.198 ms
 Execution Time: 0.042 ms
(5 rows)

根據分割槽鍵stuname查詢僅掃描分割槽 student_p3,並走了索引。

根據非分割槽鍵查詢

francs=> EXPLAIN ANALYZE SELECT * FROM student WHERE ctime='2018-09-20 10:53:55.48392';
                                                          QUERY PLAN                                                           
-------------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=1000.00..13761.36 rows=4 width=15) (actual time=37.891..39.183 rows=1 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Parallel Append  (cost=0.00..12760.96 rows=4 width=15) (actual time=23.753..35.006 rows=0 loops=3)
         ->  Parallel Seq Scan on student_p0  (cost=0.00..3196.99 rows=1 width=15) (actual time=0.014..28.550 rows=1 loops=1)
               Filter: (ctime = '2018-09-20 10:53:55.48392'::timestamp without time zone)
               Rows Removed by Filter: 250509
         ->  Parallel Seq Scan on student_p3  (cost=0.00..3195.34 rows=1 width=15) (actual time=29.543..29.543 rows=0 loops=1)
               Filter: (ctime = '2018-09-20 10:53:55.48392'::timestamp without time zone)
               Rows Removed by Filter: 250422
         ->  Parallel Seq Scan on student_p2  (cost=0.00..3185.44 rows=1 width=15) (actual time=8.260..8.260 rows=0 loops=3)
               Filter: (ctime = '2018-09-20 10:53:55.48392'::timestamp without time zone)
               Rows Removed by Filter: 83207
         ->  Parallel Seq Scan on student_p1  (cost=0.00..3183.18 rows=1 width=15) (actual time=22.135..22.135 rows=0 loops=1)
               Filter: (ctime = '2018-09-20 10:53:55.48392'::timestamp without time zone)
               Rows Removed by Filter: 249448
 Planning Time: 0.183 ms
 Execution Time: 39.219 ms
(18 rows)

根據非分割槽鍵ctime查詢掃描了分割槽表所有分割槽。

總結

本文演示了 PostgreSQL 雜湊分割槽表的建立、測試資料的生成匯入和查詢計劃,後面部落格演示分割槽表增強的其它方面。

參考

新書推薦

最後推薦和張文升共同編寫的《PostgreSQL實戰》,本書基於PostgreSQL 10 編寫,共18章,重點介紹SQL高階特性、並行查詢、分割槽表、物理複製、邏輯複製、備份恢復、高可用、效能優化、PostGIS等,涵蓋大量實戰用例!

購買連結:https://item.jd.com/12405774.html
_5_PostgreSQL_