1. 程式人生 > >行式儲存和列式儲存的比較

行式儲存和列式儲存的比較

行式儲存的優點:

同一行資料存放在同一個block塊裡面,select * from table_name;資料能直接獲取出來;

 INSERT/UPDATE比較方便

行式儲存的缺點:

不同型別資料存放在同一個block塊裡面,壓縮效能不好;

select id,name from table_name;這種型別的列查詢,所有資料都要讀取,而不能跳過。

列式儲存的優點:

同類型資料存放在同一個block塊裡面,壓縮效能好;

任何列都能作為索引。

列式儲存的缺點:

select * from table_name;這類全表查詢,需要資料重組;

INSERT/UPDATE比較麻煩。

create table page_views_orc_zlib
ROW FORMAT DELIMITED FIELDS TERMINATED BY "\t"
STORED AS ORC 
TBLPROPERTIES("orc.compress"="ZLIB")
as select * from page_views;
#預設是zlib,寫不寫都一樣

create table page_views_orc_snappy
ROW FORMAT DELIMITED FIELDS TERMINATED BY "\t"
STORED AS ORC 
TBLPROPERTIES("orc.compress"="SNAPPY")
as select * from page_views;


create table page_views_parquet
ROW FORMAT DELIMITED FIELDS TERMINATED BY "\t"
STORED AS PARQUET 
as select * from page_views;


set parquet.compression=gzip;
create table page_views_parquet_gzip
ROW FORMAT DELIMITED FIELDS TERMINATED BY "\t"
STORED AS PARQUET 
as select * from page_views;

【來自@若澤大資料】