1. 程式人生 > >Difference between orc and parquet format

Difference between orc and parquet format

參考:

https://www.cnblogs.com/ITtangtang/p/7677912.html

https://blog.csdn.net/yu616568/article/details/51868447

https://hortonworks.com/blog/orcfile-in-hdp-2-better-compression-better-performance/

 

總結

兩者都是參考了Google 的Dremel 的資料格式, 列儲存, 有預存統計資訊

區別是Parquet 對於 nested data (巢狀型別, 複雜型別 比如struct)有更好的支援

其他方面ORC效能好點

Cloudera推Parquet, Hortonworks推ORC