1. 程式人生 > >hadoop叢集之間遷移分割槽表

hadoop叢集之間遷移分割槽表

這裡叢集的分割槽表是指的hive/impala表, 表儲存格式是parquet.

遷移的時候是指檔案的拷貝。下面我做一個案例演示。 如果有大量的表要遷移,可以寫一個java程式,多執行緒控制。

1.檢視源叢集的表位置

[[email protected] ~]# hadoop fs -du -h /user/hive/warehouse/prestat.db/dt_differ_users_pre_xdr
299.0 K  896.9 K  /user/hive/warehouse/prestat.db/dt_differ_users_pre_xdr/day=20170601
[[email protected]
~]#

2.把源叢集的檔案down到源伺服器上
[[email protected] ~]# hadoop fs -get /user/hive/warehouse/prestat.db/dt_differ_users_pre_xdr/day=20170601/minute=0000 /root

3.把檔案通過ftp  down到本機或者scp到 目標叢集。
[[email protected] ~]# hadoop fs -put /root/dt_differ_users_pre_xdr/ /user/hive/warehouse/prestat.db/
[[email protected] ~]# hadoop fs -du -h /user/hive/warehouse/prestat.db
^C[
[email protected]
~]# hadoop fs -du -h /user/hive/warehouse/prestat.db/dt_differ_users_pre_xdr 159.7 M 159.7 M /user/hive/warehouse/prestat.db/dt_differ_users_pre_xdr/day=20170901 0 0 /user/hive/warehouse/prestat.db/dt_differ_users_pre_xdr/day=20171001 0 0 /user/hive/warehouse/prestat.db/dt_differ_users_pre_xdr/day=20171101
4.在目標叢集建立該表的分割槽

20170901 有資料,我們建立該月分割槽 (我直接在impala執行的,避免元資料沒有同步重新整理)

[slave01:21000] >  alter table prestat.dt_differ_users_pre_xdr add IF NOT EXISTS partition(day=20170901,minute=cast('0000' as char(4)));
Query: alter table prestat.dt_differ_users_pre_xdr add IF NOT EXISTS partition(day=20170901,minute=cast('0000' as char(4)))
Fetched 0 row(s) in 1.63s
[slave01:21000] > show partitions prestat.dt_differ_users_pre_xdr ;
Query: show partitions prestat.dt_differ_users_pre_xdr
+----------+--------+-------+--------+----------+--------------+-------------------+---------+-------------------+---------------------------------------------------------------------------------------------+
| day      | minute | #Rows | #Files | Size     | Bytes Cached | Cache Replication | Format  | Incremental stats | Location                                                                                    |
+----------+--------+-------+--------+----------+--------------+-------------------+---------+-------------------+---------------------------------------------------------------------------------------------+
| 20170901 | 0000   | -1    | 2      | 159.72MB | NOT CACHED   | NOT CACHED        | PARQUET | false             | hdfs://myha/user/hive/warehouse/prestat.db/dt_differ_users_pre_xdr/day=20170901/minute=0000 |
| Total    |        | -1    | 2      | 159.72MB | 0B           |                   |         |                   |                                                                                             |
+----------+--------+-------+--------+----------+--------------+-------------------+---------+-------------------+---------------------------------------------------------------------------------------------+
Fetched 2 row(s) in 0.03s
[slave01:21000] > 

5.檢視資料
[slave01:21000] > select day,minute,count(*) from prestat.dt_differ_users_pre_xdr group by day,minute;
Query: select day,minute,count(*) from prestat.dt_differ_users_pre_xdr group by day,minute
Query submitted at: 2017-12-18 16:07:34 (Coordinator: http://slave01:25000)
Query progress can be monitored at: http://slave01:25000/query_plan?query_id=124365f0388d9b90:2af6db6900000000
+----------+--------+----------+
| day      | minute | count(*) |
+----------+--------+----------+
| 20170901 | 0000   | 3235946  |
+----------+--------+----------+
Fetched 1 row(s) in 0.14s
[slave01:21000] > 

6.如果查詢沒有資料,可以執行invalidate metadata xxx;   refresh xxx;  
[slave01:21000] > invalidate metadata  prestat.dt_differ_users_pre_xdr;
Query: invalidate metadata  prestat.dt_differ_users_pre_xdr
Query submitted at: 2017-12-18 16:09:17 (Coordinator: http://slave01:25000)
Query progress can be monitored at: http://slave01:25000/query_plan?query_id=2242885efbd4c93d:a0ea048500000000
Fetched 0 row(s) in 0.42s
[slave01:21000] > refresh prestat.dt_differ_users_pre_xdr;
Query: refresh prestat.dt_differ_users_pre_xdr
Query submitted at: 2017-12-18 16:09:24 (Coordinator: http://slave01:25000)
Query progress can be monitored at: http://slave01:25000/query_plan?query_id=f5483ec91c70dd59:fc2a589f00000000
Fetched 0 row(s) in 0.67s
[slave01:21000] >