1. 程式人生 > >hdfs 檔案打包存檔

hdfs 檔案打包存檔

2012-07-27

打包存檔命令: [[email protected] ~]$ hadoop archive archive -archiveName NAME -p *

在父目錄後面可以跟若干子目錄,也可以不跟,直接打全部父目錄。 如:

hadoop archive -archiveName foo.har -p /user/hadoop dir1/dir2 dir3 /user/zoo/

表示dir1/dir2和dir3都是/user/hadoop子目錄,選擇父目錄下的部分目錄打包。

實踐: 要打包的目錄:

[[email protected] ~]$ hadoop fs -lsr output1
drwxr-xr-x - zhouhh supergroup 0 2012-06-04 11:05 /user/zhouhh/output1/_logs
drwxr-xr-x - zhouhh supergroup 0 2012-06-04 11:05 /user/zhouhh/output1/_logs/history
-rw-r--r-- 3 zhouhh supergroup 16856 2012-06-04 11:05 /user/zhouhh/output1/_logs/history/job_201205231824_0007_1338779151666_zhouhh_wordcount.py+%281%2F1%29
-rw-r--r-- 3 zhouhh supergroup 22357 2012-06-04 11:05 /user/zhouhh/output1/_logs/history/job_201205231824_0007_conf.xml

打包

[[email protected] ~]$ hadoop archive -archiveName output1.har  -p /user/zhouhh/output1  /user/zhouhh/
[[email protected] ~]$ hadoop fs -lsr output1.har
-rw-r--r--   3 zhouhh supergroup          0 2012-07-27 15:30 /user/zhouhh/output1.har/_SUCCESS
-rw-r--r--   5 zhouhh supergroup        555 2012-07-27 15:30 /user/zhouhh/output1.har/_index
-rw-r--r--   5 zhouhh supergroup         23 2012-07-27 15:30 /user/zhouhh/output1.har/_masterindex
-rw-r--r--   3 zhouhh supergroup      39213 2012-07-27 15:30 /user/zhouhh/output1.har/part-0

已經打包成功。 檢視包內檔案:

[[email protected] ~]$ hadoop fs -lsr har:///user/zhouhh/output1.har
drwxr-xr-x   - zhouhh supergroup          0 2012-06-04 11:05 /user/zhouhh/output1.har/_logs
drwxr-xr-x   - zhouhh supergroup          0 2012-06-04 11:05 /user/zhouhh/output1.har/_logs/history
-rw-r--r--   3 zhouhh supergroup      22357 2012-06-04 11:05 /user/zhouhh/output1.har/_logs/history/job_201205231824_0007_conf.xml
-rw-r--r--   3 zhouhh supergroup      16856 2012-06-04 11:05 /user/zhouhh/output1.har/_logs/history/job_201205231824_0007_1338779151666_zhouhh_wordcount.py+%281%2F1%29

或:

[[email protected] ~]$ hadoop fs -lsr har://hdfs-Hadoop48:54310/user/zhouhh/output1.har
drwxr-xr-x   - zhouhh supergroup          0 2012-06-04 11:05 /user/zhouhh/output1.har/_logs
drwxr-xr-x   - zhouhh supergroup          0 2012-06-04 11:05 /user/zhouhh/output1.har/_logs/history
-rw-r--r--   3 zhouhh supergroup      22357 2012-06-04 11:05 /user/zhouhh/output1.har/_logs/history/job_201205231824_0007_conf.xml
-rw-r--r--   3 zhouhh supergroup      16856 2012-06-04 11:05 /user/zhouhh/output1.har/_logs/history/job_201205231824_0007_1338779151666_zhouhh_wordcount.py+%281%2F1%29

其中54310是我Hadoop 在core-site.xml配置的hdfs的埠。 刪除

[[email protected] ~]$ hadoop fs -rmr output1.har
Deleted hdfs://Hadoop48:54310/user/zhouhh/output1.har

如非註明轉載, 均為原創. 本站遵循知識共享CC協議,轉載請註明來源