Hadoop基礎-HDFS遞歸列出文件系統-FileStatus與listFiles兩種方法

阿新 • • 發佈：2018-05-26

AI nag sys package 手動通過 keepal 是否代碼執行

　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　作者：尹正傑

　　fs.listFiles方法，返回LocatedFileStatus的叠代器，自帶遞歸。但是它是繼承於FileStatus的，而且構建函數是FileStatus的文件版，即LocaledFileStatus只能列出文件。接下來我我們一起看看這兩個方法的用法。

一.listStatus方法

 1 /*
 2 @author :yinzhengjie
 
 3 Blog:http://www.cnblogs.com/yinzhengjie/tag/Hadoop%E8%BF%9B%E9%98%B6%E4%B9%8B%E8%B7%AF/
 4 EMAIL:[email protected]
 5 */
 6 package cn.org.yinzhengjie.day01.note1;
 7 
 8 import org.apache.hadoop.conf.Configuration;
 9 import org.apache.hadoop.fs.FSDataInputStream;
10 import org.apache.hadoop.fs.FileStatus;
 
11 import org.apache.hadoop.fs.FileSystem;
12 import org.apache.hadoop.fs.Path;
13 import org.apache.hadoop.io.IOUtils;
14 
15 import java.io.FileOutputStream;
16 import java.io.IOException;
17 
18 public class HdfsDemo2 {
19     public static void main(String[] args) throws IOException {
20         list();
 
21         System.out.println("======  我是分割線  ========");
22         tree("/shell");
23     }
24 
25     //查看指定路徑的樹形結構，類似於Linux的tree命令。
26     private static void tree(String srcPath) throws IOException {
27         //由於我的Hadoop完全分布式根目錄對yinzhengjie以外的用戶(盡管是root用戶也沒有寫入權限喲！因為是hdfs系統，並非Linux系統！)沒有寫入
28         // 權限，所以需要手動指定當前用戶權限。使用“HADOOP_USER_NAME”屬性就可以輕松搞定！
29         System.setProperty("HADOOP_USER_NAME","yinzhengjie");
30         //實例化一個Configuration，它會自動去加載本地的core-site.xml配置文件的fs.defaultFS屬性。(該文件放在項目的resources目錄即可。)
31         Configuration conf = new Configuration();
32         //代碼的入口點，初始化HDFS文件系統，此時我們需要把讀取到的fs.defaultFS屬性傳給fs對象。
33         FileSystem fs = FileSystem.get(conf);
34         //這個path是指是需要在文件系統中寫入的數據,裏面的字符串可以寫出“hdfs://s101:8020/shell”，但由於core-site.xml配置
35         // 文件中已經有“hdfs://s101:8020”字樣的前綴，因此我們這裏可以直接寫相對路徑即可
36         Path path = new Path(srcPath);
37         //通過fs的listStatus方法獲取一個指定path的所有文件信息(status)，因此我們需要傳入一個hdfs的路徑，返回的是一個filStatus數組
38         FileStatus[] fileStatuses = fs.listStatus(path);
39         for (FileStatus fileStatus : fileStatuses) {
40             //判斷當前叠代對象是否是目錄
41             if (fileStatus.isDirectory()){
42                 String dirPath = fileStatus.getPath().toString();
43                 System.out.println("文件夾名:" + fileStatus.getPath());
44                 tree(dirPath);
45             }else {
46                 System.out.println("文件名:" + fileStatus.getPath());
47             }
48         }
49 
50 
51     }
52 
53     //查看指定路徑下的所有文件
54     private static void list() throws IOException {
55         //由於我的Hadoop完全分布式根目錄對yinzhengjie以外的用戶(盡管是root用戶也沒有寫入權限喲！因為是hdfs系統，並非Linux系統！)沒有寫入
56         // 權限，所以需要手動指定當前用戶權限。使用“HADOOP_USER_NAME”屬性就可以輕松搞定！
57         System.setProperty("HADOOP_USER_NAME","yinzhengjie");
58         //實例化一個Configuration，它會自動去加載本地的core-site.xml配置文件的fs.defaultFS屬性。(該文件放在項目的resources目錄即可。)
59         Configuration conf = new Configuration();
60         //代碼的入口點，初始化HDFS文件系統，此時我們需要把讀取到的fs.defaultFS屬性傳給fs對象。
61         FileSystem fs = FileSystem.get(conf);
62         //這個path是指是需要在文件系統中寫入的數據,裏面的字符串可以寫出“hdfs://s101:8020/shell”，但由於core-site.xml配置
63         // 文件中已經有“hdfs://s101:8020”字樣的前綴，因此我們這裏可以直接寫相對路徑即可
64         Path path = new Path("/shell");
65         //通過fs的listStatus方法獲取一個指定path的所有文件信息(status)，因此我們需要傳入一個hdfs的路徑，返回的是一個filStatus數組
66         FileStatus[] fileStatuses = fs.listStatus(path);
67         for (FileStatus fileStatus : fileStatuses) {
68             //判斷當前叠代對象是否是目錄
69             boolean isDir = fileStatus.isDirectory();
70             //獲取當前文件的絕對路徑
71             String fullPath = fileStatus.getPath().toString();
72             System.out.println("isDir:" + isDir + ",Path:" + fullPath);
73         }
74     }
75 }
76 
77 /*
78 以上代碼執行結果如下：
79 isDir:true,Path:hdfs://s101:8020/shell/awk
80 isDir:true,Path:hdfs://s101:8020/shell/grep
81 isDir:true,Path:hdfs://s101:8020/shell/sed
82 isDir:false,Path:hdfs://s101:8020/shell/yinzhengjie.sh
83 ======  我是分割線  ========
84 文件夾名:hdfs://s101:8020/shell/awk
85 文件名:hdfs://s101:8020/shell/awk/keepalive.sh
86 文件名:hdfs://s101:8020/shell/awk/nginx.conf
87 文件夾名:hdfs://s101:8020/shell/grep
88 文件名:hdfs://s101:8020/shell/grep/1.txt
89 文件名:hdfs://s101:8020/shell/grep/2.txt
90 文件夾名:hdfs://s101:8020/shell/sed
91 文件名:hdfs://s101:8020/shell/sed/nagios.sh
92 文件名:hdfs://s101:8020/shell/sed/zabbix.sql
93 文件名:hdfs://s101:8020/shell/yinzhengjie.sh
94  */

二.listFiles方法

 1 /*
 2 @author :yinzhengjie
 3 Blog:http://www.cnblogs.com/yinzhengjie/tag/Hadoop%E8%BF%9B%E9%98%B6%E4%B9%8B%E8%B7%AF/
 4 EMAIL:[email protected]
 5 */
 6 package cn.org.yinzhengjie.day01.note1;
 7 
 8 import org.apache.hadoop.conf.Configuration;
 9 import org.apache.hadoop.fs.*;
10 import org.apache.hadoop.io.IOUtils;
11 
12 import java.io.FileOutputStream;
13 import java.io.IOException;
14 
15 public class HdfsDemo3 {
16     public static void main(String[] args) throws IOException {
17         autoList("/shell");
18     }
19 
20     //定義方法下載文件到本地
21     private static void autoList(String path) throws IOException {
22         //實例化一個Configuration，它會自動去加載本地的core-site.xml配置文件的fs.defaultFS屬性。(該文件放在項目的resources目錄即可。)
23         Configuration conf = new Configuration();
24         //代碼的入口點，初始化HDFS文件系統，此時我們需要把讀取到的fs.defaultFS屬性傳給fs對象。
25         FileSystem fs = FileSystem.get(conf);
26         //通過fs的listFiles方法可以自動實現遞歸(自帶遞歸)列出文件類型，返回的是一個遠程可叠代對象,需要傳入兩個參數，第一個參數是服務器路徑，第二個參數是否遞歸
27         RemoteIterator<LocatedFileStatus> iterator = fs.listFiles(new Path(path), true);
28         while (iterator.hasNext()){
29             LocatedFileStatus fileStatus = iterator.next();
30             Path fullPath = fileStatus.getPath();
31             System.out.println(fullPath);
32         }
33     }
34 }
35 
36 /*
37 以上代碼執行結果如下：
38 hdfs://s101:8020/shell/awk/keepalive.sh
39 hdfs://s101:8020/shell/awk/nginx.conf
40 hdfs://s101:8020/shell/grep/1.txt
41 hdfs://s101:8020/shell/grep/2.txt
42 hdfs://s101:8020/shell/sed/nagios.sh
43 hdfs://s101:8020/shell/sed/zabbix.sql
44 hdfs://s101:8020/shell/yinzhengjie.sh
45  */

Hadoop基礎-HDFS遞歸列出文件系統-FileStatus與listFiles兩種方法

AI nag sys package 手動通過 keepal 是否代碼執行　　　　　　　　　　　　Hadoop基礎-HDFS遞歸列出文件系統-FileStatus與listFiles兩種方法　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　

Hadoop基礎-HDFS遞歸列出文件系統-FileStatus與listFiles兩種方法

Hadoop基礎-HDFS遞歸列出文件系統-FileStatus與listFiles兩種方法

Hadoop之HDFS分布式文件系統具有哪些優點？

hadoop[4]-hdfs分布式文件系統的基本工作機制

java遞歸刪除文件及目錄

Python3在指定路徑下遞歸定位文件中出現的字符串

無限遞歸替換文件內的某個字符串

遞歸刪除文件夾學習筆記

Java遞歸讀取文件路徑下所有文件名稱並保存為Txt文檔

Hadoop 從 hdfs 上拷出文件到本地許可權不夠

C#遞歸刪除文件夾目錄及文件

54.HDFS分布式文件系統

HDFS分布式文件系統

世界杯項目案例:HDFS分布式文件系統

HTML基礎（一）：文件基本結構與簡單標記

linux基礎之btrfs文件系統管理與應用

PHP中獲取文件擴展名的N種方法

Linux文件系統及與磁盤的映射

btrfs文件系統管理與應用

判斷文件是否存在的另一種方法 _access 和 _waccess

NFS網絡文件系統原理與搭建

Hadoop基礎-HDFS遞歸列出文件系統-FileStatus與listFiles兩種方法

相關推薦