Hadoop 原始碼詳解之FileInputFormat類

阿新 • • 發佈：2018-12-27

Hadoop 原始碼詳解之`FileInputFormat`類【updating…】

1. 類釋義

A base class for file-based InputFormats.
針對基於檔案的 InputFormats 一個基類

FileInputFormat is the base class for all file-based InputFormats. This provides a generic implementation of getSplits(JobContext). Implementations of FileInputFormat can also override the isSplitable(JobContext, Path) method to prevent input files from being split-up in certain situations. Implementations that may deal with non-splittable files must override this method, since the default implementation assumes splitting is always possible.
FileInputFormat

是一個基類對於素有基於檔案的InputFormats。這個類提供了一個一般的實現——getSplits(JobContext)。FileInputFormat的實現也覆寫了isSplitable(JobContext,Path)方法去阻止輸入檔案被檔案在某些場景下被切割。必須覆寫這個方法才能同時實現不切割檔案，因為預設的實現總是假設切割是可能的。

2. 類原始碼

public abstract class FileOutputFormat<K, V> extends OutputFormat<K, V> {
...
}

3. 方法詳解

3.1 `setInputPaths()`

方法

Sets the given comma separated paths as the list of inputs for the map-reduce job.
使用給定的逗號分隔路徑作為為map-reduce job的檔案列表

static void 	setInputPaths(Job job, Path... inputPaths)
Set the array of Paths as the list of inputs for the map-reduce job.

在這裡插入圖片描述注意，在呼叫這個方式時，可以看到有一個commaSeparate，這個表明的就是後面可跟逗號分隔的檔案列表。

Hadoop 原始碼詳解之FileInputFormat類

Hadoop 原始碼詳解之FileInputFormat類【updating…】 1. 類釋義 A base class for file-based InputFormats. 針對基於檔案的 InputFormats 一個基類 FileInputFo

Hadoop原始碼詳解之DBOutputFormat類

Hadoop 原始碼詳解之 DBOutputFormat 類 1. 類釋義 A OutputFormat that sends the reduce output to a SQL table. 一種將Reduce 輸出到一個SQL表中的輸出格式。 DB

Hadoop原始碼詳解之Mapper類

Hadoop原始碼詳解之Mapper類 1. 類釋義 Maps input key/value pairs to a set of intermediate key/value pairs. 將輸入的鍵值對應成一系列的中間鍵值對 Maps are the

Hadoop原始碼詳解之FileOutputFormat 類

Hadoop 原始碼詳解之FileOutputFormat 類 1. 類釋義 A base class for OutputFormats that read from FileSystems. 一個類從FileSystems讀取用於OutputFormats 【實在翻

Hadoop原始碼詳解之Job 類

Hadoop原始碼詳解之Job類 1. 原始碼包：org.apache.hadoop.mapreduce 繼承的介面有：AutoCloseable，JobContext，org.apache.hadoop.mapreduce.MRJobConfig

openTSDB原始碼詳解之Deferred類簡單示例2

openTSDB原始碼詳解之Deferred類簡單示例2 1.示例2 1.1 程式碼程式程式碼如下： public static void test2() { try { //注意這個時候由 dfd -> dfd List(lstDfd)。但是其型

openTSDB原始碼詳解之Deferred類程式碼簡單示例1

openTSDB原始碼詳解之Deferred類程式碼簡單示例1 1.示例1 1.1 程式碼 /** * simplest with only 1 defer * 最簡單的，僅僅只有1個defer */ public static void test

JDK原始碼詳解之File類

JDK原始碼詳解之File類 1. 類釋義 2. 類方法 listFiles() File[] listFiles() Returns an array of abstract pathnames denoting the files in the dir

Jdk原始碼詳解之`ProcessBuilder()`類

Jdk原始碼詳解之ProcessBuilder()類 1.ProcessBuilder類 2.方法簡介構造器ProcessBuilder /** Constructs a process builder with the specif

Hadoop 原始碼詳解之RecordReader介面

Hadoop 原始碼詳解之RecordReader介面 1. 類釋義 RecordReader reads <key, value> pairs from an InputSplit. RecordReader 從InputSplit中讀取<key,va

OkHttp原始碼詳解之二完結篇

1. 請大家思考幾個問題在開始本文之前，請大家思考如下幾個問題。並請大家帶著這幾個問題，去本文尋找答案。如果你對下面幾個問題的答案瞭如指掌那本文可以略過不看在瀏覽器中輸入一個網址，按回車後發生了什麼？ Okhttp的TCP連線建立發生在什麼時候？ Okht

OkHttp原始碼詳解之Okio原始碼詳解

請在電腦上閱讀，效果更佳本文將從兩個技術點講解OkHttp 1. 講解Okio，因為Okhttp的IO操作都是基於Okio，拋開Okio的OkHttp講解是不完美的 2. 講解OkHttp原始碼 Okio 1. Okio簡介引用官方的一段介紹 Okio是一個補

openTSDB原始碼詳解之rowKey生成

openTSDB原始碼詳解之rowKey生成 openTSDB的一個非常好的設計就是其rowKey的生成。下面詳細介紹一下。 1.相關處理類 openTSDB往hbase中寫入資料的處理過程，我之前就已經分析過，主要涉及的類有： addPointInternal(

openTSDB 原始碼詳解之寫入資料到 tsdb-uid 表

openTSDB 原始碼詳解之寫入資料到tsdb-uid表 1.方法入口messageReceived public void messageReceived(final ChannelHandlerContext ctx,

Java原始碼詳解之FileOutputStream

Java原始碼詳解之FileOutputStream類 1.類定義 A file output stream is an output stream for writing data to a File or to a FileDescriptor. Whether or

HttpClient` 原始碼詳解之`UrlEncodedFormEntity

HttpClient 原始碼詳解之UrlEncodedFormEntity 1. 類釋義 /** * An entity composed of a list of url-encoded pairs. * This is typically useful while sen

HttpClient 原始碼詳解之 BasicNameValuePair

HttpClient 原始碼詳解之 BasicNameValuePair 1. 類定義 Basic implementation of {@link NameValuePair}. 2. 方法簡介構造器 /** * Default C

Java佇列詳解之 LinkedList 類

Java佇列詳解之 LinkedList 類 1. 類簡介類釋義 A collection designed for holding elements prior to processing. Besides basic Collection oper

HttpClient 原始碼詳解之HttpEntity

HttpClient 原始碼詳解之HttpEntity 1. 類釋義 An entity that can be sent or received with an HTTP message. Entities can be found in some requests

HttpClient 原始碼詳解之HttpRequestBase

HttpClient 原始碼詳解之HttpRequestBase 1. 類釋義 * Base implementation of {@link HttpUriRequest}. 2. 基本方法 getURI() 返回初始的請求URI【這個URI不會隨著重定位或者請

Hadoop 原始碼詳解之FileInputFormat類