Eclipse中hadoop環境的建立
在eclipse中建立hadoop環境的支持
1.需要下載安裝eclipse
2.需要hadoop-eclipse-plugin-2.6.0.jar插件,插件的終極解決方案是https://github.com/winghc/hadoop2x-eclipse-plugin下載並編譯。也是可用提供好的插件。
3.復制編譯好的hadoop-eclipse-plugin-2.6.0.jar復制到eclipse插件目錄(plugins目錄)下,如圖所示
重啟eclipse
4.在eclipse中配置hadoop安裝目錄
windows ->preference -> hadoop Map/Reduce -> Hadoop installation directory在此處指定hadoop的安裝目錄
點擊Apply,點擊OK確定
5.配置Map Reduce視圖
window -> Open Perspective ->other-> Map/Reduce -> 點擊“OK”
window -> show view -> other -> Map/Reduce Locations -> 點擊“OK”
6.在“Map/Reduce Location”Tab頁點擊圖標<大象+>或者在空白的地方右鍵,選擇“New Hadoop location...”,彈出對話框“New hadoop location...”,進行相應的配置
設置Location name為任意都可以,Host為hadoop集群中主節點所在主機的ip地址或主機名,這裏MR Master的Port需mapred-site.xml配置文件一致為10020,DFS Master的Port需和core-site.xml配置文件的一致為9000,User name為root(安裝hadoop集群的用戶名)。之後點擊finish。在eclipse的DFS Location目錄下出現剛剛創建的Location name(這裏為hadoop),eclipse就與hadoop集群連接成功,如圖所示。
7.打開Project Explorers查看HDFS文件系統,如圖所示
8.新建Map/Reduce任務
需要先啟動Hadoop服務
File -> New -> project -> Map Reduce Project ->Next
填寫項目名稱
編寫WordCount類:
package test; import java.io.IOException; import java.util.StringTokenizer; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.Mapper; import org.apache.hadoop.mapreduce.Reducer; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; import org.apache.hadoop.util.GenericOptionsParser; public class WordCount { public static class MyMap extends Mapper<Object, Text, Text, IntWritable> { private final static IntWritable one = new IntWritable(1); private Text word = new Text(); @Override public void map(Object key, Text value, Context context) throws IOException, InterruptedException { StringTokenizer itr = new StringTokenizer(value.toString()); while (itr.hasMoreTokens()) { word.set(itr.nextToken()); context.write(word, one); } } } public static class MyReduce extends Reducer<Text, IntWritable, Text, IntWritable> { private IntWritable result = new IntWritable(); @Override public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException { int sum = 0; for (IntWritable val : values) { sum += val.get(); } result.set(sum); context.write(key, result); } } public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); String[] otherArgs = new GenericOptionsParser(conf, args) if (otherArgs.length != 2) { System.err.println("Usage: wordcount <in> <out>"); System.exit(2); } Job job = new Job(conf, "word count"); job.setJarByClass(WordCount.class); job.setMapperClass(MyMap.class); job.setReducerClass(MyReduce.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); FileInputFormat.addInputPath(job, new Path(otherArgs[0])); FileOutputFormat.setOutputPath(job, new Path(otherArgs[1])); System.exit(job.waitForCompletion(true) ? 0 : 1); } }
運行WordCount程序:
右鍵單擊Run As -> Run Configurations
選擇Java Applications ->WordCount(要運行的類)->Arguments
在Program arguments中填寫輸入輸出路徑,點擊Run
Eclipse中hadoop環境的建立