1. 程式人生 > >Eclipse中hadoop環境的建立

Eclipse中hadoop環境的建立

asm logs mapr oid mar test hub lib 運行

在eclipse中建立hadoop環境的支持

1.需要下載安裝eclipse

2.需要hadoop-eclipse-plugin-2.6.0.jar插件,插件的終極解決方案是https://github.com/winghc/hadoop2x-eclipse-plugin下載並編譯。也是可用提供好的插件。

3.復制編譯好的hadoop-eclipse-plugin-2.6.0.jar復制到eclipse插件目錄(plugins目錄)下,如圖所示

技術分享圖片

重啟eclipse

4.在eclipse中配置hadoop安裝目錄

windows ->preference -> hadoop Map/Reduce -> Hadoop installation directory在此處指定hadoop的安裝目錄

技術分享圖片

點擊Apply,點擊OK確定

5.配置Map Reduce視圖

window -> Open Perspective ->other-> Map/Reduce -> 點擊“OK”

window -> show view -> other -> Map/Reduce Locations -> 點擊“OK”

技術分享圖片

6.在“Map/Reduce Location”Tab頁點擊圖標<大象+>或者在空白的地方右鍵,選擇“New Hadoop location...”,彈出對話框“New hadoop location...”,進行相應的配置

技術分享圖片

設置Location name為任意都可以,Host為hadoop集群中主節點所在主機的ip地址或主機名,這裏MR Master的Port需mapred-site.xml配置文件一致為10020,DFS Master的Port需和core-site.xml配置文件的一致為9000,User name為root(安裝hadoop集群的用戶名)。之後點擊finish。在eclipse的DFS Location目錄下出現剛剛創建的Location name(這裏為hadoop),eclipse就與hadoop集群連接成功,如圖所示。

技術分享圖片

7.打開Project Explorers查看HDFS文件系統,如圖所示

技術分享圖片

8.新建Map/Reduce任務

需要先啟動Hadoop服務

File -> New -> project -> Map Reduce Project ->Next

技術分享圖片

填寫項目名稱

技術分享圖片

編寫WordCount類:

package test;

import java.io.IOException;
import java.util.StringTokenizer;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.GenericOptionsParser;
public class WordCount {
	public static class MyMap extends Mapper<Object, Text, Text, IntWritable> {
		private final static IntWritable one = new IntWritable(1);
		private Text word = new Text();
		@Override
		public void map(Object key, Text value, Context context)
				throws IOException, InterruptedException {
			StringTokenizer itr = new StringTokenizer(value.toString());
			while (itr.hasMoreTokens()) {
				word.set(itr.nextToken());
				context.write(word, one);
			}
		}
	}

	public static class MyReduce extends
			Reducer<Text, IntWritable, Text, IntWritable> {
		private IntWritable result = new IntWritable();
		@Override
		public void reduce(Text key, Iterable<IntWritable> values,
				Context context)
		throws IOException, InterruptedException {
			int sum = 0;
			for (IntWritable val : values) {
				sum += val.get();
			}
			result.set(sum);
			context.write(key, result);
		}
	}

	public static void main(String[] args) throws Exception {
		Configuration conf = new Configuration();
		String[] otherArgs = new GenericOptionsParser(conf, args)
		if (otherArgs.length != 2) {
			System.err.println("Usage: wordcount <in> <out>");
			System.exit(2);
		}
		Job job = new Job(conf, "word count");
		job.setJarByClass(WordCount.class);
		job.setMapperClass(MyMap.class);
		job.setReducerClass(MyReduce.class);
		job.setOutputKeyClass(Text.class);
		job.setOutputValueClass(IntWritable.class);
		FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
		FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));
		System.exit(job.waitForCompletion(true) ? 0 : 1);
	}
}

運行WordCount程序:

右鍵單擊Run As -> Run Configurations

選擇Java Applications ->WordCount(要運行的類)->Arguments

在Program arguments中填寫輸入輸出路徑,點擊Run

技術分享圖片

Eclipse中hadoop環境的建立