windows本地執行hadoop的MapReduce程式

阿新 • • 發佈：2018-11-29

1.下載hadoo安裝到windows本地

地址 https://archive.apache.org/dist/hadoop/core/hadoop-2.6.0/hadoop-2.6.0.tar.gz

2. 解壓之後進行設定環境變數

新建 HADOOP_HOME D:\software\hadoop-2.6.0

Path中增加 %HADOOP_HOME%\bin 和 %HADOOP_HOME%\sbin

3.範例程式碼

專案結構

pom.xml

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
	xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
	<modelVersion>4.0.0</modelVersion>

	<groupId>com.xiaobao</groupId>
	<artifactId>hadooptest</artifactId>
	<version>0.0.1-SNAPSHOT</version>
	<packaging>jar</packaging>


	<dependencies>
		<dependency>
			<groupId>org.apache.hadoop</groupId>
			<artifactId>hadoop-client</artifactId>
			<version>2.6.0</version>
		</dependency>
	</dependencies>


</project>

log4j.properties

log4j.rootLogger = debug,stdout

log4j.appender.stdout = org.apache.log4j.ConsoleAppender
log4j.appender.stdout.Target = System.out
log4j.appender.stdout.layout = org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern = [%-5p] %d{yyyy-MM-dd HH:mm:ss,SSS} method:%l%n%m%n

WordCountMapper.java

package com.xiaobao.wordcount;

import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;

import java.io.IOException;

public class WordCountMapper extends Mapper<LongWritable, Text, Text, IntWritable> {

    @Override
    protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {

        String[] words = value.toString().split("\\s+");

        for (String word : words) {
            context.write(new Text(word), new IntWritable(1));
        }
    }
}

WordCountReducer.java

package com.xiaobao.wordcount;

import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;

import java.io.IOException;

public class WordCountReducer extends Reducer<Text, IntWritable, Text, IntWritable> {

    @Override
    protected void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {
        int sum = 0;
        for (IntWritable value : values) {
            sum += value.get();
        }
        context.write(new Text(key), new IntWritable(sum));
    }
}

RunJob.java

package com.xiaobao.wordcount;


import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

public class RunJob {

    public static void main(String[] args) throws Exception {

        Configuration configuration = new Configuration();

        FileSystem fs = FileSystem.get(configuration);

        Job job = Job.getInstance(configuration);
        job.setJarByClass(RunJob.class);
        job.setJobName("wordCount");

        job.setMapperClass(WordCountMapper.class);
        job.setReducerClass(WordCountReducer.class);

        job.setMapOutputKeyClass(Text.class);
        job.setMapOutputValueClass(IntWritable.class);

        FileInputFormat.addInputPath(job, new Path("/wordcount/input"));
        Path outPath = new Path("/wordcount/output");
        if (fs.exists(outPath)) {
            fs.delete(outPath, true);
        }
        FileOutputFormat.setOutputPath(job, outPath);

        boolean completion = job.waitForCompletion(true);
        if (completion) {
            System.out.println("執行完成");
        }


    }
}

4. 然後在專案的根目錄下建立 D:\wordcount\input 和 D:\wordcount\output資料夾input 資料夾中放入 words.txt內容為

hello world
hello hadoop
hadoop yes
hadoop no

5. 然後啟動RunJob的main方法，會出現報錯

[ERROR] 2018-11-13 10:12:43,981 method:org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:373)
Failed to locate the winutils binary in the hadoop binary path
java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
	at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:355)
	at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:370)
	at org.apache.hadoop.util.Shell.<clinit>(Shell.java:363)
	at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:79)
	at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:104)
	at org.apache.hadoop.security.Groups.<init>(Groups.java:86)
	at org.apache.hadoop.security.Groups.<init>(Groups.java:66)
	at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:280)
	at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:271)
	at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:248)
	at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:763)
	at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:748)
	at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:621)
	at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2753)
	at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2745)
	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2611)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:169)
	at com.xiaobao.wordcount.RunJob.main(RunJob.java:24)

5.看日誌的錯誤資訊是沒有winutils.exe這個檔案，需要下載winutils.exe檔案，注意要和自己hadoop版本一致，否則還是報錯

下載地址: https://github.com/steveloughran/winutils 從github下載下來壓縮包，將對應版本的bin下的檔案拷貝並替換自己本地的hadoop的bin目錄中檔案

6. 繼續執行main方法，執行成功

此時 D:\wordcount\output 資料夾中出現執行完成的檔案_SUCCESS 和 part-r-00000 等檔案， _SUCCESS 代表執行成功， part-r-00000中存單詞統計的結果

hadoop	3
hello	2
no	1
world	1
yes	1

windows本地執行hadoop的MapReduce程式

1.下載hadoo安裝到windows本地地址 https://archive.apache.org/dist/hadoop/core/hadoop-2.6.0/hadoop-2.6.0.tar.gz 2. 解壓之後進行設定環境變數

Windows|Eclipse 執行HDFS程式遇到問題之 AccessControlException

轉自http://f.dataguru.cn/thread-281774-1-1.html 問題： Windows|Eclipse 執行HDFS程式之後，報：org.apache.Hadoop.security.AccessControlException: Permission denie

windows下執行spark程式

linux普通使用者開發spark程式時，由於無法使用IDEA的圖形化操作介面，所以只能大包圍jar，用spark-submit提交，不是很方便， spark的local模式可以方便開發者在本地除錯程式碼，而不用打包為jar用spark-submit提交執行，或

idea本地執行mapreduce程式

上一篇文章介紹瞭如何在idea上執行hdfs程式，中間出現了很多錯誤，通過不斷的在網上查詢資料和自己的嘗試。終於可以正常運行了。這篇我們將進行mapreduce程式的除錯。準備工作：下載hadoop到windows本地地址:https://archive.apa

解決MapReduce任務在windows本地執行的NullPointerException問題

為了能在除錯MapReduce任務階段有更好的工作效率，我們可以把URI的Schema設定為file:///，這樣MapReduce任務就可以範圍windows本地資料夾。當我在嘗試這麼做的時候出現瞭如下的空指標異常 Exception in thread "main"

Hadoop windows 本地執行Mapreduce 報錯 Error while running command to get file permissions

package cn.hadoop.mr.flowsum; import java.io.IOException; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path

windows上執行Qt程式exe問題

在VS2012環境下安裝QT外掛，編寫一個簡單的QT小程式後，可以正常的編譯執行，但是直接雙擊exe卻有問題。提示缺少相應的dll，於是從QT安裝目錄中找到相應的dll，可是最後提示本著執著的精神，想到可能還是缺少相應的dll，於是把快捷方式直接放到QT的bin目錄

如何在WINDOWS下執行UNIX程式和GNU程式

1 引言今天程式開發人員面臨的最大問題就是如何使他們的應用程式支援各種不同的平臺，如何使使用者能夠在不同的平臺下不作任何修改地就能使用他們的應用程式。眾所周知，Linux 作業系統和其它基於 UNIX 的作業系統一直是國際上使用廣泛又非常重要的 OS。但在中國，微軟的 Windows OS 卻佔領著巨大市

windows下執行C程式

（1）安裝好GCC編譯器，配置好環境變數、Library_path、include_path，並檢查gcc -v (2)gcc test.c，若無錯誤，會生出test.exe (3)命令列輸入test

jenkins 運用Windows Slave執行 python程式

安裝使用配置節點 1.左側選單欄依次選擇：【系統管理】-> 【管理結點】-> 【新建結點】輸入節點名稱，選擇下面的Permanernt Agent（或Dumb Slave新增jenkins外部的主機，虛擬機器器等好像要選擇上

windows本地eclispe執行linux上hadoop的maperduce程式

繼續上一篇博文：hadoop叢集的搭建 1.將linux節點上的hadoop安裝包從linux上下載下來（你也可以從網上直接下載壓縮包，解壓後放到自己電腦上）我的地址是： 2.配置環境變數： HADOOP_HOME D:

MapReduce 程式在 Windows 本地模式下執行報錯問題的解決

一、報錯資訊第一種： Exception in thread "main" java.lang.RuntimeException: Error while running command to get file permissions : java.io.IOException: (n

windows上eclipse執行hadoop程式報NullPointerException錯

windows上eclipse執行hadoop程式報NullPointerException錯 log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFact

eclipse執行mapereduce程式時報如下錯誤：org.apache.hadoop.io.nativeio.NativeIO$Windows.createDirectoryWithMode0(

eclipse執行mapereduce程式時報如下錯誤： log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory). log4j:WARN

10-Linux與windows檔案互傳-pscp坑---- 'pscp' 不是內部或外部命令，也不是可執行的程式或批處理檔案

1.下載pscp工具http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html2.拷貝到C:\Windows\System32 如果考到其他資料夾，執行提示 'pscp' 不是內部或外部命令，也不是可執行的程式或批處

pycharm匯入本地檔案，程式執行正常，但匯入模組時出現紅色波浪線

pycharm匯入本地檔案，程式執行正常，但匯入模組時出現紅色波浪線，如下所示：兩種解決辦法：（1）在檔案前面加“.” （2）點選選單欄的“PyCharm”, 然後選擇“Preferences”，接著依次執行以下操作，最後點選“OK”按鈕。（3）

Hadoop-mapreduce 程式在windows上執行需要注意的問題

1.在主程式中需要新增這幾個引數配置 Configuration conf = new Configuration(); // 1、設定job執行時要訪問的預設檔案系統 conf.set("fs.defaultFS", HADOOP_ROOT_PATH);

【C++筆記】Windows通過命令列編譯執行c程式（轉載）

1.準備一臺具備c開發環境的Windows。驗證方式，命令列輸入gcc -v，檢視是否輸出版本資訊。 gcc -v 1 2.寫程式在E盤建一個資料夾C，在裡面建立一個Hello.c檔案，副檔名是c，內容如下： #include <stdio.h>

將Python程式(.py)轉換為Windows可執行檔案(.exe)

將Python程式(.py)轉換為Windows可執行檔案(.exe) python開發者向普通windows使用者分享程式，要給程式加圖形化的介面（EasyGUI 學習文件）(在前面的課程中目前只學習了最簡單的EasyGui，後面還會繼續學習 GUI的終極選擇：Tkinter，敬請期待)，

解決Windows系統下執行hadoop程式出錯Could not locate executablenull\bin\winutils.exe in the Hadoop binaries

樓主今天在開發後端介面的時候,發現報了Could not locate executablenull\bin\winutils.exe in the Hadoop binaries 的錯誤,經過分析是我呼叫了同事寫的介面,同事那個模組是引入了

windows本地執行hadoop的MapReduce程式

相關推薦