1. 程式人生 > >【Linux學習筆記】Hadoop_安裝,單機測試,偽分散式

【Linux學習筆記】Hadoop_安裝,單機測試,偽分散式

Hadoop

1.安裝

不建議使用root使用者操作Hadoop,建立一個新使用者

[[email protected] ~]# useradd -u 1005 wpf

## 切換使用者
[[email protected] ~]# su - wpf
解壓hadoop
## 解壓
[[email protected] ~]$ tar zxf hadoop-2.7.3.tar.gz 

## 進入目錄下
[[email protected] ~]$ cd hadoop-2.7.3/
[[email protected] hadoop-2.7.3]$ cd etc/hadoop/
配置jdk路徑
## 配置jdk路徑
[[email protected] hadoop]$ vim hadoop-env.sh 
# The java implementation to use.
export JAVA_HOME=/usr/java/jdk1.8.0_121/

2.單機測試

在本機查詢資料,大資料一般應用於海量資料的查詢

## 建立input目錄,將etc/hadoop下的所有xml複製到input目錄
[[email protected] hadoop-2.7.3]$ mkdir input/
[[email protected] hadoop-2.7.3]$ cp etc/hadoop/*.xml input
通過執行對應的jar包,查詢資料
## 檢視所有jar包
[[email protected] hadoop-2.7.3]$ bin/hadoop jar share/hadoop/mapreduce/

## 執行jar包,篩選dfs開頭的到output中
[[email protected] hadoop-2.7.3]$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar grep input output 'dfs[a-z.]+'

## 檢視篩選的結果
[[email protected]
hadoop-2.7.3]$ cat output/* 1 dfsadmin

3.偽分散式

分散式程式設計需要多個主機,沒有條件,所以我們在一個主機上進行,稱為偽分散式

編輯兩個配置檔案

[[email protected] hadoop]$ vim core-site.xml 
<configuration>
 <property>
   <name>fs.defaultFS</name>
   <value>hdfs://172.25.254.112:9000</value>	## 本機ip
 </property>
</configuration>

## 配置
[[email protected] hadoop]$ vim hdfs-site.xml 
<configuration>
  <property>
   <name>dfs.replication</name>
   <value>l</value>
  </property>
</configuration>
配置金鑰
[[email protected] ~]$ ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/home/wpf/.ssh/id_rsa): 
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /home/wpf/.ssh/id_rsa.
Your public key has been saved in /home/wpf/.ssh/id_rsa.pub.
The key fingerprint is:
a7:f9:2b:0c:b9:be:10:d1:cd:b8:4e:13:81:96:37:00 [email protected]
The key's randomart image is:
+--[ RSA 2048]----+
|  E..+.          |
|    +.o=         |
|   ...+.o        |
|     . o         |
|    . +.S .      |
|     +o. +       |
|    . .+o        |
|     .. o.       |
|     .o. .o.     |
+-----------------+
修改配置的ip地址
[[email protected] ~]$ cd -
/home/wpf/hadoop-2.7.3/etc/hadoop

## 修改slaves內容為ip地址
[[email protected] hadoop]$ vim slaves
連線此ip主機,在此之前,先在root下設定使用者的密碼
## 連線ip
[[email protected] hadoop]$ ssh 172.25.254.112
[email protected]'s password: 
Last login: Thu Jan 11 03:48:29 2018

[[email protected] ~]$ ssh-copy-id 172.25.254.112
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
[email protected]'s password: 

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh '172.25.254.112'"
and check to make sure that only the key(s) you wanted were added.
啟動start-dfs.sh
[[email protected] hadoop-2.7.3]$ sbin/start-dfs.sh
172.25.254.112: starting namenode, logging to /home/wpf/hadoop-2.7.3/logs/hadoop-wpf-namenode-localhost.out
172.25.254.112: starting datanode, logging to /home/wpf/hadoop-2.7.3/logs/hadoop-wpf-datanode-localhost.out
Starting secondary namenodes [0.0.0.0]
The authenticity of host '0.0.0.0 (0.0.0.0)' can't be established.
ECDSA key fingerprint is eb:24:0e:07:96:26:b1:04:c2:37:0c:78:2d:bc:b0:08.
Are you sure you want to continue connecting (yes/no)? yes
0.0.0.0: Warning: Permanently added '0.0.0.0' (ECDSA) to the list of known hosts.
0.0.0.0: starting secondarynamenode, logging to /home/wpf/hadoop-2.7.3/logs/hadoop-wpf-secondarynamenode-localhost.out

## 檢視jps
[[email protected] hadoop-2.7.3]$ jps
4072 DataNode
4394 Jps

## jps路徑
[[email protected] hadoop-2.7.3]$ which jps
/usr/bin/jps
通過瀏覽器訪問,或通過命令列列印報告(顯示為雙核)
[[email protected] hadoop-2.7.3]$ bin/hdfs dfsadmin -report
Configured Capacity: 10725273600 (9.99 GB)
Present Capacity: 5367533568 (5.00 GB)
DFS Remaining: 5367529472 (5.00 GB)
DFS Used: 4096 (4 KB)
DFS Used%: 0.00%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0

-------------------------------------------------
Live datanodes (1):

Name: 172.25.254.193:50010 (www.westos.org)
Hostname: www.westos.org
Decommission Status : Normal
Configured Capacity: 10725273600 (9.99 GB)
DFS Used: 4096 (4 KB)
Non DFS Used: 5357740032 (4.99 GB)
DFS Remaining: 5367529472 (5.00 GB)
DFS Used%: 0.00%
DFS Remaining%: 50.05%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Thu Jan 11 04:20:08 EST 2018