1. 程式人生 > >大資料學習之---CDH叢集版本部署

大資料學習之---CDH叢集版本部署

1、軟體環境和IP規劃

RHEL6 角色

jdk-8u45apache-maven-3.3.9

hive-1.1.0-cdh5.7.1-src.tar.gz 

hadoop-2.8.1.tar.gz

mysql-connector-java-6.0.6.tar.gz

apache-maven-3.3.9

cloudera-manager-el6-cm5.9.3_x86_64.tar

mysql-5.7

CDH-5.9.3-1.cdh5.9.3.p0.4-el6

172.16.18.133 NN && SN && Jobtrack hadoop01

172.16.18.134 DN && tasktrack hadoop02

172.16.18.136 DN && tasktrack hadoop03

172.16.18.143 DN && tasktrack hadoop04

172.16.18.145 DN && tasktrack hadoop05

NN =namenode    SN=secondarynamenode  DN=datanode

大資料學習群119599574

叢集介紹:

不收費的Hadoop版本主要有三個(均是國外廠商),分別是:Apache(最原始的版本,所有發行版均基於這個版本進行改進)、Cloudera版本(Cloudera’s Distribution Including Apache Hadoop,簡稱CDH)、Hortonworks版本(Hortonworks Data Platform,簡稱“HDP”),對於國內而言,絕大多數選擇CDH版本。

CDH (Cloudera's Distribution, including Apache Hadoop),是Hadoop眾多分支中的一種,由Cloudera維護,基於穩定版本的Apache Hadoop構建,並集成了很多補丁,可直接用於生產環境。 

Cloudera Manager則是為了便於在叢集中進行Hadoop等大資料處理相關的服務安裝和監控管理的元件,對叢集中主機、Hadoop、Hive、Spark等服務的安裝配置管理做了極大簡化。 

叢集安裝,本文采用選擇  離線安裝CDH

https://www.cloudera.com/downloads/cdh/5-9-0.html 

官網對CDH的描述,CHD對system JDK database 等版本支援列表

官網看支援jdk1.8但是部分1.8版本會報錯,所以我們選擇jdk1.7

2、軟體包裝備

Cloudera Manager軟體包

http://archive.cloudera.com/cm5/cm/5/cloudera-manager-el6-cm5.9.3_x86_64.tar.gz

CDH軟體包  (下載對應Linux版本包)

http://archive.cloudera.com/cdh5/parcels/5.9.3/CDH-5.9.3-1.cdh5.9.3.p0.4-el6.parcel

http://archive.cloudera.com/cdh5/parcels/5.9.3/CDH-5.9.3-1.cdh5.9.3.p0.4-el6.parcel.sha1

mysql jdbc驅動版本是:

http://download.softagency.net/MySQL/Downloads/Connector-J/mysql-connector-java-6.0.6.tar.gz

3、系統相關配置

所有主機相同   安裝JDK 關閉selinux  iptables  配置/etc/hosts 配置yum

 [[email protected] ~]# vim /etc/profile

export JAVA_HOME=/usr/java/jdk1.7.0_79

export PATH=$JAVA_HOME/bin:$ORACLE_HOME/bin:$R_HOME/bin:$PATH

[[email protected] ~]# getenforce 

Disabled

[[email protected] ~]# iptables -L

Chain INPUT (policy ACCEPT)

target     prot opt source               destination         

Chain FORWARD (policy ACCEPT)

target     prot opt source               destination         

Chain OUTPUT (policy ACCEPT)

target     prot opt source               destination         

[[email protected] ~]# cat /etc/hosts

127.0.0.1   localhost

172.16.18.133   hadoop01  

172.16.18.134    hadoop02

172.16.18.136    hadoop03

172.16.18.143    hadoop04

172.16.18.145    hadoop05

[[email protected] ~]# hostname

hadoop01

4、配置ssh自動登入互信

參考偽分散式ssh互信配置

每個節點驗證不需要進行互動輸入yes

useradd hadoop   ----建立使用者

ssh hadoop01 date

ssh hadoop02 date

ssh hadoop03 date

ssh hadoop04 date

ssh hadoop05 date

5、修改swap空間的swappiness=0

cat  /proc/sys/vm/swappiness 

sysctl vm.swappiness=0

echo 0 > /proc/sys/vm/swappiness

關閉告警:echo never > /sys/kernel/mm/transparent_hugepage/defrag

6、配置NTP伺服器

先選定主伺服器,其他伺服器都同步這臺主伺服器的時間

# hwclock -w

配置開機啟動

[[email protected] ~]# chkconfig ntpd on

[[email protected] ~]# chkconfig --list ntpd

[[email protected] ~]#vi /etc/ntp.conf

    (找到這一行,放開restrict的註釋,並且修改ip地址)

    # Hosts on local network are less restricted.

    restrict 192.168.128.1 mask 255.255.255.0 nomodify notrap

    (找到這一行,註釋下面的server)

    # Please consider joining the pool (http://www.pool.ntp.org/join.html).

    #server 0.rhel.pool.ntp.org iburst

    #server 1.rhel.pool.ntp.org iburst

    #server 2.rhel.pool.ntp.org iburst

    #server 3.rhel.pool.ntp.org iburst

    新增下面兩行

    server  127.0.1.0     # local clock

    fudge   127.0.1.0 stratum 10

配置其他伺服器

 vi /etc/ntp.conf

    # Hosts on local network are less restricted.

    restrict 192.168.128.1 nomodify notrap noquery

    (註釋下面server)

    # Please consider joining the pool (http://www.pool.ntp.org/join.html).

    #server 0.rhel.pool.ntp.org iburst

    #server 1.rhel.pool.ntp.org iburst

    #server 2.rhel.pool.ntp.org iburst

    #server 3.rhel.pool.ntp.org iburst

    指定時間服務

    server 192.168.128.51

所有重啟ntp服務

[[email protected] ~]# service ntpd restart

[[email protected] ~]# ntpstat  

synchronised to NTP server (172.16.18.33) at stratum 12 

time correct to within 18 ms

polling server every 64 s

[[email protected] ~]# date

7、禁用ipv6和“透明大頁面”

[[email protected] ~]# echo "alias ipv6 off" >> /etc/modprobe.d/dist.conf  

[[email protected] ~]# echo "alias net-pf-10 off" >> /etc/modprobe.d/dist.conf

[[email protected] ~]# echo never > /sys/kernel/mm/transparent_hugepage/defrag

[[email protected] ~]# echo 'echo never > /sys/kernel/mm/transparent_hugepage/defrag' >> /etc/rc.local

[[email protected] ~]# 

8、準備好mysql資料庫

修改 mysql 許可權: 

GRANT ALL PRIVILEGES ON *.* TO 'root'@'%' IDENTIFIED BY '123' WITH GRANT OPTION;

flush privileges;

delete from user where host !='%';

[[email protected] software]# mysql -h 172.16.18.133   -uroot -p

###############################準備工作########################################

hadoop01 Server || Agent

hadoop02 Agent

hadoop03 Agent

CDH採用3臺伺服器,剩下2臺做叢集新增節點使用

########################################################################

10.CM安裝

安裝cloudera Manager Server、Agent

cdh叢集節點都要安裝 軟體準備 賬號建立

主節點:

[[email protected] software]# ls cloudera-manager-el6-cm5.9.3_x86_64.tar.gz 

cloudera-manager-el6-cm5.9.3_x86_64.tar.gz

[[email protected] software]# pwd

/opt/software

[[email protected] software]# ls cloudera-manager-el6-cm5.9.3_x86_64.tar.gz 

cloudera-manager-el6-cm5.9.3_x86_64.tar.gz

[[email protected] software]# mkdir /opt/cloudera-manager

[[email protected] software]# tar zxvf cloudera-manager-el6-cm5.9.3_x86_64.tar.gz -C /opt/cloudera-manager/

客戶端配置

/opt/cloudera-manager/cm-5.9.3/etc/cloudera-scm-agent/config.ini

server_host=hadoop01   ---在cm server主機名

[[email protected] software]# useradd --system --no-create-home --shell=/bin/false --comment "Cloudera SCM User" cloudera-scm

[[email protected] software]# id cloudera-scm

uid=495(cloudera-scm) gid=492(cloudera-scm) 組=492(cloudera-scm)

haoop02 hadoop03所有從節點

useradd --system --no-create-home --shell=/bin/false --comment "Cloudera SCM User" cloudera-scm

mkdir /opt/cloudera-manager

[[email protected] opt]# scp -r /opt/cloudera-manager/cm-5.9.3  hadoop02:/opt/cloudera-manager/

[[email protected] opt]# scp -r /opt/cloudera-manager/cm-5.9.3  hadoop03:/opt/cloudera-manager/

11、配置CM Server資料庫

我們開始準備mysql資料庫建立

[[email protected] ~]# mysql -h172.16.18.133 -uroot -p

mysql> 

GRANT ALL PRIVILEGES ON *.* TO 'root'@'%' IDENTIFIED BY '123' WITH GRANT OPTION;

flush privileges;

mysql> flush privileges;

[[email protected] schema]# pwd

/opt/cloudera-manager/cm-5.9.3/share/cmf/schema

[[email protected] schema]# ./scm_prepare_database.sh  mysql -hhadoop01 -uroot -p123 --scm-host hadoop01 cmdb root 123

JAVA_HOME=/usr/java/jdk1.7.0_79

Verifying that we can write to /opt/cloudera-manager/cm-5.9.3/etc/cloudera-scm-server

Sat Apr 28 14:20:38 CST 2018 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.

Creating SCM configuration file in /opt/cloudera-manager/cm-5.9.3/etc/cloudera-scm-server

Executing:  /usr/java/jdk1.7.0_79/bin/java -cp /usr/share/java/mysql-connector-java.jar:/usr/share/java/oracle-connector-java.jar:/opt/cloudera-manager/cm-5.9.3/share/cmf/schema/../lib/* com.cloudera.enterprise.dbutil.DbCommandExecutor /opt/cloudera-manager/cm-5.9.3/etc/cloudera-scm-server/db.properties com.cloudera.cmf.db.

Sat Apr 28 14:20:39 CST 2018 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.

[                          main] DbCommandExecutor              INFO  Successfully connected to database.

All done, your SCM database is configured correctly!

說明:這個指令碼就是用來建立和配置CMS需要的資料庫的指令碼。各引數是指:

mysql:資料庫用的是mysql,如果安裝過程中用的oracle,那麼該引數就應該改為oracle。

-hadoop01:資料庫建立在hadoop01主機上面,也就是主節點上面。

-uroot:root身份執行mysql。-123:mysql的root密碼是

--scm-host hadoop01 :CMS的主機,一般是和mysql安裝的主機是在同一個主機上,

最後三個引數是:資料庫名,資料庫使用者名稱,資料庫密碼。

12、製作CDH本地源

Server節點

mkdir -p /opt/cloudera/parcel-repo

chown cloudera-scm:cloudera-scm /opt/cloudera/parcel-repo

Agent節點

mkdir -p /opt/cloudera/parcels

chown cloudera-scm:cloudera-scm /opt/cloudera/parcels

上傳到CDH-5.9.3-1.cdh5.9.3.p0.4-el6.parcel   manifest.json主節點/opt/cloudera/parcel-repo/路徑

[[email protected] CDH]# cd /opt/cloudera/parcel-repo/

[[email protected] parcel-repo]# ls

CDH-5.9.3-1.cdh5.9.3.p0.4-el6.parcel   manifest.json

[[email protected] parcel-repo]# ls

CDH-5.9.3-1.cdh5.9.3.p0.4-el6.parcel  manifest.json

[[email protected] parcel-repo]# mv manifest.json CDH-5.9.3-1.cdh5.9.3.p0.4-el6.parcel.sha

manifest.json改名檔名與你的 parel包名一致,並加上.sha字尾

13、啟動

保障mysql先啟動

server:hadoop01

[[email protected] init.d]# pwd

/opt/cloudera-manager/cm-5.9.3/etc/init.d

[[email protected] init.d]# ./cloudera-scm-server start

Starting cloudera-scm-server:

agent:hadoop01 hadoop02 hadoop02

/opt/cloudera-manager/cm-5.9.3/etc/init.d

./cloudera-scm-agent  start

正在啟動 cloudera-scm-agent:                              [確定]

2018-04-28 14:43:37,022 INFO WebServerImpl:org.mortbay.log: jetty-6.1.26.cloudera.4

2018-04-28 14:43:37,024 INFO WebServerImpl:org.mortbay.log: Started [email protected]:7180

2018-04-28 14:43:37,024 INFO WebServerImpl:com.cloudera.server.cmf.WebServerImpl: Started Jetty server.

出現下面內容表示啟動成功

14、圖形訪問

錯誤大全:

問題1:JDBC driver驅動

[[email protected] schema]# ./scm_prepare_database.sh mysql cmdb -h hadoop01 -uroot -p123456 --scm-host hadoop01 scm scm scm

JAVA_HOME=/usr/java/jdk1.7.0_79

Verifying that we can write to /opt/cloudera-manager/cm-5.9.3/etc/cloudera-scm-server

[                          main] DbProvisioner                  ERROR Unable to find the MySQL JDBC driver. Please make sure that you have installed it as per instruction in the installation guide.

[                          main] DbProvisioner                  ERROR Stack Trace:

java.lang.ClassNotFoundException: com.mysql.jdbc.Driver

    at java.net.URLClassLoader$1.run(URLClassLoader.java:366)[:1.7.0_79]

    at java.net.URLClassLoader$1.run(URLClassLoader.java:355)[:1.7.0_79]

    at java.security.AccessController.doPrivileged(Native Method)[:1.7.0_79]

    at java.net.URLClassLoader.findClass(URLClassLoader.java:354)[:1.7.0_79]

解決方法

[[email protected] software]# ls mysql-connector-java-5.1.46.zip 

mysql-connector-java-5.1.46.zip

[[email protected] software]# unzip  mysql-connector-java-5.1.46.zip ^C

[[email protected] software]# cp mysql-connector-java-5.1.46/

build.xml                            mysql-connector-java-5.1.46-bin.jar  README.txt

CHANGES                              mysql-connector-java-5.1.46.jar      src/

COPYING                              README                               

[[email protected] software]# cp mysql-connector-java-5.1.46/mysql-connector-java-5.1.46.jar  /usr/share/java/

[[email protected] software]# mv /usr/share/java/mysql-connector-java-5.1.46.jar /usr/share/java/mysql-connector-java.jar

問題2:

dbc url 'jdbc:mysql://hadoop01/?useUnicode=true&characterEncoding=UTF-8'

java.sql.SQLException: Access denied for user 'root'@'hadoop01' (using password: YES)

    at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:965)[mysql-connector-java.jar:5.1.46]
大資料學習群119599574