1. 程式人生 > >Hadoop生態圈-CDH5.15.1升級預設的spark版本

Hadoop生態圈-CDH5.15.1升級預設的spark版本

                  Hadoop生態圈-CDH5.15.1升級預設的spark版本

                                          作者:尹正傑

版權宣告:原創作品,謝絕轉載!否則將追究法律責任。

 

  在我的CDH5.11叢集中,預設安裝的spark是1.6版本,開發的同事跟我抱怨,說之前的大資料平臺(在ucloud上,屬於雲服務)用的就是spark1.6,好多java的API都用不了,有很多高階的功能沒法在1.6版本上使用,因此被迫需要升級spark版本,他們要求升級到2.3.0或以上版本,經查閱相關資料,也查看了一些熱心網友的帖子,才總結了我部署spark2.3.0的部署筆記。當然你可以參考官網:
https://www.cloudera.com/documentation/spark2/latest/topics/spark2_installing.html
  如果你使用CDH部署kafka的話,相信升級spark版本這個事情對你來說就是小菜一碟了,因為他們基本上是一個套路。如果你使用時CDH免費版本的話,我並不推薦你使用CDH整合kafka。因為裡面有一些和奇葩的坑在等著你。     一.下載spark2.3的CSD的jar包   和CDH整合kafka的套路一樣,我們在安裝spark版本的時候也需要下載相應的csd的jar包。下載地址:
http://archive.cloudera.com/spark2/csd/
1>.選擇csd版本 2>.安裝下載的軟體包(wget)
[[email protected] ~]# yum -y install wget
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
10gen                                                                                                                                                                                       
| 2.5 kB 00:00:00 base | 3.6 kB 00:00:00 centosplus | 3.4 kB 00:00:00 epel | 3.2 kB 00:00:00 extras | 3.4 kB 00:00:00 mysql-connectors-community | 2.5 kB 00:00:00 mysql-tools-community | 2.5 kB 00:00:00 mysql56-community | 2.5 kB 00:00:00 updates | 3.4 kB 00:00:00 (1/3): epel/x86_64/updateinfo | 933 kB 00:00:00 (2/3): epel/x86_64/primary | 3.6 MB 00:00:01 (3/3): updates/7/x86_64/primary_db | 6.0 MB 00:00:01 epel 12756/12756 Resolving Dependencies --> Running transaction check ---> Package wget.x86_64 0:1.14-15.el7_4.1 will be installed --> Finished Dependency Resolution Dependencies Resolved =================================================================================================================================================================================================================== Package Arch Version Repository Size =================================================================================================================================================================================================================== Installing: wget x86_64 1.14-15.el7_4.1 base 547 k Transaction Summary =================================================================================================================================================================================================================== Install 1 Package Total download size: 547 k Installed size: 2.0 M Downloading packages: wget-1.14-15.el7_4.1.x86_64.rpm | 547 kB 00:00:00 Running transaction check Running transaction test Transaction test succeeded Running transaction Installing : wget-1.14-15.el7_4.1.x86_64 1/1 Verifying : wget-1.14-15.el7_4.1.x86_64 1/1 Installed: wget.x86_64 0:1.14-15.el7_4.1 Complete! [[email protected] ~]#
[[email protected] ~]# yum -y install wget 3>.下載csd的jar包
[[email protected] ~]# mkdir /opt/cloudera/csd && cd /opt/cloudera/csd 
[[email protected] csd]# 
[[email protected] csd]# wget http://archive.cloudera.com/spark2/csd/SPARK2_ON_YARN-2.3.0.cloudera4.jar
--2018-10-31 00:17:57--  http://archive.cloudera.com/spark2/csd/SPARK2_ON_YARN-2.3.0.cloudera4.jar
Connecting to 10.9.137.250:3888... connected.
Proxy request sent, awaiting response... 200 OK
Length: 19037 (19K) [application/java-archive]
Saving to: ‘SPARK2_ON_YARN-2.3.0.cloudera4.jar’

100%[=========================================================================================================================================================================>] 19,037      --.-K/s   in 0.002s  

2018-10-31 00:17:57 (10.4 MB/s) - ‘SPARK2_ON_YARN-2.3.0.cloudera4.jar’ saved [19037/19037]

[[email protected] csd]# 
[[email protected] csd]# ll
total 20
-rw-r--r--. 1 root root 19037 Oct  5 05:45 SPARK2_ON_YARN-2.3.0.cloudera4.jar
[[email protected] csd]# 
4>.更改許可權,讓其屬於cloudera-scm使用者
[[email protected] csd]# ll
total 20
-rw-r--r--. 1 root root 19037 Oct  5 05:45 SPARK2_ON_YARN-2.3.0.cloudera4.jar
[[email protected] csd]# 
[[email protected] csd]# 
[[email protected] csd]# id cloudera-scm
uid=997(cloudera-scm) gid=995(cloudera-scm) groups=995(cloudera-scm)
[[email protected] csd]# 
[[email protected] csd]# 
[[email protected] csd]# chown cloudera-scm:cloudera-scm SPARK2_ON_YARN-2.3.0.cloudera4.jar 
[[email protected] csd]# 
[[email protected] csd]# ll
total 20
-rw-r--r--. 1 cloudera-scm cloudera-scm 19037 Oct  5 05:45 SPARK2_ON_YARN-2.3.0.cloudera4.jar
[[email protected] csd]# 

 

  二.下載spark2.3的parcel安裝包   和CDH整合kafka的套路一樣,我們在安裝spark版本的時候也需要下載相應的parcel的jar包。下載地址:http://archive.cloudera.com/spark2/parcels/ 1>.選擇spark的版本,它需要和上面的csd的版本對應上,當然也得和你的作業系統的版本對應上。   2>.