ubuntu18.04 flink-1.9.0 Standalone叢集搭建
叢集規劃
Master JobManager Standby JobManager Task Manager Zookeeper flink01 √ √ flink02 √ √ flink03 √ √ √ flink04 √ √ √ 前置準備
克隆4臺虛擬機器
網路配置
vim /etc/netplan/01-network-manager-all.yaml 修改配置檔案
叢集網路配置如下:
# flink01 network: version: 2 renderer: NetworkManager ethernets: ens33: dhcp4: no dhcp6: no addresses: [192.168.180.160/24] gateway4: 192.168.180.2 nameservers: addresses: [114.114.114.114, 8.8.8.8] # flink02 network: version: 2 renderer: NetworkManager ethernets: ens33: dhcp4: no dhcp6: no addresses: [192.168.180.161/24] gateway4: 192.168.180.2 nameservers: addresses: [114.114.114.114, 8.8.8.8] # flink03 network: version: 2 renderer: NetworkManager ethernets: ens33: dhcp4: no dhcp6: no addresses: [192.168.180.162/24] gateway4: 192.168.180.2 nameservers: addresses: [114.114.114.114, 8.8.8.8] #flink04 network: version: 2 renderer: NetworkManager ethernets: ens33: dhcp4: no dhcp6: no addresses: [192.168.180.163/24] gateway4: 192.168.180.2 nameservers: addresses: [114.114.114.114, 8.8.8.8]
能 ping 通 baidu 連上 xshell 即可, 連不上的可能是防火牆問題, ubuntu 用的防火牆是ufw, selinux 預設是disabled, 正常來說應該不會連不上。
修改 hostname 及 hosts檔案
vim /etc/hostname 修改 hostname 為 flink01
vim /etc/hosts 修改 host檔案
127.0.0.1 localhost 192.168.180.160 flink01 192.168.180.161 flink02 192.168.180.162 flink03 192.168.180.163 flink04 # The following lines are desirable for IPv6 capable hosts ::1 ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters
- JDK配置
- 如果沒有配置jdk請參考https://www.cnblogs.com/ronnieyuan/p/11461377.html
- 如果之前有下過別的版本的jdk請將jdk的tar包解壓到/usr/lib/jvm,然後再修改配置檔案並選擇jdk
- 如果沒有配置jdk請參考https://www.cnblogs.com/ronnieyuan/p/11461377.html
免密登入
root@flink01:~# ssh-keygen -t rsa -P "" Generating public/private rsa key pair. Enter file in which to save the key (/root/.ssh/id_rsa): Created directory '/root/.ssh'. Your identification has been saved in /root/.ssh/id_rsa. Your public key has been saved in /root/.ssh/id_rsa.pub. The key fingerprint is: SHA256:2P9NCcuEljhcWBndG8FBoAxOeGbZDIHIdw06r3bbsug root@flink01 The key's randomart image is: +---[RSA 2048]----+ | . . o*Oo+.=+o | | o ++*==.. + | | .o=o + o | | * o o . | | . S + o | | . + o o . | | o . . o o | | . o.o . o | | .E oo. . . | +----[SHA256]-----+
在~目錄下vim .ssh/authorized_keys:
將5臺虛擬機器公鑰都存入該檔案中, 每臺的authorized_keys都一致
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDZGR9+3bTIq6kZ5c+1O2mvyPBM790/0QUuVcjPSREK1pMpyyKRu6CUaBlTOM8JJzkCnEe7DVKY8lV2q7Zv7VVXTLI2dKuUmo0oMSGo5ABvfjGUJfFWdNkqgrPgT+Opl/1kIqr6wTGGCJsRx+Nfkic31WhnPij2IM/Zpqu88kiXmUbaNwBDM5jaRqB/nk7DW4aMwF5oeX6LvEuI1SVmY3DH0w6Cf1EDtOyYG1f9Vof8ao88JOwRZazTNbBdsVcPKDbtvovs/lP+CrtwmAcGFGZSjA22I0dc9ek0puQ5pNUgindpb9egJJFoGhtduut6OfmmvbB8u9PjdBkHcp3Vof4j root@flink01 ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDMgNUIzT1YRTGX5E4MiWyQ9UxfQ1pRwCSMy3wqJOuyqv/kW5VsL0BS1VAwtSS9FqCBon3lT0zfgPEkscDdKugMtpgSeREjbQQJSDYPQGyCsHcXQgne5dfaWbvLvaFbNO1G5wqW6E3zC5zy85mSdg9qg8NmqYwyz0O8WSRottuAMHfhoNkHemdrIHhVBBZeKxFbsiXxiWi1EQJUYhYWU1LAqcpiIgFhjAGJkH91qULmm8FxkbB3ytcMlkFBNbcr3UkN5EK0VGV/Ly1qgs9UmXtS7xb5Lw5RlqjUMvwc28oHZYSfPRS1TL/oaFBZSIIGPjudke5wsBwNnzHGE44sTWUt root@flink02 ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDRYkED/hA9nqmCBClIByAIcrT94edjno1HXn/KIgrCXy0aoIfsgY5FDr/3iZgaEdvnDz1woyKyj5gqkYxdQw0vLiJGOefgSC4yx7j6vKaK+2aSxJD+DLFB6toFAlyfWziTnyoPj3hPhlpSdG1P4YSEpMub5p1gVsEX9+vRvyazIabsbY9YsCsv/WDs3aAYyalfuvN+AXTrXI8di+AGgsjbwmo0/VQWQfItHY+WdTGcfBQQ8Ad9UX2wVRQPXq7gHh3LlBWKZnB+nZxavnI/G9Lo9w9MODSBgxfCfKNu6fpnpcvh8u+AqmT6rNKZg5NuZWW16siGvLNReSnWn7KwHzzD root@flink03 ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDkBQxFdKvg2MzGl1eFeDO1iqL2+fdJ9b4ktU9fjO/APEkumbftfrDTgJAnQhcXMsU7RjCJ+Zwt/dpeGhql3o7hXuhjQiUXf+8GXYre8cw9+xaYmaVhCZYbpcFiSrd/8TK3qv+gf21YIT7iQEFEl384qxPbyoU4Lm+M/1aY3m4gHvELv3jfg8oVBypxgaAJpJj9ZnnA0zN470cod1E67yVbfIkoSTy8BXd8UhVedYODD1ddaFX8MF53acUdJgLCGzh4axxivlGqdXWB7lhjbBvjllt2LYb7hW+O1qyCFNiRx/+zR9dSu5ZSW3QNg2EP/ljcCjDWa7AHgTvw6BYkdpj1 root@flink04
依次測試免密登入是否成功
成功案例:
root@flink01:~# ssh [email protected] Welcome to Ubuntu 18.04.2 LTS (GNU/Linux 4.18.0-17-generic x86_64) * Documentation: https://help.ubuntu.com * Management: https://landscape.canonical.com * Support: https://ubuntu.com/advantage * Canonical Livepatch is available for installation. - Reduce system reboots and improve kernel security. Activate at: https://ubuntu.com/livepatch 273 packages can be updated. 272 updates are security updates. Your Hardware Enablement Stack (HWE) is supported until April 2023. Last login: Sun Oct 27 11:11:26 2019 from 192.168.180.160 root@flink02:~#
叢集搭建
Zookeeper 搭建
上傳 tar 包 並解壓到指定目錄 並修改解壓後的目錄名稱
tar -zxvf apache-zookeeper-3.5.5-bin.tar.gz -C /opt/ronnie/ cd /opt/ronnie mv apache-zookeeper-3.5.5-bin/ zookeeper-3.5.5
建立並修改zookeeper配置檔案
cd zookeeper-3.5.5/conf/ 進入Zookeeper配置檔案目錄
cp zoo_sample.cfg zoo.cfg 拷貝一份zoo_sample.cfg 為 zoo.cfg
vim zoo.cfg 修改配置檔案
# The number of milliseconds of each tick tickTime=2000 # The number of ticks that the initial # synchronization phase can take initLimit=10 # The number of ticks that can pass between # sending a request and getting an acknowledgement syncLimit=5 # the directory where the snapshot is stored. # do not use /tmp for storage, /tmp here is just # example sakes. dataDir=/opt/ronnie/zookeeper-3.5.5/data/zk dataLogDir=/opt/ronnie/zookeeper-3.5.5/data/log # the port at which the clients will connect clientPort=2181 # the maximum number of client connections. # increase this if you need to handle more clients #maxClientCnxns=60 # # Be sure to read the maintenance section of the # administrator guide before turning on autopurge. # # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance # # The number of snapshots to retain in dataDir #autopurge.snapRetainCount=3 # Purge task interval in hours # Set to "0" to disable auto purge feature #autopurge.purgeInterval=1 server.1=flink01:2888:3888 server.2=flink02:2888:3888 server.3=flink03:2888:3888 server.4=flink04:2888:3888
建立data目錄:
mkdir -p /opt/ronnie/zookeeper-3.5.5/data
建立data目錄下的日誌目錄:
mkdir -p /opt/ronnie/zookeeper-3.5.5/data/log
建立data目錄下的zk目錄
mkdir -p /opt/ronnie/zookeeper-3.5.5/data/zk
cd zk/, 建立myid, 四臺虛擬機器flink01, flink02, flink03, flink04 分別對應 myid 1,2,3,4
新增環境變數
vim ~/.bashrc , 完成後 source ~/.bashrc
# Zookeeper export ZOOKEEPER_HOME=/opt/ronnie/zookeeper-3.5.5 export PATH=$ZOOKEEPER_HOME/bin:$PATH
將Zookeeper目錄傳送給其他幾臺虛擬機器(記得改myid)
cd /opt/ronnie scp -r zookeeper-3.5.5/ [email protected]:/opt/ronnie/ scp -r zookeeper-3.5.5/ [email protected]:/opt/ronnie/ scp -r zookeeper-3.5.5/ [email protected]:/opt/ronnie/
啟動Zookeeper: zkServer.sh start
檢查Zookeeper 狀態: zkServer.sh status
有顯示該節點狀態就表示啟動成功了
root@flink01:~# zkServer.sh status ZooKeeper JMX enabled by default Using config: /opt/ronnie/zookeeper-3.5.5/bin/../conf/zoo.cfg Client port found: 2181. Client address: localhost. Mode: follower
報錯請檢視日誌定位錯誤
Hadoop叢集搭建(主要還是為了hdfs, On-Yarn後面再弄)
叢集分配
NN-1 NN-2 DN ZK ZKFC JN flink01 * * * * * flink02 * * * * * flink03 * * * flink04 * * *
上傳tar包並解壓到指定路徑:
tar -zxvf hadoop-3.1.2.tar.gz -C /opt/ronnie
vim /opt/ronnie/hadoop-3.1.2/etc/hadoop/hadoop-env.sh 將 環境檔案中的jdk路徑替換為本機jdk路徑
export JAVA_HOME=/usr/lib/jvm/jdk1.8
vim /opt/ronnie/hadoop-3.1.2/etc/hadoop/mapred-env.sh, 新增:
export JAVA_HOME=/usr/lib/jvm/jdk1.8
vim /opt/ronnie/hadoop-3.1.2/etc/hadoop/yarn-env.sh,新增:
export JAVA_HOME=/usr/lib/jvm/jdk1.8
vim /opt/ronnie/hadoop-3.1.2/etc/hadoop/core-site.xml 修改core-site檔案
<configuration> <property> <name>fs.defaultFS</name> <value>hdfs://ronnie</value> </property> <property> <name>ha.zookeeper.quorum</name> <value>flink01:2181,flink02:2181,flink03:2181,flink04:2181</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/var/ronnie/hadoop/ha</value> </property> </configuration>
vim /opt/ronnie/hadoop-3.1.2/etc/hadoop/hdfs-site.xml
<configuration> <property> <name>dfs.nameservices</name> <value>ronnie</value> </property> <property> <name>dfs.ha.namenodes.ronnie</name> <value>nn1,nn2</value> </property> <property> <name>dfs.namenode.rpc-address.ronnie.nn1</name> <value>flink01:8020</value> </property> <property> <name>dfs.namenode.rpc-address.ronnie.nn2</name> <value>flink02:8020</value> </property> <property> <name>dfs.namenode.http-address.ronnie.nn1</name> <value>flink01:50070</value> </property> <property> <name>dfs.namenode.http-address.ronnie.nn2</name> <value>flink02:50070</value> </property> <property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://flink01:8485;flink02:8485;flink03:8485;flink04:8485/ronnie</value> </property> <property> <name>dfs.journalnode.edits.dir</name> <value>/var/ronnie/hadoop/ha/jn</value> </property> <property> <name>dfs.client.failover.proxy.provider.ronnie</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property> <property> <name>dfs.ha.fencing.methods</name> <value>sshfence</value> </property> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/root/.ssh/id_dsa</value> </property> <property> <name>dfs.ha.automatic-failover.enabled</name> <value>true</value> </property> </configuration>
vim /opt/ronnie/hadoop-3.1.2/etc/hadoop/workers修改工作組
flink01 flink02 flink03 flink04
vim /opt/ronnie/hadoop-3.1.2/sbin/start-dfs.sh
vim /opt/ronnie/hadoop-3.1.2/sbin/stop-dfs.sh
在檔案頂部標頭檔案之後新增:
HDFS_DATANODE_USER=root HADOOP_SECURE_DN_USER=hdfs HDFS_NAMENODE_USER=root HDFS_SECONDARYNAMENODE_USER=root HDFS_JOURNALNODE_USER=root HDFS_ZKFC_USER=root
vim /opt/ronnie/hadoop-3.1.2/sbin/start-yarn.sh
vim /opt/ronnie/hadoop-3.1.2/sbin/stop-yarn.sh
在檔案頂部標頭檔案之後新增:
YARN_RESOURCEMANAGER_USER=root HADOOP_SECURE_DN_USER=yarn YARN_NODEMANAGER_USER=root
將檔案傳送給其他三臺虛擬機器:
scp -r /opt/ronnie/hadoop-3.1.2/ root@flink02:/opt/ronnie/ scp -r /opt/ronnie/hadoop-3.1.2/ root@flink03:/opt/ronnie/ scp -r /opt/ronnie/hadoop-3.1.2/ root@flink04:/opt/ronnie/
vim ~/.bashrc 新增hadoop路徑
#HADOOP VARIABLES export HADOOP_HOME=/opt/ronnie/hadoop-3.1.2 export PATH=$PATH:$HADOOP_HOME/bin export PATH=$PATH:$HADOOP_HOME/sbin export HADOOP_MAPRED_HOME=$HADOOP_HOME export HADOOP_COMMON_HOME=$HADOOP_HOME export HADOOP_HDFS_HOME=$HADOOP_HOME export YARN_HOME=$HADOOP_HOME export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"
source ~/.bashrc 使新配置生效(記得每臺都要改)
hadoop version檢視版本, 顯示如下則hadoop路徑配置成功:
root@flink01:~# hadoop version Hadoop 3.1.2 Source code repository https://github.com/apache/hadoop.git -r 1019dde65bcf12e05ef48ac71e84550d589e5d9a Compiled by sunilg on 2019-01-29T01:39Z Compiled with protoc 2.5.0 From source with checksum 64b8bdd4ca6e77cce75a93eb09ab2a9 This command was run using /opt/ronnie/hadoop-3.1.2/share/hadoop/common/hadoop-common-3.1.2.jar
啟動叢集:
啟動Zookeeper
zkServer.sh start
在 flink01, flink02, flink03, flink04上啟動journalnode
hadoop-daemon.sh start journalnode
jps 檢視程序
root@flink01:~# jps 2842 QuorumPeerMain 3068 Jps 3004 JournalNode
在兩臺NameNode中選一臺進行格式化(這裡選flink01)
- hdfs namenode -format
- hdfs --daemon start 啟動 namenode
在另一臺NameNode上同步格式化後的相關資訊
- hdfs namenode -bootstrapStandby
格式化zkfc
- hdfs zkfc -formatZK
啟動 hdfs 叢集
- start-dfs.sh
開啟50070埠:
- hdfs搭建成功
Flink 叢集搭建
上傳 tar 包並解壓到指定目錄
這次發生了一個問題, rz -E上傳 亂碼了, 解決方案: rz -be 上傳
解壓到指定目錄
tar zxvf flink-1.9.0-bin-scala_2.12.tgz -C /opt/ronnie/
vim /opt/ronnie/flink-1.9.0/conf/flink-conf.yaml 修改flink配置檔案:
```shell
JobManager的rpc地址
jobmanager.rpc.address: flink01
The RPC port where the JobManager is reachable.
rpc 埠號
jobmanager.rpc.port: 6123
The heap size for the JobManager JVM
JobManager的 JVM 的堆的大小
jobmanager.heap.size: 1024m
The heap size for the TaskManager JVM
TaskManager 的 堆的大小
taskmanager.heap.size: 1024m
The number of task slots that each TaskManager offers. Each slot runs one parallel pipeline.
每個TaskManager提供的任務槽的數量, 每個槽執行一條並行的管道
taskmanager.numberOfTaskSlots: 1
The parallelism used for programs that did not specify and other parallelism.
程式的並行度
parallelism.default: 1
高可用配置
high-availability: zookeeper
high-availability.zookeeper.quorum: flink01:2181,flink01:2181,flink03:2181,flink04:2181
high-availability.zookeeper.path.root: /flink
high-availability.cluster-id: /default_one
這個必須是NameNode的8020
high-availability.zookeeper.storageDir: hdfs://flink01:8020/flink/ha
```
vim /opt/ronnie/flink-1.9.0/conf/masters
flink01:8081 flink02:8081
vim /opt/ronnie/flink-1.9.0/conf/slaves
flink03 flink04
start-cluster.sh 後jps發現沒有對應jobManager程序,檢視flink日誌發現報錯:
UnsupportedFileSystemSchemeException: Hadoop is not in the classpath/dependencies.
需要去官網下載Pre-bundled Hadoop
將jar包上傳至 flink 目錄下的 lib目錄
cd /opt/ronnie/flink-1.9.0/lib
將 jar 包 傳送至其他三臺虛擬機器
scp flink-shaded-hadoop-2-uber-2.8.3-7.0.jar root@flink02:`pwd` scp flink-shaded-hadoop-2-uber-2.8.3-7.0.jar root@flink03:`pwd` scp flink-shaded-hadoop-2-uber-2.8.3-7.0.jar root@flink03:`pwd`
start-cluster.sh 啟動 flink 叢集, jps檢視有對應的StandaloneSessionClusterEntrypoint程序後, 開啟8081埠
執行一下自帶的wordCount測試一下
/opt/ronnie/flink-1.9.0 flink run examples/streaming/WordCount.jar
執行結果:
Starting execution of program Executing WordCount example with default input data set. Use --input to specify file input. Printing result to stdout. Use --output to specify output path. Program execution finished Job with JobID 2262c186bb5072917e79e2d081e9fab3 has finished. Job Runtime: 1311 ms
UI介面