1. 程式人生 > >Storm安裝部署與應用(1)

Storm安裝部署與應用(1)

最近在使用Storm實時計算框架,總結一下學習到的知識。以下陳述純屬個人觀點,如有錯誤,請斧正。

關於Storm是做什麼的?Storm是一個流式實時計算框架。何為流式?簡單的說流水線模式,一個接一個的向下一個流轉。何為實時?關於實時,就是Storm能夠做到毫秒級甚至納秒級梳理一條資料(注:這裡的處理時間與業務邏輯和伺服器效能有關)。

能夠做到相當短的時間內處理一條資料。下面我介紹一下乾貨。

1、Storm的安裝部署(叢集)

a:首先第一步需要先安裝Zookeeper,首先先去Apache上下載zookeeper的安裝檔案。上傳到伺服器

#tar -zxvf zookeeper.x.xx.tar.gz

然後進入zookeeper的conf檔案下,將zoo_sample.cfg 修改成zoo.cfg

# cd zookeeper.x.xx/conf/
# cp zoo_sample.cfg  zoo.cfg

然後修改zoo.cfg中配置(注:配置一臺機器,其他機器配置檔案相同,ps:除了myid檔案

# vim zoo.cfg
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
#dataDir=/tmp/zookeeper
dataDir=/home/storm/zookeeper/data
# the port at which the clients will connect clientPort=2181 server.1=xx.xx.xx.01:2888:3888 server.2= xx.xx.xx.02:2888:3888 server.3= xx.xx.xx.03:2888:3888 # the maximum number of client connections. # increase this if you need to handle more clients #maxClientCnxns=60 # # Be sure to read the maintenance section of the # administrator guide before turning on autopurge. # # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance # # The number of snapshots to retain in dataDir autopurge.snapRetainCount=3
# Purge task interval in hours # Set to "0" to disable auto purge feature autopurge.purgeInterval=1
注:紅色字型部分需要做修改或新增

dataDir=/home/storm/zookeeper/data這種目錄要手工建立並有讀寫許可權

autopurge.snapRetainCount=3、autopurge.purgeInterval=1這兩項是配置zookeeper自動刪除臨時檔案,只保留最新的三個

dataDir目錄下建立myid檔案,server.x中其中x代表幾,就在myid中寫幾,例如xx.xx.xx.01代表1myid中寫1,啟動zookeeper.

nohup ./bin/zkServer.sh start &
#jps

當出現QuorumPeerMain程序代表zookeeper啟動成功

2、Storm的安裝配置

多臺伺服器配置相同,配置好一臺複製到另外幾臺即可,我這裡是三臺。

a:下載Storm安裝檔案,上傳到伺服器,進行解壓。

# tar -zxvf apache-Storm-xx.xx.tar.gz
進入Storm的配置資料夾下,將storm.yaml進行備份
# cd apache-storm-x.x.x/conf/
# cp storm.yaml  storm_bak.yaml
# vim storm.yaml
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements.  See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership.  The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License.  You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

########### These MUST be filled in for a storm configuration
storm.zookeeper.servers:
    - "xx.xx.xx.01"
    - "xx.xx.xx.02"
    - "xx.xx.xx.03"

nimbus.host: "xx.xx.xx.01"
# nimbus.host: "nimbus"
#
storm.local.dir: "/home/storm_data"
supervisor.slots.ports:
    - 6700
    - 6701
    - 6702
    - 6703
#  supervisor.childopts: "-Xmx1024m"
worker.childopts: "-Xmx2048m"
#  topology.state.synchronization.timeout.secs: 60
topology.message.timeout.secs: 150
#  topology.enable.message.timeouts: true
topology.max.spout.pending: 8000
#  topology.ackers: 0
#


# ##### These may optionally be filled in:
#
## List of custom serializations
# topology.kryo.register:
#     - org.mycompany.MyType
#     - org.mycompany.MyType2: org.mycompany.MyType2Serializer
#
## List of custom kryo decorators
# topology.kryo.decorators:
#     - org.mycompany.MyDecorator
#
## Locations of the drpc servers
# drpc.servers:
#     - "server1"
#     - "server2"

## Metrics Consumers
# topology.metrics.consumer.register:
#   - class: "backtype.storm.metric.LoggingMetricsConsumer"
#     parallelism.hint: 1
#   - class: "org.mycompany.MyMetricsConsumer"
#     parallelism.hint: 1
#     argument:
#       - endpoint: "metrics-collector.mycompany.org"

注:storm.zookeeper.servers:是配置的Zookeeper的地址

storm.local.dir: "/home/storm_data"目錄需要手工建立

supervisor.slots.ports:

- 6700

- 6701

- 6702

- 6703

固定配置,每臺機器最多啟動4個程序,他們的埠號

worker.childopts: "-Xmx2048m"每個程序虛擬機器記憶體

topology.message.timeout.secs:150訊息150秒沒有Act就認為失敗,然後重發

topology.max.spout.pending: 8000  spout限流,每個spout例項中的沒有act和失敗的最大待處理訊息條數。

啟動Storm,輸入以下命令:

# nohup storm nimbus > myout_numbus.file 2>&1 & 
# nohup storm supervisor > myout_sup.file 2>&1 & 
#  nohup storm ui  > myout_ui.file 2>&1 &
當看到
以下幾個程序後即為安裝成功。