1. 程式人生 > >Flume安裝-配置-除錯

Flume安裝-配置-除錯

apache-flume-1.6.0-bin.tar.gz 安裝包

1.Linux虛擬機器Centos 7.0,伺服器CPU:i5 雙核以上,記憶體:2G以上

2.JDK1.7.0以上、Hadoop -2.7 .1、

3.機器名 ip地址 安裝軟體

Master1   192.168.114.38

Slave1 192.168.114.39

Slave1 192.168.114.40

1、軟體解壓:

將軟體包放在/data目錄下,並解壓到/soft目錄下,拷貝叢集上的hadoop安裝檔案到/so   ft目錄下

2、新增環境變數:

Master1   192.168.114.38

Slave1 192.168.114.39

Slave1 192.168.114.40

所有機器vim /etc/profile   向檔案中新增以下變數:

export HADOOP_HOME=/soft/hadoop-2.7.1

exportPATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

exportFLUME_HOME=/soft/apache-flume-1.6.0-bin/

export PATH=$PATH:$FLUME_HOME/bin

儲存退出,並使檔案生效:

3複製flume到其他節點

cd  /soft/

scp -r [email protected]: /soft

scp -r [email protected]: /soft

5配置agent啟動檔案

Master1節點上,在flume conf目錄中flume-conf.properties.template重新命名為agent0.conf

修改為以下內容:

agent0.sources = source1

agent0.channels = memoryChannel

agent0.sinks = sink1

agent0.sources.source1.type = avro

agent0.sources.source1.bind =192.168.114.38

agent0.sources.source1.port = 23004

agent0.sources.source1.channels =memoryChannel

agent0.sources.source1.interceptors = i1

agent0.sources.source1.interceptors.i1.type= timestamp

agent0.channels.memoryChannel.type = memory

agent0.channels.memoryChannel.capacity =2000

agent0.channels.memoryChannel.keep-alive =100

agent0.sinks.sink1.type = hdfs

agent0.sinks.sink1.hdfs.path =hdfs://192.168.114.20:8020/input/%y-%m-%d

agent0.sinks.sink1.hdfs.fileType =DataStream

agent0.sinks.sink1.hdfs.writeFormat = TEXT

agent0.sinks.sink1.hdfs.rollInterval = 1

agent0.sinks.sink1.hdfs.rollSize = 400000

agent0.sinks.sink1.hdfs.rollCount = 100

agent0.sinks.sink1.channel = memoryChannel

agent0.sinks.sink1.hdfs.filePrefix =events-

如下圖所示:

(注:rollCount不設定的話預設是10行,上傳的檔案內容如果多於10行,會被切分成兩個檔案,適當改大些.

source1.port埠選用前檢視該埠是否被佔用:netstat -tunlp | grep 23004)

slave1節點上,在flume conf目錄中flume-conf.properties.template重新命名為agent1.conf

 vim agent1.conf

修改為以下內容:

agent1.sources= source1

agent1.channels=Channel1

agent1.sinks= sink1

#agent1.sources.source1.type= spooldir

agent1.sources.source1.type= exec

#agent1.sources.source1.spoolDir= /usr/local/flumelog

agent1.sources.source1.command= tail -F /usr/local/flumelog/flume_test1.txt

agent1.sources.source1.channels=Channel1

agent1.channels.Channel1.type= file

agent1.channels.Channel1.checkpointDir= /usr/local/tmp/checkpoint

agent1.channels.Channel1.dataDirs= /usr/local/tmp/datadir

agent1.sinks.sink1.type= avro

agent1.sinks.sink1.hostname= 192.168.114.38

agent1.sinks.sink1.port= 23004

agent1.sinks.sink1.channel=Channel1

如下圖所示:

Slave2節點上,在flume conf目錄中flume-conf.properties.template重新命名為agent2.conf

 vim agent2.conf

修改內容為:

agent2.sources= source1

agent2.channels=Channel1

agent2.sinks= sink1

agent2.sources.source1.type= spooldir

agent2.sources.source1.spoolDir= /usr/local/flumelog

agent2.sources.source1.channels=Channel1

agent2.channels.Channel1.type= file

agent2.channels.Channel1.checkpointDir= /usr/local/tmp/checkpoint

agent2.channels.Channel1.dataDirs= /usr/local/tmp/datadir

agent2.sinks.sink1.type= avro

agent2.sinks.sink1.hostname= 192.168.114.38

agent2.sinks.sink1.port= 23004

agent2.sinks.sink1.channel=Channel1

如下圖所示:

6啟動flume

進入當前機器flume 目錄下面進行啟動,先啟動master1節點的flume,不然會報異常,並且確保hadoop之前是啟動的

在/soft/apache-flume-1.6.0-bin/下分別啟動:

master1: flume-ng agent--conf ./conf/ -f ./conf/agent0.conf -n agent0 -Dflume.root.logger=INFO,console

slave1:flume-ng agent--conf ./conf/ -f ./conf/agent1.conf -n agent1 -Dflume.root.logger=INFO,console

slave2:flume-ng agent--conf ./conf/ -f ./conf/agent2.conf -n agent2 -Dflume.root.logger=INFO,console

  (注意此處的-n agent0名稱對應agent0.conf中的agent0)

 3臺機器服務都啟動之後, 可以往 slave1或者slave2 機器中的 資料夾下面(配置好的spooldir路徑)加入新的檔案,flume 叢集會將這些資料 都收集到master1上面,然後寫到hdfs 中