1. 程式人生 > >如何使用Maxwell和flume,kafka 把MySQL資料實時同步到HDFs?

如何使用Maxwell和flume,kafka 把MySQL資料實時同步到HDFs?

Maxwell介紹

Maxwell是一個守護程式,一個應用程式,能夠讀取MySQL Binlogs然後解析輸出為json。支援資料輸出到Kafka中,支援表和庫過濾。

配置MySQL->Maxwell->Kafka->Flume->HDFS

1)MySQL配置要求

配置要求

12345[mysqld]server-id=1log-bin=masterbinlog_format=row #######################將mixed 改成row !!!!binlog_row_image=FULL

許可權要求

1234GRANT ALL on maxwell.*to'maxwell'
@'%'identified by'maxwell';GRANT ALL on maxwell.*to'maxwell'@'localhost'identified by'maxwell';GRANT SELECT,REPLICATION CLIENT,REPLICATION SLAVE on*.*to'maxwell'@'%';GRANT SELECT,REPLICATION CLIENT,REPLICATION SLAVE on*.*to'maxwell'@'localhost';

2)安裝配置Kafka

確認已安裝java執行環境,直接解壓Kafka即可使用。

1$tar xvf kafka_2
.10-0.10.2.1.tgz-C/mnt

解壓後,編輯配置檔案:

123456789101112131415161718192021222324252627282930$cat/mnt/kafka_2.10-0.10.2.1/config/server.properties############################# Server Basics #############################broker.id=0delete.topic.enable=true############################# Socket Server Settings #############################
listeners=PLAINTEXT://0.0.0.0:9092num.network.threads=3num.io.threads=8socket.send.buffer.bytes=102400socket.receive.buffer.bytes=102400socket.request.max.bytes=104857600############################# Log Basics #############################log.dirs=/tmp/kafka-logsnum.partitions=1num.recovery.threads.per.data.dir=1############################# Log Flush Policy #############################log.flush.interval.messages=10000log.flush.interval.ms=1000############################# Log Retention Policy #############################log.retention.hours=168log.segment.bytes=1073741824log.retention.check.interval.ms=300000############################# Zookeeper #############################zookeeper.connect=localhost:2181zookeeper.connection.timeout.ms=6000

kafka需要依賴zookeeper,所以需要先啟動zookeeper。

1$nohup/mnt/kafka_2.10-0.10.2.1/bin/zookeeper-server-start.sh/mnt/kafka_2.10-0.10.2.1/config/zookeeper.properties&

啟動Kafka Server:(指定JMX_PORT埠,可以通過Kafka-manager獲取統計資訊)

1$nohup/mnt/kafka_2.10-0.10.2.1/bin/kafka-server-start.sh/mnt/kafka_2.10-0.10.2.1/config/server.properties&

3)安裝配置Flume

去Apache官網下載Flume二進位制安裝包,然後解壓即可。

12tar xvf apache-flume-1.7.0-bin.tar.gz-C/usr/local/ln-sv/usr/local/apache-flume-1.7.0-bin//usr/local/flume

設定環境變數

1234$cat/etc/profile.d/flume.shexport FLUME_HOME=/usr/local/flumeexport FLUME_CONF_DIR=$FLUME_HOME/confexport PATH=$PATH:$FLUME_HOME/bin

檢視Flume版本