1. 程式人生 > >Flume實現日誌資料夾資料載入到HDFS

Flume實現日誌資料夾資料載入到HDFS

Flume是一種分散式,可靠和可用的服務,用於高效收集,聚合和移動大量日誌資料。 它具有基於資料流的簡單和可伸縮的架構。 它具有可靠性機制和故障切換和恢復機制的魯棒性和容錯能力。
vi corp_base_info.conf

a1.sources = r1
a1.sinks = k1
a1.channels = c1

# Describe/configure the source
a1.sources.r1.type = spooldir
a1.sources.r1.spoolDir=/home/flume/testdata/test
a1.sources.r1.includePattern

=^AUEIC.C_CONS([0-9a-zA-Z]|[._-])*$
a1.sources.r1.ignorePattern=^.*COMPLETED$
a1.sources.r1.inputCharset=UTF-8
a1.sources.r1.pollDelay=300000 #5分針採集一次
加粗的屬性1.7以上才有

#Use a channel which buffers events in memory

a1.channels=c1
a1.channels.c1.capacity=1000000
a1.channels.c1.transactionCapacity=1000000
a1.channels.c1.type=memory

#Describe the sink
a1.sinks=k1
a1.sinks.k1.channel=c1
a1.sinks.k1.hdfs.fileType=DataStream
a1.sinks.k1.hdfs.path=hdfs://mynameservice/apps/hive/warehouse/flume.db/corp_base_info/ymd=%Y%m%d
a1.sinks.k1.hdfs.rollCount=0
a1.sinks.k1.hdfs.rollInterval=0
a1.sinks.k1.hdfs.rollSize=10240000
a1.sinks.k1.hdfs.idleTimeout=60
a1.sinks.k1.hdfs.writeFormat=Text
a1.sinks.k1.type=hdfs

# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1