1. 程式人生 > >海量日誌下的日誌架構優化:filebeat+logstash+kafka+ELK

海量日誌下的日誌架構優化:filebeat+logstash+kafka+ELK

前言:

實驗需求說明

在前面的日誌收集中,都是使用的filebeat+ELK的日誌架構。但是如果業務每天會產生海量的日誌,就有可能引發logstash和elasticsearch的效能瓶頸問題。因此改善這一問題的方法就是filebeat+logstash+kafka+ELK,
也就是將儲存從elasticsearch轉移給訊息中介軟體,減少海量資料引起的宕機,降低elasticsearch的壓力,這裡的elasticsearch主要進行資料的分析處理,然後交給kibana進行介面展示

實驗架構圖:

海量日誌下的日誌架構優化:filebeat+logstash+kafka+ELK

實驗部屬拓撲圖:

整個過程是由filebeat收集本機日誌——logstash(或叢集)進行過濾處理——傳送給kafka(或叢集)進行儲存——ELK工具之logstash再到kafka中獲取資料——傳給elk工具之elasticsearch分析處理——交給kibana展示。
這裡部屬的兩個logstash扮演的角色和功能是不一樣的。
因為實驗機器是虛擬機器,記憶體小,因此使用了四臺機器,部屬分佈如下(試驗機的記憶體最好在4G以上):
海量日誌下的日誌架構優化:filebeat+logstash+kafka+ELK

實驗步驟

1、test101伺服器部屬tomcat並生成json格式日誌

1.1 在test101伺服器安裝jdk+apachetomcat

jdk安裝步驟省略,tomcat下載好安裝包,解壓即可。

1.2 修改tomcat配置,使之產生json格式日誌

修改tomcat的配置檔案/usr/local/apache-tomcat-9.0.14/conf/server.xml,註釋掉原來的內容(大概在160行):


<!--        <Valve className="org.apache.catalina.valves.AccessLogValve" directory="logs"
               prefix="localhost_access_log" suffix=".txt"
               pattern="%h %l %u %t "%r" %s %b" />   
-->

然後新增新的內容:

<Valve className="org.apache.catalina.valves.AccessLogValve" directory="logs"
prefix="tomcat_access_log" suffix=".log"
pattern="{"clientip":"%h","ClientUser":"%l","authenticated":"
%u","AccessTime":"%t","method":"%r","status":"%s","SendBytes":"
%b","Query?string":"%q","partner":"%{Referer}i","AgentVersion":"%{User-Agent}i"}" />

1.3 重啟tomcat,訪問10.0.0.101:8080

檢視日誌已經變成了json格式:

[[email protected] logs]# tailf tomcat_access_log.2018-12-23.log 

{"clientip":"10.0.0.1","ClientUser":"-","authenticated":" -","AccessTime":"[23/Dec/2018:16:01:35 -0500]","method":"GET / HTTP/1.1","status":"200","SendBytes":" 11286","Query?string":"","partner":"-","AgentVersion":"Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36"}

1.4 建立elk的yum檔案,安裝filebeat

[[email protected] ~]# cat /etc/yum.repos.d/elk.repo 
[elastic-6.x]
name=Elastic repository for 6.x packages
baseurl=https://artifacts.elastic.co/packages/6.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md
[[email protected] ~]# 

[[email protected] ~]# yum -y install filebeat

修改配置檔案/etc/filebeat/filebeat.yml如下(去掉已經註釋的內容,還剩下這面一部分:)
這裡要手動改配置,不能清空檔案直接貼上下面的配置!
這裡要手動改配置,不能清空檔案直接貼上下面的配置!
這裡要手動改配置,不能清空檔案直接貼上下面的配置!

#=========================== Filebeat inputs =============================
filebeat.inputs:
- type: log
  enabled: true
  paths:
    - /usr/local/apache-tomcat-9.0.14/logs/tomcat_access_log*     #日誌路徑
  json.keys_under_root: true     #這兩行是為了保證能傳送json格式的日誌
  json.overwrite_keys: true
#============================= Filebeat modules ===============================
filebeat.config.modules:
  path: ${path.config}/modules.d/*.yml
  reload.enabled: false
#==================== Elasticsearch template setting ==========================
setup.template.settings:
  index.number_of_shards: 3
#============================== Kibana =====================================
setup.kibana:
#----------------------------- Logstash output --------------------------------
output.logstash:
  hosts: ["10.0.0.103:5044"]
#================================ Procesors =====================================
processors:
  - add_host_metadata: ~
  - add_cloud_metadata: ~

啟動filebeat

[[email protected] ~]# systemctl start filebeat

2、test103伺服器部屬logstash+kafka

2.1 部屬jdk+zookeeper+kafka

1)jdk部屬省略

2)zookeeper安裝:

[[email protected] ~]# tar xf zookeeper-3.4.13.tar.gz -C /usr/local/
[[email protected] conf]# cd /usr/local/zookeeper-3.4.13/conf/
[[email protected] conf]# mv zoo_sample.cfg zoo.cfg 
[[email protected] conf]# cd ../bin/
[[email protected] bin]# ./zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /usr/local/zookeeper-3.4.13/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[[email protected] bin]# netstat -tlunp|grep 2181
tcp6       0      0 :::2181                 :::*                    LISTEN      18106/java          
[[email protected] bin]# 

3)kafka安裝:

[[email protected] ~]# tar xf kafka_2.12-2.1.0.tgz 
[[email protected] ~]# mv kafka_2.12-2.1.0 /usr/local/kafka
[[email protected] ~]# cd /usr/local/kafka/config/

修改server.properties,修改了兩個地方:

listeners=PLAINTEXT://10.0.0.103:9092
zookeeper.connect=10.0.0.103:2181

啟動kafka

[[email protected] config]# nohup  /usr/local/kafka/bin/kafka-server-start.sh  /usr/local/kafka/config/server.properties >/dev/null 2>&1 &
[[email protected] config]# netstat -tlunp|grep 9092
tcp6       0      0 10.0.0.103:9092         :::*                    LISTEN      17123/java          

2.2、部屬logstash

1)同test101一樣,建立elk的yum檔案:

[[email protected] ~]# cat /etc/yum.repos.d/elk.repo 
[elastic-6.x]
name=Elastic repository for 6.x packages
baseurl=https://artifacts.elastic.co/packages/6.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md
[[email protected] ~]# 

2)部屬服務,修改配置

[[email protected] ~]# yum -y install logstash

修改/etc/logstash/logstash.yml檔案下面幾項內容:

path.data: /var/lib/logstash
 path.config: /etc/logstash/conf.d
 http.host: "10.0.0.103"    #本機IP
path.logs: /var/log/logstash

建立收集日誌檔案

[[email protected] ~]# cd /etc/logstash/conf.d/

建立配置檔案logstash-kafka.conf,這個檔案是在拿到filebeat推送過來的資料後,再推送給kafka:

[[email protected] conf.d]# cat logstash-kafka.conf 
input {
  beats {
    port => 5044
  }

}

output {
kafka {
bootstrap_servers => "10.0.0.103:9092"     #kafka 的IP地址
topic_id => "crystal"
compression_type => "snappy"
codec => json
}
}
[[email protected] conf.d]# 

3)測試啟動logstash

[[email protected] ~]# /usr/share/logstash/bin/logstash -f /etc/logstash/conf.d/logstash-kafka.conf -t
WARNING: Could not find logstash.yml which is typically located in $LS_HOME/config or /etc/logstash. You can specify the path using --path.settings. Continuing using the defaults
Could not find log4j2 configuration at path /usr/share/logstash/config/log4j2.properties. Using default config which logs errors to the console
[WARN ] 2018-12-23 14:02:59.870 [LogStash::Runner] multilocal - Ignoring the 'pipelines.yml' file because modules or command line options are specified
Configuration OK
[INFO ] 2018-12-23 14:03:06.277 [LogStash::Runner] runner - Using config.test_and_exit mode. Config Validation Result: OK. Exiting Logstash
[[email protected] ~]# 

測試OK,啟動logstash:

[[email protected] ~]# nohup /usr/share/logstash/bin/logstash -f /etc/logstash/conf.d/logstash-kafka.conf >/dev/null 2>&1 &
[2] 18200
[[email protected] ~]# netstat -tlunp|grep 18200  #檢查埠啟動狀況,OK
tcp6       0      0 :::5044                 :::*                    LISTEN      18200/java          
tcp6       0      0 127.0.0.1:9600          :::*                    LISTEN      18200/java          
[[email protected] ~]# 

3、搭建ELK工具

3.1 test102伺服器搭建jdk+logstash+elasticsearch

jdk部屬省略

3.2 test102伺服器安裝logstash

1)yum安裝logstash,修改/etc/logstash/logstash.yml檔案下面幾項內容:

path.data: /var/lib/logstash
 path.config: /etc/logstash/conf.d
 http.host: "10.0.0.102"
path.logs: /var/log/logstash

2) 建立收集日誌配置檔案,這個檔案是在kafka裡面去拿資料,然後交給elasticsearch分析處理:

[[email protected] logstash]# cat /etc/logstash/conf.d/logstash-es.conf 
input {
kafka {
bootstrap_servers => "10.0.0.103:9092" 
topics => "crystal"
codec => "json"
consumer_threads => 5
decorate_events => true
}
}
output {
elasticsearch {
hosts => [ "10.0.0.102:9200" ]
index => "tomcat-log-%{+YYYY-MM-DD}"
codec => "json"
}
}
[[email protected] logstash]# 

啟動服務:

[[email protected] ~]#  nohup /usr/share/logstash/bin/logstash -f /etc/logstash/conf.d/logstash-es.conf >/dev/null 2>&1 &

3.3 test102伺服器安裝elasticsearch

yum安裝elasticsearch,修改配置檔案/etc/elasticsearch/elasticsearch.yml下面幾行內容:

path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
network.host: 10.0.0.102
http.port: 9200

啟動服務:

[[email protected] config]# systemctl start elasticsearch
[[email protected] config]# netstat -tlunp|grep 9200
tcp6       0      0 10.0.0.102:9200         :::*                    LISTEN      7109/java           
[[email protected] config]# 

3.4 在test104伺服器安裝kibana

yum安裝kibana,修改配置檔案/etc/kibana/kibana.yml下面幾行:

server.port: 5601
server.host: "10.0.0.104"
elasticsearch.url: "http://10.0.0.102:9200"
kibana.index: ".kibana"

啟動服務

[[email protected] kibana]# systemctl start kibana
[[email protected] kibana]# netstat -tlunp|grep 5601
tcp        0      0 10.0.0.104:5601         0.0.0.0:*               LISTEN      11600/node          
[[email protected] kibana]# 

4、日誌收集測試

4.1 訪問tomcat:10.0.0.101:8080產生日誌

訪問10.0.0.101:8080後,檢視kibana的索引建立介面,已經有索引tomcat-log-2018-12-357。建立索引,選擇“I don't want to use the Time Filter”,然後檢視介面資料,已經有日誌了,並且是json格式:
海量日誌下的日誌架構優化:filebeat+logstash+kafka+ELK

說明整個流程已經OK了。

《ELK收集Apache的json格式訪問日誌並按狀態碼繪製圖表》,建立一個餅圖新增到Dashboard

海量日誌下的日誌架構優化:filebeat+logstash+kafka+ELK

重新整理10.0.0.101:8080/dsfsdsd(介面不存在,會產生404的狀態碼),餅圖會動態變化如下:
海量日誌下的日誌架構優化:filebeat+logstash+kafka+ELK

至此,filebeat+logstash+kafka+elk架構部屬完成了。