Nagios 下監控伺服器流量（linux平臺下）

阿新 • • 發佈：2019-01-25

一共三種方法

A、利用nagios自帶的外掛check_mrtgtraf對網絡卡流量進行監控
這種方法要依賴mrtg資料，並且使用起來就Bytes和Bites換算也有點問題，不推薦使用。

這裡只簡單介紹下check_mrtgtraf ，它定時檢查mrtg的日誌檔案，獲取當前流量

如下例子，但該外掛個人覺得功能簡單切有限，自己已經棄用。

/u/nagios/libexec/check_mrtgtraf -F /var/www/html/mrtg/192.168.0.21_2.log -a AVG -w 300000,300000 -c 400000,400000 -e 1
Traffic WARNING - Avg. In = 295.2 KB/s, Avg. ut = 58.7 KB/s|in=295.211914KB/s;300000.000000;400000.000000;0.000000 in=58.667969KB/s;300000.000000;400000.000000;0.000000

B、網路流傳其他一類的流量監測方法的指令碼，我試用了幾個最終感覺還是不太方便。

不要感覺到坑爹，筆者也是使用了以上兩種方法之後最終選擇下面這一種的，方便快捷功能強。

C、使用check_snmp_int.pl 外掛監控網路
推薦使用，簡單方便功能多，流量計算也比較準確（我是和同步的mrtg監控和cacti監控頁面對比過資料）。
參考頁面http://nagios.manubulon.com/snmp_int.html
下載地址http://nagios.manubulon.com/check_snmp_int.pl

前提：您的要被監控的主機也要開放snmp服務才行。

環境：nagios監控伺服器和被監控伺服器均是linux伺服器

1、下載該外掛到nagios監控伺服器
首先確保監控伺服器上snmp和perl相關包都已安裝，執行以下語句測試是否返回正確值。

perl check_snmp_int.pl -H 192.168.0.21 -C zjhcsoft -n eth1 -k -Y -B -w 200,400 -c 0,800
該語句表示：-H 表示監控192.168.0.21伺服器 -C 表示組織名稱為 zjhcsoft -n 表示檢查eth1 網絡卡 -Y -B 聯合使用表示返回的是以bits/s的網絡卡流量 -w 和-c 表示警告伐值 in伐值,out伐值
確定手動執行可以返回正確結果如下，如果超過-c的伐值會有如下警告
eth1:UP (WARN 5095.5Kbps/CRIT 37443.9Kbps):(1 UP): CRITICAL

2、確定該外掛正常使用後配置nagios
編輯 commands.cfg檔案，建立一個本地命令
[

[email protected] objects]# vi commands.cfg
# 'check_snmp_int_iftraffic' command definition
define command{
command_name check_snmp_int_iftraffic
command_line $USER1$/check_snmp_int.pl -H $HOSTADDRESS$ -C $ARG1$ -n $ARG2$ -k -Y -B -w $ARG3$ -c $ARG4$
}
建立檢查服務，編輯配置檔案
[

[email protected] objects]# vi network_interface_service.cfg
define service{
use generic-service ; Inherit values from a template
host_name web1,web2,web3,web4,web6
service_description Output Interface Bandwidth Usage
check_command check_snmp_int_iftraffic!zjhcsoft!eth2!1200,5000!2000,10000
notifications_enabled 0
}

檢查配置無誤
[

[email protected] objects]$ /u/nagios/bin/nagios -v /u/nagios/etc/nagios.cfg
重啟nagios
[[email protected] objects]# service nagios restart
3、檢查nagios監控頁面確定頁面返回正常監控資料

補充：以上是snmp v1版本的如果裝置是v2版本就要加一個引數‘-2’，在nagios再新配置一個v2版本的本地命令

# 'check_snmp_int_iftraffic_v2' command definition
define command{
command_name check_snmp_int_iftraffic
command_line $USER1$/check_snmp_int.pl -H $HOSTADDRESS$ -C $ARG1$ -2 -n $ARG2$ -k -Y -B -w $ARG3$ -c $ARG4$
}
監控snmp為v2版本的網路裝置
define service{
use generic-service ; Inherit values from a template
host_name Netscreen ISG 2000
service_description Output Interface Bandwidth Usage
check_command check_snmp_int_iftraffic_v2!zjhcsoft!ethernet1/1!1200,5000!2000,10000
notifications_enabled 0
}

4、遇到問題

[[email protected] libexec]# perl check_snmp_int.pl -H 192.168.0.21 -C zjhcsoft -n eth1 -k -w 200,400 -c 0,800
eth1:UP No usable data on file (1 rows) :(1 UP): UNKNOWN
[[email protected] libexec]# perl check_snmp_int.pl -H 192.168.0.21 -C zjhcsoft -n eth1 -k -w 200,400 -c 0,800
eth1:UP No usable data on file (2 rows) :(1 UP): UNKNOWN

網站解釋：
（我總結下就是最好執行時間超過5分鐘，這樣才有正確結果可以返回，我沒有修改預設數值，有興趣的同學可以深入研究下。）
No usable data on file (X rows)

Scripts like check_snmp_int need to store data when they get a SNMP counter so they can outpout readable data like bandwidth, cpu, etc....

For example, to output a bandwidth with an octet counter, check_snmp_int will store data every time it is run. It will also read the previous data, and try to get data old enough to make a correct average. By default, it needs data which was produced 5 minutes ago.

So, when you first run the script. - or if you run it a long time ago -, it won't be able to get data old enough and will report an error (UNKNOWN status) saying the is "no usable data on file (X rows)".
If you leave the 5 minutes default delta value, the script. will need data wich is :
- At least 4 minutes and 30 seconds old (5 min - 10%)
- At most 15 minutes old (3 * 5 min)

You can change this 5 minutes value using the '-d <sec>' option. The script. will then look for data which is at least <sec>-10% old and at most 3*<sec>.

This option will only tell to make an average on <sec> seconds, you can run the service every minute with Nagios, it will always get the newest value which is at least <sec>-10% old.
The only thing you must check is that your service will at least run every 15 minutes, or the script. will always output "unknown" as the value will be too old for him.

Nagios 下監控伺服器流量（linux平臺下）

Nagios 下監控伺服器流量（linux平臺下）

Nginx配置CI框架問題（Linux平臺下Centos系統）

APP模擬弱網測試-Fiddler+clumsy（windows平臺下）

對於Linux下的伺服器程式設計（2）

對於Linux下的伺服器程式設計（1）

如何搭建視覺化應用監控伺服器效能（ubuntu下安裝influxdb+telegraf+grafana）

CAS-SSO 單點登入之伺服器搭建（linux下編譯cas-overlay-template ）

linux下 c++ 伺服器開發（一）

LINUX下郵件伺服器搭建（SENDMAIL）（1.0）

Linux下DHCP伺服器配置（二）

linux下通訊伺服器端（普通）

Windows下MySQL日誌管理（Linux相似）1.1

linux下重啟weblogic（關閉和啟動）

linux下安裝配置jdk（解壓版）

centos下部署LAMP環境（Linux+Apache+mysql+php）(轉載文章：https://www.cnblogs.com/apro-abra/p/4862285.html)

Linux下配置Django_Apache_Mysql環境（CentOS 7.5）

【轉載】linux下安裝wget命令（sftp實現法）

windows下啟動nginx閃退---（Windows平臺下80埠被System佔用解決辦法）

linux平臺下（Ubuntu16.04）安裝與配置mysql（5.7.24）以及圖形管理工具Workbench

python2與python3安裝在同一個伺服器上（linux）

Nagios 下監控伺服器流量（linux平臺下）

相關推薦