1. 程式人生 > >[轉] Zabbix效能調優

[轉] Zabbix效能調優

原文地址: http://www.mamicode.com/info-detail-1435046.html

報警問題:

Too many processes on 

zabbix poller processes more than 75% busy

zabbix unreachable poller processes more than 75% busy

 

 

1.通過Zabbix agent採集資料的裝置處於moniting的狀態但是此時機器宕機或其他原因導致zabbix agent死掉server獲取不到資料,此時unreachable poller

就會升高。 

 

2.通過Zabbix agent採集資料的裝置處於moniting的狀態但是server向agent獲取資料時時間過長,經常超過server甚至的timeout時間,此時unreachable poller就會升高。

 

 

優化思想:

1.確保zabbix內部元件效能處於被監控狀態(調優的基礎!)

 

2.使用硬體效能足夠好的伺服器

 

3.不同角色分開,使用各自獨立的伺服器

 

4.使用active主動模式

 

5.zabbixtmp使用tmpfs檔案系統

 

6.使用分散式部署

 

7.調整MySQL效能

 

8.調整Zabbix自身配置

 

 

 

優化部署:

1.度量zabbix效能

通過Zabbix的NVPS(每秒處理數值數)來衡量其效能,在Zabbix的dashboard上有一個粗略的估值

 

 

2.獲得zabbix內部元件工作狀態

 

 

 

3.使用tmpfs檔案系統

cd / 

mkdir zabbixtmp 

chown mysql:mysql zabbixtmp 

vi /etc/fstab #配置/etc/fstab檔案 

tmpfs /zabbixtmp tmpfs rw,size=400m,nr_inodes=10k,mod=0700,uid=mysql,gid=mysql 0 0 

 

在配置/etc/fstab引數中需要注意檔案的大小設定,一般情況下設成實體記憶體的8%-10%。

 

4.使用active模式以及proxy分散式監控

zabbix_server端當主機數量過多的時候,由Server端去收集資料,zabbix會出現嚴重的效能問題,主要表現如下:

(1)當被監控端達到一個量級的時候,web操作很卡,容易出現502

(2)圖層斷裂

(3)開啟的程序(pollar)太多,即使減少iteam數量,以後加入一定量的機器也會有問題

優化考慮方向:

a.新增proxy節點或Node模式做分散式監控

b.調整agentd為主動模式

 

被監控端zabbix_Agentd.conf配置

vim zabbix_Agentd.conf

LogFile = /tmp/zabbix_agentd.log

StartAgents=0

ServerActive=ip

Hostname=

RefreshActiveChecks=1800

BufferSize=200

Timeout=10

 

Serverd端zabbix_server.conf配置調整

StartPollers=100

StartTrappers=200

 

zabbix模板中批量修改成為zabbix agent(active)模式

 

5.zabbix mysql調優

[mysqld] 

datadir=/var/lib/mysql 

socket=/var/lib/mysql/mysql.sock 

user=mysql 

 

# Disabling symbolic-links is recommended to prevent assorted security risks 

tmpdir=/zabbixtmp 

#network 

connect_timeout =60 

wait_timeout =5000 

max_connections =400 

max_allowed_packet =16M 

max_connect_errors =400 

#limits 

tmp_table_size =256M 

max_heap_table_size =64M 

table_cache =256 

#logs 

slow_query_log_file =/var/log/slowquery.log 

 

log_error =/var/log/mysql-error.log 

long_query_time =10 

slow_query_log =1 

#innodb 

 

#innodb_data_file_path =ibdata1:128M;ibdata2:128M:autoextend:max:4096M 

innodb_file_per_table =1     #每個table一個檔案

innodb_status_file =1 

 

innodb_additional_mem_pool_size =128M 

innodb_buffer_pool_size =2800M  #一般設為伺服器實體記憶體的70%-80%

innodb_flush_method =O_DIRECT 

#innodb_io_capacity =1000 

innodb_support_xa =0 

innodb_log_file_size =64M  # zabbix資料庫屬於寫入較多的資料庫,因此設定大一點可以避免MySQL持續將log檔案flush到表中。

不過有一個副作用,就是啟動和關閉資料庫會變慢一點。

innodb_log_buffer_size =32M 

symbolic-links=0 

#log-queries-not-using-indexes 

thread_cache_size=4  #這個值似乎會影響show global status輸出中Threads_created per Connection的hit rate

當設定成4的時候,有3228483 Connections和5840 Threads_created,hit rate達到了99.2%Threads_created這個數值應該越小越好。

query_cache_size=128M 

#join_buffer_size=512K 

join_buffer_size=128M 

read_buffer_size=128M 

read_rnd_buffer_size=128M 

key_buffer=128M 

innodb_flush_log_at_trx_commit=2 

[mysqld_safe] 

log-error=/var/log/mysqld.log 

pid-file=/var/run/mysqld/mysqld.pid 

#DsiableHousekeeper=1  #使用分割槽表時,關閉Houerkeeper

 

6.調整zabbix工作程序數量

vim zabbix_server.conf

StartPollers=90

StartPingers=10

StartPollersUnreacheable=80

StartIPMIPollers=10

StartTrappers=20

StartDBSyncers=8

LogSlowQueries=1000

 

6.zabbix db partition

 

step 1.準備相關表

ALTER TABLE `acknowledges` DROP PRIMARY KEY, ADD KEY `acknowledgedid` (`acknowledgeid`);

ALTER TABLE `alerts` DROP PRIMARY KEY, ADD KEY `alertid` (`alertid`);

ALTER TABLE `auditlog` DROP PRIMARY KEY, ADD KEY `auditid` (`auditid`);

ALTER TABLE `events` DROP PRIMARY KEY, ADD KEY `eventid` (`eventid`);

ALTER TABLE `service_alarms` DROP PRIMARY KEY, ADD KEY `servicealarmid` (`servicealarmid`);

ALTER TABLE `history_log` DROP PRIMARY KEY, ADD PRIMARY KEY (`itemid`,`id`,`clock`);

ALTER TABLE `history_log` DROP KEY `history_log_2`;

ALTER TABLE `history_text` DROP PRIMARY KEY, ADD PRIMARY KEY (`itemid`,`id`,`clock`);

ALTER TABLE `history_text` DROP KEY `history_text_2`;

 

 

step2.設定每月的分割槽

以下步驟請在第一步的所有表中重複,下例是為events表建立2011-5到2011-12之間的月度分割槽。

ALTER TABLE `events` PARTITION BY RANGE( clock ) (

PARTITION p201105 VALUES LESS THAN (UNIX_TIMESTAMP("2011-06-01 00:00:00")),

PARTITION p201106 VALUES LESS THAN (UNIX_TIMESTAMP("2011-07-01 00:00:00")),

PARTITION p201107 VALUES LESS THAN (UNIX_TIMESTAMP("2011-08-01 00:00:00")),

PARTITION p201108 VALUES LESS THAN (UNIX_TIMESTAMP("2011-09-01 00:00:00")),

PARTITION p201109 VALUES LESS THAN (UNIX_TIMESTAMP("2011-10-01 00:00:00")),

PARTITION p201110 VALUES LESS THAN (UNIX_TIMESTAMP("2011-11-01 00:00:00")),

PARTITION p201111 VALUES LESS THAN (UNIX_TIMESTAMP("2011-12-01 00:00:00")),

PARTITION p201112 VALUES LESS THAN (UNIX_TIMESTAMP("2012-01-01 00:00:00"))

);

 

step3.設定每日的分割槽

以下步驟請在第一步的所有表中重複,下例是為history_uint表建立5.15到5.22之間的每日分割槽。

ALTER TABLE `history_uint` PARTITION BY RANGE( clock ) (

PARTITION p20110515 VALUES LESS THAN (UNIX_TIMESTAMP("2011-05-16 00:00:00")),

PARTITION p20110516 VALUES LESS THAN (UNIX_TIMESTAMP("2011-05-17 00:00:00")),

PARTITION p20110517 VALUES LESS THAN (UNIX_TIMESTAMP("2011-05-18 00:00:00")),

PARTITION p20110518 VALUES LESS THAN (UNIX_TIMESTAMP("2011-05-19 00:00:00")),

PARTITION p20110519 VALUES LESS THAN (UNIX_TIMESTAMP("2011-05-20 00:00:00")),

PARTITION p20110520 VALUES LESS THAN (UNIX_TIMESTAMP("2011-05-21 00:00:00")),

PARTITION p20110521 VALUES LESS THAN (UNIX_TIMESTAMP("2011-05-22 00:00:00")),

PARTITION p20110522 VALUES LESS THAN (UNIX_TIMESTAMP("2011-05-23 00:00:00"))

);

 

 

手動維護分割槽:

增加新分割槽

ALTER TABLE `history_uint` ADD PARTITION (

PARTITION p20110523 VALUES LESS THAN (UNIX_TIMESTAMP("2011-05-24 00:00:00"))

);

 

刪除分割槽(使用Housekeepeing)

ALTER TABLE `history_uint` DROP PARTITION p20110515;

 

 

step4.自動每日分割槽

確認已經在step3的時候為history表正確建立了分割槽。

以下指令碼自動drop和建立每日分割槽,預設只保留最近3天,如果你需要更多天的,請修改

@mindays 這個變數。

 

 

不要忘記將這條命令加入到你的cron中!

mysql -B -h localhost -u zabbix -pPASSWORD zabbix -e "CALL create_zabbix_partitions();"

 

 

自動建立分割槽的指令碼:

https://github.com/xsbr/zabbixzone/blob/master/zabbix-mysql-autopartitioning.sql

 

 

DELIMITER //

DROP PROCEDURE IF EXISTS `zabbix`.`create_zabbix_partitions` //

CREATE PROCEDURE `zabbix`.`create_zabbix_partitions` ()

BEGIN

CALL zabbix.create_next_partitions("zabbix","history");

CALL zabbix.create_next_partitions("zabbix","history_log");

CALL zabbix.create_next_partitions("zabbix","history_str");

CALL zabbix.create_next_partitions("zabbix","history_text");

CALL zabbix.create_next_partitions("zabbix","history_uint");

CALL zabbix.drop_old_partitions("zabbix","history");

CALL zabbix.drop_old_partitions("zabbix","history_log");

CALL zabbix.drop_old_partitions("zabbix","history_str");

CALL zabbix.drop_old_partitions("zabbix","history_text");

CALL zabbix.drop_old_partitions("zabbix","history_uint");

END //

DROP PROCEDURE IF EXISTS `zabbix`.`create_next_partitions` //

CREATE PROCEDURE `zabbix`.`create_next_partitions` (SCHEMANAME varchar(64), TABLENAME varchar(64))

BEGIN

DECLARE NEXTCLOCK timestamp;

DECLARE PARTITIONNAME varchar(16);

DECLARE CLOCK int;

SET @totaldays = 7;

SET @i = 1;

createloop: LOOP

SET NEXTCLOCK = DATE_ADD(NOW(),INTERVAL @i DAY);

SET PARTITIONNAME = DATE_FORMAT( NEXTCLOCK, ‘p%Y%m%d‘ );

SET CLOCK = UNIX_TIMESTAMP(DATE_FORMAT(DATE_ADD( NEXTCLOCK ,INTERVAL 1 DAY),‘%Y-%m-%d 00:00:00‘));

CALL zabbix.create_partition( SCHEMANAME, TABLENAME, PARTITIONNAME, CLOCK );

SET @[email protected]+1;

IF @i > @totaldays THEN

LEAVE createloop;

END IF;

END LOOP;

END //

DROP PROCEDURE IF EXISTS `zabbix`.`drop_old_partitions` //

CREATE PROCEDURE `zabbix`.`drop_old_partitions` (SCHEMANAME varchar(64), TABLENAME varchar(64))

BEGIN

DECLARE OLDCLOCK timestamp;

DECLARE PARTITIONNAME varchar(16);

DECLARE CLOCK int;

SET @mindays = 3;

SET @maxdays = @mindays+4;

SET @i = @maxdays;

droploop: LOOP

SET OLDCLOCK = DATE_SUB(NOW(),INTERVAL @i DAY);

SET PARTITIONNAME = DATE_FORMAT( OLDCLOCK, ‘p%Y%m%d‘ );

CALL zabbix.drop_partition( SCHEMANAME, TABLENAME, PARTITIONNAME );

SET @[email protected];

IF @i <= @mindays THEN

LEAVE droploop;

END IF;

END LOOP;

END //

DROP PROCEDURE IF EXISTS `zabbix`.`create_partition` //

CREATE PROCEDURE `zabbix`.`create_partition` (SCHEMANAME varchar(64), TABLENAME varchar(64), PARTITIONNAME varchar(64), CLOCK int)

BEGIN

DECLARE RETROWS int;

SELECT COUNT(1) INTO RETROWS

FROM `information_schema`.`partitions`

WHERE `table_schema` = SCHEMANAME AND `table_name` = TABLENAME AND `partition_name` = PARTITIONNAME;

 

IF RETROWS = 0 THEN

SELECT CONCAT( "create_partition(", SCHEMANAME, ",", TABLENAME, ",", PARTITIONNAME, ",", CLOCK, ")" ) AS msg;

SET @sql = CONCAT( ‘ALTER TABLE `‘, SCHEMANAME, ‘`.`‘, TABLENAME, ‘`‘,

‘ ADD PARTITION (PARTITION ‘, PARTITIONNAME, ‘ VALUES LESS THAN (‘, CLOCK, ‘));‘ );

PREPARE STMT FROM @sql;

EXECUTE STMT;

DEALLOCATE PREPARE STMT;

END IF;

END //

DROP PROCEDURE IF EXISTS `zabbix`.`drop_partition` //

CREATE PROCEDURE `zabbix`.`drop_partition` (SCHEMANAME varchar(64), TABLENAME varchar(64), PARTITIONNAME varchar(64))

BEGIN

DECLARE RETROWS int;

SELECT COUNT(1) INTO RETROWS

FROM `information_schema`.`partitions`

WHERE `table_schema` = SCHEMANAME AND `table_name` = TABLENAME AND `partition_name` = PARTITIONNAME;

 

IF RETROWS = 1 THEN

SELECT CONCAT( "drop_partition(", SCHEMANAME, ",", TABLENAME, ",", PARTITIONNAME, ")" ) AS msg;

SET @sql = CONCAT( ‘ALTER TABLE `‘, SCHEMANAME, ‘`.`‘, TABLENAME, ‘`‘,

‘ DROP PARTITION ‘, PARTITIONNAME, ‘;‘ );

PREPARE STMT FROM @sql;

EXECUTE STMT;

DEALLOCATE PREPARE STMT;

END IF;

END //

DELIMITER ;

 

小結:優化的思想就是當被機器越來越多時

1. 增加zabbix工作程序數量

2. 採用active模式,由agent端主動傳送資料

3. 採用proxy進行分散式監控

4. mysql調優

 

參考文件:

http://www.centoscn.com/zabbix/2014/0508/2936.html

http://caiguangguang.blog.51cto.com/1652935/1354093

http://waringid.blog.51cto.com/65148/1156013/

http://blog.sina.com.cn/s/blog_4cbf97060101fcfw.html

http://www.linuxidc.com/Linux/2015-08/121799.htm