1. 程式人生 > >用ganglia監控hadoop

用ganglia監控hadoop

2012-09-19

周海漢/文

http://abloz.com

2012.9.9

ganglia可以通過JMX來監控hadoop,並生成圖形。cacti也可以通過外掛,利用JMX,用於監控hadoop。只是目前cacti的監控外掛很久沒有更新,執行起來很困難,需要多處修改。就算執行起來了,還是和現有hadoop相關產品很多指標不一致,所以不適合使用。 ganglia則在hadoop中原生支援。

[[email protected] conf]$ pwd /home/zhouhh/hadoop-1.0.3/conf

設定使用者許可權

[[email protected] conf]$ cat jmxremote.access
monitorRole readonly
controlRole readwrite
[
[email protected]
conf]$ cat jmxremote.password monitorRole 123 controlRole 123 [[email protected] conf]$ vi hadoop-env.sh #add by zhh for monitor #因為認證有問題,所以認證禁止 export HADOOP_JMX_BASE="-Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false" export HADOOP_JMX_BASE="$HADOOP_JMX_BASE -Dcom.sun.management.jmxremote.password.file=${HADOOP_INSTALL}/conf/jmxremote.password" export HADOOP_JMX_BASE="$HADOOP_JMX_BASE -Dcom.sun.management.jmxremote.access.file=${HADOOP_INSTALL}/conf/jmxremote.access" # Command specific options appended to HADOOP_OPTS when specified #zhh for monitor #設定jmx埠號 export HADOOP_NAMENODE_OPTS="$HADOOP_JMX_BASE -Dcom.sun.management.jmxremote.port=10001" export HADOOP_SECONDARYNAMENODE_OPTS="$HADOOP_JMX_BASE -Dcom.sun.management.jmxremote.port=10002" export HADOOP_DATANODE_OPTS="$HADOOP_JMX_BASE -Dcom.sun.management.jmxremote.port=10003" export HADOOP_BALANCER_OPTS="$HADOOP_JMX_BASE -Dcom.sun.management.jmxremote.port=10004" export HADOOP_JOBTRACKER_OPTS="$HADOOP_JMX_BASE -Dcom.sun.management.jmxremote.port=10005" export HADOOP_NAMENODE_OPTS="-Dcom.sun.management.jmxremote $HADOOP_NAMENODE_OPTS" export HADOOP_SECONDARYNAMENODE_OPTS="-Dcom.sun.management.jmxremote $HADOOP_SECONDARYNAMENODE_OPTS" export HADOOP_DATANODE_OPTS="-Dcom.sun.management.jmxremote $HADOOP_DATANODE_OPTS" export HADOOP_BALANCER_OPTS="-Dcom.sun.management.jmxremote $HADOOP_BALANCER_OPTS" export HADOOP_JOBTRACKER_OPTS="-Dcom.sun.management.jmxremote $HADOOP_JOBTRACKER_OPTS"

修改hadoop-metrics2.properties

[[email protected] conf]$ cat hadoop-metrics2.properties
# syntax: [prefix].[source|sink|jmx].[instance].[options]
# See package.html for org.apache.hadoop.metrics2 for details

*.sink.file.class=org.apache.hadoop.metrics2.sink.FileSink

#namenode.sink.file.filename=namenode-metrics.out

#datanode.sink.file.filename=datanode-metrics.out

#jobtracker.sink.file.filename=jobtracker-metrics.out

#tasktracker.sink.file.filename=tasktracker-metrics.out

#maptask.sink.file.filename=maptask-metrics.out

#reducetask.sink.file.filename=reducetask-metrics.out
#
# Below are for sending metrics to Ganglia
#
# for Ganglia 3.0 support
# *.sink.ganglia.class=org.apache.hadoop.metrics2.sink.ganglia.GangliaSink30
#
# for Ganglia 3.1 support

# 大於0.20以後的版本用ganglia31
*.sink.ganglia.class=org.apache.hadoop.metrics2.sink.ganglia.GangliaSink31

*.sink.ganglia.period=10

# default for supportsparse is false
*.sink.ganglia.supportsparse=true

*.sink.ganglia.slope=jvm.metrics.gcCount=zero,jvm.metrics.memHeapUsedM=both
*.sink.ganglia.dmax=jvm.metrics.threadsBlocked=70,jvm.metrics.memHeapUsedM=40

#修改廣播IP地址,這是預設的,統一設該值

namenode.sink.ganglia.servers=239.2.11.71:8649

datanode.sink.ganglia.servers=239.2.11.71:8649

jobtracker.sink.ganglia.servers=239.2.11.71:8649

tasktracker.sink.ganglia.servers=239.2.11.71:8649

maptask.sink.ganglia.servers=239.2.11.71:8649

reducetask.sink.ganglia.servers=239.2.11.71:8649

#dfs.class=org.apache.hadoop.metrics.spi.NullContextWithUpdateThread
#zhh add
dfs.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
dfs.period=10
dfs.servers=239.2.11.71:8649

mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
mapred.period=10
mapred.servers=239.2.11.71:8649

#jvm.class=org.apache.hadoop.metrics.spi.NullContextWithUpdateThread
#jvm.period=300

jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
jvm.period=10
jvm.servers=239.2.11.71:8649

修改hbase配置

[[email protected] conf]$ pwd
/home/zhouhh/hbase-0.94.0/conf
[[email protected] conf]$ vi hbase-env.sh

#zhh
JMX_OPTS="-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.ssl=false"
JMX_OPTS="$JMX_OPTS -Dcom.sun.management.jmxremote.password.file=$HADOOP_HOME/conf/jmxremote.passwd"
JMX_OPTS="$JMX_OPTS -Dcom.sun.management.jmxremote.access.file=$HADOOP_HOME/conf/jmxremote.access"

export HBASE_JMX_BASE="$JMX_OPTS -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false"

export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10101"
export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10102"
export HBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10103"
export HBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10104"

修改hbase下的hadoop-metrics.properties

[[email protected] conf]$ cat hadoop-metrics.properties
# See http://wiki.apache.org/hadoop/GangliaMetrics
# Make sure you know whether you are using ganglia 3.0 or 3.1.
# If 3.1, you will have to patch your hadoop instance with HADOOP-4675
# And, yes, this file is named hadoop-metrics.properties rather than
# hbase-metrics.properties because we're leveraging the hadoop metrics
# package and hadoop-metrics.properties is an hardcoded-name, at least
# for the moment.
#
# See also http://hadoop.apache.org/hbase/docs/current/metrics.html
# GMETADHOST_IP is the hostname (or) IP address of the server on which the ganglia
# meta daemon (gmetad) service is running

# Configuration of the "hbase" context for null
#hbase.class=org.apache.hadoop.metrics.spi.NullContext
hbase.class=org.apache.hadoop.metrics.ganglia.GangliaContext31

# Configuration of the "hbase" context for file
# hbase.class=org.apache.hadoop.hbase.metrics.file.TimeStampingFileContext
# hbase.period=10
# hbase.fileName=/tmp/metrics_hbase.log

# HBase-specific configuration to reset long-running stats (e.g. compactions)
# If this variable is left out, then the default is no expiration.
hbase.extendedperiod = 3600

# Configuration of the "hbase" context for ganglia
# Pick one: Ganglia 3.0 (former) or Ganglia 3.1 (latter)
#zhh
#hbase.class=org.apache.hadoop.metrics.ganglia.GangliaContext
hbase.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
hbase.period=10
#hbase.servers=GMETADHOST_IP:8649
hbase.servers=239.2.11.71:8649

# Configuration of the "jvm" context for null
#jvm.class=org.apache.hadoop.metrics.spi.NullContext

# Configuration of the "jvm" context for file
# jvm.class=org.apache.hadoop.hbase.metrics.file.TimeStampingFileContext
# jvm.period=10
# jvm.fileName=/tmp/metrics_jvm.log

# Configuration of the "jvm" context for ganglia
# Pick one: Ganglia 3.0 (former) or Ganglia 3.1 (latter)
# jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext
jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
jvm.period=10
jvm.servers=239.2.11.71:8649

# Configuration of the "rpc" context for null
#rpc.class=org.apache.hadoop.metrics.spi.NullContext

# Configuration of the "rpc" context for file
# rpc.class=org.apache.hadoop.hbase.metrics.file.TimeStampingFileContext
# rpc.period=10
# rpc.fileName=/tmp/metrics_rpc.log

# Configuration of the "rpc" context for ganglia
# Pick one: Ganglia 3.0 (former) or Ganglia 3.1 (latter)
# rpc.class=org.apache.hadoop.metrics.ganglia.GangliaContext
rpc.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
rpc.period=10
rpc.servers=239.2.11.71:8649

# Configuration of the "rest" context for ganglia
# Pick one: Ganglia 3.0 (former) or Ganglia 3.1 (latter)
# rest.class=org.apache.hadoop.metrics.ganglia.GangliaContext
rest.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
rest.period=10
rest.servers=239.2.11.71:8649

重啟hbase和hadoop,gmond,gmetad 再用瀏覽器訪問,可以看到hadoop相關的度量。

如圖:

如非註明轉載, 均為原創. 本站遵循知識共享CC協議,轉載請註明來源