1. 程式人生 > >SQL Server自動化運維繫列——監控效能指標指令碼(Power Shell)

SQL Server自動化運維繫列——監控效能指標指令碼(Power Shell)

需求描述

一般在生產環境中,有時候需要自動的檢測指標值狀態,如果發生異常,需要提前預警的,比如發郵件告知,本篇就介紹如果通過Power shell實現狀態值監控

監控值範圍

根據經驗,作為DBA一般需要監控如下系統能行指標

  cpu:
 
    \Processor(_Total)\% Processor Time
    \Processor(_Total)\% Privileged Time
 
    \SQLServer:SQL Statistics\Batch Requests/sec
    \SQLServer:SQL Statistics\SQL Compilations
/sec     \SQLServer:SQL Statistics\SQL Re-Compilations/sec     \System\Processor Queue Length     \System\Context Switches/sec   Memory:     \Memory\Available Bytes     \Memory\Pages/sec     \Memory\Page Faults/sec     \Memory\Pages Input/sec     \Memory\Pages Output/sec     \Process(sqlservr)\Private Bytes     \SQLServer:Buffer Manager\Buffer cache hit ratio     \SQLServer:Buffer Manager\Page life expectancy     \SQLServer:Buffer Manager\Lazy writes
/sec     \SQLServer:Memory Manager\Memory Grants Pending     \SQLServer:Memory Manager\Target Server Memory (KB)     \SQLServer:Memory Manager\Total Server Memory (KB)   Disk:     \PhysicalDisk(_Total)\% Disk Time     \PhysicalDisk(_Total)\Current Disk Queue Length     \PhysicalDisk(_Total)\Avg. Disk Queue Length     \PhysicalDisk(_Total)\Disk Transfers
/sec     \PhysicalDisk(_Total)\Disk Bytes/sec     \PhysicalDisk(_Total)\Avg. Disk sec/Read     \PhysicalDisk(_Total)\Avg. Disk sec/Write   SQL Server:     \SQLServer:Access Methods\FreeSpace Scans/sec     \SQLServer:Access Methods\Full Scans/sec     \SQLServer:Access Methods\Table Lock Escalations/sec     \SQLServer:Access Methods\Worktables Created/sec     \SQLServer:General Statistics\Processes blocked     \SQLServer:General Statistics\User Connections     \SQLServer:Latches\Total Latch Wait Time (ms)     \SQLServer:Locks(_Total)\Lock Timeouts (timeout > 0)/sec     \SQLServer:Locks(_Total)\Lock Wait Time (ms)     \SQLServer:Locks(_Total)\Number of Deadlocks/sec     \SQLServer:SQL Statistics\Batch Requests/sec     \SQLServer:SQL Statistics\SQL Re-Compilations/sec

監控指令碼

$server = "(local)"
$uid = "sa"
$db="master"
$pwd="password"
$mailprfname = "SendEmail"
$recipients = "[email protected]"
$subject = "資料庫指標異常了!"
$computernamexml = "f:\computername.xml"
$alter_cpuxml = "f:\alter_cpu.xml"
function GetServerName($xmlpath)
{
    $xml = [xml] (Get-Content $xmlpath)
    $return = New-Object Collections.Generic.List[string]
    for($i = 0;$i -lt $xml.computernames.ChildNodes.Count;$i++)
    {
        if ( $xml.computernames.ChildNodes.Count -eq 1)
        {
            $cp = [string]$xml.computernames.computername
        }
        else
        {
            $cp = [string]$xml.computernames.computername[$i]
        }
        $return.Add($cp.Trim())
    }
    $return
}

function GetAlterCounter($xmlpath)
{
    $xml = [xml] (Get-Content $xmlpath)
    $return = New-Object Collections.Generic.List[string]
    $list = $xml.counters.Counter
    $list
}

function CreateAlter($message)
{
    $SqlConnection = New-Object System.Data.SqlClient.SqlConnection 
    $CnnString ="Server = $server; Database = $db;User Id = $uid; Password = $pwd" 
    $SqlConnection.ConnectionString = $CnnString 
    $CC = $SqlConnection.CreateCommand(); 
    if (-not ($SqlConnection.State -like "Open")) { $SqlConnection.Open() } 
    
    $cc.CommandText=" EXEC msdb..sp_send_dbmail 
             @profile_name  = '$mailprfname'
            ,@recipients = '$recipients'
            ,@body = '$message'
            ,@subject = '$subject'
" 
    $cc.ExecuteNonQuery()|out-null 
    $SqlConnection.Close();
}

$names = GetServerName($computernamexml)
$pfcounters = GetAlterCounter($alter_cpuxml)
foreach($cp in $names)
{
    $p = New-Object Collections.Generic.List[string]
    $report = ""
    foreach ($pfc in $pfcounters)
    {
        $b = ""
        $counter ="\\"+$cp+$pfc.get_InnerText().Trim()
        $p.Add($counter)
        
    }
    $count = Get-Counter $p
    for ($i = 0; $i -lt $count.CounterSamples.Count; $i++)
    {
        $v = $count.CounterSamples.Get($i).CookedValue
        $pfc = $pfcounters[$i]
        #$pfc.get_InnerText()
        $b = ""
        $lg = ""
        if($pfc.operator -eq "lt")
        {
            if ($v -ge [double]$pfc.alter)
                {$b = "alter"
                $lg = "Greater Than"}
        }
        elseif ($pfc.operator -eq "gt")
        {
            if( $v -le [double]$pfc.alter)
                {$b = "alter"
                $lg = "Less Than"}
        }
        if($b -eq "alter")
        {
            $path = "\\"+$cp+$pfc.get_InnerText()
            
            $item = "{0}:{1};{2} Threshold:{3}" -f $path,$v.ToString(),$lg,$pfc.alter.Trim()
            $report += $item + "`n"
        }
        
    }
    if($report -ne "")
    {
        #生產警告 引數 計數器,閥值,當前值
        CreateAlter $report
    }
}

其中涉及到2個配置檔案:computernamexml,alter_cpuxml分別如下:

<computernames>
        <computername>
                wuxuelei-pc
        </computername>
</computernames>
<Counters>
        <Counter alter = "10" operator = "gt" >\Processor(_Total)\% Processor Time</Counter>
        <Counter alter = "10" operator = "gt" >\Processor(_Total)\% Privileged Time</Counter>
        <Counter alter = "10" operator = "gt" >\SQLServer:SQL Statistics\Batch Requests/sec</Counter>
        <Counter alter = "10" operator = "gt" >\SQLServer:SQL Statistics\SQL Compilations/sec</Counter>
        <Counter alter = "10" operator = "gt" >\SQLServer:SQL Statistics\SQL Re-Compilations/sec</Counter>
        <Counter alter = "10" operator=  "lt" >\System\Processor Queue Length</Counter>
        <Counter alter = "10" operator=  "lt" >\System\Context Switches/sec</Counter>
</Counters>

其中 alter 就是閥值,如第一條,如果 閥值 > 效能計數器值,就會發出警告。

其實這種自定義配置的方式,實現了靈活多變的自動化監控標準:

1、比如可以檢測磁碟空間大小

2、檢測執行峰值狀態

3、定時的根據歷史執行值,更改生產系統中的閥值大小,也就是所謂的執行基線

警告實現方式

1、SQL Agent配置Job方式實現

2、計劃任務

以上兩種配置方式,可以靈活掌握,操作還是蠻簡單的,如果不會,可自行google。當然,如果不想幹預正常的生產系統,可以新增一個Server專門用來自動化運維檢測來用,實現遠端監控。

後續文章中會分析關於Power Shell的遠端呼叫,並且能實現事故當前狀態下,自動化截圖....自動Send Email......為DBA現場取證第一手材料...方便診斷問題...

效果圖如下

以上只提供實現方式,如需要內容更新,自己靈活更新。