1. 程式人生 > >Linux 實時效能測試工具——Cyclictest 的使用與分析

Linux 實時效能測試工具——Cyclictest 的使用與分析

  Cyclictest is a high resolution test program, written by User:Tglx, maintained by Clark Williams and John Kacur

Documentation

Installation

  Get the latest sources from the git repository, do a git clone git://git.kernel.org/pub/scm/utils/rt-tests/rt-tests.git or fetch a released tarball from the archive, untar into a directory of your choice and run make in the source directory. If you want to cross compile, just run make CROSS_COMPILE= (for example make CROSS_COMPILE=arm-v4t-linux-gnueabi-).
  You can run the resulting binary from there or install it.

lgs@f11#> git clone git://git.kernel.org/pub/scm/utils/rt-tests/rt-tests.git 
lgs@f11#> cd rt-tests
lgs@f11#> make all
lgs@f11#> cp ./cyclictest /usr/bin/
lgs@f11#> cyclictest --help

NOTE!
libnuma is required to build cyclictest. Usually, it’s safe to have libnuma installed also in non-numa systems, but if you don’t want to install the numa libs (e.g. in embedded environment) then compile with make NUMA=0

.

Run it

Make sure to be root or use sudo to run cyclictest.
Without parameters cyclictest creates one thread with a 1ms interval timer.
cyclictest -h provides help text for the various options

[[email protected] rt-tests]#
[[email protected] rt-tests]#
[[email protected] rt-tests]# ./cyclictest  --help
cyclictest V 0.42
Usage:
cyclictest <options>

-a [NUM
] --affinity run thread #N on processor #N, if possible
with NUM pin all threads to the processor NUM -b USEC --breaktrace=USEC send break trace command when latency > USEC -B --preemptirqs both preempt and irqsoff tracing (used with -b) -c CLOCK --clock=CLOCK select clock 0 = CLOCK_MONOTONIC (default) 1 = CLOCK_REALTIME -C --context context switch tracing (used with -b) -d DIST --distance=DIST distance of thread intervals in us default=500 -E --event event tracing (used with -b) -f --ftrace function trace (when -b is active) -i INTV --interval=INTV base interval of thread in us default=1000 -I --irqsoff Irqsoff tracing (used with -b) -l LOOPS --loops=LOOPS number of loops: default=0(endless) -m --mlockall lock current and future memory allocations -n --nanosleep use clock_nanosleep -N --nsecs print results in ns instead of ms (default ms) -o RED --oscope=RED oscilloscope mode, reduce verbose output by RED -O TOPT --traceopt=TOPT trace option -p PRIO --prio=PRIO priority of highest prio thread -P --preemptoff Preempt off tracing (used with -b) -q --quiet print only a summary on exit -r --relative use relative timer instead of absolute -s --system use sys_nanosleep and sys_setitimer -T TRACE --tracer=TRACER set tracing function configured tracers: unavailable (debugfs not mounted) -t --threads one thread per available processor -t [NUM] --threads=NUM number of threads: without NUM, threads = max_cpus without -t default = 1 -v --verbose output values on stdout for statistics format: n:c:v n=tasknum c=count v=value in us -D --duration=t specify a length for the test run default is in seconds, but 'm', 'h', or 'd' maybe add ed to modify value to minutes, hours or days -h --histogram=US dump a latency histogram to stdout after the run US is the max time to be be tracked in microseconds -w --wakeup task wakeup tracing (used with -b) -W --wakeuprt rt task wakeup tracing (used with -b)

-b is a debugging option to control the latency tracer in the realtime preemption patch.
It is useful to track down unexpected large latencies on a system. This option does only work with

  • CONFIG_PREEMPT_RT=y
  • CONFIG_WAKEUP_TIMING=y
  • CONFIG_LATENCY_TRACE=y
  • CONFIG_CRITICAL_PREEMPT_TIMING=y
  • CONFIG_CRITICAL_IRQSOFF_TIMING=y

kernel configuration options enabled. The USEC parameter to the -b option defines a maximum latency value, which is compared against the actual latencies of the test. Once the measured latency is higher than the given maximum, the kernel tracer and cyclictest is stopped. The trace can be read from /proc/latency_trace
mybox# cat /proc/latency_trace >trace.log
Please be aware that the tracer adds significant overhead to the kernel, so the latencies will be much higher than on a kernel with latency tracing disabled.
-c CLOCK selects the clock, which is used

  • 0 selects CLOCK_MONOTONIC, which is the monotonic increasing system
    time. This is the default selection
  • 1 selects CLOCK_REALTIME, which is the time of day time.

CLOCK_REALTIME can be set by settimeofday, while CLOCK_MONOTONIC can not be modified by the user.
This option has no influence when the -s option is given.
-d DIST set the distance of thread intervals in microseconds (default is 500us)
When cylictest is called with the -t option and more than one thread is created, then this distance value is added to the interval of the threads.
Interval(thread N) = Interval(thread N-1) + DIST
-i INTV set the base interval of the thread(s) in microseconds (default is 1000us)
This sets the interval of the first thread. See also -d.
-l LOOPS set the number of loops (default = 0(endless))
This option is useful for automated tests with a given number of test cycles. cyclictest is stopped once the number of timer intervals has been reached.
-n use clock_nanosleep instead of posix interval timers
Setting this option runs the tests with clock_nanosleep instead of posix interval timers.
-p PRIO set the priority of the first thread
The given priority is set to the first test thread. Each further thread gets a lower priority:
Priority(Thread N) = Priority(Thread N-1)
-q run the tests quiet and print only a summary on exit
Useful for automated tests, where only the summary output needs to be captured
-r use relative timers instead of absolute
The default behaviour of the tests is to use absolute timers. This option is there for completeness and should not be used for reproducible tests.
-s use sys_nanosleep and sys_setitimer instead of posix timers
Note, that -s can only be used with one thread because itimers are per process and not per thread. -s in combination with -n uses the nanosleep syscall and is not restricted to one thread
-t NUM set the number of test threads (default is 1), -t without an argument makes the number of threads equal to the number of cpus
Create NUM test threads. See -d, -i and -p for further information.
-v output values on stdout for statistics
This option is used to gather statistical information about the latency distribution. The output is sent to stdout. The output format is
n:c:v
where n=task number c=count v=latency value in us
Use this option in combination with -l
The OSADL Realtime LiveCD project provides a script to plot the latency distribution.

Expected Results

tglx’s reference machine

  All tests have been run on a Pentium III 400MHz based PC.
  The tables show comparisons of vanilla Linux 2.6.16, Linux-2.6.16-hrt5 and Linux-2.6.16-rt12. The tests for intervals less than the jiffy resolution have not been run on vanilla Linux 2.6.16. The test thread runs in all cases with SCHED_FIFO and priority 80. All numbers are in microseconds.

  • Test case: clock_nanosleep(TIME_ABSTIME), Interval 10000
    microseconds,. 10000 loops, no load.

Commandline: cyclictest -t1 -p 80 -n -i 10000 -l 10000
Kernel min max avg
2.6.16 24 4043 1989
2.6.16-hrt5 12 94 20
2.6.16-rt12 6 40 10

  • Test case: clock_nanosleep(TIME_ABSTIME), Interval 10000 micro
    seconds,. 10000 loops, 100% load.

Commandline: cyclictest -t1 -p 80 -n -i 10000 -l 10000
Kernel min max avg
2.6.16 55 4280 2198
2.6.16-hrt5 11 458 55
2.6.16-rt12 6 67 29

  • Test case: POSIX interval timer, Interval 10000 micro seconds,. 10000
    loops, no load.

Commandline: cyclictest -t1 -p 80 -i 10000 -l 10000
Kernel min max avg
2.6.16 21 4073 2098
2.6.16-hrt5 22 120 35
2.6.16-rt12 20 60 31

  • Test case: POSIX interval timer, Interval 10000 micro seconds,. 10000
    loops, 100% load.

Commandline: cyclictest -t1 -p 80 -i 10000 -l 10000
Kernel min max avg
2.6.16 82 4271 2089
2.6.16-hrt5 31 458 53
2.6.16-rt12 21 70 35

  • Test case: clock_nanosleep(TIME_ABSTIME), Interval 500 micro
    seconds,. 100000 loops, no load.

Commandline: cyclictest -t1 -p 80 -i 500 -n -l 100000
Kernel min max avg
2.6.16-hrt5 5 108 24
2.6.16-rt12 5 48 7

  • Test case: clock_nanosleep(TIME_ABSTIME), Interval 500 micro
    seconds,. 100000 loops, 100% load.

Commandline: cyclictest -t1 -p 80 -i 500 -n -l 100000
Kernel min max avg
2.6.16-hrt5 9 684 56
2.6.16-rt12 10 60 22

  • Test case: POSIX interval timer, Interval 500 micro seconds,. 100000
    loops, no load.

Commandline: cyclictest -t1 -p 80 -i 500 -l 100000
Kernel min max avg
2.6.16-hrt5 8 119 22
2.6.16-rt12 12 78 16

  • Test case: POSIX interval timer, Interval 500 micro seconds,. 100000
    loops, 100% load.

Commandline: cyclictest -t1 -p 80 -i 500 -l 100000
Kernel min max avg
2.6.16-hrt5 16 489 58
2.6.16-rt12 12 95 29

FAQ

ps shows the wrong scheduling class SCHED_OTHER

  Each cyclictest-task consist of one or more threads. ps -ce shows only the main-process not the threads of the main-process. ps -eLc | grep cyclic shows the main-process an the containing threads with the correct scheduler class SCHED_FIFO.

#>./cyclictest -t5 -p 80 -n -i 10000

#> ps -cLe | grep cyclic
 4764  4764 TS   19 pts/1    00:00:01 cyclictest
 4764  4765 FF  120 pts/1    00:00:00 cyclictest
 4764  4766 FF  119 pts/1    00:00:00 cyclictest
 4764  4767 FF  118 pts/1    00:00:00 cyclictest
 4764  4768 FF  117 pts/1    00:00:00 cyclictest
 4764  4769 FF  116 pts/1    00:00:00 cyclictest

chrt shows the wrong scheduling class SCHED_OTHER

  Don’t use the PID of the main-process, but the pid of one of the threads from the main-process. The threads are shown with ps -cLe | grep cyclic.

#> chrt -p 4766
pid 4766's current scheduling policy: SCHED_FIFO
pid 4766's current scheduling priority: 79

taskset for CPU affinity

  taskset command is Written by Robert M. Love. SMP operating systems have choices when it comes to scheduling processes: a new or newly rescheduled process can run on any available cpu. However, while it shouldn’t matter where a new process runs, an existing process should go back to the same cpu it was running on simply because the cpu may still be caching data that belongs to that process. This is particularly apt to be true if the process is a thread: the other threads in the same program are very likely to have cpu cache of interest to their brethren (though obviously this also diminishes the performance gain that might be seen from multithreading) . For these reasons, scheduling algorithms pay attention to cpu affinity and try to keep it constant.
  It is possible to force a process to run only on a certain cpu. There are Linux system calls (sched_setaffinity and sched_getaffinity) and a command line “taskset”.

lgs@f11#> taskset -c 3 top
lgs@f11#> taskset -p [pid]

Compile failure because numa.h can’t be found

make
cc -D VERSION_STRING=0.85 -c src/cyclictest/cyclictest.c -Wall -Wno-nonnull -O2 -DNUMA -D_GNU_SOURCE -Isrc/include
In file included from src/cyclictest/cyclictest.c:37:0:
src/cyclictest/rt_numa.h:23:18: fatal error: numa.h: No such file or directory
compilation terminated.
make: *** [cyclictest.o] Error 1

  Simply install your distribution’s numa development package. On Fedora this is numactl-devel, so

su -c 'yum install numactl-devel'

  This is only required for building. This will not affect the way the test runs on non-numa machines

相關推薦

Linux 實時效能測試工具——Cyclictest 的使用分析

  Cyclictest is a high resolution test program, written by User:Tglx, maintained by Clark Williams and John Kacur Documentatio

H5遊戲效能測試工具選擇實踐總結

概要 本文會對本人在使用白鷺做h5遊戲進行效能測試的過程送使用的工具做一些簡單記錄。 包括 記憶體,cpu,耗電,啟動時間,網路監控,弱網路,流量幾個方面介紹。 背景 玩吧提測有一個性能需要求列表。需要每項指標達到要求。 測試效能基於公司專案的遊戲,使

常用軟體測試工具介紹分析

隨著軟體測試的地位逐步提高,測試的重要性逐步顯現,測試工具的應用已經成為了普遍的趨勢。目前用於測試的工具已經比較多了,測試工具的應用可以提高測試的質量、測試的效率、減少測試過程中的重複勞動、實現測試自動化,這些測試工具一般可分為白盒測試工具、黑盒測試工具、效能測試工具,另外還

Linux伺服器效能測試工具介紹

前言 作為伺服器開發人員,對效能應該非常的敏感,在伺服器設計和編碼時就應該充分考慮到效能問題,但如果寫出來的程式,或者已經存在的程式在執行中出現了效能問題,我們又如何下手去找出問題並解決呢?這不僅靠的是經驗,還需要藉助一些工具來輔助分析。 本文將以一個例項為樣本,介紹幾款

linux 磁碟效能測試工具fio

1,安裝 apt-get install fio dd if=/dev/zero of=2G.file bs=1G count=2 2,測試 fio -filename=/mnt/dmcache/chenming.log -direct=1

【蟲師--系列08】效能測試知多少---效能測試工具原理架構

來自:http://www.cnblogs.com/fnng/archive/2012/07/31/2617546.html      作者:蟲師 在效能測試的學習過程中,堅持思想與工具(分開)並行,當前面世面上的效能測試書籍大多把理論與loadrunner融為一體講解

嵌入式linux網路效能測試工具iperf

       通常在某些應用中我們會乙太網作為高速傳輸媒介介面,在前期的驗證以及硬體完成以後都需要進行必要的效能測試。微控制器中由於RAM有限通常不會實現完整的TCP/IP協議棧,而採用諸如uIP/L

Linux 實時性能測試工具——Cyclictest 的使用分析

Cyclictest關於Cyclictest工具,在Wiki上有說明:https://wiki.linuxfoundation.org/realtime/documentation/howto/tools/cyclictest   Cyclictest is a high resolution test p

Linux效能測試工具-UnixBench--安裝以及結果分析

UnixBench unixbench是一個用於測試unix系統性能的工具,也是一個比較通用的benchmark, 此測試的目的是對類Unix 系統提供一個基本的效能指示,很多測試用於系統性能的不同方面,這些測試的結果是一個指數值(index value,如520),這個值

效能測試工具操作資料庫(九)-LoadrunnerMongoDB

1、在loadrunner中新建指令碼(本文以LoadRunner11為例),要求選擇協議型別為Java->Java Vuser 2、在Run-time Settings設定JDK路徑,由於LoadRunner11不支援jdk1.8,本次測試是拷貝了一份低版本的JDK1.6,所以路徑選擇固

PHP效能測試工具xhprof安裝使用

一、安裝 1 2 3 4 5 6 7 wget https://pecl.php.net/get/xhprof-0.9.4

Linux VPS/伺服器效能測試工具之二

無論我們選擇國內、國外VPS、伺服器,我們都希望價效比高,都凸顯在哪些方面呢?價格成本、速度、穩定性,以及各種支援的功能。其中最為關鍵的我們在選擇便宜VPS主機的時候可能會用於國內的建站等專案,其實最為直接的測試速度就是我們架設網站之後看看實際的使用者開啟速度,這樣其實是最好

Linux效能測試工具安裝全集

stress 下載地址:http://people.seas.harvard.edu/~apw/stress/ 一、stress工具安裝:1、獲取stress原始碼安裝包(stress-1.0.4.tar.gz)3、解壓並安裝 [[email protected] /]#cd /tmp/

MySQL組複製技術實現資料庫效能測試工具

測試環境 本文件是在 99Cloud Lab OpenStack 平臺虛機上面測試,僅供參考。 系統: CentOS 7.3 虛機: 2 核 4G 版本: MySQL 5.7 技術架構 MySQL Group Replication(簡稱 MGR)是官方推出的高可用解決方案,原生複製技術,基於外掛

效能測試工具操作資料庫(二)-Loadrunneroracle

1、Loadrunner支援oracle協議,可以選擇協議錄製方式,或手寫指令碼方式,本文講的手寫指令碼方式,簡潔方便。 2、vuser_init檔案程式碼(連線全放到init中,因為連線耗時耗資源,以保證效能測試時少連線): vuser_init() { lrd_in

Linux虛擬機器上安裝效能測試工具OProfile

                作者:鄒祁峰 郵箱:[email protected] 部落格

效能測試工具操作資料庫(三)-JmeterMysql

1、安裝mysql的驅動包 ·        為了連線Mysql資料庫,還需要下載"mysql-connector-java",可以從下載 新建測試計劃,載入jar包路徑,如下: 2、建立JDBC

linux 效能測試工具Lmbench 使用方法

一、引言 要評價一個系統的效能,通常有不同的指標,相應的會有不同的測試方法和測試工具,一般來說為了確保測試結果的公平和權威性,會選用比較成熟的商業測試軟體。但在特定情形下,只是想要簡單比較不同系統或比較一些函式庫效能時,也能夠從開源世界裡選用一些優秀的工具來完成這個任務,

Linux下的效能測試工具 – sysbench

sysbench是一款開源的多執行緒效能測試工具,可以執行CPU/記憶體/執行緒/IO/資料庫等方面的效能測試。資料庫目前支援MySQL/Oracle/PostgreSQL。本文只是簡單演示一下幾種測試的用法,後續準備利用sysbench來對MySQL進行一系列的測試。具體

OpenStack效能測試工具Rally實踐和分析

1       Rally介紹 1.1    概述 Rally是OpenStack社群推出開源測試工具,可用於對OpenStack各個元件進行效能測試。通過使用Rally元件,使用者可完成OpenStack雲端計算平臺的安裝部署、功能驗證、大規模負載測試(效能測試)、輸出