我們在Linux上總是要儲存資料的,資料要麼儲存在檔案系統裡(如ext3),要麼就在裸裝置裡面。我們在使用這些資料的時候都是通過檔案這個抽象來訪問的,作業系統會把我們需要的資料給我們,我們通常無需和塊裝置打交道。

從下圖我們可以很清楚的看到:

我們會發現IO是個層次很深的子系統,有很複雜的資料流動線路。

至於作業系統如何去儲存和獲取這些資料對我們完全是黑盒子的,這通常不是問題。但是如果我們的IO很密集,我們就需要搞清楚IO具體是如何運作的,免的濫用IO和導致設計問題。

這時候你就需要blktrace這樣的工具。

blktrace is a block layer IO tracing mechanism which provides detailed information about request queue operations up to user space.

它的作者Jens Axboe, 是核心IO模組的維護者,目前就職於FusionIO, 是個很nice的傢伙,同時他還是著名IO評測工具fio的作者。

目前blktrace在大部分的Linux發行版都支援的,我們可以輕鬆的安裝使用:

$ sudo yum install blktrace 

$ sudo  blktrace /dev/sda5 -o - | blkparse -i -    

  8,5    2        1     0.000000000     0  C   W 40247824 + 8 [0]
  0,0    2        2     0.000040884  4271  A   W 31105920 + 8 <- (8,3) 132600
  8,5    2        3     0.000041214  4271  Q   W 31105920 + 8 [(null)]
  8,5    2        4     0.000045947  4271  G   W 31105920 + 8 [(null)]
  8,5    2        5     0.000046707  4271  P   N [(null)]
  8,5    2        6     0.000047073  4271  I   W 31105920 + 8 [(null)]
  0,0    2        7     0.000048282  4271  A   W 31105928 + 8 <- (8,3) 132608
  8,5    2        8     0.000048357  4271  Q   W 31105928 + 8 [(null)]
  8,5    2        9     0.000049137  4271  M   W 31105928 + 8 [(null)]
  0,0    2       10     0.000050167  4271  A   W 31105936 + 8 <- (8,3) 132616
  8,5    2       11     0.000050241  4271  Q   W 31105936 + 8 [(null)]
  8,5    2       12     0.000050417  4271  M   W 31105936 + 8 [(null)]
  0,0    2       13     0.000050984  4271  A   W 31105944 + 8 <- (8,3) 132624
  8,5    2       14     0.000051047  4271  Q   W 31105944 + 8 [(null)]
  8,5    2       15     0.000051258  4271  M   W 31105944 + 8 [(null)]
  8,5    2       16     0.000051829  4271  U   N [(null)] 1
  8,5    2       17     0.000052699  4271  D   W 31105920 + 32 [(null)]
  8,5    2       18     0.000108292     0  C   W 31105920 + 32 [0]
  0,0    2       19     0.000127791  4271  A   W 31105952 + 8 <- (8,3) 132632
  8,5    2       20     0.000128001  4271  Q   W 31105952 + 8 [(null)]
  8,5    2       21     0.000128874  4271  G   W 31105952 + 8 [(null)]
  8,5    2       22     0.000129373  4271  P   N [(null)]
  8,5    2       23     0.000129706  4271  I   W 31105952 + 8 [(null)]
  8,5    2       24     0.000130551  4271  U   N [(null)] 1
  8,5    2       25     0.000131330  4271  D   W 31105952 + 8 [(null)]
  8,5    2       26     0.000172705     0  C   W 31105952 + 8 [0]
  0,0   13        1 1266874889.709337223  4271  A   W 40247824 + 8 <- (8,3) 9274504
  8,5   13        2 1266874889.709338011  4271  Q   W 40247824 + 8 [kjournald]
  8,5   13        3 1266874889.709343974  4271  G   W 40247824 + 8 [kjournald]
  8,5   13        4 1266874889.709346653  4271  P   N [kjournald]
  8,5   13        5 1266874889.709347728  4271  I   W 40247824 + 8 [kjournald]
  8,5   13        6 1266874889.709350795  4271  U   N [kjournald] 1
  8,5   13        7 1266874889.709355396  4271  D   W 40247824 + 8 [kjournald]
  0,0   21        1     0.504685570  4267  A   W 92640335 + 8 <- (8,6) 234392
  8,5   21        2     0.504686212  4267  Q   W 92640335 + 8 [kjournald]
  8,5   21        3     0.504690614  4267  G   W 92640335 + 8 [kjournald]
  8,5   21        4     0.504691826  4267  P   N [kjournald]
  8,5   21        5     0.504692896  4267  I   W 92640335 + 8 [kjournald]
  0,0   21        6     0.504694268  4267  A   W 92640343 + 8 <- (8,6) 234400
  8,5   21        7     0.504694448  4267  Q   W 92640343 + 8 [kjournald]
  8,5   21        8     0.504695115  4267  M   W 92640343 + 8 [kjournald]
  0,0   21        9     0.504696227  4267  A   W 92640351 + 8 <- (8,6) 234408
  8,5   21       10     0.504696357  4267  Q   W 92640351 + 8 [kjournald]
  8,5   21       11     0.504696615  4267  M   W 92640351 + 8 [kjournald]
  0,0   21       12     0.504697422  4267  A   W 92640359 + 8 <- (8,6) 234416
  8,5   21       13     0.504697565  4267  Q   W 92640359 + 8 [kjournald]
  8,5   21       14     0.504697787  4267  M   W 92640359 + 8 [kjournald]
  0,0   21       15     0.504698549  4267  A   W 92640367 + 8 <- (8,6) 234424
  8,5   21       16     0.504698677  4267  Q   W 92640367 + 8 [kjournald]
  8,5   21       17     0.504698939  4267  M   W 92640367 + 8 [kjournald]
  8,5   21       18     0.504699954  4267  U   N [kjournald] 1
  8,5   21       19     0.504704050  4267  D   W 92640335 + 40 [kjournald]
  8,5    2       27     0.504810390     0  C   W 92640335 + 40 [0]
  0,0    2       28     0.504842324  4267  A   W 92640375 + 8 <- (8,6) 234432
  8,5    2       29     0.504842594  4267  Q   W 92640375 + 8 [kjournald]
  8,5    2       30     0.504844133  4267  G   W 92640375 + 8 [kjournald]
  8,5    2       31     0.504845233  4267  P   N [kjournald]
  8,5    2       32     0.504845703  4267  I   W 92640375 + 8 [kjournald]
  8,5    2       33     0.504846958  4267  U   N [kjournald] 1
  8,5    2       34     0.504848547  4267  D   W 92640375 + 8 [kjournald]
  8,5    2       35     0.504879109     0  C   W 92640375 + 8 [0]
CPU2 (8,5):
 Reads Queued:           0,        0KiB  Writes Queued:           6,       24KiB
 Read Dispatches:        0,        0KiB  Write Dispatches:        3,       24KiB
 Reads Requeued:         0               Writes Requeued:         0
 Reads Completed:        0,        0KiB  Writes Completed:        5,       48KiB
 Read Merges:            0,        0KiB  Write Merges:            3,       12KiB
 Read depth:             0               Write depth:             2
 IO unplugs:             3               Timer unplugs:           0
CPU13 (8,5):
 Reads Queued:           0,        0KiB  Writes Queued:           1,        4KiB
 Read Dispatches:        0,        0KiB  Write Dispatches:        1,        4KiB
 Reads Requeued:         0               Writes Requeued:         0
 Reads Completed:        0,        0KiB  Writes Completed:        0,        0KiB
 Read Merges:            0,        0KiB  Write Merges:            0,        0KiB
 Read depth:             0               Write depth:             2
 IO unplugs:             1               Timer unplugs:           0
CPU21 (8,5):
 Reads Queued:           0,        0KiB  Writes Queued:           5,       20KiB
 Read Dispatches:        0,        0KiB  Write Dispatches:        1,       20KiB
 Reads Requeued:         0               Writes Requeued:         0
 Reads Completed:        0,        0KiB  Writes Completed:        0,        0KiB
 Read Merges:            0,        0KiB  Write Merges:            4,       16KiB
 Read depth:             0               Write depth:             2
 IO unplugs:             1               Timer unplugs:           0

Total (8,5):
 Reads Queued:           0,        0KiB  Writes Queued:          12,       48KiB
 Read Dispatches:        0,        0KiB  Write Dispatches:        5,       48KiB
 Reads Requeued:         0               Writes Requeued:         0
 Reads Completed:        0,        0KiB  Writes Completed:        5,       48KiB
 Read Merges:            0,        0KiB  Write Merges:            7,       28KiB
 IO unplugs:             5               Timer unplugs:           0

Throughput (R/W): 0KiB/s / 95KiB/s
Events (8,5): 61 entries
Skips: 0 forward (0 -   0.0%)

利用這些資訊我們可以很清楚的知道我們IO裝置在做什麼,花了多少時間,透過它瞭解我們系統的運作。如何解讀這些資訊我們可以看手冊有詳細的解釋:
$ man blkparse

同時如果你覺得這些資訊太原始,類似btt, seekwatcher這樣的工具在blktrace的資訊的基礎上更深入的挖掘了系統的行為,使用起來也更簡單。

我們在實際工作的過程中用blktrace定位了很多問題,比如fsync的延時問題和IO排程器的問題,確實是很實用的一個工具。

祝大家玩的開心。

Post Footer automatically generated by wp-posturl plugin for wordpress.

.