1. 程式人生 > >Android卡頓分析中常見的log

Android卡頓分析中常見的log

1 看記憶體
bugreport 開始的時候有pss的資訊 並且進行排序 , 之後會寫一個解析和計算的

2 找system log中關鍵部分

一般裝置hang 住的時候使用者會瘋狂按keycode ,可以找相關log

3keyCode:3 down:true eventTime:831405 下面是常見的keycode

KEYCODE_CALL 撥號鍵 5
KEYCODE_ENDCALL 掛機鍵 6
KEYCODE_HOME 按鍵Home 3
KEYCODE_MENU 選單鍵 82
KEYCODE_BACK 返回鍵 4
KEYCODE_SEARCH 搜尋鍵 84
KEYCODE_CAMERA 拍照鍵 27
KEYCODE_FOCUS 拍照對焦鍵 80
KEYCODE_POWER 電源鍵 26
KEYCODE_NOTIFICATION 通知鍵 83
KEYCODE_MUTE 話筒靜音鍵 91
KEYCODE_VOLUME_MUTE 揚聲器靜音鍵 164
KEYCODE_VOLUME_UP 音量增加鍵 24
KEYCODE_VOLUME_DOWN 音量減小鍵 25

3 lmk log

4 bugreport 啟動位置
搜尋:Starts to dump Bugreport 關鍵字,可以檢視開始生產bugreport的時間

10-27 16:32:14.361 2108 6613 W XSpaceManagerService: checkXSpaceControl, from:com.miui.home, to:com.miui.bugreport, with act:android.intent.action.MAIN, callingUserId:0, toUserId:0
10-27 16:32:14.368 2108 6613 I ActivityManager: START u0 {act=android.intent.action.MAIN cat=[android.intent.category.LAUNCHER] flg=0x10200000 cmp=com.miui.bugreport/.ui.MainTabActivity bnds=[487,664][609,786] (has extras)} from uid 10027 on display 0
10-27 16:32:14.387 4399 4545 I WtProcessController: mCurTask:85
10-27 16:32:14.397 2108 2138 I ActiveServicesInjector: RESTART Low priority start of: ServiceRecord{10fc538 u10 com.google.android.gms/com.google.android.location.internal.GoogleLocationManagerService}
10-27 16:32:14.398 2108 2138 W ActivityManager: Scheduling restart of crashed service com.google.android.gms/com.google.android.location.internal.GoogleLocationManagerService in 192456ms
10-27 16:32:14.444 2108 6613 I ActivityManager: Start proc 11744:com.miui.bugreport/1000 for activity com.miui.bugreport/.ui.MainTabActivity caller=com.miui.home
10-27 16:32:14.481 11674 11714 E ActivityThread: Failed to find provider info for com.google.android.gms.common.phenotype
10-27 16:32:14.505 4399 4545 I WtProcessController: MOVE TO FOREGROUND: com.miui.bugreport 1000
10-27 16:32:14.506 4399 4545 I StatusController: Last foreground:com.miui.home uid:10027 Current foreground:com.miui.bugreport uid:1000
10-27 16:32:14.506 4399 4545 I WtProcessController: FOREGROUND INFO: name=com.miui.bugreport uid=1000 pid=11744 TaskId:85
10-27 16:32:14.538 2108 2138 W ActivityManager: Scheduling restart of crashed service com.google.android.gms/com.google.android.contextmanager.service.ContextManagerService in 1000ms
10-27 16:32:14.548 2108 2138 I ActiveServicesInjector: Low priority start of: ServiceRecord{671a411 u10 com.google.android.gms/com.google.location.nearby.direct.service.NearbyDirectService}
10-27 16:32:14.570 2108 27407 I ActivityManager: Process com.qualcomm.qti.services.secureui:sui_service (pid 11100) has died

5 cpu資訊
—— DUMPSYS CPUINFO (dumpsys -t 10 cpuinfo -a) ——
Load: 25.52 / 21.97 / 14.3
CPU usage from 59342ms to 492ms ago (2017-11-23 19:28:38.401 to 2017-11-23 19:29:37.251):
65% 1643/system_server: 38% user + 27% kernel / faults: 23803 minor
57% 3505/android.process.acore: 30% user + 27% kernel / faults: 33751 minor 4 major
25% 11225/net.one97.paytm: 18% user + 6.9% kernel / faults: 43578 minor
8.1% 3160/com.facebook.katana: 7.2% user + 0.9% kernel / faults: 12876 minor 16 major
6.6% 732/surfaceflinger: 3.7% user + 2.8% kernel / faults: 336 minor
6.2% 3829/com.lbe.security.miui: 3.3% user + 2.8% kernel / faults: 3573 minor
5.2% 14218/android.process.media: 4.9% user + 0.2% kernel / faults: 5833 minor 1 major
5% 16004/com.android.systemui:screenshot: 3.9% user + 1.1% kernel / faults: 60453 minor 1 major
4.6% 16019/com.miui.gallery: 4.1% user + 0.5% kernel / faults: 21898 minor 8 major
3.9% 7096/com.imo.android.imoim: 3.8% user + 0.1% kernel / faults: 2909 minor

如上述log 搜尋 DUMPSYS CPUINFO 可以看到 當時系統的平均負載很高, 按照8核cpu來算,負載也很高,一般負載不應該超過cpu核心數*1.5

另外從cpu使用率上來看 kernel消耗了大量cpu時間 顯然不是正常情況,高吞吐率的系統,kernel應該使用較少cpu, 並且當時記憶體佔用不高 iowait不高的情況下 很可能是程序上下文頻繁切換引起

6 kernel重啟
—— POWERUP_REASON (/sys/bootinfo/powerup_reason) ——
kpanic

7 WatchDog資訊
watchdog發現系統卡頓的時候說明問題已經非常嚴重,有30s不響應就會列印traces.txt呼叫棧

超過一分鐘不想贏則會列印 類似如下的log,和儲存 名字如races_SystemServer_WDT_${time}.txt的檔案,並重啟system_server

Blocked in handler on foreground thread (android.fg), Blocked in handler on main thread (main), Blocked in handler on display thread (android.display), Blocked in handler on ActivityManager (ActivityManager)

“Blocked in monitor ” + mCurrentMonitor.getClass().getName() ” on ” + mName + ” (” + getThread().getName() + “)”

如果monkey過程中發生超過1s的超時,不會重啟system_server,但是會儲存一個anr_watchdog${time}.txt的bugreport檔案

8 等鎖的資訊
12-19 00:24:34.836 6795 6795 I dvm_lock_sample: [com.android.commands.monkey,0,main,1429523,PrintStream.java,823,-,823,0]
12-19 00:24:34.837 1547 1547 I dvm_lock_sample: [system_server,0,main,1459069,ActivityManagerService.java,19053,ActivityStarter.java,855,0]
12-19 00:24:34.839 20217 20217 I binder_sample: [android.app.IActivityManager,3,1459614,android.process.mediaUI,100]
12-19 00:24:34.840 1547 1833 I dvm_lock_sample: [system_server,1,Binder:1547_4,1459612,ActivityManagerService.java,18001,ActivityStarter.java,855,0]
12-19 00:24:34.842 1547 2690 I dvm_lock_sample: [system_server,1,Binder:1547_C,1459614,ActivityManagerService.java,7480,ActivityStarter.java,855,0]
12-19 00:24:34.843 1547 2822 I dvm_lock_sample: [system_server,1,Binder:1547_11,1459560,UserController.java,1507,ActivityStarter.java,855,0]
12-19 00:24:34.843 1547 1726 I dvm_lock_sample: [system_server,0,InputDispatcher,1459561,ActivityManagerService.java,6136,ActivityStarter.java,855,0]
12-19 00:24:34.844 1547 1596 I dvm_lock_sample: [system_server,0,ActivityManager,1459543,ActivityManagerService.java,2161,ActivityStarter.java,855,0]
12-19 00:24:34.845 1547 1788 I dvm_lock_sample: [system_server,0,LazyTaskWriterThread,1458865,TaskPersister.java,547,ActivityStarter.java,855,0]
12-19 00:24:34.845 1547 1602 I dvm_lock_sample: [system_server,0,android.display,1454632,ActivityManagerService.java,22610,ActivityStarter.java,855,0]
12-19 00:24:34.846 1547 1600 I dvm_lock_sample: [system_server,0,android.fg,1455409,ActivityManagerService.java,19053,ActivityStarter.java,855,0]
12-19 00:24:34.846 1547 3486 I dvm_lock_sample: [system_server,1,Binder:1547_20,1393949,ActivityManagerService.java,17756,ActivityStarter.java,855,0]
12-19 00:24:34.847 1547 1725 I dvm_lock_sample: [system_server,0,AlarmManager,1443453,ActivityManagerService.java,7780,ActivityStarter.java,855,0]
12-19 00:24:34.847 1547 3444 I dvm_lock_sample: [system_server,1,Binder:1547_1F,1253238,ActivityManagerService.java,14429,ActivityStarter.java,855,0]
12-19 00:24:34.847 1809 1809 I binder_sample: [android.app.IActivityManager,145,1459563,com.android.systemui,100]
12-19 00:24:34.847 1547 3440 I dvm_lock_sample: [system_server,1,Binder:1547_1C,815593,ActivityManagerService.java,6078,ActivityStarter.java,855,0]

格式

程序名,主執行緒?,執行緒名字,鎖等待時間,當前持有者的檔名,行號,上一個持有者檔名,行號,鎖等待的百分比

列印此條LOG的時候,上一個鎖的持有者已經釋放鎖,本執行緒為下一個鎖的持有者,即將獲取鎖
上一個持有者的檔名,可能是“-”,表示與持有者是同一個檔案。
經過最近分析,一般出現問題都是上一個只有者引起的該問題. 不過需要分析art原始碼確認這一點

9 art其他監控資訊(感覺像是斥鎖資訊)
W art : Long monitor contention with owner Binder:1547_E (2818) at android.content.Intent com.android.server.am.ExtraActivityManagerService.checkStartActivityPermission(android.content.Context, com.android.server.am.ActivityManagerService, android.app.IApplicationThread, android.content.pm.ActivityInfo, android.content.Intent, java.lang.String, boolean, int, boolean, int, int, java.lang.String, android.os.Bundle)(ExtraActivityManagerService.java:213) waiters=8 in boolean com.android.server.am.ActivityManagerService.unbindService(android.app.IServiceConnection) for 1420.999s

程式碼在art/runtime/monitor.cc 的void Monitor::Lock(Thread* self) 函式裡面 ,主要就是嘗試獲取鎖獲取不到的時候列印,如果開了ATRACE,也會有相應的trace資訊

通過std::string Monitor::PrettyContentionInfo(const std::string& owner_name,
pid_t owner_tid,
ArtMethod* owners_method,
uint32_t owners_dex_pc,
size_t num_waiters) 函式列印

std::ostringstream oss;
oss << “monitor contention with owner ” << owner_name << ” (” << owner_tid << “)”;
if (owners_method != nullptr) {
oss << ” at ” << PrettyMethod(owners_method);
oss << “(” << owners_filename << “:” << owners_line_number << “)”;
}
oss << ” waiters=” << num_waiters;
該log在獲取鎖並且超過閥值後列印,閥值使用-Xlockprofthreshold:_設定,log中引數的意義如下

W art :monitor contention with owner ( {tid}) a t {持鎖的方法}( : {行數}) waiters= i n {等鎖的方法} for ${等鎖的時間}

10 miui的一些監控資訊 (需要深入分析)
1 MIUI-BLOCK-MONITOR: 監控一些呼叫卡頓的資訊

2 12-19 02:42:26.460 2753 2753 I Timeline: Timeline: Activity_launch_request time:33148551 intent:Intent { act=com.mi.android.globalpersonalassistant.CommonSettingDetailActivity flg=0x10008000 (has extras) } 應該是Activity從啟動到可見過程的timeline 需要具體分析

如何提升ANR問題分析效率 參考這篇文章

11 原生binder監控機制 目前d2a上是關閉的
關鍵字:binder_sample 在小米手機java層的binder呼叫超過300ms必打這個log

原理 frameworks/base/core/jni/android_util_Binder.cpp:conditionally_log_binder_call 函式列印event log

binder_sample: [android.hardware.input.IInputManager,5,30002,com.android.commands.monkey,100]

binder_sample:[binder的desc,transaction code,耗時,程序名,相對於閥值的百分比]

12 Binder log分析
binder如何分析binder執行緒池耗盡? 程序在開啟binder驅動的時候會設定該程序最大的binder執行緒個數

result = ioctl(fd, BINDER_SET_MAX_THREADS, &maxThreads);
這裡在7.0的手機上預設的個數是15個. 15並不代表該執行緒所能建立的binder執行緒的最大個數.

只是代表由binder驅動被動請求建立執行緒的個數,當我們檢視binder執行緒池是否耗盡的時候要區分執行緒是被動建立的還是主動建立的.

分析log的時候搜尋關鍵字requested threads 就能找到被動建立的binder執行緒數量

—— 0.044s was the duration of ‘BINDER TRANSACTIONS’ ——
—— BINDER STATS (/sys/kernel/debug/binder/stats) ——

proc 12720
context binder
threads: 4 程序包含的binder執行緒個數
requested threads: 0+1/15 是否正在請求建立binder執行緒中+由binder驅動被動請求建立的binder個數/最大可以被動建立的binder執行緒個數
ready threads 2 空閒的執行緒個數
free async space 520192
nodes: 7 binder服務的個數
refs: 20 s 20 w 20 請求服務的個數 強引用個數 弱引用個數
buffers: 1
pages: 1:2:251 活躍的:分配未使用的:未分配的
pending transactions: 0 還沒有執行的工作項
BC_TRANSACTION: 60 發起的BC_TRANSACTION請求次數
BC_FREE_BUFFER: 62 請求釋放buffer次數
BC_INCREFS: 20 請求增加強引用
BC_ACQUIRE: 20
BC_INCREFS_DONE: 7
BC_ACQUIRE_DONE: 7
BC_REGISTER_LOOPER: 1 被動準備的執行緒 進入就緒狀態的次數
BC_ENTER_LOOPER: 1 主動
BR_TRANSACTION: 3 收到的BR_TRANSACTION請求
BR_REPLY: 60 收到的結果(入過每次BC_TRANSACTION都有返回的話是一致的)
BR_TRANSACTION_COMPLETE: 60 TRANSACTION被驅動接收到的次數
BR_INCREFS: 7
BR_ACQUIRE: 7
BR_SPAWN_LOOPER: 1 請求建立binder執行緒次數

13 跳幀

當生產者的速度超過消費者,在開發術語中稱作背壓,通過systrace可以很容易看到這種情況,另外在使用進行合成的程式碼中也會有如下log輸出

01-18 20:31:57.285 572 572 D SurfaceFlinger: Backpressure trigger, skipping transaction & refresh!
01-18 20:31:57.352 572 572 D SurfaceFlinger: Backpressure trigger, skipping transaction & refresh!
01-18 20:31:58.705 572 572 D SurfaceFlinger: Backpressure trigger, skipping transaction & refresh!

在systrace中的表現如下,在Surfaceflinger的ui執行緒中應該執行INVALIDATE的地方什麼都沒有做就返回了.說明有跳幀

這裡寫圖片描述

如下面這種情況在INVALIDATE的時候執行了hanleMessageInvalidate而沒有執行頁面翻轉是正常的,說明這段時間沒有提交事務

這裡寫圖片描述

這個值在hwc2 的裝置中生效

14 Choreographer的跳幀資訊

TAG, “Skipped ” + skippedFrames + ” frames! ”
+ “The application may be doing too much work on its main thread.”) 這裡是收到vsync和處理這個vsync的時間太長超過大概30ms就會列印,說明應用主執行緒比較忙,TAG是Choreographer

Log.w(TAG, “Already have a pending vsync event. There should only be ”
+ “one at a time.”);
還有個vsync沒有處理又新來了一個vsync

15 cpu使用資訊 在trace中

state=S schedstat=( 116304198556 13649702167 189147 ) utm=9873 stm=1757 core=6 HZ=100

schedstat=( 116304198556 13649702167 189147 ) 中的三個值分別代表 :

1 累計執行的物理時間時間

2 /累計在就緒佇列裡的等待時間

3主動切換和被動切換的累計次數

utm/stm: 使用者態/核心態的CPU時間(單位是jiffies)

core=6 最後執行的核心

HZ=100 頻率 1/HZ - jiffies

17 SurfaceFlinger 統計資訊

EventLog中

11-23 21:44:37.631 734 734 I sf_frame_dur: [naukriApp.appModules.login/com.naukri.jobsforyou.RecommendedJobsContainer,76,0,1,0,0,0,1]
11-23 21:44:37.631 734 734 I sf_frame_dur: [com.facebook.katana/com.facebook.katana.LoginActivity,1371,1387,40,25,12,5,3]
11-23 21:44:37.631 734 734 I sf_frame_dur: [com.facebook.katana/com.facebook.katana.LoginActivity,528,25,3,2,2,1,0]
11-23 21:44:37.631 734 734 I sf_frame_dur: [com.android.contacts/com.android.contacts.activities.TwelveKeyDialer,4,2,1,3,1,0,0]
11-23 21:44:37.631 734 734 I sf_frame_dur: [com.facebook.orca/com.facebook.orca.auth.StartScreenActivity,283,11,3,4,4,3,1]
11-23 21:44:37.631 734 734 I sf_frame_dur: [com.miui.gallery/com.miui.gallery.activity.HomePageActivity,225,6,8,5,7,4,8]
11-23 21:44:37.631 734 734 I sf_frame_dur: [com.android.settings/com.android.settings.MainSettings,16,0,0,0,0,0,1]
11-23 21:44:37.631 734 734 I sf_frame_dur: [ [com.android.email/com.android.email.activity.Welcome,344,143,9,2,2,2,8]
11-23 21:44:37.631 734 734 I sf_frame_dur: [StatusBar,282,11,13,101,17,55,53]
11-23 21:44:37.631 734 734 I sf_frame_dur: [NavigationBar,302,6,0,2,1,3,12]
11-23 21:44:37.631 734 734 I sf_frame_dur: [RoundCorner,0,0,0,0,0,0,8]
11-23 21:44:37.631 734 734 I sf_frame_dur: [RoundCorner,0,0,0,0,0,0,8]
11-23 21:44:37.632 734 734 I sf_frame_dur: [,9625,532,191,164,127,161,521]com.tencent.mm/com.tencent.mm.ui.LauncherUI,16,1,1,1,1,1,0]

定義

dur代表上幀被合成到當前幀被合成的時間.

DisplayPeriod= (1/fps)
durPeriod= (dur+DisplayPeriod/2)/DisplayPeriod

如上SF中layer被銷燬後會列印該layer的一些統計資訊 log的格式為 [window名稱,<2durPeriod,(2-3)durPeriod,(4-7)durPeriod,(8-15)durPeriod,(16-31)durPeriod,(23-63)durPeriod,>128durPeriod]

該log雖然可以列印延遲,但是由於有時候介面並無變化,兩幀之間差值太大並無什麼意義

18 binder 執行緒池耗盡 執行緒池耗盡很容易引起anr

11-23 19:29:57.905 3505 9248 E IPCThreadState: binder thread pool (15 threads) starved for 1070 ms

19 Looper訊息處理時間過長

Dispatch took 144ms on android.ui, h=Handler (android.view.Choreographer$FrameHandler) {a32d925} cb=android.view.Choreographe

public void setSlowDispatchThresholdMs(long slowDispatchThresholdMs) {
mSlowDispatchThresholdMs = slowDispatchThresholdMs;
}
設定looper的一個閥值,超過則列印該system log

20 AMS中會列印如下log在關鍵時刻

Slow operation:
private void checkTime(long startTime, String where) {
long now = SystemClock.uptimeMillis();
if ((now-startTime) > 50) {
// If we are taking more than 50ms, log about it.
Slog.w(TAG, “Slow operation: ” + (now-startTime) + “ms so far, now at ” + where);
}
}
20 Window 處理input時間太長

02-26 17:27:29.849 2583 3171 I InputDispatcher: Window ‘Window{8fd109b u0 InputMethod}

’ spent 3238.7ms processing the last input event: MotionEvent(deviceId=9, source=0x00001002, action=1, actionButton=0x00000000, flags=0x00000000, metaState=0x00000000, buttonState=0x00000000, edgeFlags=0x00000000, xPrecision=1.0, yPrecision=1.0, displayId=0, pointers=[0: (183.8, 1685.1)]), policyFlags=0x62000000

21 WtBlockMonitor: Package name :com.google.android.apps.maps MTK平臺特有

03-19 14:23:46.538038 1702 1702 W WtBlockMonitor: Package name :com.google.android.apps.maps
03-19 14:23:46.538038 1702 1702 W WtBlockMonitor: The Message { when= what=114 obj=CreateServiceData{[email protected] className=com.google.android.apps.gmm.offline.update.OfflineAutoUpdateJobService packageName=com.google.android.apps.maps intent=null} target=android.app.ActivityThread$H } took 9995 ms

03-24 12:56:29.097 1000 2433 2473 I am_activity_launch_time: [0,184420527,com.google.android.dialer/com.google.android.apps.dialer.extensions.GoogleDialtactsActivity,738,738]