基于Android 6.0的源码剖析, 分析Android应用Crash是如何处理的
/frameworks/base/core/java/com/android/internal/os/RuntimeInit.java
/frameworks/base/core/java/android/app/ActivityManagerNative.java (含内部类AMP)
/frameworks/base/core/java/android/app/ApplicationErrorReport.java
/frameworks/base/services/core/java/com/android/server/
- am/ActivityManagerService.java
- am/ProcessRecord.java
- am/ActivityRecord.java
- am/ActivityStackSupervisor.java
- am/ActivityStack.java
- am/ActivityRecord.java
- am/BroadcastQueue.java
- wm/WindowManagerService.java
/libcore/libart/src/main/java/java/lang/Thread.java
一、概述
App crash(全称Application crash), 对于Crash可分为native crash和framework crash(包含app crash在内),对于crash相信很多app开发者都会遇到,那么上层什么时候会出现crash呢,系统又是如何处理crash的呢。例如,在app大家经常使用try...catch
语句,那么如果没有有效catch exception,就是导致应用crash,发生没有catch exception,系统便会来进行捕获,并进入crash流程。
在Android系统启动系列文章,已讲述过上层应用都是由Zygote fork孵化而来,分为system_server系统进程和各种应用进程,在这些进程创建之初会设置未捕获异常的处理器,当系统抛出未捕获的异常时,最终都交给异常处理器。
- 对于system_server进程:文章Android系统启动-SystemServer上篇,system_server启动过程中由RuntimeInit.java的
commonInit
方法设置UncaughtHandler,用于处理未捕获异常; - 对于普通应用进程:文章理解Android进程创建流程 ,进程创建过程中,同样会调用RuntimeInit.java的
commonInit
方法设置UncaughtHandler。
1.1 crash调用链
crash流程的方法调用关系来结尾:
AMP.handleApplicationCrash
AMS.handleApplicationCrash
AMS.findAppProcess
AMS.handleApplicationCrashInner
AMS.addErrorToDropBox
AMS.crashApplication
AMS.makeAppCrashingLocked
AMS.startAppProblemLocked
ProcessRecord.stopFreezingAllLocked
ActivityRecord.stopFreezingScreenLocked
WMS.stopFreezingScreenLocked
WMS.stopFreezingDisplayLocked
AMS.handleAppCrashLocked
mUiHandler.sendMessage(SHOW_ERROR_MSG)
Process.killProcess(Process.myPid());
System.exit(10);
接下来说说这个过程。
二、Crash处理流程
那么接下来以commonInit()方法为起点来展开说明。
1. RuntimeInit.commonInit
public class RuntimeInit {
...
private static final void commonInit() {
//设置默认的未捕获异常处理器,UncaughtHandler实例化过程【见小节2】
Thread.setDefaultUncaughtExceptionHandler(new UncaughtHandler());
...
}
}
setDefaultUncaughtExceptionHandler()只是将异常处理器handler对象赋给Thread成员变量,即Thread.defaultUncaughtHandler = new UncaughtHandler()
。接下来看看UncaughtHandler对象实例化过程。
2. UncaughtHandler
[–>RuntimeInit.java]
private static class UncaughtHandler implements Thread.UncaughtExceptionHandler {
//覆写接口方法
public void uncaughtException(Thread t, Throwable e) {
try {
//保证crash处理过程不会重入
if (mCrashing) return;
mCrashing = true;
if (mApplicationObject == null) {
//system_server进程
Clog_e(TAG, "*** FATAL EXCEPTION IN SYSTEM PROCESS: " + t.getName(), e);
} else {
//普通应用进程
StringBuilder message = new StringBuilder();
message.append("FATAL EXCEPTION: ").append(t.getName()).append("\n");
final String processName = ActivityThread.currentProcessName();
if (processName != null) {
message.append("Process: ").append(processName).append(", ");
}
message.append("PID: ").append(Process.myPid());
Clog_e(TAG, message.toString(), e);
}
//启动crash对话框,等待处理完成 【见小节2.1和3】
ActivityManagerNative.getDefault().handleApplicationCrash(
mApplicationObject, new ApplicationErrorReport.CrashInfo(e));
} catch (Throwable t2) {
...
} finally {
//确保当前进程彻底杀掉【见小节11】
Process.killProcess(Process.myPid());
System.exit(10);
}
}
}
- 当system进程crash的信息:
- 开头
*** FATAL EXCEPTION IN SYSTEM PROCESS [线程名]
; - 接着输出发生crash时的调用栈信息;
- 开头
- 当app进程crash时的信息:
- 开头
FATAL EXCEPTION: [线程名]
; - 紧接着
Process: [进程名], PID: [进程id]
; - 最后输出发生crash时的调用栈信息。
- 开头
看到这里,你就会发现要从log中搜索crash信息,只需要搜索关键词FATAL EXCEPTION
;如果需要进一步筛选只搜索系统crash信息,则可以搜索的关键词可以有多样,比如*** FATAL EXCEPTION
。
当输出完crash信息到logcat里面,这只是crash流程的刚开始阶段,接下来弹出crash对话框
,ActivityManagerNative.getDefault()返回的是ActivityManagerProxy(简称AMP
),AMP
经过binder调用最终交给ActivityManagerService(简称AMS
)中相应的方法去处理,故接下来调用的是AMS.handleApplicationCrash()。
[-> ApplicationErrorReport.java]
public class ApplicationErrorReport implements Parcelable {
...
public static class CrashInfo {
public CrashInfo(Throwable tr) {
StringWriter sw = new StringWriter();
PrintWriter pw = new FastPrintWriter(sw, false, 256);
tr.printStackTrace(pw); //输出栈trace
pw.flush();
stackTrace = sw.toString();
exceptionMessage = tr.getMessage();
Throwable rootTr = tr;
while (tr.getCause() != null) {
tr = tr.getCause();
if (tr.getStackTrace() != null && tr.getStackTrace().length > 0) {
rootTr = tr;
}
String msg = tr.getMessage();
if (msg != null && msg.length() > 0) {
exceptionMessage = msg;
}
}
exceptionClassName = rootTr.getClass().getName();
if (rootTr.getStackTrace().length > 0) {
StackTraceElement trace = rootTr.getStackTrace()[0];
throwFileName = trace.getFileName();
throwClassName = trace.getClassName();
throwMethodName = trace.getMethodName();
throwLineNumber = trace.getLineNumber();
} else {
throwFileName = "unknown";
throwClassName = "unknown";
throwMethodName = "unknown";
throwLineNumber = 0;
}
}
...
}
}
将crash信息文件名
,类名
,方法名
,对应行号
以及异常信息
都封装到CrashInfo对象。
3. handleApplicationCrash
[–>ActivityManagerService.java]
public void handleApplicationCrash(IBinder app, ApplicationErrorReport.CrashInfo crashInfo) {
//获取进程record对象【见小节3.1】
ProcessRecord r = findAppProcess(app, "Crash");
final String processName = app == null ? "system_server"
: (r == null ? "unknown" : r.processName);
//【见小节4】
handleApplicationCrashInner("crash", r, processName, crashInfo);
}
关于进程名(processName):
- 当远程IBinder对象为空时,则进程名为
system_server
; - 当远程IBinder对象不为空,且ProcessRecord为空时,则进程名为
unknown
; - 当远程IBinder对象不为空,且ProcessRecord不为空时,则进程名为ProcessRecord对象中相应进程名。
3.1 findAppProcess
[–>ActivityManagerService.java]
private ProcessRecord findAppProcess(IBinder app, String reason) {
if (app == null) {
return null;
}
synchronized (this) {
final int NP = mProcessNames.getMap().size();
for (int ip=0; ip apps = mProcessNames.getMap().valueAt(ip);
final int NA = apps.size();
for (int ia=0; ia
其中 mProcessNames = new ProcessMap();
对于代码mProcessNames.getMap()
返回的是mMap
,而mMap= new ArrayMap>()
;
知识延伸:SparseArray
和ArrayMap
是Android专门针对内存优化而设计的取代Java API中的HashMap
的数据结构。对于key是int类型则使用SparseArray
,可避免自动装箱过程;对于key为其他类型则使用ArrayMap
。HashMap
的查找和插入时间复杂度为O(1)的代价是牺牲大量的内存来实现的,而SparseArray
和ArrayMap
性能略逊于HashMap
,但更节省内存。
再回到mMap
,这是以进程name为key,再以(uid为key,以ProcessRecord为Value的)结构体作为value。下面看看其get()和put()方法
//获取mMap中(name,uid)所对应的ProcessRecord
public ProcessRecord get(String name, int uid) {};
//将(name,uid, value)添加到mMap
public ProcessRecord put(String name, int uid, ProcessRecord value) {};
findAppProcess()根据app(IBinder类型)来查询相应的目标对象ProcessRecord。
有了进程记录对象ProcessRecord和进程名processName,则进入执行Crash处理方法,继续往下看。
4. handleApplicationCrashInner
[–>ActivityManagerService.java]
void handleApplicationCrashInner(String eventType, ProcessRecord r, String processName,
ApplicationErrorReport.CrashInfo crashInfo) {
//将Crash信息写入到Event log
EventLog.writeEvent(EventLogTags.AM_CRASH,...);
//将错误信息添加到DropBox
addErrorToDropBox(eventType, r, processName, null, null, null, null, null, crashInfo);
//【见小节5】
crashApplication(r, crashInfo);
}
其中addErrorToDropBox是将crash的信息输出到目录/data/system/dropbox
。例如system_server的dropbox文件名为system_server_crash@xxx.txt (xxx代表的是时间戳)
5. crashApplication
[–>ActivityManagerService.java]
private void crashApplication(ProcessRecord r, ApplicationErrorReport.CrashInfo crashInfo) {
long timeMillis = System.currentTimeMillis();
String shortMsg = crashInfo.exceptionClassName;
String longMsg = crashInfo.exceptionMessage;
String stackTrace = crashInfo.stackTrace;
if (shortMsg != null && longMsg != null) {
longMsg = shortMsg + ": " + longMsg;
} else if (shortMsg != null) {
longMsg = shortMsg;
}
AppErrorResult result = new AppErrorResult();
synchronized (this) {
//清除远程调用者uid和pid信息,并保存到origId
final long origId = Binder.clearCallingIdentity();
...
//【见小节6】
if (r == null || !makeAppCrashingLocked(r, shortMsg, longMsg, stackTrace)) {
Binder.restoreCallingIdentity(origId);
return;
}
Message msg = Message.obtain();
msg.what = SHOW_ERROR_MSG;
HashMap data = new HashMap();
data.put("result", result);
data.put("app", r);
msg.obj = data;
//发送消息SHOW_ERROR_MSG,弹出提示crash的对话框,等待用户选择【见小节10】
mUiHandler.sendMessage(msg);
//恢复远程调用者uid和pid
Binder.restoreCallingIdentity(origId);
}
//进入阻塞等待,直到用户选择crash对话框"退出"或者"退出并报告"
int res = result.get();
Intent appErrorIntent = null;
synchronized (this) {
if (r != null && !r.isolated) {
// 将崩溃的进程信息保存到mProcessCrashTimes
mProcessCrashTimes.put(r.info.processName, r.uid,
SystemClock.uptimeMillis());
}
if (res == AppErrorDialog.FORCE_QUIT_AND_REPORT) {
//创建action="android.intent.action.APP_ERROR",组件为r.errorReportReceiver的Intent
appErrorIntent = createAppErrorIntentLocked(r, timeMillis, crashInfo);
}
}
if (appErrorIntent != null) {
try {
//启动Intent为appErrorIntent的Activity
mContext.startActivityAsUser(appErrorIntent, new UserHandle(r.userId));
} catch (ActivityNotFoundException e) {
Slog.w(TAG, "bug report receiver dissappeared", e);
}
}
}
该方法主要做的两件事:
- 调用
makeAppCrashingLocked
,继续处理crash流程;
- 发送消息SHOW_ERROR_MSG,弹出提示crash的对话框,等待用户选择;
6. makeAppCrashingLocked
[–>ActivityManagerService.java]
private boolean makeAppCrashingLocked(ProcessRecord app,
String shortMsg, String longMsg, String stackTrace) {
app.crashing = true;
//封装crash信息到crashingReport对象
app.crashingReport = generateProcessError(app,
ActivityManager.ProcessErrorStateInfo.CRASHED, null, shortMsg, longMsg, stackTrace);
//【见小节7】
startAppProblemLocked(app);
//停止屏幕冻结【见小节8】
app.stopFreezingAllLocked();
//【见小节9】
return handleAppCrashLocked(app, "force-crash", shortMsg, longMsg, stackTrace);
}
7. startAppProblemLocked
[–>ActivityManagerService.java]
void startAppProblemLocked(ProcessRecord app) {
app.errorReportReceiver = null;
for (int userId : mCurrentProfileIds) {
if (app.userId == userId) {
//获取当前用户下的crash应用的error receiver【见小节7.1】
app.errorReportReceiver = ApplicationErrorReport.getErrorReportReceiver(
mContext, app.info.packageName, app.info.flags);
}
}
//忽略当前app的广播接收【见小节7.2】
skipCurrentReceiverLocked(app);
}
该方法主要功能:
- 获取当前用户下的crash应用的error receiver;
- 忽略当前app的广播接收;
7.1 getErrorReportReceiver
[-> ApplicationErrorReport.java]
public static ComponentName getErrorReportReceiver(Context context,
String packageName, int appFlags) {
//检查Settings中的"send_action_app_error"是否使能错误报告的功能
int enabled = Settings.Global.getInt(context.getContentResolver(),
Settings.Global.SEND_ACTION_APP_ERROR, 0);
if (enabled == 0) {
//1.当未使能时,则直接返回
return null;
}
PackageManager pm = context.getPackageManager();
String candidate = null;
ComponentName result = null;
try {
//获取该crash应用的安装器的包名
candidate = pm.getInstallerPackageName(packageName);
} catch (IllegalArgumentException e) {
}
if (candidate != null) {
result = getErrorReportReceiver(pm, packageName, candidate);//【见下文】
if (result != null) {
//2.当找到该crash应用的安装器,则返回;
return result;
}
}
if ((appFlags&ApplicationInfo.FLAG_SYSTEM) != 0) {
//该系统属性名为"ro.error.receiver.system.apps"
candidate = SystemProperties.get(SYSTEM_APPS_ERROR_RECEIVER_PROPERTY);
result = getErrorReportReceiver(pm, packageName, candidate);//【见下文】
if (result != null) {
//3.当crash应用是系统应用时,且系统属性指定error receiver时,则返回;
return result;
}
}
//该默认属性名为"ro.error.receiver.default"
candidate = SystemProperties.get(DEFAULT_ERROR_RECEIVER_PROPERTY);
//4.当默认属性值指定error receiver时,则返回;
return getErrorReportReceiver(pm, packageName, candidate); //【见下文】
}
getErrorReportReceiver:这是同名不同输入参数的另一个方法:
static ComponentName getErrorReportReceiver(PackageManager pm, String errorPackage,
String receiverPackage) {
if (receiverPackage == null || receiverPackage.length() == 0) {
return null;
}
//当安装应用程序的安装器Crash,则直接返回
if (receiverPackage.equals(errorPackage)) {
return null;
}
//ACTION_APP_ERROR值为"android.intent.action.APP_ERROR"
Intent intent = new Intent(Intent.ACTION_APP_ERROR);
intent.setPackage(receiverPackage);
ResolveInfo info = pm.resolveActivity(intent, 0);
if (info == null || info.activityInfo == null) {
return null;
}
//创建包名为receiverPackage的组件
return new ComponentName(receiverPackage, info.activityInfo.name);
}
7.2 skipCurrentReceiverLocked
[–>ActivityManagerService.java]
void skipCurrentReceiverLocked(ProcessRecord app) {
for (BroadcastQueue queue : mBroadcastQueues) {
queue.skipCurrentReceiverLocked(app); //【见小节7.2.1】
}
}
7.2.1 skipCurrentReceiverLocked
[-> BroadcastQueue.java]
public void skipCurrentReceiverLocked(ProcessRecord app) {
BroadcastRecord r = null;
//查看app进程中的广播
if (mOrderedBroadcasts.size() > 0) {
BroadcastRecord br = mOrderedBroadcasts.get(0);
if (br.curApp == app) {
r = br;
}
}
if (r == null && mPendingBroadcast != null && mPendingBroadcast.curApp == app) {
r = mPendingBroadcast;
}
if (r != null) {
//结束app进程的广播结束
finishReceiverLocked(r, r.resultCode, r.resultData,
r.resultExtras, r.resultAbort, false);
//广播调度
scheduleBroadcastsLocked();
}
}
8. PR.stopFreezingAllLocked
[-> ProcessRecord.java]
public void stopFreezingAllLocked() {
int i = activities.size();
while (i > 0) {
i--;
activities.get(i).stopFreezingScreenLocked(true); //【见小节8.1】
}
}
其中activities类型为ArrayList>,停止进程里所有的Activity
8.1. AR.stopFreezingScreenLocked
[-> ActivityRecord.java]
public void stopFreezingScreenLocked(boolean force) {
if (force || frozenBeforeDestroy) {
frozenBeforeDestroy = false;
//mWindowManager类型为WMS //【见小节8.1.1】
service.mWindowManager.stopAppFreezingScreen(appToken, force);
}
}
其中appToken是IApplication.Stub类型,即WindowManager的token。
8.1.1 WMS.stopFreezingScreenLocked
[-> WindowManagerService.java]
@Override
public void stopFreezingScreen() {
//权限检查
if (!checkCallingPermission(android.Manifest.permission.FREEZE_SCREEN,
"stopFreezingScreen()")) {
throw new SecurityException("Requires FREEZE_SCREEN permission");
}
synchronized(mWindowMap) {
if (mClientFreezingScreen) {
mClientFreezingScreen = false;
mLastFinishedFreezeSource = "client";
final long origId = Binder.clearCallingIdentity();
try {
stopFreezingDisplayLocked(); //【见流程8.1.1.1】
} finally {
Binder.restoreCallingIdentity(origId);
}
}
}
}
8.1.1.1 WMS.stopFreezingDisplayLocked
[-> WindowManagerService.java]
private void stopFreezingDisplayLocked() {
if (!mDisplayFrozen) {
return; //显示没有冻结,则直接返回
}
//往往跟屏幕旋转相关
...
mDisplayFrozen = false;
//从上次冻屏到现在的总时长
mLastDisplayFreezeDuration = (int)(SystemClock.elapsedRealtime() - mDisplayFreezeTime);
//移除冻屏的超时消息
mH.removeMessages(H.APP_FREEZE_TIMEOUT);
mH.removeMessages(H.CLIENT_FREEZE_TIMEOUT);
boolean updateRotation = false;
//获取默认的DisplayContent
final DisplayContent displayContent = getDefaultDisplayContentLocked();
final int displayId = displayContent.getDisplayId();
ScreenRotationAnimation screenRotationAnimation =
mAnimator.getScreenRotationAnimationLocked(displayId);
//屏幕旋转动画的相关操作
if (CUSTOM_SCREEN_ROTATION && screenRotationAnimation != null
&& screenRotationAnimation.hasScreenshot()) {
DisplayInfo displayInfo = displayContent.getDisplayInfo();
boolean isDimming = displayContent.isDimming();
if (!mPolicy.validateRotationAnimationLw(mExitAnimId, mEnterAnimId, isDimming)) {
mExitAnimId = mEnterAnimId = 0;
}
//加载动画最大时长为10s
if (screenRotationAnimation.dismiss(mFxSession, MAX_ANIMATION_DURATION,
getTransitionAnimationScaleLocked(), displayInfo.logicalWidth,
displayInfo.logicalHeight, mExitAnimId, mEnterAnimId)) {
scheduleAnimationLocked();
} else {
screenRotationAnimation.kill();
mAnimator.setScreenRotationAnimationLocked(displayId, null);
updateRotation = true;
}
} else {
if (screenRotationAnimation != null) {
screenRotationAnimation.kill();
mAnimator.setScreenRotationAnimationLocked(displayId, null);
}
updateRotation = true;
}
//经过层层调用到InputManagerService服务,IMS服务使能输入事件分发功能
mInputMonitor.thawInputDispatchingLw();
boolean configChanged;
//当display被冻结时不再计算屏幕方向,以避免不连续的状态。
configChanged = updateOrientationFromAppTokensLocked(false);
//display冻结时,执行gc操作
mH.removeMessages(H.FORCE_GC);
mH.sendEmptyMessageDelayed(H.FORCE_GC, 2000);
//mScreenFrozenLock的类型为PowerManager.WakeLock,即释放屏幕冻结的锁
mScreenFrozenLock.release();
if (updateRotation) {
//更新当前的屏幕方向
configChanged |= updateRotationUncheckedLocked(false);
}
if (configChanged) {
//向mH发送configuraion改变的消息
mH.sendEmptyMessage(H.SEND_NEW_CONFIGURATION);
}
}
该方法主要功能:
- 处理屏幕旋转相关逻辑;
- 移除冻屏的超时消息;
- 屏幕旋转动画的相关操作;
- 使能输入事件分发功能;
- display冻结时,执行gc操作;
- 更新当前的屏幕方向;
- 向mH发送configuraion改变的消息。
9.AMS.handleAppCrashLocked
[-> ActivityManagerService.java]
private boolean handleAppCrashLocked(ProcessRecord app, String reason,
String shortMsg, String longMsg, String stackTrace) {
long now = SystemClock.uptimeMillis();
Long crashTime;
if (!app.isolated) {
crashTime = mProcessCrashTimes.get(app.info.processName, app.uid);
} else {
crashTime = null;
}
//当同一个进程,连续两次crash的时间间隔小于1分钟时,则认为crash太过于频繁
if (crashTime != null && now < crashTime+ProcessList.MIN_CRASH_INTERVAL) {
EventLog.writeEvent(EventLogTags.AM_PROCESS_CRASHED_TOO_MUCH,
app.userId, app.info.processName, app.uid);
//【见小节9.1】
mStackSupervisor.handleAppCrashLocked(app);
if (!app.persistent) {
//不再重启非persistent进程,除非用户显式地调用
EventLog.writeEvent(EventLogTags.AM_PROC_BAD, app.userId, app.uid,
app.info.processName);
if (!app.isolated) {
//将当前app加入到mBadProcesses
mBadProcesses.put(app.info.processName, app.uid,
new BadProcessInfo(now, shortMsg, longMsg, stackTrace));
mProcessCrashTimes.remove(app.info.processName, app.uid);
}
app.bad = true;
app.removed = true;
//移除进程的所有服务,保证不再重启【见小节9.2】
removeProcessLocked(app, false, false, "crash");
//恢复最顶部的Activity【见小节9.3】
mStackSupervisor.resumeTopActivitiesLocked();
return false;
}
mStackSupervisor.resumeTopActivitiesLocked();
} else {
//此处reason="force-crash"【见小节9.4】
mStackSupervisor.finishTopRunningActivityLocked(app, reason);
}
//运行在当前进程中的所有服务的crash次数执行加1操作
for (int i=app.services.size()-1; i>=0; i--) {
ServiceRecord sr = app.services.valueAt(i);
sr.crashCount++;
}
//当桌面应用crash,并且被三方app所取代,那么需要清空桌面应用的偏爱选项。
final ArrayList activities = app.activities;
if (app == mHomeProcess && activities.size() > 0
&& (mHomeProcess.info.flags & ApplicationInfo.FLAG_SYSTEM) == 0) {
for (int activityNdx = activities.size() - 1; activityNdx >= 0; --activityNdx) {
final ActivityRecord r = activities.get(activityNdx);
if (r.isHomeActivity()) {
//清空偏爱应用
ActivityThread.getPackageManager()
.clearPackagePreferredActivities(r.packageName);
}
}
}
if (!app.isolated) {
//无法记录孤立进程的crash时间点,由于他们并没有一个固定身份
mProcessCrashTimes.put(app.info.processName, app.uid, now);
}
//当app存在crash的handler,那么交给其处理
if (app.crashHandler != null) mHandler.post(app.crashHandler);
return true;
}
- 当同一进程在时间间隔小于1分钟时连续两次crash,则执行的情况下:
- 对于非persistent进程:
- [9.1] mStackSupervisor.handleAppCrashLocked(app);
- [9.2] removeProcessLocked(app, false, false, “crash”);
- [9.3] mStackSupervisor.resumeTopActivitiesLocked();
- 对于persistent进程,则只执行
- [9.3] mStackSupervisor.resumeTopActivitiesLocked();
- 否则执行
- [9.4] mStackSupervisor.finishTopRunningActivityLocked(app, reason);
9.1 ASS.handleAppCrashLocked
[-> ActivityStackSupervisor.java]
void handleAppCrashLocked(ProcessRecord app) {
for (int displayNdx = mActivityDisplays.size() - 1; displayNdx >= 0; --displayNdx) {
final ArrayList stacks = mActivityDisplays.valueAt(displayNdx).mStacks;
int stackNdx = stacks.size() - 1;
while (stackNdx >= 0) {
//调用ActivityStack【见小节9.1.1】
stacks.get(stackNdx).handleAppCrashLocked(app);
stackNdx--;
}
}
}
9.1.1 AS.handleAppCrashLocked
[-> ActivityStack.java]
void handleAppCrashLocked(ProcessRecord app) {
for (int taskNdx = mTaskHistory.size() - 1; taskNdx >= 0; --taskNdx) {
final ArrayList activities = mTaskHistory.get(taskNdx).mActivities;
for (int activityNdx = activities.size() - 1; activityNdx >= 0; --activityNdx) {
final ActivityRecord r = activities.get(activityNdx);
if (r.app == app) {
r.app = null;
//结束当前activity
finishCurrentActivityLocked(r, FINISH_IMMEDIATELY, false);
}
}
}
}
这里的mTaskHistory
数据类型为ArrayList,记录着所有先前的后台activities。遍历所有activities,找到位于该ProcessRecord的所有ActivityRecord,并结束该Acitivity。
[-> ActivityManagerService.java]
private final boolean removeProcessLocked(ProcessRecord app,
boolean callerWillRestart, boolean allowRestart, String reason) {
final String name = app.processName;
final int uid = app.uid;
//从mProcessNames移除该进程
removeProcessNameLocked(name, uid);
...
if (app.pid > 0 && app.pid != MY_PID) {
int pid = app.pid;
synchronized (mPidsSelfLocked) {
mPidsSelfLocked.remove(pid); //移除该pid
mHandler.removeMessages(PROC_START_TIMEOUT_MSG, app);
}
...
boolean willRestart = false;
//对于非孤立的persistent进程设置成可重启flags
if (app.persistent && !app.isolated) {
if (!callerWillRestart) {
willRestart = true;
} else {
needRestart = true;
}
}
// 杀进程【9.2.1】
app.kill(reason, true);
//移除进程并清空该进程相关联的activity/service等组件 【9.2.2】
handleAppDiedLocked(app, willRestart, allowRestart);
if (willRestart) {
//此处willRestart=false,不进入该分支
removeLruProcessLocked(app);
addAppLocked(app.info, false, null /* ABI override */);
}
} else {
mRemovedProcesses.add(app);
}
return needRestart;
}
mProcessNames
数据类型为ProcessMap,这是以进程名为key,记录着所有的ProcessRecord信息
mPidsSelfLocked
数据类型为SparseArray,这是以pid为key,记录着所有的ProcessRecord信息。该对象的同步保护是通过自身锁,而非全局ActivityManager锁。
9.2.1 app.kill
[-> ProcessRecord.java]
void kill(String reason, boolean noisy) {
if (!killedByAm) {
Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "kill");
if (noisy) {
Slog.i(TAG, "Killing " + toShortString() + " (adj " + setAdj + "): " + reason);
}
EventLog.writeEvent(EventLogTags.AM_KILL, userId, pid, processName, setAdj, reason);
Process.killProcessQuiet(pid); //杀进程
Process.killProcessGroup(info.uid, pid); //杀进程组,包括native进程
if (!persistent) {
killed = true;
killedByAm = true;
}
Trace.traceEnd(Trace.TRACE_TAG_ACTIVITY_MANAGER);
}
}
此处reason为“crash”,关于杀进程的过程见我的另一篇文章理解杀进程的实现原理.
9.2.2 handleAppDiedLocked
[-> ActivityManagerService.java]
private final void handleAppDiedLocked(ProcessRecord app,
boolean restarting, boolean allowRestart) {
int pid = app.pid;
//清除应用中service/receiver/ContentProvider信息
boolean kept = cleanUpApplicationRecordLocked(app, restarting, allowRestart, -1);
if (!kept && !restarting) {
removeLruProcessLocked(app);
if (pid > 0) {
ProcessList.remove(pid);
}
}
if (mProfileProc == app) {
clearProfilerLocked();
}
//清除应用中activity相关信息
boolean hasVisibleActivities = mStackSupervisor.handleAppDiedLocked(app);
app.activities.clear();
...
if (!restarting && hasVisibleActivities && !mStackSupervisor.resumeTopActivitiesLocked()) {
mStackSupervisor.ensureActivitiesVisibleLocked(null, 0);
}
}
9.3 ASS.resumeTopActivitiesLocked
[-> ActivityStackSupervisor.java]
boolean resumeTopActivitiesLocked() {
return resumeTopActivitiesLocked(null, null, null);
}
boolean resumeTopActivitiesLocked(ActivityStack targetStack, ActivityRecord target,
Bundle targetOptions) {
if (targetStack == null) {
targetStack = mFocusedStack;
}
boolean result = false;
if (isFrontStack(targetStack)) {
//【见小节9.3.1】
result = targetStack.resumeTopActivityLocked(target, targetOptions);
}
for (int displayNdx = mActivityDisplays.size() - 1; displayNdx >= 0; --displayNdx) {
final ArrayList stacks = mActivityDisplays.valueAt(displayNdx).mStacks;
for (int stackNdx = stacks.size() - 1; stackNdx >= 0; --stackNdx) {
final ActivityStack stack = stacks.get(stackNdx);
if (stack == targetStack) {
continue; //已经启动
}
if (isFrontStack(stack)) {
stack.resumeTopActivityLocked(null);
}
}
}
return result;
}
此处mFocusedStack
是当前正在等待接收input事件或者正在启动下一个activity的ActivityStack
。
9.3.1 AS.resumeTopActivityLocked
[-> ActivityStack.java]
final boolean .resumeTopActivityLocked(ActivityRecord prev, Bundle options) {
...
result = resumeTopActivityInnerLocked(prev, options);//【见小节9.3.2】
return result;
}
9.3.2 AS.resumeTopActivityInnerLocked
[-> ActivityStack.java]
private boolean resumeTopActivityInnerLocked(ActivityRecord prev, Bundle options) {
//找到mTaskHistory栈中第一个未处于finishing状态的Activity
final ActivityRecord next = topRunningActivityLocked(null);
if (mResumedActivity == next && next.state == ActivityState.RESUMED &&
mStackSupervisor.allResumedActivitiesComplete()) {
//当top activity已经处于resume,则无需操作;
return false;
}
if (mService.isSleepingOrShuttingDown()
&& mLastPausedActivity == next
&& mStackSupervisor.allPausedActivitiesComplete()) {
//当正处于sleeping状态,top activity处于paused,则无需操作
return false;
}
//正在启动app的activity,确保app不会被设置为stopped
AppGlobals.getPackageManager().setPackageStoppedState(
next.packageName, false, next.userId);
//回调应用onResume方法
next.app.thread.scheduleResumeActivity(next.appToken, next.app.repProcState,
mService.isNextTransitionForward(), resumeAnimOptions);
...
}
该方法代码比较长,这里就简单列举几条比较重要的代码。执行完该方法,应用也便完成了activity的resume过程。
9.4 finishTopRunningActivityLocked
9.4.1 ASS.finishTopRunningActivityLocked
[-> ActivityStackSupervisor.java]
void finishTopRunningActivityLocked(ProcessRecord app, String reason) {
for (int displayNdx = mActivityDisplays.size() - 1; displayNdx >= 0; --displayNdx) {
final ArrayList stacks = mActivityDisplays.valueAt(displayNdx).mStacks;
final int numStacks = stacks.size();
for (int stackNdx = 0; stackNdx < numStacks; ++stackNdx) {
final ActivityStack stack = stacks.get(stackNdx);
//此处reason= "force-crash"【见小节9.4.2】
stack.finishTopRunningActivityLocked(app, reason);
}
}
}
9.4.2 AS.finishTopRunningActivityLocked
final void finishTopRunningActivityLocked(ProcessRecord app, String reason) {
//找到栈顶第一个不处于finishing状态的activity
ActivityRecord r = topRunningActivityLocked(null);
if (r != null && r.app == app) {
int taskNdx = mTaskHistory.indexOf(r.task);
int activityNdx = r.task.mActivities.indexOf(r);
//【见小节9.4.3】
finishActivityLocked(r, Activity.RESULT_CANCELED, null, reason, false);
--activityNdx;
if (activityNdx < 0) {
do {
--taskNdx;
if (taskNdx < 0) {
break;
}
activityNdx = mTaskHistory.get(taskNdx).mActivities.size() - 1;
} while (activityNdx < 0);
}
if (activityNdx >= 0) {
r = mTaskHistory.get(taskNdx).mActivities.get(activityNdx);
if (r.state == ActivityState.RESUMED
|| r.state == ActivityState.PAUSING
|| r.state == ActivityState.PAUSED) {
if (!r.isHomeActivity() || mService.mHomeProcess != r.app) {
//【见小节9.4.3】
finishActivityLocked(r, Activity.RESULT_CANCELED, null, reason, false);
}
}
}
}
}
9.4.3 AS.finishActivityLocked
final boolean finishActivityLocked(ActivityRecord r, int resultCode, Intent resultData,
String reason, boolean oomAdj) {
if (r.finishing) {
return false; //正在finishing则返回
}
//设置finish状态的activity不可见
r.makeFinishingLocked();
//暂停key的分发事件
r.pauseKeyDispatchingLocked();
mWindowManager.prepareAppTransition(endTask
? AppTransition.TRANSIT_TASK_CLOSE
: AppTransition.TRANSIT_ACTIVITY_CLOSE, false);
mWindowManager.setAppVisibility(r.appToken, false);
//回调activity的onPause方法
startPausingLocked(false, false, false, false);
...
}
该方法最终会回调到activity的pause方法。
执行到这,我们还回过来看小节5.crashApplication
中,处理完makeAppCrashingLocked,则会再发送消息SHOW_ERROR_MSG,弹出提示crash的对话框,接下来再看看该过程。
10. UiHandler
通过mUiHandler发送message,且消息的msg.waht=SHOW_ERROR_MSG,接下来进入UiHandler来看看handleMessage的处理过程。
[-> ActivityManagerService.java]
final class UiHandler extends Handler {
public void handleMessage(Message msg) {
switch (msg.what) {
case SHOW_ERROR_MSG: {
HashMap data = (HashMap) msg.obj;
synchronized (ActivityManagerService.this) {
ProcessRecord proc = (ProcessRecord)data.get("app");
AppErrorResult res = (AppErrorResult) data.get("result");
、
boolean isBackground = (UserHandle.getAppId(proc.uid)
>= Process.FIRST_APPLICATION_UID
&& proc.pid != MY_PID);
...
if (mShowDialogs && !mSleeping && !mShuttingDown) {
//创建提示crash对话框,等待用户选择,5分钟操作等待。
Dialog d = new AppErrorDialog(mContext,
ActivityManagerService.this, res, proc);
d.show();
proc.crashDialog = d;
} else {
//当处于sleep状态,则默认选择退出。
if (res != null) {
res.set(0);
}
}
}
} break;
...
}
}
在发生crash时,默认系统会弹出提示crash的对话框,并阻塞等待用户选择是“退出”或 “退出并报告”,当用户不做任何选择时5min超时后,默认选择“退出”,当手机休眠时也默认选择“退出”。到这里也并没有真正结束,在小节2.uncaughtException
中在finnally
语句块还有一个杀进程的动作。
11. killProcess
Process.killProcess(Process.myPid());
System.exit(10);
通过finnally语句块保证能执行并彻底杀掉Crash进程,关于杀进程的过程见我的另一篇文章理解杀进程的实现原理.。当Crash进程被杀后,并没有完全结束,还有Binder死亡通知的流程还没有处理完成。
12. 小结
当进程抛出未捕获异常时,则系统会处理该异常并进入crash处理流程。
其中最为核心的工作图中红色部分AMS.handleAppCrashLocked
的主要功能:
- 当同一进程1分钟之内连续两次crash,则执行的情况下:
- 对于非persistent进程:
- ASS.handleAppCrashLocked, 直接结束该应用所有activity
- AMS.removeProcessLocked,杀死该进程以及同一个进程组下的所有进
- ASS.resumeTopActivitiesLocked,恢复栈顶第一个非finishing状态的activity
- 对于persistent进程,则只执行
- ASS.resumeTopActivitiesLocked,恢复栈顶第一个非finishing状态的activity
- 否则,当进程没连续频繁crash
- ASS.finishTopRunningActivityLocked,执行结束栈顶正在运行activity
另外,AMS.handleAppCrashLocked
,该方法内部主要调用链,如下:
AMS.handleAppCrashLocked
ASS.handleAppCrashLocked
AS.handleAppCrashLocked
AS.finishCurrentActivityLocked
AMS.removeProcessLocked
ProcessRecord.kill
AMS.handleAppDiedLocked
ASS.handleAppDiedLocked
AMS.cleanUpApplicationRecordLocked
AS.handleAppDiedLocked
AS.removeHistoryRecordsForAppLocked
ASS.resumeTopActivitiesLocked
AS.resumeTopActivityLocked
AS.resumeTopActivityInnerLocked
ASS.finishTopRunningActivityLocked
AS.finishTopRunningActivityLocked
AS.finishActivityLocked
三、Binder死亡通知
进程被杀,如果还记得Binder的死亡回调机制,在应用进程创建的过程中有一个attachApplicationLocked
方法的过程中便会创建死亡通知。
[-> ActivityManagerService.java]
private final boolean attachApplicationLocked(IApplicationThread thread,
int pid) {
try {
//创建binder死亡通知
AppDeathRecipient adr = new AppDeathRecipient(
app, pid, thread);
thread.asBinder().linkToDeath(adr, 0);
app.deathRecipient = adr;
} catch (RemoteException e) {
app.resetPackageList(mProcessStats);
startProcessLocked(app, "link fail", processName);
return false;
}
...
}
当binder服务端挂了之后,便会通过binder的DeathRecipient来通知AMS进行相应的清理收尾工作。前面已经降到crash的进程会被kill掉,那么当该进程会杀,则会回调到binderDied()方法。
1. binderDied
[-> ActivityManagerService.java]
private final class AppDeathRecipient implements IBinder.DeathRecipient {
public void binderDied() {
synchronized(ActivityManagerService.this) {
appDiedLocked(mApp, mPid, mAppThread, true);//【见小节2】
}
}
}
2. appDiedLocked
final void appDiedLocked(ProcessRecord app, int pid, IApplicationThread thread,
boolean fromBinderDied) {
...
if (!app.killed) {
if (!fromBinderDied) {
Process.killProcessQuiet(pid);
}
killProcessGroup(app.info.uid, pid);
app.killed = true;
}
// Clean up already done if the process has been re-started.
if (app.pid == pid && app.thread != null &&
app.thread.asBinder() == thread.asBinder()) {
boolean doLowMem = app.instrumentationClass == null;
boolean doOomAdj = doLowMem;
if (!app.killedByAm) {
mAllowLowerMemLevel = true;
} else {
mAllowLowerMemLevel = false;
doLowMem = false;
}
//【见小节3】
handleAppDiedLocked(app, false, true);
if (doOomAdj) {
updateOomAdjLocked();
}
if (doLowMem) {
doLowMemReportIfNeededLocked(app);
}
}
...
}
3 handleAppDiedLocked
[-> ActivityManagerService.java]
private final void handleAppDiedLocked(ProcessRecord app,
boolean restarting, boolean allowRestart) {
int pid = app.pid;
//清理应用程序service, BroadcastReceiver, ContentProvider相关信息【见小节4】
boolean kept = cleanUpApplicationRecordLocked(app, restarting, allowRestart, -1);
if (!kept && !restarting) {
removeLruProcessLocked(app);
if (pid > 0) {
ProcessList.remove(pid);
}
}
//清理activity相关信息
boolean hasVisibleActivities = mStackSupervisor.handleAppDiedLocked(app);
app.activities.clear();
...
//恢复栈顶第一个非finish的activity
if (!restarting && hasVisibleActivities && !mStackSupervisor.resumeTopActivitiesLocked()) {
mStackSupervisor.ensureActivitiesVisibleLocked(null, 0);
}
}
4 cleanUpApplicationRecordLocked
该方法清理应用程序service, BroadcastReceiver, ContentProvider,process相关信息,为了便于说明将该方法划分为4个部分讲解
4.1 清理service
参数restarting = false, allowRestart =true, index =-1
private final boolean cleanUpApplicationRecordLocked(ProcessRecord app,
boolean restarting, boolean allowRestart, int index) {
...
mProcessesToGc.remove(app);
mPendingPssProcesses.remove(app);
//如果存在,则清除crash/anr/wait对话框
if (app.crashDialog != null && !app.forceCrashReport) {
app.crashDialog.dismiss();
app.crashDialog = null;
}
if (app.anrDialog != null) {
app.anrDialog.dismiss();
app.anrDialog = null;
}
if (app.waitDialog != null) {
app.waitDialog.dismiss();
app.waitDialog = null;
}
app.crashing = false;
app.notResponding = false;
app.resetPackageList(mProcessStats);
app.unlinkDeathRecipient(); //解除app的死亡通告
app.makeInactive(mProcessStats);
app.waitingToKill = null;
app.forcingToForeground = null;
//将app移除前台进程
updateProcessForegroundLocked(app, false, false);
app.foregroundActivities = false;
app.hasShownUi = false;
app.treatLikeActivity = false;
app.hasAboveClient = false;
app.hasClientActivities = false;
//清理service信息,这个过程也比较复杂,后续再展开
mServices.killServicesLocked(app, allowRestart);
boolean restart = false;
}
- mProcessesToGc:记录着需要尽快执行gc的进程列表
- mPendingPssProcesses:记录着需要收集内存信息的进程列表
4.2 清理ContentProvider
private final boolean cleanUpApplicationRecordLocked(...) {
...
for (int i = app.pubProviders.size() - 1; i >= 0; i--) {
//获取该进程已发表的ContentProvider
ContentProviderRecord cpr = app.pubProviders.valueAt(i);
final boolean always = app.bad || !allowRestart;
//ContentProvider服务端被杀,则client端进程也会被杀
boolean inLaunching = removeDyingProviderLocked(app, cpr, always);
if ((inLaunching || always) && cpr.hasConnectionOrHandle()) {
restart = true; //需要重启
}
cpr.provider = null;
cpr.proc = null;
}
app.pubProviders.clear();
//处理正在启动并且是有client端正在等待的ContentProvider
if (cleanupAppInLaunchingProvidersLocked(app, false)) {
restart = true;
}
//取消已连接的ContentProvider的注册
if (!app.conProviders.isEmpty()) {
for (int i = app.conProviders.size() - 1; i >= 0; i--) {
ContentProviderConnection conn = app.conProviders.get(i);
conn.provider.connections.remove(conn);
stopAssociationLocked(app.uid, app.processName, conn.provider.uid,
conn.provider.name);
}
app.conProviders.clear();
}
4.3 清理BroadcastReceiver
private final boolean cleanUpApplicationRecordLocked(...) {
...
skipCurrentReceiverLocked(app);
// 取消注册的广播接收者
for (int i = app.receivers.size() - 1; i >= 0; i--) {
removeReceiverLocked(app.receivers.valueAt(i));
}
app.receivers.clear();
}
4.4 清理Process
private final boolean cleanUpApplicationRecordLocked(...) {
...
//当app正在备份时的处理方式
if (mBackupTarget != null && app.pid == mBackupTarget.app.pid) {
...
IBackupManager bm = IBackupManager.Stub.asInterface(
ServiceManager.getService(Context.BACKUP_SERVICE));
bm.agentDisconnected(app.info.packageName);
}
for (int i = mPendingProcessChanges.size() - 1; i >= 0; i--) {
ProcessChangeItem item = mPendingProcessChanges.get(i);
if (item.pid == app.pid) {
mPendingProcessChanges.remove(i);
mAvailProcessChanges.add(item);
}
}
mUiHandler.obtainMessage(DISPATCH_PROCESS_DIED, app.pid, app.info.uid, null).sendToTarget();
if (!app.persistent || app.isolated) {
removeProcessNameLocked(app.processName, app.uid);
if (mHeavyWeightProcess == app) {
mHandler.sendMessage(mHandler.obtainMessage(CANCEL_HEAVY_NOTIFICATION_MSG,
mHeavyWeightProcess.userId, 0));
mHeavyWeightProcess = null;
}
} else if (!app.removed) {
//对于persistent应用,则需要重启
if (mPersistentStartingProcesses.indexOf(app) < 0) {
mPersistentStartingProcesses.add(app);
restart = true;
}
}
//mProcessesOnHold:记录着试图在系统ready之前就启动的进程。
//在那时并不启动这些进程,先记录下来,等系统启动完成则启动这些进程。
mProcessesOnHold.remove(app);
if (app == mHomeProcess) {
mHomeProcess = null;
}
if (app == mPreviousProcess) {
mPreviousProcess = null;
}
if (restart && !app.isolated) {
//仍有组件需要运行在该进程中,因此重启该进程
if (index < 0) {
ProcessList.remove(app.pid);
}
addProcessNameLocked(app);
startProcessLocked(app, "restart", app.processName);
return true;
} else if (app.pid > 0 && app.pid != MY_PID) {
//移除该进程相关信息
boolean removed;
synchronized (mPidsSelfLocked) {
mPidsSelfLocked.remove(app.pid);
mHandler.removeMessages(PROC_START_TIMEOUT_MSG, app);
}
app.setPid(0);
}
return false;
}
对于需要重启进程的情形有:
mLaunchingProviders
:记录着存在client端等待的ContentProvider。应用当前正在启动中,当ContentProvider一旦发布则将该ContentProvider将从该list去除。当进程包含这样的ContentProvider,则需要重启进程。
mPersistentStartingProcesses
:记录着试图在系统ready之前就启动的进程。在那时并不启动这些进程,先记录下来,等系统启动完成则启动这些进程。当进程属于这种类型也需要重启。
5. 小结
当crash进程执行kill操作后,进程被杀。此时需要掌握binder 死亡通知原理,由于Crash进程中拥有一个Binder服务端ApplicationThread
,而应用进程在创建过程调用attachApplicationLocked(),从而attach到system_server进程,在system_server进程内有一个ApplicationThreadProxy
,这是相对应的Binder客户端。当Binder服务端ApplicationThread
所在进程(即Crash进程)挂掉后,则Binder客户端能收到相应的死亡通知,从而进入binderDied流程。更多关于bInder原理,这里就不细说,博客中有关于binder系列的专题。
四、 总结
本文主要以源码的视角,详细介绍了到应用crash后系统的处理流程:
- 首先发生crash所在进程,在创建之初便准备好了defaultUncaughtHandler,用来来处理Uncaught Exception,并输出当前crash基本信息;
- 调用当前进程中的AMP.handleApplicationCrash;经过binder ipc机制,传递到system_server进程;
- 接下来,进入system_server进程,调用binder服务端执行AMS.handleApplicationCrash;
- 从
mProcessNames
查找到目标进程的ProcessRecord对象;并将进程crash信息输出到目录/data/system/dropbox
;
- 执行makeAppCrashingLocked
- 创建当前用户下的crash应用的error receiver,并忽略当前应用的广播;
- 停止当前进程中所有activity中的WMS的冻结屏幕消息,并执行相关一些屏幕相关操作;
- 再执行handleAppCrashLocked方法,
- 当1分钟内同一进程``连续crash两次
时,且
非persistent`进程,则直接结束该应用所有activity,并杀死该进程以及同一个进程组下的所有进程。然后再恢复栈顶第一个非finishing状态的activity;
- 当1分钟内同一进程``连续crash两次
时,且
persistent`进程,,则只执行恢复栈顶第一个非finishing状态的activity;
- 当1分钟内同一进程
未发生连续crash两次
时,则执行结束栈顶正在运行activity的流程。
- 通过mUiHandler发送消息
SHOW_ERROR_MSG
,弹出crash对话框;
- 到此,system_server进程执行完成。回到crash进程开始执行杀掉当前进程的操作;
- 当crash进程被杀,通过binder死亡通知,告知system_server进程来执行appDiedLocked();
- 最后,执行清理应用相关的activity/service/ContentProvider/receiver组件信息。
这基本就是整个应用Crash后系统的执行过程。
欢迎关注我的微博: Gityuan。如果觉得我的文章对您所有帮助,
请
¥打赏支持
,或者点击下方分享给更多的朋友。您的支持将激励我创作更多技术干货!