[Framework] Activity onDestroy 生命周期延迟回调原理
在工作中发现一个 Bug,某些情况下会导致我们的连麦功能失败,我们的连麦的类是一个单例,连麦必须调用 start
方法,结束连麦必须调用 stop
方法,同时最多只能创建一个连麦,也就是说 start
方法调用后,如果需要再重新连麦,需要先 stop
,然后再 start
。stop
方法的调用是写在 Activity#onDestroy()
生命周期的回调方法中的,经过各种验证发现,这个问题是由于 Activity#onDestroy()
延迟调用,导致没有 stop
就去调用 start
然后造成了连麦失败。
这里先说结论这是由于主线程忙碌时在 Activity
销毁,会导致 onStop()
和 onDestroy()
生命周期延迟 =<10s
回调。不过在 Android 11 及其以后的版本 Google 修改了这部分代码,不会再有言辞太大的情况。
如果想要重现延迟回调 onDestroy()
可以在 Android 10 及其以下的手机调用以下代码后再销毁 Activity
就会导致 onDestroy()
延迟回调:
val h = object : Handler(Looper.getMainLooper()) {}
fun doNothing() {
h.post {
doNothing()
}
}
那为什么会出现延迟回调的问题呢?我们就从 Android 9 Activity
的销毁流程来找这个问题。
Android 9 Activity 销毁流程
应用进程通过 Activity#finish() 方法销毁 Activity
public void finish() {
finish(DONT_FINISH_TASK_WITH_ACTIVITY);
}
private void finish(int finishTask) {
// ...
if (ActivityManager.getService()
.finishActivity(mToken, resultCode, resultData, finishTask)) {
mFinished = true;
}
// ...
}
这个 ActivityManager
其实是一个 binder
的 Client
,这个 binder
的 Server
就是 ActivityManagerServer
,它工作在 system_server
进程中,通过 binder
进行 IPC 通信调用了 AMS
中的 finishActivity()
方法,如果对 binder
不熟悉的同学可以参考我之前的文章:Android Binder 工作原理
AMS#finishActivity() 方法
@Override
public final boolean finishActivity(IBinder token, int resultCode, Intent resultData,
int finishTask) {
// ...
synchronized(this) {
ActivityRecord r = ActivityRecord.isInStackLocked(token);
if (r == null) {
return true;
}
// Keep track of the root activity of the task before we finish it
TaskRecord tr = r.getTask();
// ...
final long origId = Binder.clearCallingIdentity();
try {
boolean res;
final boolean finishWithRootActivity =
finishTask == Activity.FINISH_TASK_WITH_ROOT_ACTIVITY;
if (finishTask == Activity.FINISH_TASK_WITH_ACTIVITY
|| (finishWithRootActivity && r == rootR)) {
// ...
} else {
res = tr.getStack().requestFinishActivityLocked(token, resultCode,
resultData, "app-request", true);
if (!res) {
Slog.i(TAG, "Failed to finish by app-request");
}
}
return res;
} finally {
// ...
}
}
}
这里会调用 ActivityStack#requestFinishActivityLocked()
方法:
/**
* @return Returns true if the activity is being finished, false if for
* some reason it is being left as-is.
*/
final boolean requestFinishActivityLocked(IBinder token, int resultCode,
Intent resultData, String reason, boolean oomAdj) {
ActivityRecord r = isInStackLocked(token);
if (DEBUG_RESULTS || DEBUG_STATES) Slog.v(TAG_STATES,
"Finishing activity token=" + token + " r="
+ ", result=" + resultCode + ", data=" + resultData
+ ", reason=" + reason);
if (r == null) {
return false;
}
finishActivityLocked(r, resultCode, resultData, reason, oomAdj);
return true;
}
然后进入关键方法 finishActivityLocked()
:
final boolean finishActivityLocked(ActivityRecord r, int resultCode, Intent resultData,
String reason, boolean oomAdj, boolean pauseImmediately) {
// ..
if (mPausingActivity == null) {
if (DEBUG_PAUSE) Slog.v(TAG_PAUSE, "Finish needs to pause: " + r);
if (DEBUG_USER_LEAVING) Slog.v(TAG_USER_LEAVING,
"finish() => pause with userLeaving=false");
startPausingLocked(false, false, null, pauseImmediately);
}
// ..
}
我省略很多逻辑,这里的关键方法是 startPausingLocked()
:
final boolean startPausingLocked(boolean userLeaving, boolean uiSleeping,
ActivityRecord resuming, boolean pauseImmediately) {
// ...
mService.getLifecycleManager().scheduleTransaction(prev.app.thread, prev.appToken,
PauseActivityItem.obtain(prev.finishing, userLeaving,
prev.configChangeFlags, pauseImmediately));
// ...
}
这里很关键构建了一个 PauseActivityItem
对象,可以理解为 pause
Activity
的任务,然后调用 ClientLifecycleManager#scheduleTransaction()
方法。
void scheduleTransaction(@NonNull IApplicationThread client, @NonNull IBinder activityToken,
@NonNull ActivityLifecycleItem stateRequest) throws RemoteException {
final ClientTransaction clientTransaction = transactionWithState(client, activityToken,
stateRequest);
scheduleTransaction(clientTransaction);
}
void scheduleTransaction(ClientTransaction transaction) throws RemoteException {
final IApplicationThread client = transaction.getClient();
transaction.schedule();
if (!(client instanceof Binder)) {
// If client is not an instance of Binder - it's a remote call and at this point it is
// safe to recycle the object. All objects used for local calls will be recycled after
// the transaction is executed on client in ActivityThread.
transaction.recycle();
}
}
public void schedule() throws RemoteException {
mClient.scheduleTransaction(this);
}
这个 mClient
其实就是一个 binder
的 Client
,对应的其 Server
就是对应的应用进程中在 ActivityThread
中的 ApplicationThread
,也就是把 pause
Activity
的任务下发至应用处理了。
应用进程处理 Pause
@Override
public void scheduleTransaction(ClientTransaction transaction) throws RemoteException {
ActivityThread.this.scheduleTransaction(transaction);
}
ApplicationThread
会直接调用 ActivityThread
的 scheduleTransaction()
方法:
void scheduleTransaction(ClientTransaction transaction) {
transaction.preExecute(this);
sendMessage(ActivityThread.H.EXECUTE_TRANSACTION, transaction);
}
private void sendMessage(int what, Object obj, int arg1, int arg2, boolean async) {
if (DEBUG_MESSAGES) Slog.v(
TAG, "SCHEDULE " + what + " " + mH.codeToString(what)
+ ": " + arg1 + " / " + obj);
Message msg = Message.obtain();
msg.what = what;
msg.obj = obj;
msg.arg1 = arg1;
msg.arg2 = arg2;
if (async) {
msg.setAsynchronous(true);
}
mH.sendMessage(msg);
}
这里会通过 Handler
把这个任务发送至主线程处理。
还记得在 system_server
中发送过来的是 PauseActivityItem
吗?最后在应用主线程执行的就是这个任务,它分为三个方法 preExecute()
,execute()
和 postExecute()
,表示任务执行前,执行任务,任务执行完成。
@Override
public void execute(ClientTransactionHandler client, IBinder token,
PendingTransactionActions pendingActions) {
Trace.traceBegin(TRACE_TAG_ACTIVITY_MANAGER, "activityPause");
client.handlePauseActivity(token, mFinished, mUserLeaving, mConfigChanges, pendingActions,
"PAUSE_ACTIVITY_ITEM");
Trace.traceEnd(TRACE_TAG_ACTIVITY_MANAGER);
}
@Override
public void postExecute(ClientTransactionHandler client, IBinder token,
PendingTransactionActions pendingActions) {
if (mDontReport) {
return;
}
try {
// TODO(lifecycler): Use interface callback instead of AMS.
ActivityManager.getService().activityPaused(token);
} catch (RemoteException ex) {
throw ex.rethrowFromSystemServer();
}
}
在 execute()
方法中会调用 client.handlePauseActivity()
的方法,这个 client
其实就是 ActivityThread
,然后这个方法也会触发我们熟悉的 Activity#onPause()
生命周期,这里就不分析了,也没多难。这里主要看看 postExecute()
方法,它又调用了 ActivityManager
的 activityPaused()
通过 IPC 来告诉 AMS
已经执行完 pause
了,后续又进入了 AMS
。
AMS 处理已经 pause 的 Activity
@Override
public final void activityPaused(IBinder token) {
final long origId = Binder.clearCallingIdentity();
synchronized(this) {
ActivityStack stack = ActivityRecord.getStackLocked(token);
if (stack != null) {
stack.activityPausedLocked(token, false);
}
}
Binder.restoreCallingIdentity(origId);
}
然后接着调用 ActivityStack#activityPausedLocked()
方法:
final void activityPausedLocked(IBinder token, boolean timeout) {
// ...
completePauseLocked(true /* resumeNext */, null /* resumingActivity */);
// ...
}
private void completePauseLocked(boolean resumeNext, ActivityRecord resuming) {
ActivityRecord prev = mPausingActivity;
if (DEBUG_PAUSE) Slog.v(TAG_PAUSE, "Complete pause: " + prev);
if (prev != null) {
prev.setWillCloseOrEnterPip(false);
final boolean wasStopping = prev.isState(STOPPING);
prev.setState(PAUSED, "completePausedLocked");
if (prev.finishing) {
if (DEBUG_PAUSE) Slog.v(TAG_PAUSE, "Executing finish of activity: " + prev);
prev = finishCurrentActivityLocked(prev, FINISH_AFTER_VISIBLE, false,
"completedPausedLocked");
} else if (prev.app != null) {
// ...
} else {
// ...
}
// ...
}
// ...
}
由于我们的 Activity
是 finish()
的情况,所以 finishing
状态是 true
,然后会执行 finishCurrentActivityLocked()
方法:
final ActivityRecord finishCurrentActivityLocked(ActivityRecord r, int mode, boolean oomAdj,
String reason) {
// ...
addToStopping(r, false /* scheduleIdle */, false /* idleDelayed */);
// ...
}
void addToStopping(ActivityRecord r, boolean scheduleIdle, boolean idleDelayed) {
// ...
if (!mStackSupervisor.mStoppingActivities.contains(r)) {
mStackSupervisor.mStoppingActivities.add(r);
// ...
}
// ...
mStackSupervisor.scheduleIdleTimeoutLocked(r);
// ...
}
在 addtoStopping()
方法中会把这个 ActivityRecord
添加到 mStoppingActivities
中,来表示等待 stop
的 Activity
。然后这里继续调用 ActivityStackSupervisor#scheduleIdleTimeoutLocked()
方法,这个方法很重要也是我们标题的答案,先看看,后续会再说到这个方法。
void scheduleIdleTimeoutLocked(ActivityRecord next) {
if (DEBUG_IDLE) Slog.d(TAG_IDLE,
"scheduleIdleTimeoutLocked: Callers=" + Debug.getCallers(4));
Message msg = mHandler.obtainMessage(IDLE_TIMEOUT_MSG, next);
mHandler.sendMessageDelayed(msg, IDLE_TIMEOUT);
}
其实就是一个 Handler
的延时任务,IDLE_TIMEOUT
的延时是 10s。
我们再来看看 IDLE_TIMEOUT_MSG
中做了什么:
// ...
case IDLE_TIMEOUT_MSG: {
if (DEBUG_IDLE) Slog.d(TAG_IDLE,
"handleMessage: IDLE_TIMEOUT_MSG: r=" + msg.obj);
// We don't at this point know if the activity is fullscreen,
// so we need to be conservative and assume it isn't.
activityIdleInternal((ActivityRecord) msg.obj,
true /* processPausingActivities */);
} break;
// ...
void activityIdleInternal(ActivityRecord r, boolean processPausingActivities) {
synchronized (mService) {
activityIdleInternalLocked(r != null ? r.appToken : null, true /* fromTimeout */,
processPausingActivities, null);
}
}
final ActivityRecord activityIdleInternalLocked(final IBinder token, boolean fromTimeout,
boolean processPausingActivities, Configuration config) {
// ...
// Atomically retrieve all of the other things to do.
final ArrayList<ActivityRecord> stops = processStoppingActivitiesLocked(r,
true /* remove */, processPausingActivities);
NS = stops != null ? stops.size() : 0;
if ((NF = mFinishingActivities.size()) > 0) {
finishes = new ArrayList<>(mFinishingActivities);
mFinishingActivities.clear();
}
if (mStartingUsers.size() > 0) {
startingUsers = new ArrayList<>(mStartingUsers);
mStartingUsers.clear();
}
// Stop any activities that are scheduled to do so but have been
// waiting for the next one to start.
for (int i = 0; i < NS; i++) {
r = stops.get(i);
final ActivityStack stack = r.getStack();
if (stack != null) {
if (r.finishing) {
stack.finishCurrentActivityLocked(r, ActivityStack.FINISH_IMMEDIATELY, false,
"activityIdleInternalLocked");
} else {
stack.stopActivityLocked(r);
}
}
}
// ...
}
这里其实就是把那些在 stop
列表中的 Activity
给销毁了,调用的是 ActivityStack#finishCurrentActivityLocked()
这个方法最终会通过 IPC 到达应用进程,后续再分析。
AMS#activityPaused()
方法中主要是把我们的 Activity
添加到 stoping
的列表里,然后开启一个定时任务(10s),这个定时任务会触发 stoping
列表里面的 Actvity
进入 stop
的生命周期。
然后流程到这里就断了,这里直接给出结论需要等一个 Actviity#resume
的生命周期,因为显示的 Actvitiy
销毁后,而在其栈下面的 Activity
就会显示,触发 resume
的生命周期,这个 resume
的触发流程我就省略了,感兴趣的可以自己再去找找源码。
应用进程 resume Activity
我们看看 ActivityThread#handleResumeActvity()
方法:
@Override
public void handleResumeActivity(IBinder token, boolean finalStateRequest, boolean isForward,
String reason) {
// ...
Looper.myQueue().addIdleHandler(new Idler());
}
前面处理 Activity
的 resume
生命周期的方法我省略了,感兴趣的自己去看看,在处理完后,会在 Looper
中添加一个 IdleHandler
,这个其实就是在线程空闲的时候会执行(这里的线程是主线程)。如果对 Handler
感兴趣的同学可以看看我之前的文章:Android Handler 工作原理,其中有详细讲 IdleHandler
。
@Override
public final boolean queueIdle() {
ActivityClientRecord a = mNewActivities;
boolean stopProfiling = false;
if (mBoundApplication != null && mProfiler.profileFd != null
&& mProfiler.autoStopProfiler) {
stopProfiling = true;
}
if (a != null) {
mNewActivities = null;
IActivityManager am = ActivityManager.getService();
ActivityClientRecord prev;
do {
if (localLOGV) Slog.v(
TAG, "Reporting idle of " + a +
" finished=" +
(a.activity != null && a.activity.mFinished));
if (a.activity != null && !a.activity.mFinished) {
try {
am.activityIdle(a.token, a.createdConfig, stopProfiling);
a.createdConfig = null;
} catch (RemoteException ex) {
throw ex.rethrowFromSystemServer();
}
}
prev = a;
a = a.nextIdle;
prev.nextIdle = null;
} while (a != null);
}
if (stopProfiling) {
mProfiler.stopProfiling();
}
ensureJitEnabled();
return false;
}
在 IdleHandler
中他会遍历自己已经销毁的 Activity
然后通过 IPC 调用 AMS
的 activityIdle()
方法。
AMS#activityIdle()
@Override
public final void activityIdle(IBinder token, Configuration config, boolean stopProfiling) {
final long origId = Binder.clearCallingIdentity();
synchronized (this) {
ActivityStack stack = ActivityRecord.getStackLocked(token);
if (stack != null) {
ActivityRecord r =
mStackSupervisor.activityIdleInternalLocked(token, false /* fromTimeout */,
false /* processPausingActivities */, config);
if (stopProfiling) {
if ((mProfileProc == r.app) && mProfilerInfo != null) {
clearProfilerLocked();
}
}
}
}
Binder.restoreCallingIdentity(origId);
}
这里会调用 ActivityStackSupervisor#activityIdleInternalLocked()
方法,这个方法在上面已经讲过了,它会调用让 stop
列表中等待 stop
的 Activity
进入 stop
和 destroy
生命周期。
然后销毁 Activity
调用的方法是 ActivityStack#finishCurrentActivityLocked()
方法,这个方法在前面说到过,不过这次是运行的不同的逻辑:
final ActivityRecord finishCurrentActivityLocked(ActivityRecord r, int mode, boolean oomAdj,
String reason) {
// ...
if (mode == FINISH_IMMEDIATELY
|| (prevState == PAUSED
&& (mode == FINISH_AFTER_PAUSE || inPinnedWindowingMode()))
|| finishingActivityInNonFocusedStack
|| prevState == STOPPING
|| prevState == STOPPED
|| prevState == ActivityState.INITIALIZING) {
r.makeFinishingLocked();
boolean activityRemoved = destroyActivityLocked(r, true, "finish-imm:" + reason);
// ...
}
// ...
}
这里调用了 destroyActivityLocked()
方法:
final boolean destroyActivityLocked(ActivityRecord r, boolean removeFromApp, String reason) {
// ...
cleanUpActivityLocked(r, false, false);
final boolean hadApp = r.app != null;
if (hadApp) {
// ...
try {
if (DEBUG_SWITCH) Slog.i(TAG_SWITCH, "Destroying: " + r);
mService.getLifecycleManager().scheduleTransaction(r.app.thread, r.appToken,
DestroyActivityItem.obtain(r.finishing, r.configChangeFlags));
} catch (Exception e) {
// ...
}
// ...
}
// ...
}
这里会调用 ClientLifecycleManager#scheduleTransaction()
方法 IPC 通信调用到应用层 ApplicationThread
,注意这里的参数是 DestroyActivityItem
,也就是销毁 Activity
的任务。
在上面一点还有一个重要的方法 cleanUpActivityLocked()
,还记得我们之前提到的 10s 的延迟任务吗?这个方法会清除那个延迟任务,我们简单看看这个方法:
private void cleanUpActivityLocked(ActivityRecord r, boolean cleanServices, boolean setState) {
onActivityRemovedFromStack(r);
// ...
// Get rid of any pending idle timeouts.
removeTimeoutsForActivityLocked(r);
// ...
}
void removeTimeoutsForActivityLocked(ActivityRecord r) {
mStackSupervisor.removeTimeoutsForActivityLocked(r);
mHandler.removeMessages(PAUSE_TIMEOUT_MSG, r);
mHandler.removeMessages(STOP_TIMEOUT_MSG, r);
mHandler.removeMessages(DESTROY_TIMEOUT_MSG, r);
r.finishLaunchTickingLocked();
}
void removeTimeoutsForActivityLocked(ActivityRecord r) {
if (DEBUG_IDLE) Slog.d(TAG_IDLE, "removeTimeoutsForActivity: Callers="
+ Debug.getCallers(4));
mHandler.removeMessages(IDLE_TIMEOUT_MSG, r);
}
上面的代码很简单,没有什么好说的,我们再看看应用进程处理 Activity
destroy
生命周期。
应用进程处理 destroy 生命周期
直接看 ActivityThread#handleDestroyActivity()
方法:
@Override
public void handleDestroyActivity(IBinder token, boolean finishing, int configChanges,
boolean getNonConfigInstance, String reason) {
ActivityClientRecord r = performDestroyActivity(token, finishing,
configChanges, getNonConfigInstance, reason);
// ...
if (finishing) {
try {
ActivityManager.getService().activityDestroyed(token);
} catch (RemoteException ex) {
throw ex.rethrowFromSystemServer();
}
}
mSomeActivitiesChanged = true;
}
这里会调用 performDestroyActivity()
方法来处理,处理完了以后通过 binder
通知 AMS
。
ActivityClientRecord performDestroyActivity(IBinder token, boolean finishing,
int configChanges, boolean getNonConfigInstance, String reason) {
ActivityClientRecord r = mActivities.get(token);
Class<? extends Activity> activityClass = null;
if (localLOGV) Slog.v(TAG, "Performing finish of " + r);
if (r != null) {
activityClass = r.activity.getClass();
r.activity.mConfigChangeFlags |= configChanges;
if (finishing) {
r.activity.mFinished = true;
}
performPauseActivityIfNeeded(r, "destroy");
if (!r.stopped) {
callActivityOnStop(r, false /* saveState */, "destroy");
}
if (getNonConfigInstance) {
try {
r.lastNonConfigurationInstances
= r.activity.retainNonConfigurationInstances();
} catch (Exception e) {
if (!mInstrumentation.onException(r.activity, e)) {
throw new RuntimeException(
"Unable to retain activity "
+ r.intent.getComponent().toShortString()
+ ": " + e.toString(), e);
}
}
}
try {
r.activity.mCalled = false;
mInstrumentation.callActivityOnDestroy(r.activity);
if (!r.activity.mCalled) {
throw new SuperNotCalledException(
"Activity " + safeToComponentShortString(r.intent) +
" did not call through to super.onDestroy()");
}
if (r.window != null) {
r.window.closeAllPanels();
}
} catch (SuperNotCalledException e) {
throw e;
} catch (Exception e) {
if (!mInstrumentation.onException(r.activity, e)) {
throw new RuntimeException(
"Unable to destroy activity " + safeToComponentShortString(r.intent)
+ ": " + e.toString(), e);
}
}
r.setState(ON_DESTROY);
}
mActivities.remove(token);
StrictMode.decrementExpectedActivityCount(activityClass);
return r;
}
这个方法就很简单了,就是没有 pause
先 pause
,没有 stop
先 stop
,然后执行 destroy
,然后再把这个 Activity
从本地的记录中移除。
总结
上面已经分析完 Andorid 9 中 Activity
的 finish
流程的源码,这里再简要总结下:
应用进程发起请求 -> AMS
处理请求(对 Activity
发起 pause
生命周期) -> 应用进程处理 pause
生命周期 (处理完成后通知 AMS
) -> AMS
预处理 finishing 的 Activity
(将 Activity
添加至 stop
队列中,开启一个 10s 延迟任务) -> AMS
resume
一个新的 Activity
-> 应用进程执行新 Activity
的 resume
生命周期,完成后开启一个 IdleHandler
(IdleHandler
执行时会通知 AMS
) -> AMS
处理 stop
列表中的 Activity
(首先清除前面的延迟任务,然后发送 destory
生命周期到应用进程)-> 应用进程处理 destroy
生命周期。
到这里如果你还没有看懂,就需要再回去反复看看源码的分析,我也不知道看了多少次。当看懂了后,如果你有灵性想必也可以回答我开头提到的问题了。
我再总结下为啥会有开头的问题: 因为在 Activity
resume
完成后会添加一个 IdleHandler
,它的工作是通知 AMS
去处理 stop
列表中的 Activity
,但是由于 IdleHandler
的特性,只有在线程空闲的时候才能执行,如果这个时候线程很忙,就无法执行,AMS
就收不到这个消息,就把无法执行处理 stop
列表中 Activity
的 destroy
任务,但是 AMS
有一个兜底机制,如果超过 10s 没有收到 idle
的消息,也会执行。所以就出现了我们开头说到的问题,导致 destroy
生命周期延迟 10s 执行。