本文为稀土掘金技术社区首发签约文章,14天内禁止转载,14天后未获授权禁止转载,侵权必究!
背景
crash一直是影响app稳定性的大头,同时在随着项目逐渐迭代,复杂性越来越提高的同时,由于主观或者客观的的原因,都会造成意想不到的crash出现。同样的,在android的历史化过程中,就算是android系统本身,在迭代中也会存在着隐含的crash。我们常说的crash包括java层(虚拟机层)crash与native层crash,本期我们着重讲一下java层的crash。
java层crash由来
虽然说我们在开发过程中会遇到各种各样的crash,但是这个crash是如果产生的呢?我们来探讨一下一个crash是如何诞生的!
我们很容易就知道,在java中main函数是程序的开始(其实还有前置步骤),我们开发中,虽然android系统把应用的主线程创建封装在了自己的系统中,但是无论怎么封装,一个java层的线程无论再怎么强大,背后肯定是绑定了一个操作系统级别的线程,才真正得与驱动,也就是说,我们平常说的java线程,它其实是被操作系统真正的Thread的一个使用体罢了,java层的多个thread,可能会只对应着native层的一个Thread(便于区分,这里thread统一只java层的线程,Thread指的是native层的Thread。其实native的Thread也不是真正的线程,只是操作系统提供的一个api罢了,但是我们这里先简单这样定义,假设了native的线程与操作系统线程为同一个东西)
每一个java层的thread调用start方法,就会来到native层Thread的世界
public synchronized void start() {
throw new IllegalThreadStateException();
group.add(this);
started = false;
try {
nativeCreate(this, stackSize, daemon);
started = true;
} finally {
try {
if (!started) {
group.threadStartFailed(this);
}
} catch (Throwable ignore) {
/* do nothing. If start0 threw a Throwable then
it will be passed up the call stack */
}
}
}
最终调用的是一个jni方法
private native static void nativeCreate(Thread t, long stackSize, boolean daemon);
而nativeCreate最终在native层的实现是
static void Thread_nativeCreate(JNIEnv* env, jclass, jobject java_thread, jlong stack_size,
jboolean daemon) {
// There are sections in the zygote that forbid thread creation.
Runtime* runtime = Runtime::Current();
if (runtime->IsZygote() && runtime->IsZygoteNoThreadSection()) {
jclass internal_error = env->FindClass("java/lang/InternalError");
CHECK(internal_error != nullptr);
env->ThrowNew(internal_error, "Cannot create threads in zygote");
return;
}
// 这里就是真正的创建线程方法
Thread::CreateNativeThread(env, java_thread, stack_size, daemon == JNI_TRUE);
}
CreateNativeThread 经过了一系列的校验动作,终于到了真正创建线程的地方了,最终在CreateNativeThread方法中,通过了pthread_create创建了一个真正的Thread
Thread::CreateNativeThread 方法中
...
pthread_create_result = pthread_create(&new_pthread,
&attr,
Thread::CreateCallback,
child_thread);
CHECK_PTHREAD_CALL(pthread_attr_destroy, (&attr), "new thread");
if (pthread_create_result == 0) {
// pthread_create started the new thread. The child is now responsible for managing the
// JNIEnvExt we created.
// Note: we can't check for tmp_jni_env == nullptr, as that would require synchronization
// between the threads.
child_jni_env_ext.release(); // NOLINT pthreads API.
return;
}
...
到这里我们就能够明白,一个java层的thread其实真正绑定的,是一个native层的Thread,有了这个知识,我们就可以回到我们的crash主题了,当发生异常的时候(即检测到一些操作不符合虚拟机规定时),注意,这个时候还是在虚拟机的控制范围之内,就可以直接调用
void Thread::ThrowNewException(const char* exception_class_descriptor,
const char* msg) {
// Callers should either clear or call ThrowNewWrappedException.
AssertNoPendingExceptionForNewException(msg);
ThrowNewWrappedException(exception_class_descriptor, msg);
}
进行对exception的抛出,我们目前所有的java层crash都是如此,因为对crash的识别还属于本虚拟机所在的进程的范畴(native crash 虚拟机就没办法直接识别),比如我们常见的各种crash
然后就会调用到Thread::ThrowNewWrappedException 方法,在这个方法里面再次调用到Thread::SetException方法,成功的把当次引发异常的信息记录下来
void Thread::SetException(ObjPtr<mirror::Throwable> new_exception) {
CHECK(new_exception != nullptr);
// TODO: DCHECK(!IsExceptionPending());
tlsPtr_.exception = new_exception.Ptr();
}
此时,此时就会调用Thread的Destroy方法,这个时候,线程就会在里面判断,本次的异常该怎么去处理
void Thread::Destroy() {
...
if (tlsPtr_.opeer != nullptr) {
ScopedObjectAccess soa(self);
// We may need to call user-supplied managed code, do this before final clean-up.
HandleUncaughtExceptions(soa);
RemoveFromThreadGroup(soa);
Runtime* runtime = Runtime::Current();
if (runtime != nullptr) {
runtime->GetRuntimeCallbacks()->ThreadDeath(self);
}
HandleUncaughtExceptions 这个方式就是处理的函数,我们继续看一下这个异常处理函数
void Thread::HandleUncaughtExceptions(ScopedObjectAccessAlreadyRunnable& soa) {
if (!IsExceptionPending()) {
return;
}
ScopedLocalRef<jobject> peer(tlsPtr_.jni_env, soa.AddLocalReference<jobject>(tlsPtr_.opeer));
ScopedThreadStateChange tsc(this, ThreadState::kNative);
// Get and clear the exception.
ScopedLocalRef<jthrowable> exception(tlsPtr_.jni_env, tlsPtr_.jni_env->ExceptionOccurred());
tlsPtr_.jni_env->ExceptionClear();
// Call the Thread instance's dispatchUncaughtException(Throwable)
// 关键点就在此,回到java层
tlsPtr_.jni_env->CallVoidMethod(peer.get(),
WellKnownClasses::java_lang_Thread_dispatchUncaughtException,
exception.get());
// If the dispatchUncaughtException threw, clear that exception too.
tlsPtr_.jni_env->ExceptionClear();
}
到这里,我们就接近尾声了,可以看到我们的处理函数最终通过jni,再次回到了java层的世界,而这个连接的java层函数就是dispatchUncaughtException(java_lang_Thread_dispatchUncaughtException)
public final void dispatchUncaughtException(Throwable e) {
// BEGIN Android-added: uncaughtExceptionPreHandler for use by platform.
Thread.UncaughtExceptionHandler initialUeh =
Thread.getUncaughtExceptionPreHandler();
if (initialUeh != null) {
try {
initialUeh.uncaughtException(this, e);
} catch (RuntimeException | Error ignored) {
// Throwables thrown by the initial handler are ignored
}
}
// END Android-added: uncaughtExceptionPreHandler for use by platform.
getUncaughtExceptionHandler().uncaughtException(this, e);
}
到这里,我们就彻底了解到了一个java层异常的产生过程!
为什么java层异常会导致crash
从上面我们文章我们能够看到,一个异常是怎么产生的,可能细心的读者会了解到,笔者一直在用异常这个词,而不是crash,因为异常发生了,crash是不一定产生的!我们可以看到dispatchUncaughtException方法最终会尝试着调用UncaughtExceptionHandler去处理本次异常,好家伙!那么UncaughtExceptionHandler是在什么时候设置的?其实就是在Init中,由系统提前设置好的!frameworks/base/core/java/com/android/internal/os/RuntimeInit.java
protected static final void commonInit() {
if (DEBUG) Slog.d(TAG, "Entered RuntimeInit!");
/*
* set handlers; these apply to all threads in the VM. Apps can replace
* the default handler, but not the pre handler.
*/
LoggingHandler loggingHandler = new LoggingHandler();
RuntimeHooks.setUncaughtExceptionPreHandler(loggingHandler);
Thread.setDefaultUncaughtExceptionHandler(new KillApplicationHandler(loggingHandler));
/*
* Install a time zone supplier that uses the Android persistent time zone system property.
*/
RuntimeHooks.setTimeZoneIdSupplier(() -> SystemProperties.get("persist.sys.timezone"));
LogManager.getLogManager().reset();
new AndroidConfig();
/*
* Sets the default HTTP User-Agent used by HttpURLConnection.
*/
String userAgent = getDefaultUserAgent();
System.setProperty("http.agent", userAgent);
/*
* Wire socket tagging to traffic stats.
*/
TrafficStats.attachSocketTagger();
initialized = true;
}
好家伙,原来是KillApplicationHandler“捣蛋”,在异常到来时,就会通过KillApplicationHandler去处理,而这里的处理就是,杀死app!!
private static class KillApplicationHandler implements Thread.UncaughtExceptionHandler {
private final LoggingHandler mLoggingHandler;
public KillApplicationHandler(LoggingHandler loggingHandler) {
this.mLoggingHandler = Objects.requireNonNull(loggingHandler);
}
@Override
public void uncaughtException(Thread t, Throwable e) {
try {
ensureLogging(t, e);
// Don't re-enter -- avoid infinite loops if crash-reporting crashes.
if (mCrashing) return;
mCrashing = true;
if (ActivityThread.currentActivityThread() != null) {
ActivityThread.currentActivityThread().stopProfiling();
}
// Bring up crash dialog, wait for it to be dismissed
ActivityManager.getService().handleApplicationCrash(
mApplicationObject, new ApplicationErrorReport.ParcelableCrashInfo(e));
} catch (Throwable t2) {
if (t2 instanceof DeadObjectException) {
// System process is dead; ignore
} else {
try {
Clog_e(TAG, "Error reporting crash", t2);
} catch (Throwable t3) {
// Even Clog_e() fails! Oh well.
}
}
} finally {
// Try everything to make sure this process goes away.
Process.killProcess(Process.myPid());
System.exit(10);
}
}
private void ensureLogging(Thread t, Throwable e) {
if (!mLoggingHandler.mTriggered) {
try {
mLoggingHandler.uncaughtException(t, e);
} catch (Throwable loggingThrowable) {
// Ignored.
}
}
}
}
看到了吗!异常的产生导致的crash,真正的源头就是在此了!
捕获crash
通过对前文的阅读,我们了解到了crash的源头就是KillApplicationHandler,因为它默认处理就是杀死app,此时我们也注意到,它是继承于UncaughtExceptionHandler的。当然,有异常及时抛出解决,是一件好事,但是我们也可能有一些异常,比如android系统sdk的问题,或者其他没那么重要的异常,直接崩溃app,这个处理就不是那么好了。但是不要紧,java虚拟机开发者也肯定注意到了这点,所以提供
Thread.java
public static void setDefaultUncaughtExceptionHandler(UncaughtExceptionHandler eh)
方式,导入一个我们自定义的实现了UncaughtExceptionHandler接口的类
public interface UncaughtExceptionHandler {
/**
* Method invoked when the given thread terminates due to the
* given uncaught exception.
* <p>Any exception thrown by this method will be ignored by the
* Java Virtual Machine.
* @param t the thread
* @param e the exception
*/
void uncaughtException(Thread t, Throwable e);
}
此时我们只需要写一个类,模仿KillApplicationHandler一样,就能写出一个自己的异常处理类,去处理我们程序中的异常(或者Android系统中特定版本的异常)。例子demo比如
class MyExceptionHandler:Thread.UncaughtExceptionHandler {
override fun uncaughtException(t: Thread, e: Throwable) {
// 做自己的逻辑
Log.i("hello",e.toString())
}
}
总结
到这里,我们能够了解到了一个java crash是怎么产生的了,同时我们也了解到了常用的UncaughtExceptionHandler为什么可以拦截一些我们不希望产生crash的异常,在接下来的android性能优化系列中,会持续带来相关的其他分享,感谢观看