Android 线程创建学习总结

1,779 阅读12分钟

背景

对线程创建相关的知识进行总结。

  1. 学习线程创建的相关源码
  2. 学习如何监控线程创建
  3. 学习实现一个监控线程创建的 Flipper 插件。

相关源码学习

		Thread thread = new Thread(new Runnable() {
			@Override
			public void run() {
                            //do something
			}
		});

创建 Java Thread 对象

java/lang/Thread

class Thread implements Runnable {
    /**
     * Reference to the native thread object.
     *
     * <p>Is 0 if the native thread has not yet been created/started, or has been destroyed.
     */
    private volatile long nativePeer;
    // END Android-added: Android specific fields lock, nativePeer.

    private volatile String name;

    /* What will be run. */
    private Runnable target;

    /* The group of this thread */
    private ThreadGroup group;

    /* For autonumbering anonymous threads. */
    private static int threadInitNumber;
    private static synchronized int nextThreadNum() {
        return threadInitNumber++;
    }    

    /* ThreadLocal values pertaining to this thread. This map is maintained
     * by the ThreadLocal class. */
    ThreadLocal.ThreadLocalMap threadLocals = null;    

    /*
     * Thread ID
     */
    private final long tid;

    /* For generating thread ID */
    private static long threadSeqNumber;

    private static synchronized long nextThreadID() {
        return ++threadSeqNumber;
    }

    public Thread() {
        this(null, null, "Thread-" + nextThreadNum(), 0);
    }

    public Thread(Runnable target) {
        this(null, target, "Thread-" + nextThreadNum(), 0);
    }

	public Thread(ThreadGroup group, Runnable target) {
        this(group, target, "Thread-" + nextThreadNum(), 0);
    }

    public Thread(String name) {
        this(null, null, name, 0);
    }

    /** @hide */
    Thread(ThreadGroup group, String name, int priority, boolean daemon) {
        this.group = group;
        this.group.addUnstarted();
        // Must be tolerant of threads without a name.
        if (name == null) {
            name = "Thread-" + nextThreadNum();
        }

        // NOTE: Resist the temptation to call setName() here. This constructor is only called
        // by the runtime to construct peers for threads that have attached via JNI and it's
        // undesirable to clobber their natively set name.
        this.name = name;

        this.priority = priority;
        this.daemon = daemon;
        init2(currentThread(), true);
        this.stackSize = 0;
        this.tid = nextThreadID();
    }    

}
  1. 可以看到,Java 层 Thread 类,包含了多个重载构造函数。我们常说的匿名线程,主要指的就是没有在构造函数中去显式地设置线程名称,所以默认的线程名称就是 "Thread-" + 递增的 id。

匿名线程对于分析问题不是很方便,于是就有了通过 ASM 字节码插桩,对项目中的匿名线程池进行命名,并将项目中的 Thread 都替换成 ShadowThread。booster-android-instrument-thread

  1. Java 层的线程 ID 定义:注意这里的 tid,是通过 nextThreadID() 方法进行赋值的,也是自增的数值。与 native 层的 tid 并非是同一个值,native 层的 tid 才是真正的线程 tid 。
  2. nativePeer 就是一个指针,指向 native 层的线程对象。所以通过这个指针,就可以获取到 native 线程对象的相关信息。
  3. 可以看出,通过构造函数创建 Thread 对象,主要是进行各种赋值操作,并不会去真的创建线程。只有在调用了 Thread#start 方法后,才开始真正的线程创建工作。

创建 art::Thread 对象

CreateNativeThread

start 函数内部会调用 native 层的 nativeCreate 函数进行真正的线程创建流程。 nativeCreate 是一个 native 方法,nativeCreate函数对应的实现是 thread.cc 的Thread::CreateNativeThread 函数。

    public synchronized void start() {
        //如果线程已启动过,抛出异常 
        if (started)
            throw new IllegalThreadStateException();
        //添加线程到 group中                   
        group.add(this);

        started = false;
        try {
             //调用native 函数进行真正的线程创建工作
            nativeCreate(this, stackSize, daemon);
            started = true;
        } finally {
            try {
                if (!started) {
                    group.threadStartFailed(this);
                }
            } catch (Throwable ignore) {
                /* do nothing. If start0 threw a Throwable then
                  it will be passed up the call stack */
            }
        }
    }
void Thread::CreateNativeThread(JNIEnv* env, jobject java_peer, size_t stack_size, bool is_daemon) {
  CHECK(java_peer != nullptr);
  Thread* self = static_cast<JNIEnvExt*>(env)->GetSelf();
  //前序验证: 如果是在 runtime shutdown阶段则直接返回
  Runtime* runtime = Runtime::Current();
  // Atomically start the birth of the thread ensuring the runtime isn't shutting down.
  bool thread_start_during_shutdown = false;
  {
    MutexLock mu(self, *Locks::runtime_shutdown_lock_);
    if (runtime->IsShuttingDownLocked()) {
      thread_start_during_shutdown = true;
    } else {
      runtime->StartThreadBirth();
    }
  }
  if (thread_start_during_shutdown) {
    ScopedLocalRef<jclass> error_class(env, env->FindClass("java/lang/InternalError"));
    env->ThrowNew(error_class.get(), "Thread starting during runtime shutdown");
    return;
  }
  // 流程1 创建 ART 虚拟机对应的 Thread对象
  Thread* child_thread = new Thread(is_daemon);
  // Use global JNI ref to hold peer live while child thread starts.
  child_thread->tlsPtr_.jpeer = env->NewGlobalRef(java_peer);
  stack_size = FixStackSize(stack_size);

  // Thread.start is synchronized, so we know that nativePeer is 0, and know that we're not racing
  // to assign it.
  env->SetLongField(java_peer, WellKnownClasses::java_lang_Thread_nativePeer,
                    reinterpret_cast<jlong>(child_thread));

  // Try to allocate a JNIEnvExt for the thread. We do this here as we might be out of memory and
  // do not have a good way to report this on the child's side.
  std::string error_msg;
  std::unique_ptr<JNIEnvExt> child_jni_env_ext(
      JNIEnvExt::Create(child_thread, Runtime::Current()->GetJavaVM(), &error_msg));

  int pthread_create_result = 0;
  if (child_jni_env_ext.get() != nullptr) {
    pthread_t new_pthread;
    pthread_attr_t attr;
    child_thread->tlsPtr_.tmp_jni_env = child_jni_env_ext.get();
    // 流程2:创建 linux内核对应的 thread
    pthread_create_result = pthread_create(&new_pthread,
                                           &attr,
                                           Thread::CreateCallback,
                                           child_thread);
    CHECK_PTHREAD_CALL(pthread_attr_destroy, (&attr), "new thread");

    if (pthread_create_result == 0) {
      // pthread_create started the new thread. The child is now responsible for managing the
      // JNIEnvExt we created.
      child_jni_env_ext.release();  // NOLINT pthreads API.
      return;
    }
  }

  // Either JNIEnvExt::Create or pthread_create(3) failed, so clean up.
  {
    MutexLock mu(self, *Locks::runtime_shutdown_lock_);
    runtime->EndThreadBirth();
  }
  // Manually delete the global reference since Thread::Init will not have been run. Make sure
  // nothing can observe both opeer and jpeer set at the same time.
  child_thread->DeleteJPeer(env);
  delete child_thread;
  child_thread = nullptr;
  // 流程3:设置 java_peer  
  env->SetLongField(java_peer, WellKnownClasses::java_lang_Thread_nativePeer, 0);
  {
    std::string msg(child_jni_env_ext.get() == nullptr ?
        StringPrintf("Could not allocate JNI Env: %s", error_msg.c_str()) :
        StringPrintf("pthread_create (%s stack) failed: %s",
                                 PrettySize(stack_size).c_str(), strerror(pthread_create_result)));
    ScopedObjectAccess soa(env);
    soa.Self()->ThrowOutOfMemoryError(msg.c_str());
  }
}
  1. 流程1:通过 new Thread() 构造出 ART 虚拟机 native 层所对应的 Thread 对象。

系统根据变量的不同类型,如指针类型、32位大小变量、64位大小的变量,会将一些变量分别存放在 tlsptr、tls32_、tls64_、结构体中 。以tls32_对应的结构体为例,其包含了线程状态、线程suspend次数计数、tid、daemon属性、是否OOM等属性_。(注意不同 Android 版本,结构体中的变量可能会有差异。)

Thread::Thread(bool daemon)
    : tls32_(daemon),
      wait_monitor_(nullptr),
      is_runtime_thread_(false) {
  wait_mutex_ = new Mutex("a thread wait mutex", LockLevel::kThreadWaitLock);
  wait_cond_ = new ConditionVariable("a thread wait condition variable", *wait_mutex_);
  tlsPtr_.mutator_lock = Locks::mutator_lock_;
  tlsPtr_.instrumentation_stack =
      new std::map<uintptr_t, instrumentation::InstrumentationStackFrame>;
  tlsPtr_.name.store(kThreadNameDuringStartup, std::memory_order_relaxed);

  static_assert((sizeof(Thread) % 4) == 0U,
                "art::Thread has a size which is not a multiple of 4.");
  StateAndFlags state_and_flags = StateAndFlags(0u).WithState(ThreadState::kNative);
  tls32_.state_and_flags.store(state_and_flags.GetValue(), std::memory_order_relaxed);
  tls32_.interrupted.store(false, std::memory_order_relaxed);
  // Initialize with no permit; if the java Thread was unparked before being
  // started, it will unpark itself before calling into java code.
  tls32_.park_state_.store(kNoPermit, std::memory_order_relaxed);
  memset(&tlsPtr_.held_mutexes[0], 0, sizeof(tlsPtr_.held_mutexes));
  std::fill(tlsPtr_.rosalloc_runs,
            tlsPtr_.rosalloc_runs + kNumRosAllocThreadLocalSizeBracketsInThread,
            gc::allocator::RosAlloc::GetDedicatedFullRun());
  tlsPtr_.checkpoint_function = nullptr;
  for (uint32_t i = 0; i < kMaxSuspendBarriers; ++i) {
    tlsPtr_.active_suspend_barriers[i] = nullptr;
  }
  tlsPtr_.flip_function = nullptr;
  tlsPtr_.thread_local_mark_stack = nullptr;
  tls32_.is_transitioning_to_runnable = false;
  ResetTlab();
}
   explicit tls_32bit_sized_values(bool is_daemon)
        : state_and_flags(0u),
          suspend_count(0),
          thin_lock_thread_id(0),
          tid(0),
          daemon(is_daemon),
          throwing_OutOfMemoryError(false),
          no_thread_suspension(0),
          thread_exit_check_count(0),
          is_transitioning_to_runnable(false),
          is_gc_marking(false),
          is_deopt_check_required(false),
          weak_ref_access_enabled(WeakRefAccessState::kVisiblyEnabled),
          disable_thread_flip_count(0),
          user_code_suspend_count(0),
          force_interpreter_count(0),
          make_visibly_initialized_counter(0),
          define_class_counter(0),
          num_name_readers(0),
          shared_method_hotness(kSharedMethodHotnessThreshold)
        {}
  1. 流程2:调用 pthread_create 方法,创建操作系统层面的线程。
  2. 流程3:线程创建完成之后 ,将创建的 Thread 对象地址,写到 Java 层 Thread 对象的 nativePeer 字段。

所以,在 Java 层拿到 Thread 对象后,可以通过反射获取到 nativePeer 的值,就相当于拿到了 native 层的线程地址。

    public static final long getNativePeer(Thread t)throws IllegalAccessException{
        try {
            Field nativePeerField = Thread.class.getDeclaredField("nativePeer");
            nativePeerField.setAccessible(true);
            Long nativePeer = (Long) nativePeerField.get(t);
            return nativePeer;
        } catch (NoSuchFieldException e) {
            throw new IllegalAccessException("failed to get nativePeer value");
        } catch (IllegalAccessException e) {
            throw e;
        }
    }

创建系统线程对象

上面提到,真正创建线程的方法,是 pthread_create,会创建操作系统层面的线程。

int pthread_create(pthread_t* thread_out, pthread_attr_t const* attr,
                   void* (*start_routine)(void*), void* arg)

pthread_create 函数对应的参数含义如下。

  1. __pthread_ptr:pthread_t类型的参数,成功时tidp指向的内容被设置为新创建线程的pthread_t。
  2. __attr 线程的属性。
  3. __start_routine 执行函数,新创建线程从此函数开始运行。
  4. __start_routine中 需要运行的入参,如果__start_routine不需要入参,则该值为null。

上面在调用 pthread_create 的时候,第三个参数指定了该线程启动后运行的函数为 Thread::CreateCallback。

void* Thread::CreateCallback(void* arg) {
  Thread* self = reinterpret_cast<Thread*>(arg);
  Runtime* runtime = Runtime::Current();
  {
    //..
    //调用Thread::Init()  函数进行Thread对象的一些初始化操
    CHECK(self->Init(runtime->GetThreadList(), runtime->GetJavaVM(), self->tlsPtr_.tmp_jni_env));
    self->tlsPtr_.tmp_jni_env = nullptr;
    Runtime::Current()->EndThreadBirth();
  }
  {
    ScopedObjectAccess soa(self);
    self->InitStringEntryPoints();

    // Copy peer into self, deleting global reference when done.
    CHECK(self->tlsPtr_.jpeer != nullptr);
    self->tlsPtr_.opeer = soa.Decode<mirror::Object>(self->tlsPtr_.jpeer).Ptr();
    // Make sure nothing can observe both opeer and jpeer set at the same time.
    self->DeleteJPeer(self->GetJniEnv());

	// 设置线程名称
    self->SetThreadName(self->GetThreadName()->ToModifiedUtf8().c_str());

    ArtField* priorityField = jni::DecodeArtField(WellKnownClasses::java_lang_Thread_priority);
    //设置线程优先级
    self->SetNativePriority(priorityField->GetInt(self->tlsPtr_.opeer));
    
    runtime->GetRuntimeCallbacks()->ThreadStart(self);

    ArtField* unparkedField = jni::DecodeArtField(
        WellKnownClasses::java_lang_Thread_unparkedBeforeStart);
    bool should_unpark = false;
    {
      art::MutexLock mu(soa.Self(), *art::Locks::thread_list_lock_);
      should_unpark = unparkedField->GetBoolean(self->tlsPtr_.opeer) == JNI_TRUE;
    }
    if (should_unpark) {
      self->Unpark();
    }
    // Invoke the 'run' method of our java.lang.Thread.
    ObjPtr<mirror::Object> receiver = self->tlsPtr_.opeer;
    jmethodID mid = WellKnownClasses::java_lang_Thread_run;
    ScopedLocalRef<jobject> ref(soa.Env(), soa.AddLocalReference<jobject>(receiver));
    InvokeVirtualOrInterfaceWithJValues(soa, ref.get(), mid, nullptr);
  }
  // Detach and delete self.
  Runtime::Current()->GetThreadList()->Unregister(self);

  return nullptr;
}

在 CreateCallback 中,会先调用 Thread::Init() 函数进行 Thread 对象的一些初始化操作。

bool Thread::Init(ThreadList* thread_list, JavaVMExt* java_vm, JNIEnvExt* jni_env_ext) {
  //..
  // Set pthread_self_ ahead of pthread_setspecific, that makes Thread::Current function, this
  // avoids pthread_self_ ever being invalid when discovered from Thread::Current().
  tlsPtr_.pthread_self = pthread_self();

  ScopedTrace trace("Thread::Init");

  SetUpAlternateSignalStack();
  if (!InitStackHwm()) {
    return false;
  }
  InitCpu();
  InitTlsEntryPoints();
  RemoveSuspendTrigger();
  InitCardTable();
  InitTid();

#ifdef __BIONIC__
  __get_tls()[TLS_SLOT_ART_THREAD_SELF] = this;
#else
  CHECK_PTHREAD_CALL(pthread_setspecific, (Thread::pthread_key_self_, this), "attach self");
  Thread::self_tls_ = this;
#endif

  tls32_.thin_lock_thread_id = thread_list->AllocThreadId(this);

  if (jni_env_ext != nullptr) {
    DCHECK_EQ(jni_env_ext->GetVm(), java_vm);
    DCHECK_EQ(jni_env_ext->GetSelf(), this);
    tlsPtr_.jni_env = jni_env_ext;
  } else {
    std::string error_msg;
    tlsPtr_.jni_env = JNIEnvExt::Create(this, java_vm, &error_msg);
    if (tlsPtr_.jni_env == nullptr) {
      LOG(ERROR) << "Failed to create JNIEnvExt: " << error_msg;
      return false;
    }
  }
  ScopedTrace trace3("ThreadList::Register");
  thread_list->Register(this);
  return true;
}
void Thread::InitTid() {
  tls32_.tid = ::art::GetTid();
}

uint32_t GetTid() {
#if defined(__APPLE__)
  uint64_t owner;
  CHECK_PTHREAD_CALL(pthread_threadid_np, (nullptr, &owner), __FUNCTION__);  // Requires Mac OS 10.6
  return owner;
#elif defined(__BIONIC__)
  return gettid();
#elif defined(_WIN32)
  return static_cast<pid_t>(::GetCurrentThreadId());
#else
  return syscall(__NR_gettid);
#endif
}

InitTid 函数,主要是对 tls32_结构体的 tid 字段 进行赋值。

GetTid 函数,会根据不同的系统,调用系统函数,获取操作系统层面的线程 id。

jmethodID mid = WellKnownClasses::java_lang_Thread_run;
ScopedLocalRef<jobject> ref(soa.Env(), soa.AddLocalReference<jobject>(receiver));
InvokeVirtualOrInterfaceWithJValues(soa, ref.get(), mid, nullptr);

执行线程任务:获取 Java层 Thread对象run()函数对应的 jmethodID,Thread 类的run()函数是个抽象函数,因此需要调用 InvokeVirtualOrInterfaceWithJvalues 执行,在该函数内部会查找到该抽象函数对应的最终的实现函数进行调用。

总结

Java线程创建流程

  • Java层初始化操作,设置线程基本信息
  • 调用 nativeCreate 进入Native层线程创建的流程
    • 创建art::Thread对象流程
      • 调用 new Thread(is_daemon) ,创建 art::Thread对象
      • 将art::Thread 对象的地址写回到对应的Java Thread对象的 nativePeer属性中
    • 创建系统线程对象
      • 调用 pthread_create 创建操作系统层面的真线程对象,并设置线程创建后执行的函数为Thread::CreateCallback
      • 如果pthred_create创建失败,进行资源回收,并抛出OOM异常
    • 系统线程创建成功,当前线程流程结束,Thread::CreateCallback 在新线程得到执行
    • 线程任务 Thread::CreateCallabck流程
      • Thread::Init() 函数进行Thread对象的一些初始化
      • 设置线程名称
      • 设置线程优先级
      • 获取 Java层Thread对象 对应的 run()函数,并调用run函数
  • 最后,在异步线程中,Java Thread对象的run函数得到执行

Flipper 插件实现

实现思路:监控线程创建,将线程创建的信息,在 Flipper 端进行展示,更加方便地分析线程创建过程,提高定位问题的效率。

线程信息包括:线程tid,线程名,线程创建堆栈信息等,下面的内容主要是讲下如何获取到对应数据。

线程创建监控

整体实现思路:使用 native hook 的方式, hook 了 pthread_create 调用,并记录每一个线程创建时的堆栈。

这里使用的是 xhook,native 代码实现参考 KOOM 库。

xhook_register("libart.so", "pthread_create", (void *) HookThreadCreate, nullptr);

int ThreadHooker::HookThreadCreate(pthread_t *tidp, const pthread_attr_t *attr,
                                   void *(*start_rtn)(void *), void *arg) {
    auto time = Util::CurrentTimeNs();
    threadhook::Log::info(thread_tag, "HookThreadCreate");
    auto *hook_arg = new StartRtnArg(arg, Util::CurrentTimeNs(), start_rtn);
    auto *thread_create_arg = hook_arg->thread_create_arg;
    void *thread = threadhook::CallStack::GetCurrentThread();
    if (thread != nullptr) {
        //获取Java堆栈
        threadhook::CallStack::JavaStackTrace(thread,
                                              hook_arg->thread_create_arg->java_stack);
        //获取native堆栈
        threadhook::CallStack::NativeStackTrace(thread_create_arg->pc,
                                                    threadhook::Constant::kMaxCallStackDepth,
                                                    thread_create_arg->native_stack)
    }
 
    thread_create_arg->stack_time = Util::CurrentTimeNs() - time;
    return pthread_create(tidp, attr,
                          reinterpret_cast<void *(*)(void *)>(HookThreadStart),
                          reinterpret_cast<void *>(hook_arg));
}

获取 Java 堆栈

场景:当我们在 native层拦截到线程创建后,想要获取此时的 Java 堆栈数据。

首先想到的是 Java 提供的方法,我们可以通过 Thread.currentThread().stackTrace 来获取到堆栈数据。

但是我们此时是 native 层,想调用 java 层方法,会麻烦那么一点,所以换一种思路实现。

art::Thread::DumpJavaStack

参考KOOM里面的方式,是在 native 层直接调用 art::Thread::DumpJavaStack 函数来获取 Java堆栈。

Thread::DumpOrder Thread::DumpJavaStack(std::ostream& os,
                                        bool check_suspended,
                                        bool dump_locks) const {
  // Dumping the Java stack involves the verifier for locks. The verifier operates under the
  // assumption that there is no exception pending on entry. Thus, stash any pending exception.
  // Thread::Current() instead of this in case a thread is dumping the stack of another suspended
  // thread.
  ScopedExceptionStorage ses(Thread::Current());

  std::unique_ptr<Context> context(Context::Create());
  StackDumpVisitor dumper(os, const_cast<Thread*>(this), context.get(),
                          !tls32_.throwing_OutOfMemoryError, check_suspended, dump_locks);
  dumper.WalkStack();
  if (IsJitSensitiveThread()) {
    return DumpOrder::kMain;
  } else if (dumper.num_blocked > 0) {
    return DumpOrder::kBlocked;
  } else if (dumper.num_locked > 0) {
    return DumpOrder::kLocked;
  } else {
    return DumpOrder::kDefault;
  }
}

dlopen 和 dlsym 方法

想调用对应的函数,就得先获取到对应的函数指针。

  • 通过 dlopen 函数,通过动态连接库的文件名或路径,来获取一个动态链接库的句柄。
  • 通过 dlsym 函数,通过一个动态链接库的句柄,来查找某个函数符号对应的指针。
void *handle = dlopen("libart.so", RTLD_LAZY | RTLD_LOCAL);
void *dump_java_stack_above_o = dlsym(handle,"_ZNK3art6Thread13DumpJavaStackERNSt3__113basic_ostreamIcNS1_11char_traitsIcEEEEbb");

问题:刚好自己的测试设备是Android 10,会发现 dump_java_stack_above_o为null,也就是说在 libart.so 中找不到 art::Thread::DumpJavaStack 的符号。

系统禁止调用非公开 NDK 库

原因:从Android N开始(SDK >= 24),系统将阻止应用动态链接非公开NDK库。所以通过 dlopen 打开系统私有库,或者 lib 库中依赖系统私有库,都会产生异常。

系统私有库:指的是存放在 android 系统 /system/lib/和/vendor/lib下面,但是 Android NDK 中没有公开 API 的 lib库。

官方文档说明:链接

绕过系统限制

解决方案:xDL,提供了增强的 dlopen() + dlsym() 的方法,能够绕过 Android 7.0+ linker namespace 的限制,查找到相应的函数符号。

通过这种方式,我们就可以拿到 art::Thread::DumpJavaStack 的函数指针,然后就可以传入相应参数,获取当前线程的 Java 堆栈数据了。

void *handle = xdl_open("libart.so", RTLD_LAZY | RTLD_LOCAL);
void *dump_java_stack_above_o = xdl_dsym(handle,"_ZNK3art6Thread13DumpJavaStackERNSt3__113basic_ostreamIcNS1_11char_traitsIcEEEEbb",NULL);

//获取Java堆栈
void CallStack::JavaStackTrace(void *thread, std::ostream &os) {
		//...
		dump_java_stack_above_o(thread, os, true, false);  
}

获取线程名

先了解一下线程名这块的一些逻辑。

Native层线程名

设置线程名

pthread 线程的命名,是通过 pthread_setname_np 函数进行修改的。但是有个问题,传入的名称长度不能超过16,否则设置无效。

// This value is not exported by kernel headers.
#define MAX_TASK_COMM_LEN 16

int pthread_setname_np(pthread_t t, const char* thread_name) {
  ErrnoRestorer errno_restorer;

	//计算字符串长度
  size_t thread_name_len = strlen(thread_name);
  if (thread_name_len >= MAX_TASK_COMM_LEN) return ERANGE;

  // Setting our own name is an easy special case.
  if (t == pthread_self()) {
    return prctl(PR_SET_NAME, thread_name) ? errno : 0;
  }

  // We have to set another thread's name.
  int fd = __open_task_comm_fd(t, O_WRONLY, "pthread_setname_np");
  if (fd == -1) return errno;

  ssize_t n = TEMP_FAILURE_RETRY(write(fd, thread_name, thread_name_len));
  close(fd);

  if (n == -1) return errno;
  if (n != static_cast<ssize_t>(thread_name_len)) return EIO;
  return 0;
}

获取线程名

所以通过下面这种方式,去获取当前线程的名字,一般是不完整的,会被截取。

char thread_name[16];
prctl(PR_GET_NAME,thread_name):

Java 层线程名

设置线程名

  • 我们在 Java 层创建线程的时候,可以显示的设置线程名。如果没有指定名称,则系统会按照"Thread-XX"的方式自动命名。
  • 或者通过 java.lang.Thread#setName 设置线程名字。
public Thread(Runnable target, String name) {
		init(null, target, name, 0);
}
public Thread(ThreadGroup group, Runnable target) {
    init(group, target, "Thread-" + nextThreadNum(), 0);
}

public final synchronized void setName(String name) {
    this.name = name;
    setNativeName(name);
}
private native void setNativeName(String name);

获取线程名

  • 通过 java.lang.Thread#getName 获取到线程名对应的完整字符串,没有长度限制,不会有被截取的问题。
public final String getName() {
	return name;
}

获取Java 线程真正的 tid

通过上面源码的分析。

  1. 我们知道 java.lang.Thread#tid 并不是真正的线程 tid。线程真正的 tid 是存放在 tls32_结构体中,而 tls32_结构体是存放在 Thread 对象的头部。
  2. 通过 Java层的 nativePeer 指针,可以拿到 底层的 Thread 对象。
  3. 通过指针偏移的方式,偏移不同大小,就可以找到对应字段的值。
  4. 比如下面的结构,nativePeer 指针向下偏移3位就找到了tid(因为state_and_flags,state_and_flags,think_lock_thread_id都是int类型),所以对应的代码实现为
int *pInt = reinterpret_cast<int *>(native_peer);
//地址 +3,得到 native id
//不同版本可能有不同的偏移量
pInt = pInt + 3;

确定实现思路

上面提到,获取 native pthread的线程名是不完整的,会被截取。所以我们这里使用的是 Java Thread#name。

场景:当我们在 native层拦截到线程创建后,已经拿到 native 线程的 tid了,想要获取对应的 Java 线程名。

实现思路:

  1. 通过 Thread.getAllStackTraces().keys 拿到 Thread 列表,找到对应 tid 的Thread,调用 Thread#getName 方法获取线程名。
  2. 兜底逻辑,如果步骤1拿不到线程名的话,就使用 native 层调用 PR_GET_NAME 获取到的线程名。
var threadName = ""
val thread = Thread.getAllStackTraces().keys.firstOrNull {
    tid == ThreadCreateMonitor.getNativeTid(it)
}
if (thread != null) {
   threadName = thread.name
} else {
   threadName = nativeName
}

插件使用

插件地址:github链接

添加相关的依赖

pluginManagement {
    repositories {
        gradlePluginPortal()
        google()
        mavenCentral()
        maven { url 'https://jitpack.io' }
    }
}
dependencyResolutionManagement {
    repositoriesMode.set(RepositoriesMode.FAIL_ON_PROJECT_REPOS)
    repositories {
        google()
        mavenCentral()
        maven { url 'https://jitpack.io' }
    }
}
implementation "com.github.LXD312569496.thread-learning:thread_hook:0.0.3"
implementation "com.github.LXD312569496.thread-learning:thread_hook_flipper:0.0.3"

在 Application 中初始化,调用 ThreadCreateMonitor.start() 开始监控线程的创建过程。

class MyApplication: Application() {

    override fun attachBaseContext(base: Context?) {
        ThreadCreateMonitor.start()
        super.attachBaseContext(base)
    }
}

参考文章

Android虚拟机线程启动过程解析, 获取Java线程真实线程Id的方式 - 掘金

Andoird性能优化 - 死锁监控与其背后的小知识 - 掘金

GitHub - KwaiAppTeam/KOOM: KOOM is an OOM killer on mobile platform by Kwai.

Android Native禁止使用系统私有库详解-阿里云开发者社区

Android中的进程名和线程名