通过源码搞定ANR Log如何产生(五)

841 阅读3分钟

上文提到在system/core/debuggerd/client/debuggerd_client.cpp的debuggerd_trigger_dump中,接收到了socket服务端返回的response

5.1

     接3.4中的code,继续往下分析,在接收到response以后,从response中判断是否Registered ,然后调用send_signal()方法发送SIGQUIT信号,开始dump anr发生时该pid对应的log

接下来在[art](http://axr.htc.com/source/xref/Q_Mainline/art/)/[runtime](http://axr.htc.com/source/xref/Q_Mainline/art/runtime/)/[runtime.cc](http://axr.htc.com/source/xref/Q_Mainline/art/runtime/runtime.cc) 中,看怎么处理SIGQUIT信号的
bool debuggerd_trigger_dump(pid_t tid, DebuggerdDumpType dump_type, unsigned int timeout_ms,
                            unique_fd output_fd) {
  // Check to make sure we've successfully registered.
  InterceptResponse response;
  //获取5.4中的response,并且进行相应的kRegistered判断
  rc = TEMP_FAILURE_RETRY(recv(set_timeout(sockfd.get()), &response, sizeof(response), MSG_TRUNC));
  if (response.status != InterceptStatus::kRegistered) {
    LOG(ERROR) << "libdebuggerd_client: unexpected registration response: "
               << static_cast(response.status);
    return false;
  }
  //发送SIGQUIT信号,在/Q_Mainline/art/runtime/runtime.cc中处理这个信号
  if (!send_signal(tid, dump_type)) {
    return false;
  }                                     
}

5.2  art/runtime/runtime.cc

     在初始化化方法Init()中,SIGQUIT就是在init的时候注册的,在BlockSignals()中通过signals.Add(SIGQUIT)注册的

     然后从注释看,发现在Runtime::Init() -> LoadNativeBridge() -> Runtime::Start()这样的逻辑,那么就看Runtime::Start()
bool Runtime::Init(RuntimeArgumentMap&& runtime_options_in) {
    //在BlockSignals()中,注册SIGQUIT信号
    BlockSignals();

  // Look for a native bridge.
  //
  // The intended flow here is, in the case of a running system:
  //
  // Runtime::Init() (zygote):
  //   LoadNativeBridge -> dlopen from cmd line parameter.
  //  |
  //  V
  // Runtime::Start() (zygote):
  //   No-op wrt native bridge.
  //  |
  //  | start app
  //  V
  // DidForkFromZygote(action)
  //   action = kUnload -> dlclose native bridge.
  //   action = kInitialize -> initialize library
  //
  //
  // The intended flow here is, in the case of a simple dalvikvm call:
  //
  // Runtime::Init():
  //   LoadNativeBridge -> dlopen from cmd line parameter.
  //  |
  //  V
  // Runtime::Start():
  //   DidForkFromZygote(kInitialize) -> try to initialize any native bridge given.
  //   No-op wrt native bridge.
  {
      //从注释看,LoadNativeBridge()最后调用了Runtime::Start()
    std::string native_bridge_file_name = runtime_options.ReleaseOrDefault(Opt::NativeBridge);
    is_native_bridge_loaded_ = LoadNativeBridge(native_bridge_file_name);
  }    
}

void Runtime::BlockSignals() {
  SignalSet signals;
  signals.Add(SIGPIPE);
  // SIGQUIT is used to dump the runtime's state (including stack traces).
  signals.Add(SIGQUIT);
  // SIGUSR1 is used to initiate a GC.
  signals.Add(SIGUSR1);
  signals.Block();
}

5.3

     Runtime::Start(),调用了InitNonZygoteOrPostFork()

bool Runtime::Start() {
  if (!is_zygote_) {
    InitNonZygoteOrPostFork(self->GetJniEnv(), /* is_system_server= */ false, action, GetInstructionSetString(kRuntimeISA));
  }

5.4

     Runtime::InitNonZygoteOrPostFork(),调用了StartSignalCatcher()

void Runtime::InitNonZygoteOrPostFork(JNIEnv* env, bool is_system_server, NativeBridgeAction action, const char* isa, bool profile_system_server) {
    StartSignalCatcher();
}

5.5

     Runtime::StartSignalCatcher(),在这里new了一个SignalCatcher()对象,SignalCatcher的构造方法在 art/runtime/signal_catcher.cc

void Runtime::StartSignalCatcher() {
  if (!is_zygote_) {
    signal_catcher_ = new SignalCatcher();
  }
}

5.6  art/runtime/signal_catcher.cc

     SignalCatcher::SignalCatcher(),通过构造方法,创建SignalCatcher:对象,在构造方法中,通过注释可知,会执行Run()

SignalCatcher::SignalCatcher(): lock_("SignalCatcher lock"),cond_("SignalCatcher::cond_", lock_),thread_(nullptr) {
  SetHaltFlag(false);
  // Create a raw pthread; its start routine will attach to the runtime.
  CHECK_PTHREAD_CALL(pthread_create, (&pthread_, nullptr, &Run, this), "signal catcher thread");
}

5.7

     SignalCatcher::Run(),这里通过一个while循环,去switch SIGQUIT信号,然后调用HandleSigQuit()去处理

void* SignalCatcher::Run(void* arg) {
  while (true) {
    int signal_number = signal_catcher->WaitForSignal(self, signals);
    if (signal_catcher->ShouldHalt()) {
      runtime->DetachCurrentThread();
      return nullptr;
    }
    switch (signal_number) {
      case SIGQUIT:
        signal_catcher->HandleSigQuit();
        break;
     }
  }
}

5.8

     SignalCatcher::HandleSigQuit(),这个方法创建了一个ostringstream类型的os流,然后将需要打印的log都存放进去

     1,调用DumpForSigQuit()去收集不同的log
     2,通过Output(os.str()),将这些log输出到文件
void SignalCatcher::HandleSigQuit() {
  Runtime* runtime = Runtime::Current();
  std::ostringstream os;
  os << "\n"
      << "----- pid " << getpid() << " at " << GetIsoDate() << " -----\n";
  DumpCmdLine(os);
  // Note: The strings "Build fingerprint:" and "ABI:" are chosen to match the format used by
  // debuggerd. This allows, for example, the stack tool to work.
  std::string fingerprint = runtime->GetFingerprint();
  os << "Build fingerprint: '" << (fingerprint.empty() ? "unknown" : fingerprint) << "'\n";
  os << "ABI: '" << GetInstructionSetString(runtime->GetInstructionSet()) << "'\n";
  os << "Build type: " << (kIsDebugBuild ? "debug" : "optimized") << "\n";

  runtime->DumpForSigQuit(os);

  os << "----- end " << getpid() << " -----\n";
  Output(os.str());
}

5.9

     Runtime::DumpForSigQuit(),通过c++的多态的特点,通过对应的DumpForSigQuit()去收集对应的log

void Runtime::DumpForSigQuit(std::ostream& os) {
  GetClassLinker()->DumpForSigQuit(os);
  GetInternTable()->DumpForSigQuit(os);
  //收集jvmlog
  GetJavaVM()->DumpForSigQuit(os);
  //收集内存占用log,比如Total number of allocations之类的
  GetHeap()->DumpForSigQuit(os);
  oat_file_manager_->DumpForSigQuit(os);
  if (GetJit() != nullptr) {
    GetJit()->DumpForSigQuit(os);
  } else {
    os << "Running non JIT\n";
  }
  DumpDeoptimizations(os);
  TrackedAllocators::Dump(os);
  os << "\n";

  thread_list_->DumpForSigQuit(os);
  BaseMutex::DumpAll(os);
}

5.10

     SignalCatcher::Output(),传入参数为6.9中收集的log组成的os流,然后调用PaletteWriteCrashThreadStacks()将log写入到本地

void SignalCatcher::Output(const std::string& s) {
  ScopedThreadStateChange tsc(Thread::Current(), kWaitingForSignalCatcherOutput);
  PaletteStatus status = PaletteWriteCrashThreadStacks(s.data(), s.size());
  if (status == PaletteStatus::kOkay) {
    LOG(INFO) << "Wrote stack traces to tombstoned";
  } else {
    CHECK(status == PaletteStatus::kFailedCheckLog);
    LOG(ERROR) << "Failed to write stack traces to tombstoned";
  }
}

5.11

     PaletteStatus PaletteWriteCrashThreadStacks(),这里会创建两个unique_fd(可以看作是智能文件描述符,和智能指针类似,可以自动回收),然后通过tombstoned_connect()去连接一个socket服务器,进行数据的通信,得到一个output_fd,然后通过WriteFully()方法去将os流中的log,写入到output_fd,先分析tombstoned_connect这个方法,比较重要

     参数: stacks = os.data() ,stacks_len = os.size()

enum PaletteStatus PaletteWriteCrashThreadStacks(/*in*/const char* stacks, size_t stacks_len) {
  android::base::unique_fd tombstone_fd;
  android::base::unique_fd output_fd;
  //output_fd = pipe_write.get()
  if (!tombstoned_connect(getpid(), &tombstone_fd, &output_fd, kDebuggerdJavaBacktrace)) {
    // Failure here could be due to file descriptor resource exhaustion
    // so write the stack trace message to the log in case it helps
    // debug that.
    LOG(INFO) << std::string_view(stacks, stacks_len);
    // tombstoned_connect() logs failure reason.
    return PaletteStatus::kFailedCheckLog;
  }

  PaletteStatus status = PaletteStatus::kOkay;
  //将stacks写入output_fd = pipe_write.get(),然后在/Q_Mainline/system/core/debuggerd/client/debuggerd_client.cpp的debuggerd_trigger_dump()方法中,通过pipe_read.get()读取出来
  if (!android::base::WriteFully(output_fd, stacks, stacks_len)) {
    PLOG(ERROR) << "Failed to write tombstoned output";
    TEMP_FAILURE_RETRY(ftruncate(output_fd, 0));
    status = PaletteStatus::kFailedCheckLog;
  }
}

5.12

     tombstoned_connect(),这个方法在system/core/debuggerd/tombstoned/tombstoned_client.cpp中实现

在这个方法中,
     1,通过socket_local_client()连接了socket name = "kTombstonedJavaTraceSocketName" 的socket服务端,这个socket服务端就是在4.1的main()中初始化的,
     2,调用write()函数,往socket服务端写入数据,由服务端接收
     3,会从socket服务端(9.4)获取一个文件描述符output_fd = pipe_write.get() ,用来在8.4中  的WriteFully()方法中写入
     接下来就看socket服务端做了些什么
     参数:pid = getpid(),tombstoned_socket = tombstone_fd = null,output_fd = output_fd = null,dump_type = kDebuggerdJavaBacktrace
bool tombstoned_connect(pid_t pid, unique_fd* tombstoned_socket, unique_fd* output_fd,
                        DebuggerdDumpType dump_type) {
  unique_fd sockfd(
      socket_local_client((dump_type != kDebuggerdJavaBacktrace ? kTombstonedCrashSocketName
                                                                : kTombstonedJavaTraceSocketName),
                          ANDROID_SOCKET_NAMESPACE_RESERVED, SOCK_SEQPACKET));

  TombstonedCrashPacket packet = {};
  packet.packet_type = CrashPacketType::kDumpRequest;
  packet.packet.dump_request.pid = pid;
  packet.packet.dump_request.dump_type = dump_type;
  if (TEMP_FAILURE_RETRY(write(sockfd, &packet, sizeof(packet))) != sizeof(packet)) {
    async_safe_format_log(ANDROID_LOG_ERROR, "libc", "failed to write DumpRequest packet: %s",
                          strerror(errno));
    return false;
  }

  unique_fd tmp_output_fd;
  //从7.4可知通过SendFileDescriptors发送    
  //packet = response = {.packet_type = CrashPacketType::kPerformDump}
  //tmp_output_fd = output_fd =  pipe_write.get()
  ssize_t rc = ReceiveFileDescriptors(sockfd, &packet, sizeof(packet), &tmp_output_fd);

  *tombstoned_socket = std::move(sockfd);
  //output_fd = tmp_output_fd = output_fd =  pipe_write.get()
  *output_fd = std::move(tmp_output_fd);
  return true;
}