搞点基础的——关于Okhttp超时检测的研究

4,056 阅读6分钟

前言

查看Okhttp源码时,在Transmitter类中发现了一个AsyncTimeout对象。了解代码后得知,该类是用于做一些超时检测的操作。本文主要总结笔者对于AsyncTimeout机制的研究。

本文基于okhttp 3.14.9

github地址:github.com/square/okht…

gradle依赖:implementation group: 'com.squareup.okhttp3', name: 'okhttp', version: '3.14.9'

AsyncTimeout

AsyncTimeout类位于Okio库,集成自Timeout。其类中有如下注释:

/**
 * This timeout uses a background thread to take action exactly when the timeout occurs. Use this to
 * implement timeouts where they aren't supported natively, such as to sockets that are blocked on
 * writing.
 *
 * <p>Subclasses should override {@link #timedOut} to take action when a timeout occurs. This method
 * will be invoked by the shared watchdog thread so it should not do any long-running operations.
 * Otherwise we risk starving other timeouts from being triggered.
 *
 * <p>Use {@link #sink} and {@link #source} to apply this timeout to a stream. The returned value
 * will apply the timeout to each operation on the wrapped stream.
 *
 * <p>Callers should call {@link #enter} before doing work that is subject to timeouts, and {@link
 * #exit} afterwards. The return value of {@link #exit} indicates whether a timeout was triggered.
 * Note that the call to {@link #timedOut} is asynchronous, and may be called after {@link #exit}.
 */
public class AsyncTimeout extends Timeout {

这里提供了几个有用的信息:

  • 这是一个利用统一子线程检测超时的工具,主要针对的是一些原生不支持超时检测的类。
  • 它提供了一个timedOut()方法,作为检测到超时的回调
  • 内部提供的sink()source()方法可以适配流的读写超时检测,这可以对应到网络请求的流读写,后面会讲到。
  • 提供enter()exit()作为开始计时和结束计时的调用。也就是说开始执行计时的起点将会在enter()发生

Timeout

上述一直在说超时检测,那究竟超时的时间从何而来呢?先来看看Timeout中有如下定义:

  /**
   * True if {@code deadlineNanoTime} is defined. There is no equivalent to null
   * or 0 for {@link System#nanoTime}.
   */
  private boolean hasDeadline;
  private long deadlineNanoTime;
  private long timeoutNanos;

Timeout中定义了:deadlineNanoTime也就是deadline时间timeoutNanos超时时长。具体到其子类AsyncTimeout就是利用timeoutNanos来计算超时的。

AsyncTimeout属性定义

再来看看AsyncTimeout的一些属性定义,

  private static final int TIMEOUT_WRITE_SIZE = 64 * 1024;

  private static final long IDLE_TIMEOUT_MILLIS = TimeUnit.SECONDS.toMillis(60);
  private static final long IDLE_TIMEOUT_NANOS = TimeUnit.MILLISECONDS.toNanos(IDLE_TIMEOUT_MILLIS);

  static @Nullable AsyncTimeout head;

  private boolean inQueue;

  private @Nullable AsyncTimeout next;

  private long timeoutAt;
  • timeoutAt:记录超时的具体时间,这个的计算是通过开始计时的当前时间+上述的timeoutNanos
  • 上述代码出现了一个headnext的定义,前面在AsyncTimeout的注释中讲到,它会通过一个统一的子线程进行超时检测。而这个headnext的定义即一个链表的结构,用于将每个AsyncTimeout对象形成一个队列,方便每次超时检测触发时的遍历。这个会在后面讲到。
  • inQueue:即AsyncTimeout对象一旦加入到链表中,就会置为true。

AsyncTimeout在网络请求流程中的使用

先来看看AsyncTimeout具体在网络请求流程中的运用。

  • Transmitter中有一个自带的AsyncTimeout类型属性,它的超时时间timeoutNanos会在Transmitter的构造方法中设置,设置的是OkHttpClient初始化时自定义的callTimeout这里的超时检测的是整个请求的总时间

    private final AsyncTimeout timeout = new AsyncTimeout() {
        @Override protected void timedOut() {
          cancel();
        }
    };
    。。。
    public Transmitter(OkHttpClient client, Call call) {
        this.client = client;
        this.connectionPool = Internal.instance.realConnectionPool(client.connectionPool());
        this.call = call;
        this.eventListener = client.eventListenerFactory().create(call);
        this.timeout.timeout(client.callTimeoutMillis(), MILLISECONDS);
    }
    

    callTimeout:整个请求过程的超时时间,通常不设置默认为0

  • 在建立连接时会调用到RealConnection.connectSocket(),建立连接之后会创建两个Okio相关的BufferedSourceBufferedSink对象。

    // RealConnection.connectSocket()
    // RealConnection.java 275行
    source = Okio.buffer(Okio.source(rawSocket));
    sink = Okio.buffer(Okio.sink(rawSocket));
    
    // Okio.java 221行
    public static Source source(Socket socket) throws IOException {
        if (socket == null) throw new IllegalArgumentException("socket == null");
        if (socket.getInputStream() == null) throw new IOException("socket's input stream == null");
        AsyncTimeout timeout = timeout(socket);
        Source source = source(socket.getInputStream(), timeout);
        return timeout.source(source);
    }
    
    // Okio.java 115行
    public static Sink sink(Socket socket) throws IOException {
        if (socket == null) throw new IllegalArgumentException("socket == null");
        if (socket.getOutputStream() == null) throw new IOException("socket's output stream == null");
        AsyncTimeout timeout = timeout(socket);
        Sink sink = sink(socket.getOutputStream(), timeout);
        return timeout.sink(sink);
    }
    
    // RealConnection.java 542行
    ExchangeCodec newCodec(OkHttpClient client, Interceptor.Chain chain) throws SocketException {
        if (http2Connection != null) {
          return new Http2ExchangeCodec(client, this, chain, http2Connection);
        } else {
          socket.setSoTimeout(chain.readTimeoutMillis());
          source.timeout().timeout(chain.readTimeoutMillis(), MILLISECONDS);
          sink.timeout().timeout(chain.writeTimeoutMillis(), MILLISECONDS);
          return new Http1ExchangeCodec(client, this, source, sink);
        }
      }
    

    新建BufferedSourceBufferedSink对象时都需要先新建一个AsyncTimeout,在利用其新建BufferedSourceBufferedSink,这里的代码运用到了装饰器的思想,继而将sourcesink拥有timeout的能力。后续在新建ExchangeCodec时,会分别设置OkHttpClient初始化时自定义的readTimeoutwriteTimeout,对应读写的超时

    readTimeout:读超时时间,默认10s。

    writeTimeout:写超时时间,默认10s。

ps:因为socket自身具备连接超时的检测,故connectTimeout不需要采用AsyncTimeout的方案。

AsyncTimeout超时检测

加入队列,开始检测

  // AsyncTimeout.java 72行
  public final void enter() {
    if (inQueue) throw new IllegalStateException("Unbalanced enter/exit");
    long timeoutNanos = timeoutNanos();
    boolean hasDeadline = hasDeadline();
    if (timeoutNanos == 0 && !hasDeadline) {
      return; // No timeout and no deadline? Don't bother with the queue.
    }
    inQueue = true;
    scheduleTimeout(this, timeoutNanos, hasDeadline);
  }

AsyncTimeout.enter()方法如上所示,调用之后正式进入超时检测。重点关注最后的scheduleTimeout(this, timeoutNanos, hasDeadline);这时一个static方法,还记得上面提到的AsyncTimeout有一个静态成员变量head?接下来就来看看这个方法。

  // AsyncTimeout.java 83行
  private static synchronized void scheduleTimeout(
      AsyncTimeout node, long timeoutNanos, boolean hasDeadline) {
    // Start the watchdog thread and create the head node when the first timeout is scheduled.
    if (head == null) {
      head = new AsyncTimeout();
      new Watchdog().start();
    }

    long now = System.nanoTime();
    if (timeoutNanos != 0 && hasDeadline) {
      // Compute the earliest event; either timeout or deadline. Because nanoTime can wrap around,
      // Math.min() is undefined for absolute values, but meaningful for relative ones.
      node.timeoutAt = now + Math.min(timeoutNanos, node.deadlineNanoTime() - now);
    } else if (timeoutNanos != 0) {
      node.timeoutAt = now + timeoutNanos;
    } else if (hasDeadline) {
      node.timeoutAt = node.deadlineNanoTime();
    } else {
      throw new AssertionError();
    }

    // Insert the node in sorted order.
    long remainingNanos = node.remainingNanos(now);
    for (AsyncTimeout prev = head; true; prev = prev.next) {
      if (prev.next == null || remainingNanos < prev.next.remainingNanos(now)) {
        node.next = prev.next;
        prev.next = node;
        if (prev == head) {
          AsyncTimeout.class.notify(); // Wake up the watchdog when inserting at the front.
        }
        break;
      }
    }
  }

ps:node.remainingNanos(now);会计算出当前时间与超时时间的时间间隔。

方法中主要做了3件事:

  • 静态变量head若为空,则说明全局检测还未开启,需要开启检测线程Watchdog。ps:head实际上只是一个队列开端的标志,本身不属于一次超时的检测
  • 计算出加入到检测队列的当前节点的超时时间timeoutAt
  • 将全局的检测队列进行重排序按照timeoutAt从小到大排序。保证后续Watchdog的检测机制。因为是链表结构,只需要将下一个节点改变指向即可。具体的顺序可见下图(先用旅程图代替,当作时间轴理解即可):
journey
title AsyncTimeout队列顺序(单位/ns)
now: 0
timeoutAt: 0
timeoutAt2: 0

Watchdog

Watchdog是整个AsyncTimeout超时检测机制的检测线程。

private static final class Watchdog extends Thread {
    Watchdog() {
      super("Okio Watchdog");
      setDaemon(true);
    }

    public void run() {
      while (true) {
        try {
          AsyncTimeout timedOut;
          synchronized (AsyncTimeout.class) {
            timedOut = awaitTimeout();

            // Didn't find a node to interrupt. Try again.
            if (timedOut == null) continue;

            // The queue is completely empty. Let this thread exit and let another watchdog thread
            // get created on the next call to scheduleTimeout().
            if (timedOut == head) {
              head = null;
              return;
            }
          }

          // Close the timed out node.
          timedOut.timedOut();
        } catch (InterruptedException ignored) {
        }
      }
    }
  }
  • 通过awaitTimeout()寻找出已超时的AsyncTimeout对象。
  • timedOut对象为空,则继续检测。
  • timedOuthead,说明链表中已无检测对象。可情况链表。
  • timedOut为有效的已超时对象,则调用其timedOut()方法回调给注册监听方
  • 值得一提的是,在WatchDog的构造方法中设置了setDaemon(true);表明它是一个守护进程。关于守护进程可以看看setDaemon详解。这样做的好处是,它可以依赖开启它的线程关闭而关闭
  • 因为WatchDog检测性质的线程,所以timedOut()方法内不应进行耗时操作,以免影响后续检测的进行。

awaitTimeout()

Watchdog线程中通过调用awaitTimeout()找出已经过期的AsyncTimeout

static @Nullable AsyncTimeout awaitTimeout() throws InterruptedException {
    // Get the next eligible node.
    AsyncTimeout node = head.next;

    // The queue is empty. Wait until either something is enqueued or the idle timeout elapses.
    if (node == null) {
      long startNanos = System.nanoTime();
      AsyncTimeout.class.wait(IDLE_TIMEOUT_MILLIS);
      return head.next == null && (System.nanoTime() - startNanos) >= IDLE_TIMEOUT_NANOS
          ? head  // The idle timeout elapsed.
          : null; // The situation has changed.
    }

    long waitNanos = node.remainingNanos(System.nanoTime());

    // The head of the queue hasn't timed out yet. Await that.
    if (waitNanos > 0) {
      // Waiting is made complicated by the fact that we work in nanoseconds,
      // but the API wants (millis, nanos) in two arguments.
      long waitMillis = waitNanos / 1000000L;
      waitNanos -= (waitMillis * 1000000L);
      AsyncTimeout.class.wait(waitMillis, (int) waitNanos);
      return null;
    }

    // The head of the queue has timed out. Remove it.
    head.next = node.next;
    node.next = null;
    return node;
  }

从代码可知,awaitTimeout()只会检测head.next

  • head.nextnull,会先进入60s超时等待状态。
    • 若还是没有就会认为超时检测队列已经清空,Watchdog线程就会结束。等下一次有新的检测时才会开启。
    • 若存在就会返回null,外部进行下一次循环。
  • head.next的超时时间比当前时间早,就会进入以当前时间与超时时间的时间差的超时等待状态。唤醒会也是返回null,外部进行下一次循环。
  • awaitTimeout()运用了java多线程的wait()、notify/notifyAll() 机制。上述的scheduleTimeout(this, timeoutNanos, hasDeadline);方法在新node插入链表后会调用AsyncTimeout.class.notify();。这样做的目的是为了在没有超时的情况下让出资源

Java多线程学习之wait、notify/notifyAll 详解

退出检测

当流程走完时需要调用exit()将其绑定的AsyncTimeout对象移出链表。如果链表内找不到,则说明已经超时了...

  /** Returns true if the timeout occurred. */
  public final boolean exit() {
    if (!inQueue) return false;
    inQueue = false;
    return cancelScheduledTimeout(this);
  }

  /** Returns true if the timeout occurred. */
  private static synchronized boolean cancelScheduledTimeout(AsyncTimeout node) {
    // Remove the node from the linked list.
    for (AsyncTimeout prev = head; prev != null; prev = prev.next) {
      if (prev.next == node) {
        prev.next = node.next;
        node.next = null;
        return false;
      }
    }

    // The node wasn't found in the linked list: it must have timed out!
    return true;
  }