搞点基础的——关于http缓存机制的那些事

632 阅读6分钟

前言

http的缓存机制可用于提升和优化网络请求的效率。这个对于项目优化来说也会有一定的帮助,所以本文就来总结一下http缓存机制。同时也会结合okhttp源码,了解okhttp是如何实现这一缓存机制的。

本文基于okhttp 3.14.9

github地址:github.com/square/okht…

gradle依赖:implementation group: 'com.squareup.okhttp3', name: 'okhttp', version: '3.14.9'

本文参考:详解Http缓存策略

Okhttp.Cache

先来说说如何cache管理的设置以及使用。

val okHttpClient = OkHttpClient.Builder()
                    .cache(Cache(this.cacheDir, 20 * 1024 * 1024))
                    .build()

// OkhttpClient.Builder.java
public Builder cache(@Nullable Cache cache) {
      this.cache = cache;
      this.internalCache = null;
      return this;
}                    

创建OkHttpClient的时候,可以设置一个okhttp3.Cache的类型,传入的参数为缓存保存的文件路径,以及最大容量

// Cache.java
final InternalCache internalCache = new InternalCache() {
    @Override public @Nullable Response get(Request request) throws IOException {
      return Cache.this.get(request);
    }

    @Override public @Nullable CacheRequest put(Response response) throws IOException {
      return Cache.this.put(response);
    }

    @Override public void remove(Request request) throws IOException {
      Cache.this.remove(request);
    }

    @Override public void update(Response cached, Response network) {
      Cache.this.update(cached, network);
    }

    @Override public void trackConditionalCacheHit() {
      Cache.this.trackConditionalCacheHit();
    }

    @Override public void trackResponse(CacheStrategy cacheStrategy) {
      Cache.this.trackResponse(cacheStrategy);
    }
  };

Cache中的internalCache属性是提供给外部调用获取或者保存缓存的接口

// OkhttpClient.java
@Nullable InternalCache internalCache() {
    return cache != null ? cache.internalCache : internalCache;
}

OkhttpClient提供一个方法获取该internalCache属性。

// RealCall.java
Response getResponseWithInterceptorChain() throws IOException {
    // Build a full stack of interceptors.
    List<Interceptor> interceptors = new ArrayList<>();
    interceptors.addAll(client.interceptors());
    interceptors.add(new RetryAndFollowUpInterceptor(client));
    interceptors.add(new BridgeInterceptor(client.cookieJar()));
    interceptors.add(new CacheInterceptor(client.internalCache()));
    interceptors.add(new ConnectInterceptor(client));

网络请求发起时,getResponseWithInterceptorChain()方法会构建Okhttp的责任链,这时创建CacheInterceptor的时候就会传入上述的internalCache对象,用于后续的缓存策略使用。

故Okhttp的缓存机制处理,会发生在CacheInterceptor.intercept方法中

// CacheInterceptor.java
final @Nullable InternalCache cache;

public CacheInterceptor(@Nullable InternalCache cache) {
    this.cache = cache;
}
  
@Override public Response intercept(Chain chain) throws IOException {
    Response cacheCandidate = cache != null
        ? cache.get(chain.request())
        : null;

    long now = System.currentTimeMillis();

    CacheStrategy strategy = new CacheStrategy.Factory(now, chain.request(), cacheCandidate).get();
    Request networkRequest = strategy.networkRequest;
    Response cacheResponse = strategy.cacheResponse;

从上述代码段可知,在请求时责任链走到CacheInterceptor时,会先从cache中获取对应已缓存的response。ps:这里的cache.get正是上述的internalCache对象的get方法

通过创建CacheStrategy对象调用其get方法,获取到一个可用的cacheResponse

public CacheStrategy get() {
    CacheStrategy candidate = getCandidate();

    if (candidate.networkRequest != null && request.cacheControl().onlyIfCached()) {
      // We're forbidden from using the network and the cache is insufficient.
      return new CacheStrategy(null, null);
    }

    return candidate;
}

CacheStrategy.get方法中,会调用getCandidate()方法,所以getCandidate()方法即为缓存策略的体现

本文后续将会结合CacheInterceptor.intercept以及getCandidate()方法介绍缓存机制

强缓存

http缓存机制大致可分为强缓存对比缓存,先来看强缓存

sequenceDiagram
客户端->>缓存: 请求是否存在符合规定的缓存?
缓存->>客户端: 存在未过期的缓存,直接返回。

强缓存的字段有ExpiresCache-Control其中Cache-Control优先级高于Expires

Expires

这个字段比较好理解,就是标识缓存的过期时间,即超过该时间后缓存失效

Cache-Control

有可能会看到Cache-Control是如下格式:

cache-control: public, max-age=7200

这里可以将其理解成一个自定义类型,里面包含着多个字段用于形成一个配置的效果。在Okhttp中以CacheControl类定义。cache-control常见字段有:

  • public

    • 表明响应可以被任何对象(包括:发送请求的客户端,CDN等代理服务器,等等)缓存,即使是通常不可缓存的内容(例如,该响应没有max-age指令或Expires消息头)。
  • private

    • 表明响应只能被单个用户缓存,不能作为共享缓存(即代理服务器不能缓存它)。私有缓存可以缓存响应内容。
  • no-cache

    • 可以在本地进行缓存,但每次发请求时,都要向服务器进行验证,如果服务器允许,才能使用本地缓存(即:需要协商缓存)。
  • no-store

    • 禁止缓存客户端请求或服务器响应的内容,每次都须重新请求服务器拿内容
  • max-age

    • 设置缓存存储的最大周期,超过这个时间缓存被视为过期 (单位:秒)

Okhttp源码体现

// CacheStrategy.Factory
public Factory(long nowMillis, Request request, Response cacheResponse) {
      this.nowMillis = nowMillis;
      this.request = request;
      this.cacheResponse = cacheResponse;

      if (cacheResponse != null) {
        this.sentRequestMillis = cacheResponse.sentRequestAtMillis();
        this.receivedResponseMillis = cacheResponse.receivedResponseAtMillis();
        Headers headers = cacheResponse.headers();
        for (int i = 0, size = headers.size(); i < size; i++) {
          String fieldName = headers.name(i);
          String value = headers.value(i);
          if ("Date".equalsIgnoreCase(fieldName)) {
            servedDate = HttpDate.parse(value);
            servedDateString = value;
          } else if ("Expires".equalsIgnoreCase(fieldName)) {
            expires = HttpDate.parse(value);
          } else if ("Last-Modified".equalsIgnoreCase(fieldName)) {
            lastModified = HttpDate.parse(value);
            lastModifiedString = value;
          } else if ("ETag".equalsIgnoreCase(fieldName)) {
            etag = value;
          } else if ("Age".equalsIgnoreCase(fieldName)) {
            ageSeconds = HttpHeaders.parseSeconds(value, -1);
          }
        }
      }
    }

// getCandidate()
private CacheStrategy getCandidate() {
      。。。
      
      CacheControl responseCaching = cacheResponse.cacheControl();

      long ageMillis = cacheResponseAge();
      long freshMillis = computeFreshnessLifetime();

      if (requestCaching.maxAgeSeconds() != -1) {
        freshMillis = Math.min(freshMillis, SECONDS.toMillis(requestCaching.maxAgeSeconds()));
      }

      long minFreshMillis = 0;
      if (requestCaching.minFreshSeconds() != -1) {
        minFreshMillis = SECONDS.toMillis(requestCaching.minFreshSeconds());
      }

      long maxStaleMillis = 0;
      if (!responseCaching.mustRevalidate() && requestCaching.maxStaleSeconds() != -1) {
        maxStaleMillis = SECONDS.toMillis(requestCaching.maxStaleSeconds());
      }

      if (!responseCaching.noCache() && ageMillis + minFreshMillis < freshMillis + maxStaleMillis) {
        Response.Builder builder = cacheResponse.newBuilder();
        if (ageMillis + minFreshMillis >= freshMillis) {
          builder.addHeader("Warning", "110 HttpURLConnection \"Response is stale\"");
        }
        long oneDayMillis = 24 * 60 * 60 * 1000L;
        if (ageMillis > oneDayMillis && isFreshnessLifetimeHeuristic()) {
          builder.addHeader("Warning", "113 HttpURLConnection \"Heuristic expiration\"");
        }
        return new CacheStrategy(null, builder.build());
      }
      
。。。

private long cacheResponseAge() {
      long apparentReceivedAge = servedDate != null
          ? Math.max(0, receivedResponseMillis - servedDate.getTime())
          : 0;
      long receivedAge = ageSeconds != -1
          ? Math.max(apparentReceivedAge, SECONDS.toMillis(ageSeconds))
          : apparentReceivedAge;
      long responseDuration = receivedResponseMillis - sentRequestMillis;
      long residentDuration = nowMillis - receivedResponseMillis;
      return receivedAge + responseDuration + residentDuration;
}

private long computeFreshnessLifetime() {
      CacheControl responseCaching = cacheResponse.cacheControl();
      if (responseCaching.maxAgeSeconds() != -1) {
        return SECONDS.toMillis(responseCaching.maxAgeSeconds());
      } else if (expires != null) {
        long servedMillis = servedDate != null
            ? servedDate.getTime()
            : receivedResponseMillis;
        long delta = expires.getTime() - servedMillis;
        return delta > 0 ? delta : 0;
      } else if (lastModified != null
          && cacheResponse.request().url().query() == null) {
        // As recommended by the HTTP RFC and implemented in Firefox, the
        // max age of a document should be defaulted to 10% of the
        // document's age at the time it was served. Default expiration
        // dates aren't used for URIs containing a query.
        long servedMillis = servedDate != null
            ? servedDate.getTime()
            : sentRequestMillis;
        long delta = servedMillis - lastModified.getTime();
        return delta > 0 ? (delta / 10) : 0;
      }
      return 0;
    }

上述的代码大概总结:

  • CacheStrategyFactory方法利用CacheInterceptor中传入的本次发起的请求request以及对应缓存的response。这里会读取cacheResponseExpires字段以及上次接受到响应的日期Date,用于计算强缓存是否有效的。当然还有一些字段是用于对比缓存的,这个后面会讲到。
  • getCandidate()方法最终会返回一个CacheStrategy对象,可以理解为一个request的请求对象和一个经过处理后的缓存响应cacheResponse
  • getCandidate()方法会获取cacheResponsecachecontrol配置,解析并计算缓存是否有效
  • 在getCandidate()中若缓存最终被判定为有效,则会返回return new CacheStrategy(null, builder.build());,意味着这里是强缓存策略生效,直接返回缓存的response
  • 值得一提的是,在computeFreshnessLifetime()中,会先判断cachecontrolmaxAge,不存在才会判断expires,这也说明他们间的优先级问题
// CacheInterceptor.intercept()方法
// If we don't need the network, we're done.
    if (networkRequest == null) {
      return cacheResponse.newBuilder()
          .cacheResponse(stripBody(cacheResponse))
          .build();
    }

最终返回到CacheInterceptor.intercept()时,如果判断返回的networkRequest为空,则认为可直接返回缓存中的cacheResponse,不会再发起网络请求

对比缓存

对比缓存又叫协商缓存,其实现的方式为通过一个字段记录数据的修改,通过询问服务器来判断是否需要使用缓存或者使用更新后的数据

sequenceDiagram
客户端->>服务器: 携带缓存标志请求
服务器->>客户端: 缓存有效,返回304
客户端->>缓存: 获取缓存
缓存->>客户端: 缓存的响应

对比缓存的具体流程可见上图,客户端请求服务端时会携带一个标志,服务器通过判断标志确定缓存是否有效,若有效则返回特定的状态码304。否则,返回最新的数据,http状态码为200

对比缓存的实现有如下两种:

  • Last-Modified / If-Modified-Since
  • Etag / If-None-Match

Last-Modified / If-Modified-Since

首次请求服务器时,服务器会在响应中携带一个Last-Modified的字段,用于标识该份数据的最后修改时间。而待到下一次请求时,客户端就会将上一次缓存下来的response下的Last-Modified的值,携带到request的If-Modified-Since字段,提供给服务端判断。

Etag / If-None-Match

这个其实很好理解,就是将上面Last-Modified的模式将时间替换成一个tag。这样做的好处个人感觉就是有效地解决数据频繁修改或者是修改的卡点时间问题,因为时间有精确度的问题,如果按时间来判断数据的有效性有可能会出现误判的情况,不够准确

ps:Etag / If-None-Match的优先级是高于Last-Modified / If-Modified-Since

Okhttp源码实现

// getCandidate()
      String conditionName;
      String conditionValue;
      if (etag != null) {
        conditionName = "If-None-Match";
        conditionValue = etag;
      } else if (lastModified != null) {
        conditionName = "If-Modified-Since";
        conditionValue = lastModifiedString;
      } else if (servedDate != null) {
        conditionName = "If-Modified-Since";
        conditionValue = servedDateString;
      } else {
        return new CacheStrategy(request, null); // No condition! Make a regular request.
      }

      Headers.Builder conditionalRequestHeaders = request.headers().newBuilder();
      Internal.instance.addLenient(conditionalRequestHeaders, conditionName, conditionValue);

      Request conditionalRequest = request.newBuilder()
          .headers(conditionalRequestHeaders.build())
          .build();
      return new CacheStrategy(conditionalRequest, cacheResponse);

还是回到getCandidate()方法中,在经历了上述强缓存无法满足之后,会进行对比缓存的请求头组装。这里可以看到会先判断etag是否存在,不存在则添加lastModified最后修改时间。这个就是Etag / If-None-Match的优先级是高于Last-Modified / If-Modified-Since的体现。

// class CacheInterceptor
    Response networkResponse = null;
    try {
      networkResponse = chain.proceed(networkRequest);
    } finally {
      // If we're crashing on I/O or otherwise, don't leak the cache body.
      if (networkResponse == null && cacheCandidate != null) {
        closeQuietly(cacheCandidate.body());
      }
    }

    // If we have a cache response too, then we're doing a conditional get.
    if (cacheResponse != null) {
      if (networkResponse.code() == HTTP_NOT_MODIFIED) {
        Response response = cacheResponse.newBuilder()
            .headers(combine(cacheResponse.headers(), networkResponse.headers()))
            .sentRequestAtMillis(networkResponse.sentRequestAtMillis())
            .receivedResponseAtMillis(networkResponse.receivedResponseAtMillis())
            .cacheResponse(stripBody(cacheResponse))
            .networkResponse(stripBody(networkResponse))
            .build();
        networkResponse.body().close();

        // Update the cache after combining headers but before stripping the
        // Content-Encoding header (as performed by initContentStream()).
        cache.trackConditionalCacheHit();
        cache.update(cacheResponse, response);
        return response;
      } else {
        closeQuietly(cacheResponse.body());
      }
    }

返回到CacheInterceptor之后,经历了一次网络请求后,判断若状态码为HTTP_NOT_MODIFIED(304)时,会使用cacheResponse的数据进行响应给上层

Okhttp缓存机制流程

这里引用一张图说明一下缓存机制的流程:

image.png

  • 先进行强缓存的判断,若有效则直接返回
  • 强缓存无效,则判断cacheResponse是否携带Etag字段,有则添加到该次requestIf-None-Match进行对比缓存
  • Etag字段不存在,则判断cacheResponse是否携带Last-Modified字段,有则添加到该次requestIf-Modified-Since进行对比缓存
  • 这一过程在上述的getCandidate()方法均有体现。