iOS网络协议栈原理(五) -- 干活类`_NativeProtocol`中单次请求传输过程

196 阅读8分钟

iOS网络协议栈原理(五) -- 干活类_NativeProtocol中单次请求传输过程

_NativeProtocol是针对 easyHandle 的包装

_NativeProtocol在定义时候是需要遵守protocol _EasyHandleDelegate. 并且在 _NativeProtocol的构造函数中, 就设置_EasyHandle.delegate = self. 也就是说, curl 在数据请求传输过程中, 发生的事件会通过代理方法交给_NativeProtocol进行处理, 具体可以参考一下代码和注释:


class _NativeProtocol: _EasyHandleDelegate {
    ...

        // 默认实现方法! -> 这个是在request 初始化的时候就创建了
    public required init(task: URLSessionTask, cachedResponse: CachedURLResponse?, client: URLProtocolClient?) {
        ...
        // 创建curlHandle 时, 需要设置 curl 的 代理对象!!!
        self.easyHandle = _EasyHandle(delegate: self)
    }
}

以上能看出, _NativeProtocol 拥有一个关键的成员变量internal var easyHandle: _EasyHandle!, 并且实现了_EasyHandleDelegate, easyHandle才是网络请求的发起者. 源码中注释解释的非常清楚:

/// Minimal wrapper around the [curl easy interface](https://curl.haxx.se/libcurl/c/)
///
/// An *easy handle* manages the state of a transfer inside libcurl.
///
/// As such the easy handle's responsibility is implementing the HTTP
/// protocol while the *multi handle* is in charge of managing sockets and
/// reading from / writing to these sockets.
///
/// An easy handle is added to a multi handle in order to associate it with
/// an actual socket. The multi handle will then feed bytes into the easy
/// handle and read bytes from the easy handle. But this process is opaque
/// to use. It is further worth noting, that with HTTP/1.1 persistent
/// connections and with HTTP/2 there's a 1-to-many relationship between
/// TCP streams and HTTP transfers / easy handles. A single TCP stream and
/// its socket may be shared by multiple easy handles.
///
/// A single HTTP request-response exchange (refered to here as a
/// *transfer*) corresponds directly to an easy handle. Hence anything that
/// needs to be configured for a specific transfer (e.g. the URL) will be
/// configured on an easy handle.
///
/// A single `URLSessionTask` may do multiple, consecutive transfers, and
/// as a result it will have to reconfigure its easy handle between
/// transfers. An easy handle can be re-used once its transfer has
/// completed.

以上注释中说的非常清楚:

  1. easyHandle 实际只封装了HTTP 请求响应应用层的内容, 我们正在下面的回调方法也能看出来!!!!
  2. multiHandle 用来帮助 easyHandle解决TCP socket连接层的内容, 包括socket连接池, epoll模型等等
internal final class _EasyHandle {
    let rawHandle = CFURLSessionEasyHandleInit()
    weak var delegate: _EasyHandleDelegate? // 外部的 _NativeProtocol 
    fileprivate var headerList: _CurlStringList?
    fileprivate var pauseState: _PauseState = []
    internal var timeoutTimer: _TimeoutSource! // 内置 timeoutTimer 是一个 DispatchTimer
    internal lazy var errorBuffer = [UInt8](repeating: 0, count: Int(CFURLSessionEasyErrorSize))
    internal var _config: URLSession._Configuration? = nil
    internal var _url: URL? = nil

    init(delegate: _EasyHandleDelegate) {
        self.delegate = delegate
        setupCallbacks()
    }

    deinit {
        CFURLSessionEasyHandleDeinit(rawHandle)
    }

    ...
}

internal protocol _EasyHandleDelegate: AnyObject {
    /// 接受 easyHandle 收到数据时, 回调!!!
    /// Handle data read from the network.
    /// - returns: the action to be taken: abort, proceed, or pause.
    func didReceive(data: Data) -> _EasyHandle._Action

    /// 当收到 response Header 时回调
    /// Handle header data read from the network.
    /// - returns: the action to be taken: abort, proceed, or pause.
    func didReceive(headerData data: Data, contentLength: Int64) -> _EasyHandle._Action

    // 当需要发送 request body 数据时, 回调
    /// Fill a buffer with data to be sent.
    ///
    /// - parameter data: The buffer to fill
    /// - returns: the number of bytes written to the `data` buffer, or `nil` to stop the current transfer immediately.
    func fill(writeBuffer buffer: UnsafeMutableBufferPointer<Int8>) -> _EasyHandle._WriteBufferResult

    /// 当 数据传输完毕/或者出错时, 回调
    /// The transfer for this handle completed.
    /// - parameter errorCode: An NSURLError code, or `nil` if no error occurred.
    func transferCompleted(withError error: NSError?)

    // 如果使用 inputStream 作为 http request body 数据, 这个方法用来 seek
    /// Seek the input stream to the given position
    func seekInputStream(to position: UInt64) throws

    // 数据传输过程中的 progress update 回调
    /// Gets called during the transfer to update progress.
    func updateProgressMeter(with propgress: _EasyHandle._Progress)
}

_EasyHandleHTTP请求响应过程中的核心回调方法

其中关键方法都做了注释, 注意这里封装的都是HTTP 协议层的回调...

internal final class _EasyHandle {

    /// libcurl 在http请求响应过程中, 有任何出错, 就会回调这个方法, 通知上层, 传输完成/传输失败
    func completedTransfer(withError error: NSError?) {
        delegate?.transferCompleted(withError: error)
    }

    /// This callback function gets called by libcurl when it receives body
    /// data.
    /// libcurl 接收到 response body 时回调
    ///
    /// - SeeAlso: <https://curl.haxx.se/libcurl/c/CURLOPT_WRITEFUNCTION.html>
    func didReceive(data: UnsafeMutablePointer<Int8>, size: Int, nmemb: Int) -> Int {
        let d: Int = {
            let buffer = Data(bytes: data, count: size*nmemb)
            switch delegate?.didReceive(data: buffer) {
            case .proceed?: return size * nmemb
            case .abort?: return 0
            case .pause?:
                pauseState.insert(.receivePaused)
                return Int(CFURLSessionWriteFuncPause)
            case nil:
                /* the delegate disappeared */
                return 0
            }
        }()
        return d
    }

    /// This callback function gets called by libcurl when it receives header
    /// data.
    /// libcurl 接收到 response header时 回调
    ///
    /// - SeeAlso: <https://curl.haxx.se/libcurl/c/CURLOPT_HEADERFUNCTION.html>
    func didReceive(headerData data: UnsafeMutablePointer<Int8>, size: Int, nmemb: Int, contentLength: Double) -> Int {
        let buffer = Data(bytes: data, count: size*nmemb)
        let d: Int = {
            switch delegate?.didReceive(headerData: buffer, contentLength: Int64(contentLength)) {
            case .proceed?: return size * nmemb
            case .abort?: return 0
            case .pause?:
                pauseState.insert(.receivePaused)
                return Int(CFURLSessionWriteFuncPause)
            case nil:
                /* the delegate disappeared */
                return 0
            }
        }()
        // 收到 response header 过程中, 去buffer 中检测 cookies
        setCookies(headerData: buffer)
        return d
    }
    
    /// 在 libcurl 接收到 response header 时, 解析其中的数据, 查找`Set-Cookie`字段, 然后写入 Config.CookieStorage中!!!
    /// libcurl response header 中获取到 Set-Cookie 时, 构造 HTTPCookie 并存储到 CookieStorage 中
    func setCookies(headerData data: Data) {
        guard let config = _config, config.httpCookieAcceptPolicy !=  HTTPCookie.AcceptPolicy.never else { return }
        guard let headerData = String(data: data, encoding: String.Encoding.utf8) else { return }
        // Convert headerData from a string to a dictionary.
        // Ignore headers like 'HTTP/1.1 200 OK\r\n' which do not have a key value pair.
        // Value can have colons (ie, date), so only split at the first one, ie header:value
        let headerComponents = headerData.split(separator: ":", maxSplits: 1)
        var headers: [String: String] = [:]
        //Trim the leading and trailing whitespaces (if any) before adding the header information to the dictionary.
        if headerComponents.count > 1 {
            headers[String(headerComponents[0].trimmingCharacters(in: .whitespacesAndNewlines))] = headerComponents[1].trimmingCharacters(in: .whitespacesAndNewlines)
        }
        let cookies = HTTPCookie.cookies(withResponseHeaderFields: headers, for: _url!)
        guard cookies.count > 0 else { return }
        if let cookieStorage = config.httpCookieStorage {
            cookieStorage.setCookies(cookies, for: _url, mainDocumentURL: nil)
        }
    }

    /// libcurl 需要 发送 http request 时, 会回调这个方法!!!
    /// This callback function gets called by libcurl when it wants to send data
    /// it to the network.
    ///
    /// - SeeAlso: <https://curl.haxx.se/libcurl/c/CURLOPT_READFUNCTION.html>
    func fill(writeBuffer data: UnsafeMutablePointer<Int8>, size: Int, nmemb: Int) -> Int {
        let d: Int = {
            let buffer = UnsafeMutableBufferPointer(start: data, count: size * nmemb)
            switch delegate?.fill(writeBuffer: buffer) {
            case .pause?:
                pauseState.insert(.sendPaused)
                return Int(CFURLSessionReadFuncPause)
            case .abort?:
                return Int(CFURLSessionReadFuncAbort)
            case .bytes(let length)?:
                return length
            case nil:
                /* the delegate disappeared */
                return Int(CFURLSessionReadFuncAbort)
            }
        }()
        return d
    }
}

_NativeProtocol具体实现_EasyHandleDelegate的逻辑

class _NativeProtocol: _EasyHandleDelegate {

    // 用这个结构体,可以看出 http 请求过程中, 可能使用到的几个关键数据结构
    // 1. request url
    // 2. request body (可能在内存, 可能在file中...)
    // 3. _ParsedResponseHeader response header 解析器
    // 4. 真正的 response!!
    internal struct _TransferState {
        /// The URL that's being requested
        let url: URL
        /// Raw headers received.
        let parsedResponseHeader: _ParsedResponseHeader
        /// Once the headers is complete, this will contain the response
        var response: URLResponse?
        /// The body data to be sent in the request
        let requestBodySource: _BodySource? // 真实的 dataSource 最简单
        /// Body data received
        let bodyDataDrain: _DataDrain // 接受数据存储的地方: file/mem?
        /// Describes what to do with received body data for this transfer:
    }

    ...
    
    /// 当 easyHandle 需要发送 http request header/body 时, 调用!!!
    /// 这里会调用     
    // func urlSession(_ session: URLSession, task: URLSessionTask, didSendBodyData bytesSent: Int64, totalBytesSent: Int64, totalBytesExpectedToSend: Int64)
    /// This callback function gets called by libcurl when it wants to send data it to the network.
    func fill(writeBuffer buffer: UnsafeMutableBufferPointer<Int8>) -> _EasyHandle._WriteBufferResult {
        guard case .transferInProgress(let ts) = internalState else {
            fatalError("Requested to fill write buffer, but transfer isn't in progress.")
        }
        guard let source = ts.requestBodySource else {
            fatalError("Requested to fill write buffer, but transfer state has no body source.")
        }
        switch source.getNextChunk(withLength: buffer.count) {
        case .data(let data):
            // 每次上传一部分内容
            copyDispatchData(data, infoBuffer: buffer)
            let count = data.count
            assert(count > 0)
            notifyDelegate(aboutUploadedData: Int64(count))
            return .bytes(count)
        case .done:
            return .bytes(0)
        case .retryLater:
            // At this point we'll try to pause the easy handle. The body source
            // is responsible for un-pausing the handle once data becomes
            // available.
            return .pause
        case .error:
            return .abort
        }
    }

    /// This callback function gets called by libcurl when it receives header
    /// 当 easyHandle 收到 http response header 时候回调
    func didReceive(headerData data: Data, contentLength: Int64) -> _EasyHandle._Action {
        NSRequiresConcreteImplementation()
    }

    /// 当 easyHandle 接收到 http response body 时候回调
    func didReceive(data: Data) -> _EasyHandle._Action {
        // 1. 判断 internalState, 必须是 .transferInProgress, 并且 ts: _TransferState 来持有当前传输过程中的关键节点
        guard case .transferInProgress(var ts) = internalState else {
            fatalError("Received body data, but no transfer in progress.")
        }

        // 2. 判断 response header 是否解析完成!!!
        // 如果解析完成, 在 ts中存储这个解析完成的 response
        if let response = validateHeaderComplete(transferState:ts) {
            ts.response = response
        }

        // 3. 如果 response 已经解析完成, 判断 http stateCode, 如果需要重定向, 告知curl 继续进行重定向
        // Note this excludes code 300 which should return the response of the redirect and not follow it.
        // For other redirect codes dont notify the delegate of the data received in the redirect response.
        if let httpResponse = ts.response as? HTTPURLResponse, 301...308 ~= httpResponse.statusCode {
            if let _http = self as? _HTTPURLProtocol {
                // Save the response body in case the delegate does not perform a redirect and the 3xx response
                // including its body needs to be returned to the client.
                var redirectBody = _http.lastRedirectBody ?? Data()
                redirectBody.append(data)
                _http.lastRedirectBody = redirectBody
            }
            return .proceed
        }

        // 4. 通知 task.delegate 根据条件触发接受数据的回调, 收到数据
        notifyDelegate(aboutReceivedData: data)

        // 5. 继续在 internalState.transferState 中缓存接收到的数据
        internalState = .transferInProgress(ts.byAppending(bodyData: data))
        return .proceed
    }

    /*
    根据 task 的回调方式:  Delegate 以及 task 种类 (DataTask, DownloadTask) 调用不同的回调

    调用不同的回调, 最著名的回调方法是:

        func urlSession(_ session: URLSession, dataTask: URLSessionDataTask, didReceive data: Data)
    或者
        func urlSession(_ session: URLSession, downloadTask: URLSessionDownloadTask, didWriteData bytesWritten: Int64, totalBytesWritten: Int64, totalBytesExpectedToWrite: Int64)
    */
    fileprivate func notifyDelegate(aboutReceivedData data: Data) {
        guard let t = self.task else {
            fatalError("Cannot notify")
        }
        if case .taskDelegate(let delegate) = t.session.behaviour(for: self.task!),
            let dataDelegate = delegate as? URLSessionDataDelegate,
            let task = self.task as? URLSessionDataTask {
            // Forward to the delegate:
            guard let s = self.task?.session as? URLSession else {
                fatalError()
            }
            s.delegateQueue.addOperation {
                dataDelegate.urlSession(s, dataTask: task, didReceive: data)
            }
        } else if case .taskDelegate(let delegate) = t.session.behaviour(for: self.task!),
            let downloadDelegate = delegate as? URLSessionDownloadDelegate,
            let task = self.task as? URLSessionDownloadTask {
            guard let s = self.task?.session as? URLSession else {
                fatalError()
            }
            let fileHandle = try! FileHandle(forWritingTo: self.tempFileURL)
            _ = fileHandle.seekToEndOfFile()
            fileHandle.write(data)
            task.countOfBytesReceived  += Int64(data.count)
            s.delegateQueue.addOperation {
                downloadDelegate.urlSession(s, downloadTask: task, didWriteData: Int64(data.count), totalBytesWritten: task.countOfBytesReceived, totalBytesExpectedToWrite: task.countOfBytesExpectedToReceive)
            }
        }
    }
}

以上过程按照HTTP请求响应的过程分成两部分:

  1. HTTP Request 发起时, 基本就是fill(writeBuffer ... 方法, 这个方法本质上是http 发送request body 时, 会调用
  2. HTTP Response 收到时:
    1. response header 收到 - didReceive(header ...
    2. response body 收到 - didReceive(data ... 收到 response data以后, 会调用 URLSession delegate 的核心方法

小结一下, _NativeProtocol封装了HTTP请求响应中的数据传输的过程!!!

_NativeProtocol的子类_HTTPURLProtocol

我们能看到_NativeProtocol 几乎实现了 HTTP协议的大部分传输相关的内容, 单实际URLProtocol使用的是_HTTPURLProtocol.

internal class _HTTPURLProtocol: _NativeProtocol {

    ...

    override class func canInit(with request: URLRequest) -> Bool {
        guard request.url?.scheme == "http" || request.url?.scheme == "https" else { return false }
        return true
    }

    override func didReceive(headerData data: Data, contentLength: Int64) -> _EasyHandle._Action {
        ...

        didReceiveResponse()
    }

    func didReceiveResponse() {
        guard let _ = task as? URLSessionDataTask else { return }
        guard case .transferInProgress(let ts) = self.internalState else { fatalError("Transfer not in progress.") }
        guard let response = ts.response as? HTTPURLResponse else { fatalError("Header complete, but not URL response.") }
        guard let session = task?.session as? URLSession else { fatalError() }
        switch session.behaviour(for: self.task!) {
        case .noDelegate:
            break
        case .taskDelegate:
            //TODO: There's a problem with libcurl / with how we're using it.
            // We're currently unable to pause the transfer / the easy handle:
            // https://curl.haxx.se/mail/lib-2016-03/0222.html
            //
            // For now, we'll notify the delegate, but won't pause the transfer,
            // and we'll disregard the completion handler:
            switch response.statusCode {
            case 301, 302, 303, 305...308:
                break
            default:
                // 其他的 code ->
                // work queue 调用 client.urlProtocolxxx方法
                self.client?.urlProtocol(self, didReceive: response, cacheStoragePolicy: .notAllowed)
            }
        case .dataCompletionHandler:
            break
        case .downloadCompletionHandler:
            break
        }
    }

}

关于_HTTPURLProtocol的重点有如下信息:

  1. canInit(with ... 方法显示, 它只会处理scheme是 httphttpsurl
  2. 实现了_EasyHandleDelegate的核心回调方法didReceive(headerData ..., 也就是说curl在收到response header 时, 会回调给_HTTPURLProtocol. 这个回调中, 会进行如下操作:
    1. 解析http reponse header协议
    2. 在条件满足的情况下, 调用self.client?.urlProtocol(self, didReceive: response, cacheStoragePolicy: .notAllowed) 方法. 这个方法最后会向上层抛出func urlSession(_ session: URLSession, dataTask: URLSessionDataTask, didReceive response: URLResponse, completionHandler: @escaping (URLSession.ResponseDisposition) -> Void) 回调!!!
  3. Cache-Control 相关的逻辑
  4. Redirect请求重定向的部分逻辑.