kafka server - LogManager

889 阅读7分钟

LogManager

kafka的LogManager是kafka的日志管理系统。LogManager专门负责日志的创建,检索和清理。所有的日志读写请求都会经过它完成。下面看LogManager拥有哪些属性。

属性

  • logDirs: Seq[File]
  • initialOfflineDirs: Seq[File]
  • topicConfigs: Map[String,LogConfig]; 表示每个topic的日志管理相关的配置
  • initialDefaultConfig: LogConfig; 日志管理默认配置
  • cleanerConfig: CleanerConfig; 日志清理相关的配置
  • recoveryThreadsPerDataDir: Int; 恢复,加载每个日志目录时的线程数
  • flushCheckMs: Long
  • flushRecoveryOffsetCheckpointMs: Long
  • flushStartOffsetCheckpointMs: Long
  • retentionCheckMs: Long
  • maxPidExpirationMs: Int
  • scheduler: Scheduler; kafka自身和任务调度相关的工具类
  • brokerState: BrokerState; 表示一台broker的状态机
  • brokerTopicStats: BrokerTopicStats,
  • logDirFailureChannel: LogDirFailureChannel
  • time: Time; kafka自身一个时间相关的工具类

方法

构造方法

LogManager的构造方法相对比较简洁,最主要的是:

  • 创建并验证提供的logDirs中的dir没有重复,如果目录不存在就创建目录
  • 给logDirs中的每个目录加锁 //todo(实现)
  • logDir的每个目录下面创建一个recovery-point-offset-checkpoint文件,用来保存topic/partition=>offset的映射。//todo(用途)
  • logDir的每个目录下面创建一个log-start-offset-checkpoint文件,用来保存topic/partition=>offset的映射。 //todo(用途)
  • loadLogs():加载目录中的日志文件,下面详细介绍

loadLogs()方法

private def loadLogs(): Unit = {
    info("Loading logs.")
    val startMs = time.milliseconds
    val threadPools = ArrayBuffer.empty[ExecutorService]
    val offlineDirs = mutable.Set.empty[(String, IOException)]
    val jobs = mutable.Map.empty[File, Seq[Future[_]]]

    for (dir <- liveLogDirs) {
      try {
        //每个log对应一个线程池执行加载工作
        val pool = Executors.newFixedThreadPool(numRecoveryThreadsPerDataDir)
        threadPools.append(pool)

        val cleanShutdownFile = new File(dir, Log.CleanShutdownFile)

        if (cleanShutdownFile.exists) {
          //正常shutdown,不需要执行恢复工作
          debug(s"Found clean shutdown file. Skipping recovery for all logs in data directory: ${dir.getAbsolutePath}")
        } else {
          jobs // log recovery itself is being performed by `Log` class during initialization
          brokerState.newState(RecoveringFromUncleanShutdown)
        }

        var recoveryPoints = Map[TopicPartition, Long]()
        try {
          recoveryPoints = this.recoveryPointCheckpoints(dir).read
        } catch {
          case e: Exception =>
            warn("Error occurred while reading recovery-point-offset-checkpoint file of directory " + dir, e)
            warn("Resetting the recovery checkpoint to 0")
        }

        var logStartOffsets = Map[TopicPartition, Long]()
        try {
          //读取每个log的startOffset
          logStartOffsets = this.logStartOffsetCheckpoints(dir).read
        } catch {
          case e: Exception =>
            warn("Error occurred while reading log-start-offset-checkpoint file of directory " + dir, e)
        }

        val jobsForDir = for {
          dirContent <- Option(dir.listFiles).toList
          logDir <- dirContent if logDir.isDirectory
        } yield {
          CoreUtils.runnable {
            try {
              //加载log的runnable任务
              loadLog(logDir, recoveryPoints, logStartOffsets)
            } catch {
              case e: IOException =>
                offlineDirs.add((dir.getAbsolutePath, e))
                error("Error while loading log dir " + dir.getAbsolutePath, e)
            }
          }
        }
        (cleanShutdownFile) = jobsForDir.map(pool.submit)
      } catch {
        case e: IOException =>
          offlineDirs.add((dir.getAbsolutePath, e))
          error("Error while loading log dir " + dir.getAbsolutePath, e)
      }
    }

    try {
      for ((cleanShutdownFile, dirJobs) <- jobs) {
        dirJobs.foreach(_.get)
        try {
          cleanShutdownFile.delete()
        } catch {
          case e: IOException =>
            offlineDirs.add((cleanShutdownFile.getParent, e))
            error(s"Error while deleting the clean shutdown file $cleanShutdownFile", e)
        }
      }

      offlineDirs.foreach { case (dir, e) =>
        logDirFailureChannel.maybeAddOfflineLogDir(dir, s"Error while deleting the clean shutdown file in dir $dir", e)
      }
    } catch {
      case e: ExecutionException =>
        error("There was an error in one of the threads during logs loading: " + e.getCause)
        throw e.getCause
    } finally {
      threadPools.foreach(_.shutdown())
    }

    info(s"Logs loading complete in ${time.milliseconds - startMs} ms.")
  }
  • 对每个日志目录
    • 创建线程池:val pool = Executors.newFixedThreadPool(numRecoveryThreadsPerDataDir)
    • 如果存在.kafka_cleanshutdown文件,跳过恢复此目录下面的日志;如果不存在,将这个broker的状态设置为RecoveringFromUncleanShutdown
    • 读取这个目录对应的recovery-point-offset-checkpoint文件。如果读取出错,将recovery checkpoint设置为0
    • 读取这个目录下的log-start-offset-checkpoint文件
    • 读取dir中的子目录列表
      • 执行loadLog(logDir, recoveryPoints, logStartOffsets)方法

loadLog方法

private def loadLog(logDir: File, recoveryPoints: Map[TopicPartition, Long], logStartOffsets: Map[TopicPartition, Long]): Unit = {
    debug("Loading log '" + logDir.getName + "'")
    val topicPartition = Log.parseTopicPartitionName(logDir)
    val config = topicConfigs.getOrElse(topicPartition.topic, currentDefaultConfig)
    val logRecoveryPoint = recoveryPoints.getOrElse(topicPartition, 0L)
    val logStartOffset = logStartOffsets.getOrElse(topicPartition, 0L)

    val log = Log(
      dir = logDir,
      config = config,
      logStartOffset = logStartOffset,
      recoveryPoint = logRecoveryPoint,
      maxProducerIdExpirationMs = maxPidExpirationMs,
      producerIdExpirationCheckIntervalMs = LogManager.ProducerIdExpirationCheckIntervalMs,
      scheduler = scheduler,
      time = time,
      brokerTopicStats = brokerTopicStats,
      logDirFailureChannel = logDirFailureChannel)

    if (logDir.getName.endsWith(Log.DeleteDirSuffix)) {
      addLogToBeDeleted(log)
    } else {
      val previous = {
        if (log.isFuture)
          this.futureLogs.put(topicPartition, log)
        else
          this.currentLogs.put(topicPartition, log)
      }
      if (previous != null) {
        if (log.isFuture)
          throw new IllegalStateException("Duplicate log directories found: %s, %s!".format(log.dir.getAbsolutePath, previous.dir.getAbsolutePath))
        else
          throw new IllegalStateException(s"Duplicate log directories for $topicPartition are found in both ${log.dir.getAbsolutePath} " +
            s"and ${previous.dir.getAbsolutePath}. It is likely because log directory failure happened while broker was " +
            s"replacing current replica with future replica. Recover broker from this failure by manually deleting one of the two directories " +
            s"for this partition. It is recommended to delete the partition in the log directory that is known to have failed recently.")
      }
    }
  }

loadLog的方法在于在系统内创建一个Log对象代表一个TopicPartition的文件合集。因此

  • loadLog第一步是根据子目录的名字获取到topic和partition
  • 获取这个topicPartition相关的信息,包括配置,recovery checkpoint和log-start-offset-checkpoint。
  • 构造Log对象
  • 如果这个子目录以-delete结尾,就将其Log对象加入到待删除列表中
  • 如果这个子目录以-future结尾,就将其Log对象加入到future列表中
  • 否则将其Log对象加入到current列表中。 future列表的解释是,如果用户想在同一台broker上复制一个目录,则先创建一个以-future结尾的目录,等待追上当前日志的时候用future替换原来的日志文件。

Log

Log是一个LogSegment的序列,每个LogSegment都有一个起始offset。根据配置,当一个LogSegment的大小或者时间达到限制的时候就会创建出一个新的Segment。一个Log有一下的属性

属性

  • dir: File; 新的LogSegment文件会在此目录下创建
  • config: LogConfig
  • logStartOffset: Long; 暴露给kafka client的最早的offset。当用户删除日志,broker保留日志(?),broker滚动日志的时候会更新。LogStartOffset的作用在于
    • 删除日志;如果一个logSegment的nextOffset小于startOffset的时候,这个Segment就可以被删掉
    • 作为响应返回给客户端
  • recoveryPoint: Long;需要开始做恢复的offset(在这个offset之前的日志都已被刷盘,在此之后的还没有)
  • scheduler: Scheduler
  • brokerTopicStats: BrokerTopicStats
  • time: Time
  • maxProducerIdExpirationMs: Int
  • producerIdExpirationCheckIntervalMs: Int
  • topicPartition: TopicPartition
  • producerStateManager: ProducerStateManager
  • logDirFailureChannel: LogDirFailureChannel

在创建一个Log的时候,会加载属于这个Log的Segment,方法是loadSegments

方法

loadSegments()

private def loadSegments(): Long = {
    // first do a pass through the files in the log directory and remove any temporary files
    // and find any interrupted swap operations
    val swapFiles = removeTempFilesAndCollectSwapFiles()

    // Now do a second pass and load all the log and index files.
    // We might encounter legacy log segments with offset overflow (KAFKA-6264). We need to split such segments. When
    // this happens, restart loading segment files from scratch.
    retryOnOffsetOverflow {
      // In case we encounter a segment with offset overflow, the retry logic will split it after which we need to retry
      // loading of segments. In that case, we also need to close all segments that could have been left open in previous
      // call to loadSegmentFiles().
      logSegments.foreach(_.close())
      segments.clear()
      loadSegmentFiles()
    }

    // Finally, complete any interrupted swap operations. To be crash-safe,
    // log files that are replaced by the swap segment should be renamed to .deleted
    // before the swap file is restored as the new segment file.
    //收集完swap文件后继续进行swap操作
    completeSwapOperations(swapFiles)

    if (logSegments.isEmpty) {
      // no existing segments, create a new mutable segment beginning at offset 0
      addSegment(LogSegment.open(dir = dir,
        baseOffset = 0,
        config,
        time = time,
        fileAlreadyExists = false,
        initFileSize = this.initFileSize,
        preallocate = config.preallocate))
      0
    } else if (!dir.getAbsolutePath.endsWith(Log.DeleteDirSuffix)) {
      val nextOffset = retryOnOffsetOverflow {
        recoverLog()
      }

      // reset the index size of the currently active log segment to allow more entries
      activeSegment.resizeIndexes(config.maxIndexSize)
      nextOffset
    } else 0
  }private def loadSegments(): Long = {
    // first do a pass through the files in the log directory and remove any temporary files
    // and find any interrupted swap operations
    val swapFiles = removeTempFilesAndCollectSwapFiles()

    // Now do a second pass and load all the log and index files.
    // We might encounter legacy log segments with offset overflow (KAFKA-6264). We need to split such segments. When
    // this happens, restart loading segment files from scratch.
    retryOnOffsetOverflow {
      // In case we encounter a segment with offset overflow, the retry logic will split it after which we need to retry
      // loading of segments. In that case, we also need to close all segments that could have been left open in previous
      // call to loadSegmentFiles().
      logSegments.foreach(_.close())
      segments.clear()
      loadSegmentFiles()
    }

    // Finally, complete any interrupted swap operations. To be crash-safe,
    // log files that are replaced by the swap segment should be renamed to .deleted
    // before the swap file is restored as the new segment file.
    completeSwapOperations(swapFiles)

    if (logSegments.isEmpty) {
      // no existing segments, create a new mutable segment beginning at offset 0
      addSegment(LogSegment.open(dir = dir,
        baseOffset = 0,
        config,
        time = time,
        fileAlreadyExists = false,
        initFileSize = this.initFileSize,
        preallocate = config.preallocate))
      0
    } else if (!dir.getAbsolutePath.endsWith(Log.DeleteDirSuffix)) {
      val nextOffset = retryOnOffsetOverflow {
        recoverLog()
      }

      // reset the index size of the currently active log segment to allow more entries
      activeSegment.resizeIndexes(config.maxIndexSize)
      nextOffset
    } else 0
  }

loadSegments的作用是从目录中加载这个Log的所有segments并返回下一个offset.过程是

  • removeTempFilesAndCollectSwapFiles(): 遍历一遍所有文件,清理临时文件,找到所有中断交换操作(?);具体过程下面分析
  • 再次遍历一遍所有文件,如果有segment的offset溢出,就将这个segment分割,并重新进行loadSegments操作(需要先将前面已经加载的segments close掉)。分割overflow的segment的方法是splitOverflowedSegment
  • 继续完成前面找到的中断swap操作,用swap文件替换原文件; 将原文件标记为deleted状态
  • 如果目录以-deleted结尾,则恢复日志,执行recoverLog()
  • 如果logSegments为空,则新建一个LogSegment,设置baseOffset为0 下面分析在loadSegments方法中遇到过的其他方法;

removeTempFilesAndCollectSwapFiles

  • 遍历目录下面的所有文件
  • 如果文件以deleted结尾,则删除文件
  • 如果文件以cleaned结尾,也要删除文件。cleaned文件是指文件在压缩过程中产生的中间文件
  • 如果文件以swap结尾,先将swap后缀去除。如果原文件是index文件,则删除不需要。否则保存下来。

completeSwapOperations

LogSegment

LogSegment代表磁盘上一个物理的log文件和index文件。log文件包含了messag的内容,index是一个索引文件,它将offset映射到物理的文件地址。每个LogSegment都有一个基准offset。这个基准offset比上个segment里面的offset大,比他包含的所有message的offset小。LogSegment包含的两个文件的文件名是[base_offset].log和[base_offset].index 一个LogSegment包含如下的属性

属性

  • log: FileRecords; 包含所有的日志条目
  • offsetIndex: OffsetIndex
  • timeIndex: TimeIndex
  • txnIndex: TransactionIndex
  • baseOffset: Long
  • indexIntervalBytes: Int
  • rollJitterMs: Long
  • time: Time