LogManager
kafka的LogManager是kafka的日志管理系统。LogManager专门负责日志的创建,检索和清理。所有的日志读写请求都会经过它完成。下面看LogManager拥有哪些属性。
属性
- logDirs: Seq[File]
- initialOfflineDirs: Seq[File]
- topicConfigs: Map[String,LogConfig]; 表示每个topic的日志管理相关的配置
- initialDefaultConfig: LogConfig; 日志管理默认配置
- cleanerConfig: CleanerConfig; 日志清理相关的配置
- recoveryThreadsPerDataDir: Int; 恢复,加载每个日志目录时的线程数
- flushCheckMs: Long
- flushRecoveryOffsetCheckpointMs: Long
- flushStartOffsetCheckpointMs: Long
- retentionCheckMs: Long
- maxPidExpirationMs: Int
- scheduler: Scheduler; kafka自身和任务调度相关的工具类
- brokerState: BrokerState; 表示一台broker的状态机
- brokerTopicStats: BrokerTopicStats,
- logDirFailureChannel: LogDirFailureChannel
- time: Time; kafka自身一个时间相关的工具类
方法
构造方法
LogManager的构造方法相对比较简洁,最主要的是:
- 创建并验证提供的logDirs中的dir没有重复,如果目录不存在就创建目录
- 给logDirs中的每个目录加锁 //todo(实现)
- logDir的每个目录下面创建一个recovery-point-offset-checkpoint文件,用来保存topic/partition=>offset的映射。//todo(用途)
- logDir的每个目录下面创建一个log-start-offset-checkpoint文件,用来保存topic/partition=>offset的映射。 //todo(用途)
- loadLogs():加载目录中的日志文件,下面详细介绍
loadLogs()方法
private def loadLogs(): Unit = {
info("Loading logs.")
val startMs = time.milliseconds
val threadPools = ArrayBuffer.empty[ExecutorService]
val offlineDirs = mutable.Set.empty[(String, IOException)]
val jobs = mutable.Map.empty[File, Seq[Future[_]]]
for (dir <- liveLogDirs) {
try {
//每个log对应一个线程池执行加载工作
val pool = Executors.newFixedThreadPool(numRecoveryThreadsPerDataDir)
threadPools.append(pool)
val cleanShutdownFile = new File(dir, Log.CleanShutdownFile)
if (cleanShutdownFile.exists) {
//正常shutdown,不需要执行恢复工作
debug(s"Found clean shutdown file. Skipping recovery for all logs in data directory: ${dir.getAbsolutePath}")
} else {
jobs // log recovery itself is being performed by `Log` class during initialization
brokerState.newState(RecoveringFromUncleanShutdown)
}
var recoveryPoints = Map[TopicPartition, Long]()
try {
recoveryPoints = this.recoveryPointCheckpoints(dir).read
} catch {
case e: Exception =>
warn("Error occurred while reading recovery-point-offset-checkpoint file of directory " + dir, e)
warn("Resetting the recovery checkpoint to 0")
}
var logStartOffsets = Map[TopicPartition, Long]()
try {
//读取每个log的startOffset
logStartOffsets = this.logStartOffsetCheckpoints(dir).read
} catch {
case e: Exception =>
warn("Error occurred while reading log-start-offset-checkpoint file of directory " + dir, e)
}
val jobsForDir = for {
dirContent <- Option(dir.listFiles).toList
logDir <- dirContent if logDir.isDirectory
} yield {
CoreUtils.runnable {
try {
//加载log的runnable任务
loadLog(logDir, recoveryPoints, logStartOffsets)
} catch {
case e: IOException =>
offlineDirs.add((dir.getAbsolutePath, e))
error("Error while loading log dir " + dir.getAbsolutePath, e)
}
}
}
(cleanShutdownFile) = jobsForDir.map(pool.submit)
} catch {
case e: IOException =>
offlineDirs.add((dir.getAbsolutePath, e))
error("Error while loading log dir " + dir.getAbsolutePath, e)
}
}
try {
for ((cleanShutdownFile, dirJobs) <- jobs) {
dirJobs.foreach(_.get)
try {
cleanShutdownFile.delete()
} catch {
case e: IOException =>
offlineDirs.add((cleanShutdownFile.getParent, e))
error(s"Error while deleting the clean shutdown file $cleanShutdownFile", e)
}
}
offlineDirs.foreach { case (dir, e) =>
logDirFailureChannel.maybeAddOfflineLogDir(dir, s"Error while deleting the clean shutdown file in dir $dir", e)
}
} catch {
case e: ExecutionException =>
error("There was an error in one of the threads during logs loading: " + e.getCause)
throw e.getCause
} finally {
threadPools.foreach(_.shutdown())
}
info(s"Logs loading complete in ${time.milliseconds - startMs} ms.")
}
- 对每个日志目录
- 创建线程池:val pool = Executors.newFixedThreadPool(numRecoveryThreadsPerDataDir)
- 如果存在.kafka_cleanshutdown文件,跳过恢复此目录下面的日志;如果不存在,将这个broker的状态设置为RecoveringFromUncleanShutdown
- 读取这个目录对应的recovery-point-offset-checkpoint文件。如果读取出错,将recovery checkpoint设置为0
- 读取这个目录下的log-start-offset-checkpoint文件
- 读取dir中的子目录列表
- 执行loadLog(logDir, recoveryPoints, logStartOffsets)方法
loadLog方法
private def loadLog(logDir: File, recoveryPoints: Map[TopicPartition, Long], logStartOffsets: Map[TopicPartition, Long]): Unit = {
debug("Loading log '" + logDir.getName + "'")
val topicPartition = Log.parseTopicPartitionName(logDir)
val config = topicConfigs.getOrElse(topicPartition.topic, currentDefaultConfig)
val logRecoveryPoint = recoveryPoints.getOrElse(topicPartition, 0L)
val logStartOffset = logStartOffsets.getOrElse(topicPartition, 0L)
val log = Log(
dir = logDir,
config = config,
logStartOffset = logStartOffset,
recoveryPoint = logRecoveryPoint,
maxProducerIdExpirationMs = maxPidExpirationMs,
producerIdExpirationCheckIntervalMs = LogManager.ProducerIdExpirationCheckIntervalMs,
scheduler = scheduler,
time = time,
brokerTopicStats = brokerTopicStats,
logDirFailureChannel = logDirFailureChannel)
if (logDir.getName.endsWith(Log.DeleteDirSuffix)) {
addLogToBeDeleted(log)
} else {
val previous = {
if (log.isFuture)
this.futureLogs.put(topicPartition, log)
else
this.currentLogs.put(topicPartition, log)
}
if (previous != null) {
if (log.isFuture)
throw new IllegalStateException("Duplicate log directories found: %s, %s!".format(log.dir.getAbsolutePath, previous.dir.getAbsolutePath))
else
throw new IllegalStateException(s"Duplicate log directories for $topicPartition are found in both ${log.dir.getAbsolutePath} " +
s"and ${previous.dir.getAbsolutePath}. It is likely because log directory failure happened while broker was " +
s"replacing current replica with future replica. Recover broker from this failure by manually deleting one of the two directories " +
s"for this partition. It is recommended to delete the partition in the log directory that is known to have failed recently.")
}
}
}
loadLog的方法在于在系统内创建一个Log对象代表一个TopicPartition的文件合集。因此
- loadLog第一步是根据子目录的名字获取到topic和partition
- 获取这个topicPartition相关的信息,包括配置,recovery checkpoint和log-start-offset-checkpoint。
- 构造Log对象
- 如果这个子目录以-delete结尾,就将其Log对象加入到待删除列表中
- 如果这个子目录以-future结尾,就将其Log对象加入到future列表中
- 否则将其Log对象加入到current列表中。 future列表的解释是,如果用户想在同一台broker上复制一个目录,则先创建一个以-future结尾的目录,等待追上当前日志的时候用future替换原来的日志文件。
Log
Log是一个LogSegment的序列,每个LogSegment都有一个起始offset。根据配置,当一个LogSegment的大小或者时间达到限制的时候就会创建出一个新的Segment。一个Log有一下的属性
属性
- dir: File; 新的LogSegment文件会在此目录下创建
- config: LogConfig
- logStartOffset: Long; 暴露给kafka client的最早的offset。当用户删除日志,broker保留日志(?),broker滚动日志的时候会更新。LogStartOffset的作用在于
- 删除日志;如果一个logSegment的nextOffset小于startOffset的时候,这个Segment就可以被删掉
- 作为响应返回给客户端
- recoveryPoint: Long;需要开始做恢复的offset(在这个offset之前的日志都已被刷盘,在此之后的还没有)
- scheduler: Scheduler
- brokerTopicStats: BrokerTopicStats
- time: Time
- maxProducerIdExpirationMs: Int
- producerIdExpirationCheckIntervalMs: Int
- topicPartition: TopicPartition
- producerStateManager: ProducerStateManager
- logDirFailureChannel: LogDirFailureChannel
在创建一个Log的时候,会加载属于这个Log的Segment,方法是loadSegments
方法
loadSegments()
private def loadSegments(): Long = {
// first do a pass through the files in the log directory and remove any temporary files
// and find any interrupted swap operations
val swapFiles = removeTempFilesAndCollectSwapFiles()
// Now do a second pass and load all the log and index files.
// We might encounter legacy log segments with offset overflow (KAFKA-6264). We need to split such segments. When
// this happens, restart loading segment files from scratch.
retryOnOffsetOverflow {
// In case we encounter a segment with offset overflow, the retry logic will split it after which we need to retry
// loading of segments. In that case, we also need to close all segments that could have been left open in previous
// call to loadSegmentFiles().
logSegments.foreach(_.close())
segments.clear()
loadSegmentFiles()
}
// Finally, complete any interrupted swap operations. To be crash-safe,
// log files that are replaced by the swap segment should be renamed to .deleted
// before the swap file is restored as the new segment file.
//收集完swap文件后继续进行swap操作
completeSwapOperations(swapFiles)
if (logSegments.isEmpty) {
// no existing segments, create a new mutable segment beginning at offset 0
addSegment(LogSegment.open(dir = dir,
baseOffset = 0,
config,
time = time,
fileAlreadyExists = false,
initFileSize = this.initFileSize,
preallocate = config.preallocate))
0
} else if (!dir.getAbsolutePath.endsWith(Log.DeleteDirSuffix)) {
val nextOffset = retryOnOffsetOverflow {
recoverLog()
}
// reset the index size of the currently active log segment to allow more entries
activeSegment.resizeIndexes(config.maxIndexSize)
nextOffset
} else 0
}private def loadSegments(): Long = {
// first do a pass through the files in the log directory and remove any temporary files
// and find any interrupted swap operations
val swapFiles = removeTempFilesAndCollectSwapFiles()
// Now do a second pass and load all the log and index files.
// We might encounter legacy log segments with offset overflow (KAFKA-6264). We need to split such segments. When
// this happens, restart loading segment files from scratch.
retryOnOffsetOverflow {
// In case we encounter a segment with offset overflow, the retry logic will split it after which we need to retry
// loading of segments. In that case, we also need to close all segments that could have been left open in previous
// call to loadSegmentFiles().
logSegments.foreach(_.close())
segments.clear()
loadSegmentFiles()
}
// Finally, complete any interrupted swap operations. To be crash-safe,
// log files that are replaced by the swap segment should be renamed to .deleted
// before the swap file is restored as the new segment file.
completeSwapOperations(swapFiles)
if (logSegments.isEmpty) {
// no existing segments, create a new mutable segment beginning at offset 0
addSegment(LogSegment.open(dir = dir,
baseOffset = 0,
config,
time = time,
fileAlreadyExists = false,
initFileSize = this.initFileSize,
preallocate = config.preallocate))
0
} else if (!dir.getAbsolutePath.endsWith(Log.DeleteDirSuffix)) {
val nextOffset = retryOnOffsetOverflow {
recoverLog()
}
// reset the index size of the currently active log segment to allow more entries
activeSegment.resizeIndexes(config.maxIndexSize)
nextOffset
} else 0
}
loadSegments的作用是从目录中加载这个Log的所有segments并返回下一个offset.过程是
- removeTempFilesAndCollectSwapFiles(): 遍历一遍所有文件,清理临时文件,找到所有中断交换操作(?);具体过程下面分析
- 再次遍历一遍所有文件,如果有segment的offset溢出,就将这个segment分割,并重新进行loadSegments操作(需要先将前面已经加载的segments close掉)。分割overflow的segment的方法是splitOverflowedSegment
- 继续完成前面找到的中断swap操作,用swap文件替换原文件; 将原文件标记为deleted状态
- 如果目录以-deleted结尾,则恢复日志,执行recoverLog()
- 如果logSegments为空,则新建一个LogSegment,设置baseOffset为0 下面分析在loadSegments方法中遇到过的其他方法;
removeTempFilesAndCollectSwapFiles
- 遍历目录下面的所有文件
- 如果文件以deleted结尾,则删除文件
- 如果文件以cleaned结尾,也要删除文件。cleaned文件是指文件在压缩过程中产生的中间文件
- 如果文件以swap结尾,先将swap后缀去除。如果原文件是index文件,则删除不需要。否则保存下来。
completeSwapOperations
LogSegment
LogSegment代表磁盘上一个物理的log文件和index文件。log文件包含了messag的内容,index是一个索引文件,它将offset映射到物理的文件地址。每个LogSegment都有一个基准offset。这个基准offset比上个segment里面的offset大,比他包含的所有message的offset小。LogSegment包含的两个文件的文件名是[base_offset].log和[base_offset].index 一个LogSegment包含如下的属性
属性
- log: FileRecords; 包含所有的日志条目
- offsetIndex: OffsetIndex
- timeIndex: TimeIndex
- txnIndex: TransactionIndex
- baseOffset: Long
- indexIntervalBytes: Int
- rollJitterMs: Long
- time: Time