前段时间新部署了一套双主双从的 RocketMQ 集群,前几天收到同事反馈,最近生产某应用偶尔发消息失败,有以下错误日志:
这一看就是生产消息量大,导致 broker 处理不过来。但是同事说,这是新应用,没多少消息量。
登录服务器查看 stats.log,确实没什么消息量.
查看了 broker 的 gc 情况,没发生几次 gc,所以不是因为 gc 导致卡顿。
broker 其它日志也没打印异常信息,按道理这点消息量对 RocketMQ 来说应该是游刃有余,但是却翻车了。没什么头绪,只能按照网上找的方案把 waitTimeMillsInSendQueue 参数从 200 改到 1000,应该可以缓解问题。
修改参数后的第二天,又收到了反馈,只是这次异常有点不同:
改的参数应该是生效了,但是由于 broker 存在问题,所以出现了新的异常,必须把问题解决才行。
从源码看,是 page cache 导致的超时,也就是数据刷盘的时候超时。
RocketMQ 是通过 MappedByteBuffer 方式读写文件,写入的数据并不是直接写入硬盘,而是先写入内存映射,操作系统会在合适的时机自动把内容刷到硬盘上,broker 也会定时进行刷盘。
在 store.log 中找到刷盘的日志,刷盘时间超过 500 ms 就会打印该日志。
可以看到这台机器上刷盘耗时很长,这个时间超过 1000 ms 就可能引发 [PCBUSY_CLEAN_QUEUE]broker busy。
我们有两台主 broker 而另外一台情况就好很多
所以我怀疑是机器的磁盘有问题,导致刷盘经常超时。于是编写了一段代码来测试刷盘速度:
public class Benchmark {
/** 1G */
private static long FILE_SIZE = 1024 * 1024 * 1024;
public static void main(String[] args) throws Exception {
for (int i = 0; i < 5; i++) {
MappedByteBuffer mb = getMappedByteBuffer(FILE_SIZE);
byte[] arr = new byte[4096];
Arrays.fill(arr, (byte) 2);
int length = 0;
long s = System.currentTimeMillis();
while (length < mb.capacity()) {
length += arr.length;
mb.put(arr);
}
long e = System.currentTimeMillis();
System.out.print("MappedByteBuffer cost: " + (e - s) + "ms");
mb.force();
System.out.println(", force time: " + (System.currentTimeMillis() - e) + "ms");
}
}
private static MappedByteBuffer getMappedByteBuffer(long fileSize) throws IOException {
File file = getRandomFile();
return new RandomAccessFile(file, "rw").getChannel().
map(FileChannel.MapMode.READ_WRITE, 0, fileSize);
}
private static File getRandomFile() {
String fileName = UUID.randomUUID().toString();
File file = new File(fileName);
file.deleteOnExit();
return file;
}
}
在怀疑故障机器和其它机器都运行一遍,平均刷盘时间高了一倍。
怀疑故障机器:
MappedByteBuffer cost: 1622ms, force time: 243ms
MappedByteBuffer cost: 1441ms, force time: 292ms
MappedByteBuffer cost: 1376ms, force time: 560ms
MappedByteBuffer cost: 2232ms, force time: 333ms
MappedByteBuffer cost: 1747ms, force time: 370ms
MappedByteBuffer cost: 2073ms, force time: 412ms
MappedByteBuffer cost: 1912ms, force time: 524ms
MappedByteBuffer cost: 1968ms, force time: 503ms
MappedByteBuffer cost: 1838ms, force time: 434ms
MappedByteBuffer cost: 2135ms, force time: 231ms
MappedByteBuffer cost: 1899ms, force time: 209ms
MappedByteBuffer cost: 2103ms, force time: 567ms
MappedByteBuffer cost: 1716ms, force time: 411ms
MappedByteBuffer cost: 1953ms, force time: 806ms
MappedByteBuffer cost: 2156ms, force time: 299ms
MappedByteBuffer cost: 1925ms, force time: 526ms
MappedByteBuffer cost: 1809ms, force time: 414ms
MappedByteBuffer cost: 2655ms, force time: 61ms
MappedByteBuffer cost: 1888ms, force time: 1477ms
MappedByteBuffer cost: 1791ms, force time: 489ms
其它机器:
MappedByteBuffer cost: 2101ms, force time: 487ms
MappedByteBuffer cost: 2317ms, force time: 171ms
MappedByteBuffer cost: 2037ms, force time: 578ms
MappedByteBuffer cost: 2502ms, force time: 137ms
MappedByteBuffer cost: 2144ms, force time: 210ms
MappedByteBuffer cost: 2253ms, force time: 51ms
MappedByteBuffer cost: 2404ms, force time: 295ms
MappedByteBuffer cost: 2186ms, force time: 335ms
MappedByteBuffer cost: 1932ms, force time: 632ms
MappedByteBuffer cost: 2492ms, force time: 79ms
MappedByteBuffer cost: 2004ms, force time: 566ms
MappedByteBuffer cost: 2252ms, force time: 414ms
MappedByteBuffer cost: 2163ms, force time: 399ms
MappedByteBuffer cost: 2214ms, force time: 131ms
MappedByteBuffer cost: 2343ms, force time: 44ms
MappedByteBuffer cost: 2270ms, force time: 149ms
MappedByteBuffer cost: 2316ms, force time: 170ms
MappedByteBuffer cost: 2312ms, force time: 191ms
MappedByteBuffer cost: 2225ms, force time: 89ms
MappedByteBuffer cost: 2359ms, force time: 106ms