架构

优点

rocketmq能在海量topic下保持性能可靠, 而kafka在100+topic时性能会下降到原来的1/10

流程

启动顺序

nameserver starts and is listening
broker starts and connects to every nameserver
producer starts and fetch broker's IP from nameserver with specific topic and connects to broker
consumer starts and fetch broker's IP from nameserver with specific topic and connects to broker

存储流程

消费者读消息时间复杂度是O(1): 消费者用offset读一个queue里的消息, 因为queue里每个消息索引长度是20B, 所以直接O(1)读, 然后再通过消息索引里的commitlog偏移量从commitlog里读, 也是O(1)复杂度

broker

每个broker里有不同的topic, 每个topic分了好几个message queue(也就是下面consumelog的结构), 每个不同的broker的同一个topic下的不同message queue都是不同的, 也就是说一个producer发一条消息, 只会发送到多个broker中的多个message queue中的一个message queue

同理一个message queue也只会被同一个group里的一个consumer消费

不同于kafka的主从复制形式, rocketmq中是broker完整复制, 也就是broker分为master broker 和 slave broker, 只有master能写, slave只能读

广播模式 和 集群模式:

集群模式就是consumer group的形式, 而广播模式就是每个queue的每条消息都被group里的每一个consumer消费

存储

有三个主要文件: commitLog, consumeLog, indexlog(每台broker上都有这三个文件)

kafka中一个Topic可以分为多个Partition, 以实现水平扩容, 同样的, Rocketmq中一个Topic也分为多个Message Queue

CommitLog

commitlog consists of many files and every file size is 1G. 消息顺序存储在commitlog中不区分topic. 文件名是第一个消息的偏移量比如01010010011000

- commitlog
-- 0000000000000
-- 0010101010001

一个broker中只有一个commitLog, 消息发送到commitLog也是顺序写入, 所以不会说topic越多性能越差(这点比kafka好)

假如brokerA上有Topic1,Topic2,Topic3各自的一个Message Queue, 它们三个queue是共用一个commit log的

ConsumeLog

Message Queue对应的物理日志就是consume log, 存储的是该message queue里的消息在commit log里的offset偏移量(8B), size长度(4B), hash keytag的hash值(8B); 本质上consumelog就是commitlog的索引文件

- consumequeue
-- topic1
--- 0
---- 0000000000000
---- 0010101000101
--- 1
--- 2
-- topic2

可以看到每个topic文件夹下又分了0 1 2 等文件夹, 这些是为了负载均衡, 相当于kafka的partition

IndexLog

结构是一个hashmap, 方便从commitlog中快速检索消息,

Consume

rocketmq支持PUSH和PULL两种消费方式:

PULL需要消费者业务自己轮询,好处是可以自己跟进消费速率来拉取
PUSH实时性更好, 但broker不知道consumer的消费速率会造成堆积

REF

Meeting problem heapdump.cn/article/380…
Horizental scaling jaskey.github.io/blog/2016/1…

Rocketmq - Must Know

架构

优点

流程