开启掘金成长之旅!这是我参与「掘金日新计划 · 12 月更文挑战」的第7天,点击查看活动详情
哨兵模式是提高 Redis 可用性的一种方式,本文是哨兵模式系列的第二篇文章,主要介绍哨兵发现以及判断主观下线。本来这篇文章想直接进行源码分析,但是越写越陷入细节写的很分散,最终还是放弃了。所以通过这篇文章不会执着于源码的实现顺序。
节点发现
从配置文件看,我们只配置了主节点的信息,那么哨兵是怎样知道从节点以及其他哨兵的信息呢。下面我们来分析一下。
数据节点
正常情况下哨兵会每10秒向主节点发送 INFO 消息,主节点收到命令后会把主节点的从节点的信息回复给哨兵。我们看一下代码实现
void sentinelSendPeriodicCommands(sentinelRedisInstance *ri) {
...
// 如果处于故障转移状态则每秒执行一次否则没10执行一次
if ((ri->flags & SRI_SLAVE) &&
((ri->master->flags & (SRI_O_DOWN|SRI_FAILOVER_IN_PROGRESS)) ||
(ri->master_link_down_time != 0)))
{
info_period = 1000;
} else {
info_period = SENTINEL_INFO_PERIOD;
}
...
/* Send INFO to masters and slaves, not sentinels. */
// 向主节点或从节点发送 INFO 命令
if ((ri->flags & SRI_SENTINEL) == 0 &&
(ri->info_refresh == 0 ||
(now - ri->info_refresh) > info_period))
{
retval = redisAsyncCommand(ri->link->cc,
sentinelInfoReplyCallback, ri, "%s",
sentinelInstanceMapCommand(ri,"INFO"));
if (retval == C_OK) ri->link->pending_commands++;
}
...
}
sentinelSendPeriodicCommands 函数最终是由 serverCron 函数中的 sentinelTimer 函数调用的。我们知道 serverCron 是个周期函数,INFO 命令执行的周期是 info_period 这个参数决定的。
我们看到哨兵发送 INFO 命令注册的回调函数是 sentinelInfoReplyCallback。当主从节点收到 INFO 命令时会调用 infoCommand 函数。
- 当主节点收到
INFO命令,主节点主要会把所有的从节点的信息返回给哨兵。哨兵会获取从节点列表,如果从节点没有被记录则会加入哨兵的监控。 - 当从节点收到
INFO命令,从节点主要会把主节点和当前从节点的信息返回给哨兵。哨兵从返回信息中获取从节点所属的最新主节点 ip 和端口,如果与历史记录不一致,则执行更新,获取从节点的优先级、复制偏移量以及与主节点的链接状态并更新。
{"info",infoCommand,-1,"ltR",0,NULL,0,0,0,0,0},
哨兵节点
从前面的分析中我们知道哨兵是通过 INFO 命令来获取主从节点的信息,那么哨兵之间是通过什么方式来发现的呢?
哨兵发现是通过发布订阅模式也就是 Pub/Sub 来实现通信的。哨兵们通过一个 "__sentinel__:hello" 来发送 hello 信息来通信的。
#define SENTINEL_PUBLISH_PERIOD 2000
void sentinelSendPeriodicCommands(sentinelRedisInstance *ri) {
...
/* PUBLISH hello messages to all the three kinds of instances. */
if ((now - ri->last_pub_time) > SENTINEL_PUBLISH_PERIOD) {
sentinelSendHello(ri);
}
...
}
与上面同样 sentinelSendHello 也是由 sentinelSendPeriodicCommands 函数调用。我们看到哨兵会每两秒钟向监控的主从节点发送 hello 消息。
订阅频道
我们下面看一下哨兵是如何订阅消息的:
void sentinelReconnectInstance(sentinelRedisInstance *ri) {
...
if (link->pc->err) {
sentinelEvent(LL_DEBUG,"-pubsub-link-reconnection",ri,"%@ #%s",
link->pc->errstr);
instanceLinkCloseConnection(link,link->pc);
} else {
...
/* Now we subscribe to the Sentinels "Hello" channel. */
retval = redisAsyncCommand(link->pc,
sentinelReceiveHelloMessages, ri, "%s %s",
sentinelInstanceMapCommand(ri,"SUBSCRIBE"),
SENTINEL_HELLO_CHANNEL);
}
...
}
我们看到在哨兵与主从节点建立连接的时候会订阅 __sentinel__:hello 频道,并且设置 sentinelReceiveHelloMessages 为回调函数。sentinelReceiveHelloMessages 会解析从 __sentinel__:hello 频道收到的消息。
发布消息
我们看下哨兵是如何发送 hello 消息的:
int sentinelSendHello(sentinelRedisInstance *ri) {
...
// 发送哨兵以及主节点的信息
/* Format and send the Hello message. */
snprintf(payload,sizeof(payload),
"%s,%d,%s,%llu," /* Info about this sentinel. */
"%s,%s,%d,%llu", /* Info about current master. */
announce_ip, announce_port, sentinel.myid,
(unsigned long long) sentinel.current_epoch,
/* --- */
master->name,master_addr->ip,master_addr->port,
(unsigned long long) master->config_epoch);
retval = redisAsyncCommand(ri->link->cc,
sentinelPublishReplyCallback, ri, "%s %s %s",
sentinelInstanceMapCommand(ri,"PUBLISH"),
SENTINEL_HELLO_CHANNEL,payload);
if (retval != C_OK) return C_ERR;
ri->link->pending_commands++;
return C_OK;
}
我们可以看到哨兵会把当前哨兵的信息以及它监听的主节点信息发送到 SENTINEL_HELLO_CHANNEL 也就是 __sentinel__:hello 频道。
接收消息
我们接下来看下哨兵是如何处理 hello 消息的:
心跳检测
哨兵怎样知道主从节点是否正常呢,答案就是心跳检测机制。Redis 周期性的使用 PING/PONG 命令来实现心跳检测。下面我们来看一下 Redis 是如何实现的。
#define SENTINEL_PING_PERIOD 1000
void sentinelSendPeriodicCommands(sentinelRedisInstance *ri) {
...
//
ping_period = ri->down_after_period;
if (ping_period > SENTINEL_PING_PERIOD) ping_period = SENTINEL_PING_PERIOD;
...
/* Send PING to all the three kinds of instances. */
if ((now - ri->link->last_pong_time) > ping_period &&
(now - ri->link->last_ping_time) > ping_period/2) {
sentinelSendPing(ri);
}
...
}
我们看到进行心跳检测的周期是由 down_after_period 决定的,而 down_after_period 的值是由 sentinel.conf 配置文件中 down-after-milliseconds 配置项决定的,但是如果 down-after-milliseconds 配置的值大于 SENTINEL_PING_PERIOD 也就是 1 秒钟,那么 ping_period 的值就变成 1 秒钟,也就是说心跳检测的周期必须小于等于 1 秒钟。
int sentinelSendPing(sentinelRedisInstance *ri) {
// 发送 PING
int retval = redisAsyncCommand(ri->link->cc,
sentinelPingReplyCallback, ri, "%s",
sentinelInstanceMapCommand(ri,"PING"));
if (retval == C_OK) {
ri->link->pending_commands++;
ri->link->last_ping_time = mstime();
/* We update the active ping time only if we received the pong for
* the previous ping, otherwise we are technically waiting since the
* first ping that did not receive a reply. */
if (ri->link->act_ping_time == 0)
ri->link->act_ping_time = ri->link->last_ping_time;
return 1;
} else {
return 0;
}
}
我们看到 sentinelSendPing 主要是发送 PING 命令,并设置回调函数 sentinelPingReplyCallback。如果发送成功会更新 pending_commands、last_ping_time、act_ping_time 等参数。
void pingCommand(client *c) {
...
if (c->argc == 1)
addReply(c,shared.pong);
else
addReplyBulk(c,c->argv[1]);
...
}
节点收到 PING 命令时调用 pingCommand,可以看到这里只是简单的回复了一个 PONG。哨兵收到 PONG 之后会更新 last_pong_time 为当前时间。
主观下线
检测主管下线的函数是 sentinelCheckSubjectivelyDown。
void sentinelCheckSubjectivelyDown(sentinelRedisInstance *ri) {
mstime_t elapsed = 0;
if (ri->link->act_ping_time)
elapsed = mstime() - ri->link->act_ping_time;
else if (ri->link->disconnected)
elapsed = mstime() - ri->link->last_avail_time;
/* Check if we are in need for a reconnection of one of the
* links, because we are detecting low activity.
*
* 1) Check if the command link seems connected, was connected not less
* than SENTINEL_MIN_LINK_RECONNECT_PERIOD, but still we have a
* pending ping for more than half the timeout. */
// 检查命令连接是否正常
if (ri->link->cc &&
(mstime() - ri->link->cc_conn_time) >
SENTINEL_MIN_LINK_RECONNECT_PERIOD &&
ri->link->act_ping_time != 0 && /* There is a pending ping... */
/* The pending ping is delayed, and we did not receive
* error replies as well. */
(mstime() - ri->link->act_ping_time) > (ri->down_after_period/2) &&
(mstime() - ri->link->last_pong_time) > (ri->down_after_period/2))
{
instanceLinkCloseConnection(ri->link,ri->link->cc);
}
/* 2) Check if the pubsub link seems connected, was connected not less
* than SENTINEL_MIN_LINK_RECONNECT_PERIOD, but still we have no
* activity in the Pub/Sub channel for more than
* SENTINEL_PUBLISH_PERIOD * 3.
*/
// 检查发布通道是否正常
if (ri->link->pc &&
(mstime() - ri->link->pc_conn_time) >
SENTINEL_MIN_LINK_RECONNECT_PERIOD &&
(mstime() - ri->link->pc_last_activity) > (SENTINEL_PUBLISH_PERIOD*3))
{
instanceLinkCloseConnection(ri->link,ri->link->pc);
}
/* Update the SDOWN flag. We believe the instance is SDOWN if:
*
* 1) It is not replying.
* 2) We believe it is a master, it reports to be a slave for enough time
* to meet the down_after_period, plus enough time to get two times
* INFO report from the instance. */
// 设置为主观下线状态
if (elapsed > ri->down_after_period ||
(ri->flags & SRI_MASTER &&
ri->role_reported == SRI_SLAVE &&
mstime() - ri->role_reported_time >
(ri->down_after_period+SENTINEL_INFO_PERIOD*2)))
{
/* Is subjectively down */
if ((ri->flags & SRI_S_DOWN) == 0) {
sentinelEvent(LL_WARNING,"+sdown",ri,"%@");
ri->s_down_since_time = mstime();
ri->flags |= SRI_S_DOWN;
}
} else {
/* Is subjectively up */
if (ri->flags & SRI_S_DOWN) {
sentinelEvent(LL_WARNING,"-sdown",ri,"%@");
ri->flags &= ~(SRI_S_DOWN|SRI_SCRIPT_KILL_SENT);
}
}
}