【Redis】docker compose 部署哨兵集群模式

1,642 阅读12分钟

哨兵基础认识

关于 Redis 哨兵的介绍:

  • 一个哨兵就是一个 redis-sentinel 服务,默认使用 26379 端口。和 Redis 服务是分开的,可以理解为一个哨兵就是一台额外的服务器,用来监控 Redis 服务能不能正常运行。

  • 多个哨兵就是多个 redis-sentinel 服务,每个可以部署在不同的机器上,甚至可以全部部署在同一台机器上,监听不同端口就行了,但这样就失去了多个哨兵的意义。

  • 所有的哨兵都是监听同一个 Redis 服务的,就是 Master 服务,也就是说哨兵的数量和 Redis 服务的数量没有任何关系。

  • 哨兵只需要监听 Redis Master 服务,因为 Slave 信息会注册在 Master 上,所有哨兵通过 Master 就可以知道所有的 Slave 服务信息了。

  • 每个哨兵只需要知道当前的 Redis Master 服务是谁就行了,不需要知道其他哨兵的信息,因为他们会在 Master 上通过相互沟通得知当前其他哨兵的情况。

对哨兵更深入的了解请查看:【Redis】哨兵(Sentinel)介绍及其工作原理 -- 辐射工兵

哨兵集群架构

本次搭建的哨兵集群架构如下:

image.png

目录结构

sentinel/
├── docker-compose.yml
├── master
│   ├── data/
│   └── redis.conf
├── slave1
│   ├── data/
│   └── redis.conf
├── slave2
|   ├── data/
|   └── redis.conf
├── sentinel1
│   └── sentinel.conf
├── sentinel2
│   └── sentinel.conf
└── sentinel3
    └── sentinel.conf

Compose File

注意点:

  • 定义一个内部网络容易处理,和更加规范。
  • sentinel.conf 配置文件在哨兵启动之后,会进行修改的,如果是和宿主机共享数据卷的话,容器没办法进行修改,所以复制一份给容器内的哨兵服务用。
version: "3"

networks:
  redis-replication:
    driver: bridge
    ipam:
      config:
        - subnet: 172.25.0.0/24

services:
  master:
    image: redis
    container_name: redis-master
    ports:
      - "6380:6379"
    volumes:
      - "./master/redis.conf:/etc/redis.conf"
      - "./master/data:/data"
    command: ["redis-server", "/etc/redis.conf"]
    restart: always
    networks:
      redis-replication:
        ipv4_address: 172.25.0.101

  slave1:
    image: redis
    container_name: redis-slave-1
    ports:
      - "6381:6379"
    volumes:
      - "./slave1/redis.conf:/etc/redis.conf"
      - "./slave1/data:/data"
    command: ["redis-server", "/etc/redis.conf"]
    restart: always
    networks:
      redis-replication:
        ipv4_address: 172.25.0.102

  slave2:
    image: redis
    container_name: redis-slave-2
    ports:
      - "6382:6379"
    volumes:
      - "./slave2/redis.conf:/etc/redis.conf"
      - "./slave2/data:/data"
    command: ["redis-server", "/etc/redis.conf"]
    restart: always
    networks:
      redis-replication:
        ipv4_address: 172.25.0.103
  
  sentinel1:
    image: redis
    container_name: redis-sentinel-1
    ports:
      - "26380:26379"
    volumes: 
      - "./sentinel1/sentinel.conf:/etc/sentinel.conf"
    command: ["/bin/bash", "-c", "cp /etc/sentinel.conf /sentinel.conf && redis-sentinel /sentinel.conf"]
    restart: always
    networks:
      redis-replication:
        ipv4_address: 172.25.0.201

  sentinel2:
    image: redis
    container_name: redis-sentinel-2
    ports:
      - "26381:26379"
    volumes: 
      - "./sentinel2/sentinel.conf:/etc/sentinel.conf"
    command: ["/bin/bash", "-c", "cp /etc/sentinel.conf /sentinel.conf && redis-sentinel /sentinel.conf"]
    restart: always
    networks:
      redis-replication:
        ipv4_address: 172.25.0.202

  sentinel3:
    image: redis
    container_name: redis-sentinel-3
    ports:
      - "26382:26379"
    volumes: 
      - "./sentinel3/sentinel.conf:/etc/sentinel.conf"
    command: ["/bin/bash", "-c", "cp /etc/sentinel.conf /sentinel.conf && redis-sentinel /sentinel.conf"]
    restart: always
    networks:
      redis-replication:
        ipv4_address: 172.25.0.203

主从配置

配置 redis.conf 文件。

  • Master

    port 6379
    pidfile /var/run/redis_6379.pid
    protected-mode no
    timeout 0
    tcp-keepalive 300
    loglevel notice
    
    ################################# REPLICATION #################################
    slave-serve-stale-data yes
    slave-read-only yes
    repl-diskless-sync no
    repl-diskless-sync-delay 5
    repl-disable-tcp-nodelay no
    
    ##################################### RDB #####################################
    dbfilename dump.rdb
    save 900 1
    save 300 10
    save 60 10000
    stop-writes-on-bgsave-error yes
    rdbcompression yes
    rdbchecksum yes
    dir ./
    
    ##################################### AOF #####################################
    appendonly yes
    appendfilename "appendonly.aof"
    appendfsync everysec
    no-appendfsync-on-rewrite no
    aof-load-truncated yes
    aof-use-rdb-preamble no
    
  • Slave1

    在 REPLICATION 中添加 slaveof 设置即可:

    ...
    ################################# REPLICATION #################################
    slaveof 172.25.0.101 6379
    ...
    
  • Slave2

    在 REPLICATION 中添加 slaveof 设置即可:

    ...
    ################################# REPLICATION #################################
    slaveof 172.25.0.101 6379
    ...
    

哨兵配置

配置 sentinel.conf 文件,三个哨兵文件夹都配置一份。

# 所有哨兵端口都一致,因为使用 Docker 桥接网络映射 
port 26379

# 哨兵设置,所有哨兵皆一致,都指向 Master
sentinel monitor mymaster 172.25.0.101 6379 2
sentinel parallel-syncs mymaster 1
sentinel down-after-milliseconds mymaster 30000
sentinel failover-timeout mymaster 180000

bind 0.0.0.0
protected-mode no
daemonize no
pidfile /var/run/redis-sentinel.pid
logfile ""
dir /tmp

启动并验证

  1. 启动服务

    docker-compose up -d
    Creating network "sentinel_redis-replication" with driver "bridge"
    Creating redis-sentinel-3 ... done
    Creating redis-sentinel-2 ... done
    Creating redis-slave-2    ... done
    Creating redis-master     ... done
    Creating redis-slave-1    ... done
    Creating redis-sentinel-1 ... done
    
  2. 查看主从配置是否正常

    Master 的输出:

    docker logs --tail 100 redis-master
    1:C 17 Aug 2021 17:18:43.255 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
    ...
    1:M 17 Aug 2021 17:18:43.311 * Ready to accept connections
    1:M 17 Aug 2021 17:18:43.748 * Replica 172.25.0.102:6379 asks for synchronization
    1:M 17 Aug 2021 17:18:43.748 * Full resync requested by replica 172.25.0.102:6379
    1:M 17 Aug 2021 17:18:43.748 * Replication backlog created, my new replication IDs are '08bf50bad7b8c67a5a8c7402b5dda40e1fc1647a' and '0000000000000000000000000000000000000000'
    1:M 17 Aug 2021 17:18:43.748 * Starting BGSAVE for SYNC with target: disk
    1:M 17 Aug 2021 17:18:43.748 * Background saving started by pid 19
    19:C 17 Aug 2021 17:18:43.753 * DB saved on disk
    19:C 17 Aug 2021 17:18:43.754 * RDB: 0 MB of memory used by copy-on-write
    1:M 17 Aug 2021 17:18:43.814 * Background saving terminated with success
    1:M 17 Aug 2021 17:18:43.814 * Synchronization with replica 172.25.0.102:6379 succeeded
    1:M 17 Aug 2021 17:18:44.464 * Replica 172.25.0.103:6379 asks for synchronization
    1:M 17 Aug 2021 17:18:44.464 * Full resync requested by replica 172.25.0.103:6379
    1:M 17 Aug 2021 17:18:44.464 * Starting BGSAVE for SYNC with target: disk
    1:M 17 Aug 2021 17:18:44.464 * Background saving started by pid 20
    20:C 17 Aug 2021 17:18:44.470 * DB saved on disk
    20:C 17 Aug 2021 17:18:44.470 * RDB: 0 MB of memory used by copy-on-write
    1:M 17 Aug 2021 17:18:44.518 * Background saving terminated with success
    1:M 17 Aug 2021 17:18:44.518 * Synchronization with replica 172.25.0.103:6379 succeeded
    

    可以看到二从库已经连接并完成全量同步了,此处不浪费篇幅查看 Slave 日志与测试了。

  3. 查看各哨兵是否正常

    • Sentinel1

      docker logs --tail 100 redis-sentinel-1
      1:X 17 Aug 2021 17:18:42.325 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
      1:X 17 Aug 2021 17:18:42.326 # Redis version=6.2.5, bits=64, commit=00000000, modified=0, pid=1, just started
      1:X 17 Aug 2021 17:18:42.326 # Configuration loaded
      1:X 17 Aug 2021 17:18:42.326 * monotonic clock: POSIX clock_gettime
      1:X 17 Aug 2021 17:18:42.327 * Running mode=sentinel, port=26379.
      1:X 17 Aug 2021 17:18:42.335 # Sentinel ID is dc93b1609d8610a42c209d41a3157ee052d935fb
      1:X 17 Aug 2021 17:18:42.335 # +monitor master mymaster 172.25.0.101 6379 quorum 2
      1:X 17 Aug 2021 17:18:45.265 * +sentinel sentinel f14ad168ba17bbeed8f467d269d914666db6cfa6 172.25.0.202 26379 @ mymaster 172.25.0.101 6379
      1:X 17 Aug 2021 17:18:45.855 * +sentinel sentinel 774bc91d18ed7eeac779ac9bd325c720e4e4d9cd 172.25.0.203 26379 @ mymaster 172.25.0.101 6379
      1:X 17 Aug 2021 17:18:53.406 * +slave slave 172.25.0.102:6379 172.25.0.102 6379 @ mymaster 172.25.0.101 6379
      1:X 17 Aug 2021 17:18:53.414 * +slave slave 172.25.0.103:6379 172.25.0.103 6379 @ mymaster 172.25.0.101 6379
      

      可以看到 +monitor 配置、两次 +sentinel 记录和两次 +slave 记录,且各连接的 ip 和 port 信息都正确,此哨兵正常运行。

    • Sentinel2

      docker logs --tail 100 redis-sentinel-2
      1:X 17 Aug 2021 17:18:43.253 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
      1:X 17 Aug 2021 17:18:43.253 # Redis version=6.2.5, bits=64, commit=00000000, modified=0, pid=1, just started
      1:X 17 Aug 2021 17:18:43.253 # Configuration loaded
      1:X 17 Aug 2021 17:18:43.254 * monotonic clock: POSIX clock_gettime
      1:X 17 Aug 2021 17:18:43.254 * Running mode=sentinel, port=26379.
      1:X 17 Aug 2021 17:18:43.262 # Sentinel ID is f14ad168ba17bbeed8f467d269d914666db6cfa6
      1:X 17 Aug 2021 17:18:43.262 # +monitor master mymaster 172.25.0.101 6379 quorum 2
      1:X 17 Aug 2021 17:18:44.363 * +sentinel sentinel dc93b1609d8610a42c209d41a3157ee052d935fb 172.25.0.201 26379 @ mymaster 172.25.0.101 6379
      1:X 17 Aug 2021 17:18:45.855 * +sentinel sentinel 774bc91d18ed7eeac779ac9bd325c720e4e4d9cd 172.25.0.203 26379 @ mymaster 172.25.0.101 6379
      1:X 17 Aug 2021 17:18:53.325 * +slave slave 172.25.0.102:6379 172.25.0.102 6379 @ mymaster 172.25.0.101 6379
      1:X 17 Aug 2021 17:18:53.342 * +slave slave 172.25.0.103:6379 172.25.0.103 6379 @ mymaster 172.25.0.101 6379
      

      正常运行。

    • Sentinel3

      docker logs --tail 100 redis-sentinel-3
      1:X 17 Aug 2021 17:18:43.842 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
      1:X 17 Aug 2021 17:18:43.842 # Redis version=6.2.5, bits=64, commit=00000000, modified=0, pid=1, just started
      1:X 17 Aug 2021 17:18:43.842 # Configuration loaded
      1:X 17 Aug 2021 17:18:43.843 * monotonic clock: POSIX clock_gettime
      1:X 17 Aug 2021 17:18:43.843 * Running mode=sentinel, port=26379.
      1:X 17 Aug 2021 17:18:43.850 # Sentinel ID is 774bc91d18ed7eeac779ac9bd325c720e4e4d9cd
      1:X 17 Aug 2021 17:18:43.850 # +monitor master mymaster 172.25.0.101 6379 quorum 2
      1:X 17 Aug 2021 17:18:43.851 * +slave slave 172.25.0.102:6379 172.25.0.102 6379 @ mymaster 172.25.0.101 6379
      1:X 17 Aug 2021 17:18:44.363 * +sentinel sentinel dc93b1609d8610a42c209d41a3157ee052d935fb 172.25.0.201 26379 @ mymaster 172.25.0.101 6379
      1:X 17 Aug 2021 17:18:45.265 * +sentinel sentinel f14ad168ba17bbeed8f467d269d914666db6cfa6 172.25.0.202 26379 @ mymaster 172.25.0.101 6379
      1:X 17 Aug 2021 17:18:53.869 * +slave slave 172.25.0.103:6379 172.25.0.103 6379 @ mymaster 172.25.0.101 6379
      

      正常运行。

  4. Master 宕机

    主动停止 Master 容器,等待配置的失活判定时间过后,看哨兵是否识别并选举新的 Master:

    docker stop redis-master
    redis-master
    

    此时,其他机器:

    docker logs --tail 30 -f redis-sentinel-3
    1:X 17 Aug 2021 17:37:12.290 # +sdown master mymaster 172.25.0.101 6379
    1:X 17 Aug 2021 17:37:12.357 # +odown master mymaster 172.25.0.101 6379 #quorum 3/2
    1:X 17 Aug 2021 17:37:12.357 # +new-epoch 1
    1:X 17 Aug 2021 17:37:12.357 # +try-failover master mymaster 172.25.0.101 6379
    1:X 17 Aug 2021 17:37:12.376 # +vote-for-leader 774bc91d18ed7eeac779ac9bd325c720e4e4d9cd 1
    1:X 17 Aug 2021 17:37:12.376 # dc93b1609d8610a42c209d41a3157ee052d935fb voted for dc93b1609d8610a42c209d41a3157ee052d935fb 1
    1:X 17 Aug 2021 17:37:12.395 # f14ad168ba17bbeed8f467d269d914666db6cfa6 voted for 774bc91d18ed7eeac779ac9bd325c720e4e4d9cd 1
    1:X 17 Aug 2021 17:37:12.467 # +elected-leader master mymaster 172.25.0.101 6379
    1:X 17 Aug 2021 17:37:12.467 # +failover-state-select-slave master mymaster 172.25.0.101 6379
    1:X 17 Aug 2021 17:37:12.523 # +selected-slave slave 172.25.0.102:6379 172.25.0.102 6379 @ mymaster 172.25.0.101 6379
    1:X 17 Aug 2021 17:37:12.523 * +failover-state-send-slaveof-noone slave 172.25.0.102:6379 172.25.0.102 6379 @ mymaster 172.25.0.101 6379
    1:X 17 Aug 2021 17:37:12.579 * +failover-state-wait-promotion slave 172.25.0.102:6379 172.25.0.102 6379 @ mymaster 172.25.0.101 6379
    1:X 17 Aug 2021 17:37:12.952 # +promoted-slave slave 172.25.0.102:6379 172.25.0.102 6379 @ mymaster 172.25.0.101 6379
    1:X 17 Aug 2021 17:37:12.952 # +failover-state-reconf-slaves master mymaster 172.25.0.101 6379
    1:X 17 Aug 2021 17:37:12.992 * +slave-reconf-sent slave 172.25.0.103:6379 172.25.0.103 6379 @ mymaster 172.25.0.101 6379
    1:X 17 Aug 2021 17:37:13.518 # -odown master mymaster 172.25.0.101 6379
    1:X 17 Aug 2021 17:37:13.954 * +slave-reconf-inprog slave 172.25.0.103:6379 172.25.0.103 6379 @ mymaster 172.25.0.101 6379
    1:X 17 Aug 2021 17:37:13.954 * +slave-reconf-done slave 172.25.0.103:6379 172.25.0.103 6379 @ mymaster 172.25.0.101 6379
    1:X 17 Aug 2021 17:37:14.010 # +failover-end master mymaster 172.25.0.101 6379
    1:X 17 Aug 2021 17:37:14.010 # +switch-master mymaster 172.25.0.101 6379 172.25.0.102 6379
    1:X 17 Aug 2021 17:37:14.010 * +slave slave 172.25.0.103:6379 172.25.0.103 6379 @ mymaster 172.25.0.102 6379
    1:X 17 Aug 2021 17:37:14.010 * +slave slave 172.25.0.101:6379 172.25.0.101 6379 @ mymaster 172.25.0.102 6379
    1:X 17 Aug 2021 17:37:44.058 # +sdown slave 172.25.0.101:6379 172.25.0.101 6379 @ mymaster 172.25.0.102 6379
    

    哨兵 sdown 主观判断 Master 失活,接着所有哨兵 odown 客观判断 Master 失活。此时 +new-epoch 1 进入第一轮选举。

    接着 +selected-slave slave 172.25.0.102:6379 选举出 102,即 Slave2 作为新的 Master,再 +promoted-slave 推举 Slave2 为新的 Master。

    最终依旧会去看 101 是否回来了,+sdown slave 172.25.0.101:6379 172.25.0.101 6379 @ mymaster 172.25.0.102 6379 发现 101 还没上线。

  5. 测试新 Master

    登录 102(映射端口 6381),测试能否进行写操作,如果可以则已成功成为 Master:

    127.0.0.1:6381> ping
    PONG
    127.0.0.1:6381> set hello world
    OK
    

    登录 103(映射端口 6382),测试能否读到新 Key 的值:

    127.0.0.1:6381> ping
    PONG
    127.0.0.1:6381> get hello
    "world"
    

    成功。

  6. 原 Master 重新上线。

    docker restart redis-master
    redis-master
    

    查看哨兵反应:

    docker logs --tail 5 -f redis-sentinel-3
    1:X 17 Aug 2021 17:37:14.010 # +switch-master mymaster 172.25.0.101 6379 172.25.0.102 6379
    1:X 17 Aug 2021 17:37:14.010 * +slave slave 172.25.0.103:6379 172.25.0.103 6379 @ mymaster 172.25.0.102 6379
    1:X 17 Aug 2021 17:37:14.010 * +slave slave 172.25.0.101:6379 172.25.0.101 6379 @ mymaster 172.25.0.102 6379
    1:X 17 Aug 2021 17:37:44.058 # +sdown slave 172.25.0.101:6379 172.25.0.101 6379 @ mymaster 172.25.0.102 6379
    1:X 17 Aug 2021 17:47:37.128 # -sdown slave 172.25.0.101:6379 172.25.0.101 6379 @ mymaster 172.25.0.102 6379
    

    发现 -sdown slave 172.25.0.101:6379,即把 101 Slave(原 Master)的 down 状态取消掉。

    测试是否能工作、读取新 Key 和进行写操作:

    127.0.0.1:6380> ping
    PONG
    127.0.0.1:6380> get hello
    "world"
    127.0.0.1:6380> set hello JAVA
    (error) READONLY You can't write against a read only replica.
    

    可读不可写,确定成为 Slave。