RabbitMQ集群持久化方案

39 阅读4分钟

1. 持久化方案对比

RabbitMQ集群在Kubernetes环境中的三种持久化方案详细对比:

对比维度emptyDirhostPath+nodeNamePersistentVolumeClaim (PVC)
数据持久性数据不具备持久性,节点故障会导致数据丢失数据在节点级别持久化,使用nodeName绑定节点,Pod重启不会丢失数据数据完全持久化,与Pod生命周期解耦
高可用性无高可用性保障Pod和节点强绑定,节点故障会导致Pod故障Pod在任何节点可挂载PVC,实现高可用性
性能表现性能较好,数据存储在节点本地性能优秀,直接使用节点本地存储性能良好,取决于存储后端
配置复杂度简单易用,无需额外配置配置相对简单配置相对复杂,需要存储基础设施
主要适用场景开发测试环境、临时数据处理、缓存单节点生产环境、高性能要求场景生产环境多节点集群、企业级应用
存储后端支持仅支持节点本地存储仅支持节点本地存储支持多种存储后端(NFS、Ceph、云存储等)

使用HostPath方式时,需要在RabbitMQ的Deployment配置中添加nodeName参数,将Pod绑定到特定节点。Pod每次重启都会读取固定主机路径的数据,确保数据不丢失。当你的Kubernetes集群只有三个节点时,建议使用HostPath+nodeName方式。

但是集群节点较多,建议使用PVC方式,将数据存储在NFS或云存储等持久化存储后端,实现高可用性和数据持久性。

下面介绍一个使用NFS服务器作为存储后端的实现方式。

2. NFS PersistentVolumeClaim实现

image.png

上图展示了基于多 NFS 服务器的 PVC 持久化架构:

  • 左侧为 3 台独立的 NFS 服务器(192.168.10.5/6/7),分别导出 /kubernetes/nfs 目录,提供跨节点的共享存储。
  • 中间层通过 3 个 nfs-client-provisioner(分别对应 storage class nfs-1/2/3)动态为每个 RabbitMQ Pod 创建独立的 PVC,实现“一 Pod 一 PVC 一 NFS 路径”的映射。
  • 右侧为 RabbitMQ StatefulSet,每个节点(rabbitmq-node1/2/3)通过 volumeClaimTemplates 绑定到不同 storage class,确保:
    – 数据完全与 Pod 生命周期解耦,节点故障时可快速在其他节点重新挂载;
    – 各节点数据分散在 3 台 NFS 服务器,避免单点 IO 瓶颈,提升吞吐量;
    – 扩容时只需新增 storage class 与 NFS 服务器,即可水平扩展存储池。
    整体方案兼顾高可用、性能与运维简单性,适合生产级 RabbitMQ 集群。

2.1 创建3个不同的存储类

  • deployments/nfs-client-provisioner-1.yaml
  • deployments/nfs-client-provisioner-2.yaml
  • deployments/nfs-client-provisioner-3.yaml
kind: Deployment
apiVersion: apps/v1
metadata:
  # 存储类名称,对应修改为managed-nfs-storage-2,managed-nfs-storage-3
  name: nfs-client-provisioner-1  
  namespace: kube-system
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nfs-client-provisioner-1
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: nfs-client-provisioner-1
    spec:
      volumes:
        - name: nfs-client-root
          nfs:
            # 对应修改为NFS服务器IP, 192.168.10.6,192.168.10.7 e.g.
            server: 192.168.10.5
            path: /kubernetes/nfs
      containers:
        - name: nfs-client-provisioner
          image: >-
            nfs-subdir-external-provisioner:v4.0.2
          env:
            # 存储类供应名称,对应修改为managed-nfs-storage-2,managed-nfs-storage-3
            - name: PROVISIONER_NAME
              value: managed-nfs-storage-1
            # 修改为NFS服务器IP, 192.168.10.6,192.168.10.7 e.g.  
            - name: NFS_SERVER
              value: 192.168.10.5
            # 修改为NFS服务器导出路径
            - name: NFS_PATH
              value: /kubernetes/nfs
          resources: {}
          volumeMounts:
            - name: nfs-client-root
              mountPath: /persistentvolumes
          imagePullPolicy: IfNotPresent
      restartPolicy: Always
      serviceAccountName: nfs-client-provisioner
      serviceAccount: nfs-client-provisioner

  strategy:
    type: Recreate

接下来创建对应存储类:

  • storageclasses/nfs-1.yaml
  • storageclasses/nfs-2.yaml
  • storageclasses/nfs-3.yaml
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  # 存储类名称,对应修改为nfs-2,nfs-3
  name: nfs-1
# 修改为managed-nfs-storage-2,managed-nfs-storage-3
provisioner: managed-nfs-storage-1
reclaimPolicy: Retain
volumeBindingMode: Immediate

2.2 创建RabbitMQ集群

使用PVC方式的RabbitMQ集群部署,需要在Deployment配置中添加volumeClaimTemplates参数,指定使用的存储类为nfs-N

  • rabbitmq-node1.yaml
  • rabbitmq-node2.yaml
  • rabbitmq-node3.yaml
kind: StatefulSet
apiVersion: apps/v1
metadata:
  name: rabbitmq-node1
  labels:
    app: rabbitmq
spec:
  replicas: 1
  selector:
    matchLabels:
      app: rabbitmq
      instance: node1
  template:
    metadata:
      labels:
        app: rabbitmq
        instance: node1
    spec:
      volumes:
        - name: config
          configMap:
            name: rabbitmq-config
            defaultMode: 420
        - name: time
          hostPath:
            path: /etc/localtime
            type: ''
      containers:
        - name: rabbitmq
          image: >-
            rabbitmq:3.11.16-management-alpine
          command:
            - sh
            - '-c'
          args:
            - >
              # 启动插件 

              rabbitmq-plugins enable --offline rabbitmq_management
              rabbitmq_peer_discovery_k8s 

              # 启动RabbitMQ 

              exec docker-entrypoint.sh rabbitmq-server
          ports:
            - name: amqp
              containerPort: 5672
              protocol: TCP
            - name: http
              containerPort: 15672
              protocol: TCP
            - name: clustering
              containerPort: 25672
              protocol: TCP
            - name: epmd
              containerPort: 4369
              protocol: TCP
          env:
            - name: RABBITMQ_DEFAULT_USER
              value: admin
            - name: RABBITMQ_DEFAULT_PASS
              value: Admin@123
            - name: RABBITMQ_ERLANG_COOKIE
              valueFrom:
                secretKeyRef:
                  name: rabbitmq-erlang-cookie
                  key: .erlang.cookie
            - name: RABBITMQ_USE_LONGNAME
              value: 'true'
            - name: POD_NAME
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: metadata.name
            - name: POD_NAMESPACE
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: metadata.namespace
            - name: RABBITMQ_MNESIA_BASE
              value: /var/lib/rabbitmq/mnesia/$(POD_NAME)
            - name: K8S_SERVICE_NAME
              value: rabbitmq-headless
            - name: K8S_HOSTNAME_SUFFIX
              value: .$(K8S_SERVICE_NAME).$(POD_NAMESPACE).svc.cluster.local
            - name: RABBITMQ_NODENAME
              value: rabbit@$(POD_NAME)$(K8S_HOSTNAME_SUFFIX)
          resources: {}
          volumeMounts:
            - name: config
              mountPath: /etc/rabbitmq/rabbitmq.conf
              subPath: rabbitmq.conf
            - name: config
              mountPath: /etc/rabbitmq/cluster-health.sh
              subPath: cluster-health.sh
            - name: config
              mountPath: /etc/rabbitmq/z-migrates.json
              subPath: z-migrates.json
            - name: data
              mountPath: /var/lib/rabbitmq
            - name: time
              readOnly: true
              mountPath: /etc/localtime
          livenessProbe:
            exec:
              command:
                - rabbitmq-diagnostics
                - ping
            initialDelaySeconds: 60
            timeoutSeconds: 10
            periodSeconds: 30
            successThreshold: 1
            failureThreshold: 3
          readinessProbe:
            exec:
              command:
                - rabbitmq-diagnostics
                - status
            initialDelaySeconds: 20
            timeoutSeconds: 5
            periodSeconds: 10
            successThreshold: 1
            failureThreshold: 3
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          imagePullPolicy: IfNotPresent
      restartPolicy: Always
      terminationGracePeriodSeconds: 10
      dnsPolicy: ClusterFirst
      serviceAccountName: rabbitmq-sa
      serviceAccount: rabbitmq-sa
      securityContext: {}
      schedulerName: default-scheduler
  # VolumeClaimTemplates 定义了一个 PersistentVolumeClaim 模板
  # 用于动态创建 PersistentVolumeClaim
  volumeClaimTemplates:
    - kind: PersistentVolumeClaim
      apiVersion: v1
      metadata:
        name: data
        creationTimestamp: null
      spec:
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: 10Gi
        # 在StatefulSet rabbitmq-node2,对应修改为nfs-2
        storageClassName: nfs-1
        volumeMode: Filesystem
  serviceName: rabbitmq-headless
  podManagementPolicy: OrderedReady