rocketmq-operator之NameService

58 阅读6分钟

一.了解nameservice

NameServer是一个简单的 Topic 路由注册中心,支持 Topic、Broker 的动态注册与发现。

主要包括两个功能:

  • Broker管理,NameServer接受Broker集群的注册信息并且保存下来作为路由信息的基本数据。然后提供心跳检测机制,检查Broker是否还存活;
  • 路由信息管理,每个NameServer将保存关于 Broker 集群的整个路由信息和用于客户端查询的队列信息。Producer和Consumer通过NameServer就可以知道整个Broker集群的路由信息,从而进行消息的投递和消费。

NameServer通常会有多个实例部署,各实例间相互不进行信息通讯。Broker是向每一台NameServer注册自己的路由信息,所以每一个NameServer实例上面都保存一份完整的路由信息。当某个NameServer因某种原因下线了,客户端仍然可以向其它NameServer获取路由信息。

ps:可以先简单理解成一个无状态的注册中心。

二.nameservice具体实现

我是手抄了一下官网的项目,接下来我简单介绍一下重点。

2.1.nameservice_types

import (
	corev1 "k8s.io/api/core/v1"
	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)

// EDIT THIS FILE!  THIS IS SCAFFOLDING FOR YOU TO OWN!
// NOTE: json tags are required.  Any new fields you add must have json tags for the fields to be serialized.

// NameServiceSpec defines the desired state of NameService
type NameServiceSpec struct {
	// INSERT ADDITIONAL SPEC FIELDS - desired state of cluster
	// Important: Run "make" to regenerate code after modifying this file

	// Size is the number of the name service Pod
	Size int32 `json:"size"`
	//NameServiceImage is the name service image
	NameServiceImage string `json:"nameServiceImage"`
	// ImagePullPolicy defines how the image is pulled.
	ImagePullPolicy corev1.PullPolicy `json:"imagePullPolicy"`
	// HostNetwork can be true or false
	HostNetwork bool `json:"hostNetwork"`
	// dnsPolicy defines how a pod's DNS will be configured
	DNSPolicy corev1.DNSPolicy `json:"dnsPolicy"`
	// Resources describes the compute resource requirements
	Resources corev1.ResourceRequirements `json:"resources"`
	// StorageMode can be EmptyDir, HostPath, StorageClass
	StorageMode string `json:"storageMode"`
	// HostPath is the local path to store data
	HostPath string `json:"hostPath"`
	// Env defines custom env, e.g. JAVA_OPT_EXT
	Env []corev1.EnvVar `json:"env,omitempty"`
	// VolumeClaimTemplates defines the StorageClass
	VolumeClaimTemplates []corev1.PersistentVolumeClaim `json:"volumeClaimTemplates"`
	// Pod Security Context
	PodSecurityContext *corev1.PodSecurityContext `json:"securityContext,omitempty"`
	// Container Security Context
	ContainerSecurityContext *corev1.SecurityContext `json:"containerSecurityContext,omitempty"`
	// The secrets used to pull image from private registry
	ImagePullSecrets []corev1.LocalObjectReference `json:"imagePullSecrets,omitempty"`
	// Affinity the pod's scheduling constraints
	Affinity *corev1.Affinity `json:"affinity,omitempty"`
	// Tolerations the pod's tolerations.
	Tolerations []corev1.Toleration `json:"tolerations,omitempty"`
	// NodeSelector is a selector which must be true for the pod to fit on a node
	NodeSelector map[string]string `json:"nodeSelector,omitempty"`
	// PriorityClassName indicates the pod's priority
	PriorityClassName string `json:"priorityClassName,omitempty"`
	// ServiceAccountName
	ServiceAccountName string `json:"serviceAccountName,omitempty"`
}

// NameServiceStatus defines the observed state of NameService
type NameServiceStatus struct {
	// INSERT ADDITIONAL STATUS FIELD - define observed state of cluster
	// Important: Run "make" to regenerate code after modifying this file
	// NameServers is the name service ip list
	NameServices []string `json:"nameServices"`
}

// +kubebuilder:object:root=true
// +kubebuilder:subresource:status

// NameService is the Schema for the nameservices API
type NameService struct {
	metav1.TypeMeta   `json:",inline"`
	metav1.ObjectMeta `json:"metadata,omitempty"`

	Spec   NameServiceSpec   `json:"spec,omitempty"`
	Status NameServiceStatus `json:"status,omitempty"`
}

// +kubebuilder:object:root=true

// NameServiceList contains a list of NameService
type NameServiceList struct {
	metav1.TypeMeta `json:",inline"`
	metav1.ListMeta `json:"metadata,omitempty"`
	Items           []NameService `json:"items"`
}

func init() {
	SchemeBuilder.Register(&NameService{}, &NameServiceList{})
}

这个文件没什么可以说的,主要是定义了NameService的期望状态和集群的当前状态。

2.2.nameservice_controller

2.2.1Reconcile方法

核心,无需多言

func (r *NameServiceReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
    reqLogger := log.WithValues("Request.Namespace", req.Namespace, "Request.Name", req.Name)
    reqLogger.Info("Reconciling NameService")

    // Fetch the NameService instance
    instance := &appsv1s.NameService{}
    err := r.Client.Get(context.TODO(), req.NamespacedName, instance)
    if err != nil {
        if errors.IsNotFound(err) {
            return reconcile.Result{}, nil
        }
        return reconcile.Result{}, err
    }

    // Check if the statefulSet already exists, if not create a new one
    found := &appsv1.StatefulSet{}

    //返回一个statefulSet,注意这里并没有创建
    dep := r.statefulSetForNameService(instance)

    err = r.Client.Get(context.TODO(), types.NamespacedName{Name: dep.Name, Namespace: dep.Namespace}, found)
    if err != nil && errors.IsNotFound(err) {
        //没有找到就创建
        err = r.Client.Create(context.TODO(), dep)
        if err != nil {
            reqLogger.Error(err, "Failed to create new StatefulSet of NameService", "StatefulSet.Namespace", dep.Namespace, "StatefulSet.Name", dep.Name)
        }
        // StatefulSet created successfully - return and requeue
        return reconcile.Result{Requeue: true}, nil
    } else if err != nil {
        reqLogger.Error(err, "Failed to get NameService StatefulSet.")
    }

    // Ensure the statefulSet size is the same as the spec
    size := instance.Spec.Size
    if *found.Spec.Replicas != size {
        found.Spec.Replicas = &size
        //数量不够?
        err = r.Client.Update(context.TODO(), found)
        reqLogger.Info("NameService Updated")
        if err != nil {
            reqLogger.Error(err, "Failed to update StatefulSet.", "StatefulSet.Namespace", found.Namespace, "StatefulSet.Name", found.Name)
            return reconcile.Result{}, err
        }
    }

    return r.updateNameServiceStatus(instance, req, true)
}

逻辑如下:

2.2.2statefulSetForNameService方法

主要的逻辑是创建StatefulSet,也就是我们上一个流程图中第一个红框的位置。

func (r *NameServiceReconciler) statefulSetForNameService(nameService *appsv1s.NameService) *appsv1.StatefulSet {
    //Convert to this structure
    //{"app": "name_service", "name_service_cr": nameService.Name}
    ls := labelsForNameService(nameService.Name)

    //如果没有设置VCT的名字,则随机生成一个,
    //因为只需要一个VCT去挂载log目录的数据所以这里是[0]
    if strings.EqualFold(nameService.Spec.VolumeClaimTemplates[0].Name, "") {
        nameService.Spec.VolumeClaimTemplates[0].Name = uuid.New().String()
    }

    dep := &appsv1.StatefulSet{
        ObjectMeta: ctrl.ObjectMeta{
            Name:      nameService.Name,
            Namespace: nameService.Namespace,
        },
        Spec: appsv1.StatefulSetSpec{
            Replicas: &nameService.Spec.Size,
            Selector: &metav1.LabelSelector{
                MatchLabels: ls,
            },
            Template: corev1.PodTemplateSpec{
                ObjectMeta: metav1.ObjectMeta{
                    Labels: ls,
                },
                Spec: corev1.PodSpec{
                    ServiceAccountName: nameService.Spec.ServiceAccountName,
                    Affinity:           nameService.Spec.Affinity,
                    Tolerations:        nameService.Spec.Tolerations,
                    NodeSelector:       nameService.Spec.NodeSelector,
                    PriorityClassName:  nameService.Spec.PriorityClassName,
                    HostNetwork:        nameService.Spec.HostNetwork,
                    DNSPolicy:          nameService.Spec.DNSPolicy,
                    ImagePullSecrets:   nameService.Spec.ImagePullSecrets,
                    Containers: []corev1.Container{{
                        Resources:       nameService.Spec.Resources,
                        Image:           nameService.Spec.NameServiceImage,
                        Name:            "name-service",
                        ImagePullPolicy: nameService.Spec.ImagePullPolicy,
                        Env:             nameService.Spec.Env,
                        Ports: []corev1.ContainerPort{{
                            ContainerPort: constants.NameServiceMainContainerPort,
                            Name:          constants.NameServiceMainContainerPortName,
                        }},
                        VolumeMounts: []corev1.VolumeMount{{
                            MountPath: constants.LogMountPath,
                            Name:      nameService.Spec.VolumeClaimTemplates[0].Name,
                            SubPath:   constants.LogSubPathName,
                        }},
                        //为 Pod 或容器配置安全上下文
                        SecurityContext: getContainerSecurityContext(nameService),
                    }},
                    Volumes:         getVolumes(nameService),
                    SecurityContext: getPodSecurityContext(nameService),
                },
            },
            VolumeClaimTemplates: getVolumeClaimTemplates(nameService),
        },
    }

    //Set Broker instance as the owner and controller
    //可以确保当 nameService 这个资源被删除时,Kubernetes 会自动清理掉由它控制的 dep 资源
    controllerutil.SetControllerReference(nameService, dep, r.Scheme)

    return dep
}

这里主要存储,也就是getVolumes函数和getVolumeClaimTemplates函数。

  1. getVolumes
// getVolumes 根据存储模式,创建卷
func getVolumes(nameService *appsv1s.NameService) []corev1.Volume {
    switch nameService.Spec.StorageMode {
        //不需要创建卷
        case ss:
        return nil
        case constants.StorageModeEmptyDir:
        //创建一个临时
        volumes := []corev1.Volume{{
            Name: nameService.Spec.VolumeClaimTemplates[0].Name,
            VolumeSource: corev1.VolumeSource{
                EmptyDir: &corev1.EmptyDirVolumeSource{},
            },
        }}
        return volumes
        case constants.StorageModeHostPath:
        fallthrough
        default:
        //创建一个本地
        volumes := []corev1.Volume{{
            Name: nameService.Spec.VolumeClaimTemplates[0].Name,
            VolumeSource: corev1.VolumeSource{
                HostPath: &corev1.HostPathVolumeSource{
                    Path: nameService.Spec.HostPath,
                }},
        }}
        return volumes
    }
}

  1. getVolumeClaimTemplates
// getVolumeClaimTemplates 为StorageClass创建一个PVC
func getVolumeClaimTemplates(nameService *appsv1s.NameService) []corev1.PersistentVolumeClaim {
    switch nameService.Spec.StorageMode {
    case constants.StorageModeStorageClass:
       return nameService.Spec.VolumeClaimTemplates
    case constants.StorageModeEmptyDir, constants.StorageModeHostPath:
       fallthrough
    default:
       return nil
    }
}

总结一下,如果是本地或临时就去创建对应类型的volume,如果是存储类,那么就去通过PVC里指定的StorageClass去创建volume。

2.2.3.updateNameServiceStatus

func (r *NameServiceReconciler) updateNameServiceStatus(instance *appsv1s.NameService, request reconcile.Request, requeue bool) (reconcile.Result, error) {
    reqLogger := log.WithValues("Request.Namespace", request.Namespace, "Request.Name", request.Name)
    reqLogger.Info("Check the NameServers status")
    // List the pods for this nameService's statefulSet
    podList := &corev1.PodList{}
    labelSelector := labels.SelectorFromSet(labelsForNameService(instance.Name))
    listOps := &client.ListOptions{
        Namespace:     instance.Namespace,
        LabelSelector: labelSelector,
    }

    err := r.Client.List(context.TODO(), podList, listOps)
    if err != nil {
        reqLogger.Error(err, "Failed to list pods.", "NameService.Namespace", instance.Namespace, "NameService.Name", instance.Name)
        return reconcile.Result{Requeue: true}, err
    }
    //获取所有pod的ip
    hostIps := getNameServers(podList.Items)

    //排序
    sort.Strings(hostIps)
    sort.Strings(instance.Status.NameServices)

    //生成类似
    //192.168.1.1:9876;192.168.1.2:9876;192.168.1.3:9876;
    nameServerListStr := ""
    for _, value := range hostIps {
        nameServerListStr = nameServerListStr + value + ":9876;"
    }

    // Update status.NameServers if needed
    // 两个数组不相等,需要更新状态
    if !reflect.DeepEqual(hostIps, instance.Status.NameServices) {
        //原来的pod列表
        oldNameServerListStr := ""
        for _, value := range instance.Status.NameServices {
            oldNameServerListStr = oldNameServerListStr + value + ":9876;"
        }

        //从新的复制一份数组并去除;
        share.NameServersStr = nameServerListStr[:len(nameServerListStr)-1]
        reqLogger.Info("share.NameServersStr:" + share.NameServersStr)

        //老的小于8,就是ip不可能小于8只能说明有问题
        if len(oldNameServerListStr) <= constants.MinIpListLength {
            oldNameServerListStr = share.NameServersStr
        } else if len(share.NameServersStr) > constants.MinIpListLength {
            //正常更新状态
            oldNameServerListStr = oldNameServerListStr[:len(oldNameServerListStr)-1]
            share.IsNameServersStrUpdated = true
        }
        reqLogger.Info("oldNameServerListStr:" + oldNameServerListStr)
        //更新状态
        instance.Status.NameServices = hostIps
        err := r.Client.Status().Update(context.TODO(), instance)
        // Update the NameServers status with the host ips
        reqLogger.Info("Updated the NameServers status with the host IP")
        if err != nil {
            reqLogger.Error(err, "Failed to update NameServers status of NameService.")
            return reconcile.Result{Requeue: true}, err
        }

        //use admin tool to update broker config
        //就是将broker的配置文件中关于nameServer的配置更新,同admin的方式
        if share.IsNameServersStrUpdated && (len(oldNameServerListStr) > constants.MinIpListLength) &&
        (len(share.NameServersStr) > constants.MinIpListLength) {

            mqAdmin := constants.AdminToolDir
            subCmd := constants.UpdateBrokerConfig
            key := constants.ParamNameServiceAddress

            reqLogger.Info("share.GroupNum=broker.Spec.Size=" + strconv.Itoa(share.GroupNum))

            clusterName := share.BrokerClusterName
            reqLogger.Info("Updating config " + key + " of cluster" + clusterName)
            command := mqAdmin + " " + subCmd + " -c " + clusterName + " -k " + key + " -n " + oldNameServerListStr + " -v" + share.NameServersStr
            cmd := osexec.Command("sh", mqAdmin, subCmd, "-c", clusterName, "-k", key, "-n", oldNameServerListStr, "-v", share.NameServersStr)
            output, err := cmd.Output()
            if err != nil {
                reqLogger.Error(err, "Update Broker config "+key+" failed of cluster "+clusterName+", command: "+command)
                return reconcile.Result{Requeue: true}, err
            }
            reqLogger.Info("Successfully updated Broker config " + key + " of cluster " + clusterName + ", command: " + command + ", with output: " + string(output))
        }
    }

    // Print NameServers IP
    for i, value := range instance.Status.NameServices {
        reqLogger.Info("NameService IP[" + strconv.Itoa(i) + "]: " + value)
    }

    runningNameServerNum := getRunningNameServersNum(podList.Items)
    if runningNameServerNum == instance.Spec.Size {
        share.IsNameServersStrInitialized = true
        share.NameServersStr = nameServerListStr //reassign if operator restarts
    }

    reqLogger.Info("Share variables", "GroupNum", share.GroupNum,
                   "NameServersStr", share.NameServersStr, "IsNameServersStrUpdated", share.IsNameServersStrUpdated,
                   "IsNameServersStrInitialized", share.IsNameServersStrInitialized, "BrokerClusterName", share.BrokerClusterName)

    if requeue {
        return reconcile.Result{Requeue: true, RequeueAfter: time.Duration(constants.RequeueIntervalInSecond) * time.Second}, nil
    }

    return reconcile.Result{}, nil
}

这里的逻辑就是不断的更新NameService的ip,也就是所有pod的ip地址组合的列表,如果有变化也会更新broker的nameservice地址。但是还需要重启broker的pod,这部分说Broker的时候会提到

总结

我感觉总结起来只有一句话:实现了新扩容的Name Server 自动被所有Broker感知(只实现了一半更新broker的nameservice地址,另一半重启broker在broker控制器中实现)

参考

GitHub - apache/rocketmq-operator: Apache RocketMQ Operator

初识RocketMQ | RocketMQ

RocketMQ Operator-K8s平台自动化部署工具免费在线阅读_藏经阁-阿里云开发者社区