K8S学习笔录 - Pod的优先级调度

2,625 阅读3分钟

原文链接

所谓“优先级”调度,就是优先部署较为重要的是Pod,在资源紧张的时候,甚至通过驱逐不那么重要的Pod释放资源来部署重要的是Pod

优先级调度可以通过声明PriorityClass和在Pod的配置中通过配置 PriorityClassName 名字来使用

那么如何声明一个Pod相对于其他的Pod“更重要”?

“重要”可以通过以下几个维度来定义

  1. Priority优先级
  2. QoS服务质量等级
  3. 系统定义的其他度量指标

优先级 抢占调度策略的行为分别是 驱逐(Eviction)抢占(Preemption) ,这两种行为的场景不同,但效果相同。

驱逐行为 是kubelet进程的行为,当Node资源不足的时候,该节点上的kubelet会综合考虑节点上Pod的优先级、资源申请量与实际使用量等信息来计算哪些Pod应当被驱逐。 当被驱逐的Pod中有几个优先级相同时,会优先驱逐实际使用量超过申请量最多倍数的Pod。

抢占行为 是Scheduler的行为,当一个新的Pod因为资源无法满足而不能被调度时,Scheduler有权决定选择驱逐部分低优先级的Pod实例来满足此Pod的调度目标。

PriorityClass

type PriorityClass struct {
    metav1.TypeMeta `json:",inline"`
    metav1.ObjectMeta `json:"metadata,omitempty"`

    // 优先级数
    Value int32 `json:"value"`

    // globalDefault表示是在Pod没有设置优先级的时候,此优先级配置是否为默认的优先级
    // 只能有一个PriorityClasses可以设置该值为true
    // 如果设置多个,则使用数值较小的那个
    GlobalDefault bool `json:"globalDefault,omitempty"`

    // 字符串描述,记录一些有用信息
    Description string `json:"description,omitempty"`

    // 标记是否会抢占低优先级的Pod,默认值为PreemptLowerPriority
    PreemptionPolicy *apiv1.PreemptionPolicy `json:"preemptionPolicy,omitempty"`
}

type PreemptionPolicy string
const (
    // 会抢占其他低优先级Pod
    PreemptLowerPriority PreemptionPolicy = "PreemptLowerPriority"

    // 不会抢占其他低优先级Pod
    PreemptNever PreemptionPolicy = "Never"
)

配置示例

  1. 创建两个个优先级配置low-priority和high-priority
  2. 创建一个使用low-priority并且带有标签priority=low的Pod
  3. 创建一个使用high-priority但是不与priority=low的Pod在同一个节点的Pod
  4. 检查high-priority的Pod是否将low-priority的Pod“挤走”

Step - 1 创建两个个优先级配置low-priority和high-priority

配置文件priority.yaml

apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: low-priority
value: 10
globalDefault: false
description: "低优先级Pod"
---
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: high-priority
value: 100
globalDefault: false
description: "高优先级Pod"
$ kubectl create -f priority.yaml
priorityclass.scheduling.k8s.io/low-priority created
priorityclass.scheduling.k8s.io/high-priority created

Step - 2 创建一个使用low-priority的Pod

配置文件nginx-low.yaml

apiVersion: v1
kind: Pod
metadata:
  name: nginx-low
  labels:
    priority: low
spec:
  priorityClassName: low-priority
  containers:
  - name: nginx
    image: nginx
  nodeSelector:
    kubernetes.io/hostname: tx
$ kubectl create -f nginx-low.yaml
pod/nginx-low created

$ kubectl get pods nginx-low -o wide
NAME        READY   STATUS    RESTARTS   AGE   IP          NODE   NOMINATED NODE   READINESS GATES
nginx-low   1/1     Running   0          21s   10.32.0.1   tx     <none>           <none>

Step - 3 创建一个使用high-priority的Pod

配置文件nginx-high.yaml

apiVersion: v1
kind: Pod
metadata:
  name: nginx-high
spec:
  affinity:
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchExpressions:
          - key: priority
            operator: In
            values:
            - low
        topologyKey: kubernetes.io/hostname
  priorityClassName: high-priority
  containers:
  - name: nginx
    image: nginx
  nodeSelector:
    kubernetes.io/hostname: tx

创建该Pod

$ kubectl create -f nginx-high.yaml
pod/nginx-high created

Step - 4 检查high-priority的Pod是否将low-priority的Pod“挤走”

$ kubectl get pods -o wide
NAME         READY   STATUS    RESTARTS   AGE   IP       NODE   NOMINATED NODE   READINESS GATES
nginx-high   1/1     Running   0          15s   <none>   tx     <none>           <none>

发现之前创建的低优先级的Pod消失了!!

通过查看Pod的事件列表可以发现,高优先级的Pod因为反亲和性配置无法部署在期望的 tx 节点上

而调度器也因为找不到满足条件的节点无法将低优先级的Pod调度到其他节点

所以低优先级的Pod nginx-low 被驱逐后移除了

$ kubectl describe pod nginx-high
Events:
  Type     Reason            Age        From               Message
  ----     ------            ----       ----               -------
  Warning  FailedScheduling  <unknown>  default-scheduler  0/4 nodes are available: 1 node(s) didn't match pod affinity/anti-affinity, 1 node(s) didn't match pod anti-affinity rules, 3 node(s) didn't match node selector.
  Warning  FailedScheduling  <unknown>  default-scheduler  0/4 nodes are available: 1 node(s) didn't match pod affinity/anti-affinity, 1 node(s) didn't match pod anti-affinity rules, 3 node(s) didn't match node selector.
  Normal   Scheduled         <unknown>  default-scheduler  Successfully assigned default/nginx-high to tx
  Normal   Pulling           16s        kubelet, tx        Pulling image "nginx"
  Normal   Pulled            12s        kubelet, tx        Successfully pulled image "nginx"
  Normal   Created           12s        kubelet, tx        Created container nginx
  Normal   Started           11s        kubelet, tx        Started container nginx