Kubernetes Healthcheck
对 Pod 的健康状态检查可以通过两类探针来检查:LivenessProbe 和 ReadinessProbe。
- LivenessProbe :用于判断容器是否存活,如果 LivenessProbe 探针探测到容器不健康,则 kubelet 将杀掉容器,并根据容器的启动策略做相应的处理,如果一个容器不包括 LivenessProbe 探针,kubelet 认为该容器的 LivenessProbe 探针返回的值永远是 Success。
- ReadinessProbe:用于判断容器是否启动完成(ready 状态),可以接受请求,如果 ReadinessProbe 探针检测到失败,则 Pod 的状态则被更改。
1)ExecAction :容器内不执行一个命令,如果该命令的返回码为 0 ,则表示容器健康,15秒探测一次,超时1秒后会重启服务。
kind: ...``spec:`` ``containers:`` ``- name: ..`` ``livenessProbe:`` ``exec``:`` ``command``:`` ``- ``cat`` ``- ``/tmp/health`` ``initialDelaySeconds: 15`` ``timeoutSeconds: 1 |
2)TCPSocketAction:通过容器的 IP 地址和端口号执行 TCP 检查,如果能够建立 TCP 连接,则表明容器健康。
...``spec:`` ``containers:`` ``- name:`` ``...`` ``livenessProbe:`` ``tcpSocket:`` ``port: 80`` ``initialDelaySeconds: 30`` ``timeoutSeconds: 1 |
3)HTTPGETAction:通过容器的 IP 地址,端口及路径调用 HTTP GET 方法,如果响应的状态码大于等于 200 且小于 400,则任务健康,访问的路径是 localhost:80/_status/healthz
...``spec:`` ``containers:`` ``- ports:`` ``...`` ``livenessProbe:`` ``httpGet:`` ``path: ``/_status/healthz`` ``port: 80`` ``initialDelaySeconds: 30`` ``timeoutSeconds: 1 |
它们的含义分别如下:
-
initialDelaySeconds:启动容器后进行首次健康检查的等待时间,单位为 s 。
-
timeutSeconds:健康检查发送请求后等待相应的超时时间,单位为 s。当超时发生时,kubeler 会认为容器已经无法提供服务,将会重启该容器。
-
initialDelaySeconds:检查开始执行的时间,以容器启动完成为起点计算
-
periodSeconds:检查执行的周期,默认为10秒,最小为1秒
-
timeoutSeconds:检查超时的时间,默认为1秒,最小为1秒
-
successThreshold:从上次检查失败后重新认定检查成功的检查次数阈值(必须是连续成功),默认为1
-
failureThreshold:从上次检查成功后认定检查失败的检查次数阈值(必须是连续失败),默认为1
-
httpGet的属性
- host:主机名或IP
- scheme:链接类型,HTTP或HTTPS,默认为HTTP
- path:请求路径
- httpHeaders:自定义请求头
- port:请求端口
定义说明如下:
$ kubectl explain pod.spec.containers.livenessProbe``KIND: Pod``VERSION: v1``RESOURCE: livenessProbe <Object>``DESCRIPTION:`` ``Periodic probe of container liveness. Container will be restarted ``if the`` ``probe fails. Cannot be updated. More info:`` ``https:``//kubernetes``.io``/docs/concepts/workloads/pods/pod-lifecycle``#container-probes`` ``Probe describes a health check to be performed against a container to`` ``determine whether it is alive or ready to receive traffic.``FIELDS:`` ``exec <Object>`` ``One and only one of the following should be specified. Exec specifies the`` ``action to take.`` ``failureThreshold <integer>`` ``Minimum consecutive failures ``for the probe to be considered failed after`` ``having succeeded. Defaults to 3. Minimum value is 1.`` ``httpGet <Object>`` ``HTTPGet specifies the http request to perform.`` ``initialDelaySeconds <integer>`` ``Number of seconds after the container has started before liveness probes`` ``are initiated. More info:`` ``https:``//kubernetes``.io``/docs/concepts/workloads/pods/pod-lifecycle``#container-probes`` ``periodSeconds <integer>`` ``How often (``in seconds) to perform the probe. Default to 10 seconds. Minimum`` ``value is 1.`` ``successThreshold <integer>`` ``Minimum consecutive successes ``for the probe to be considered successful`` ``after having failed. Defaults to 1. Must be 1 ``for liveness. Minimum value`` ``is 1.`` ``tcpSocket <Object>`` ``TCPSocket specifies an action involving a TCP port. TCP hooks not yet`` ``supported`` ``timeoutSeconds <integer>`` ``Number of seconds after ``which the probe ``times out. Defaults to 1 second.`` ``Minimum value is 1. More info:`` ``https:``//kubernetes``.io``/docs/concepts/workloads/pods/pod-lifecycle``#container-probes |
startupProbe
新增 startupProbe, 主要解决 livenessProbe 启动时,如果无法正常启动或服务启动时间较长引起的容器重新启动的问题。
apiVersion: apps``/v1``kind: Deployment``metadata:`` ``name: busybox-lifecycles-nginx-``sleep``spec:`` ``replicas: 2`` ``selector:`` ``matchLabels:`` ``app: busybox-lifecycles-nginx-``sleep`` ``env``-o: nginx-``sleep`` ``template:`` ``metadata:`` ``labels:`` ``app: busybox-lifecycles-nginx-``sleep`` ``env``-o: nginx-``sleep`` ``annotations:`` ``consul.hashicorp.com``/connect-inject``: ``"true"`` ``spec:`` ``terminationGracePeriodSeconds: 120`` ``containers:`` ``- image: slzcc``/terminal-ctl``:ubuntu-20.04`` ``imagePullPolicy: Always`` ``command``:`` ``- nginx`` ``- -g`` ``- daemon off;`` ``name: busybox`` ``livenessProbe:`` ``exec``:`` ``command``:`` ``- ``/bin/bash`` ``- -c`` ``- nc -z -``v -n 127.0.0.1 80`` ``failureThreshold: 10`` ``initialDelaySeconds: 5`` ``periodSeconds: 10`` ``successThreshold: 1`` ``timeoutSeconds: 5`` ``readinessProbe:`` ``exec``:`` ``command``:`` ``- ``/bin/bash`` ``- -c`` ``- nc -z -``v -n 127.0.0.1 80`` ``initialDelaySeconds: 5`` ``periodSeconds: 10`` ``startupProbe:`` ``httpGet:`` ``path: /`` ``port: 801`` ``failureThreshold: 3`` ``periodSeconds: 10`` ``restartPolicy: Always``---``apiVersion: v1``kind: Service``metadata:`` ``name: busybox-lifecycles-nginx-``sleep`` ``labels:`` ``app: busybox-lifecycles-nginx-``sleep``spec:`` ``ports:`` ``- name: http`` ``port: 80`` ``targetPort: 80`` ``protocol: TCP`` ``selector:`` ``app: busybox-lifecycles-nginx-``sleep |