Kubernetes 安装Metrics Server

626 阅读3分钟

1.问题背景

自建k8s集群 无法正常使用kubectl top 命令, 如: kubectl top node/kubectl top pod 无法查看 node或pod资源占用,导致无法精确查看pod占用资源情况 使用 kubectl top pod 报错如下: Error from server (NotFound): the server could not find the requested resource (get services http:heapster:)

问题原因: 未安装 Metrics Server,需要安装 Metrics Server 才能采集 node/pod资源占用数据,才能使用 kuebctl top node/pod 命令查看资源占用

2.解决方案

1.安装 Metrics Server 组件参考

kubernetes-sigs.github.io/metrics-ser… github.com/kubernetes-… metrics-server helm chart配置 安装 Metrics Server需要注意下 与 k8s版本兼容性 image.png

helm repo add metrics-server [https://kubernetes-sigs.github.io/metrics-server/](https://kubernetes-sigs.github.io/metrics-server/)
helm search repo metrics-server 
NAME                            CHART VERSION   APP VERSION     DESCRIPTION                                       
metrics-server/metrics-server   3.11.0          0.6.4           Metrics Server is a scalable, efficient source ...
[root@aivpp et-industry-prometheus]# helm search repo metrics-server --versions
NAME                            CHART VERSION   APP VERSION     DESCRIPTION                                       
metrics-server/metrics-server   3.11.0          0.6.4           Metrics Server is a scalable, efficient source ...
metrics-server/metrics-server   3.10.0          0.6.3           Metrics Server is a scalable, efficient source ...
metrics-server/metrics-server   3.9.0           0.6.3           Metrics Server is a scalable, efficient source ...
metrics-server/metrics-server   3.8.4           0.6.2           Metrics Server is a scalable, efficient source ...
metrics-server/metrics-server   3.8.3           0.6.2           Metrics Server is a scalable, efficient source ...
metrics-server/metrics-server   3.8.2           0.6.1           Metrics Server is a scalable, efficient source ...
metrics-server/metrics-server   3.8.1           0.6.1           Metrics Server is a scalable, efficient source ...
metrics-server/metrics-server   3.8.0           0.6.0           Metrics Server is a scalable, efficient source ...
metrics-server/metrics-server   3.7.0           0.5.2           Metrics Server is a scalable, efficient source ...
metrics-server/metrics-server   3.6.0           0.5.1           Metrics Server is a scalable, efficient source ...
metrics-server/metrics-server   3.5.0           0.5.0           Metrics Server is a scalable, efficient source ...

2.安装 Metrics Server 组件

注: 本地安装k8s集群为1.18版本 ,需要使用 Metrics Server 0.5.x 版本才能兼容 使用版本: metrics-server/metrics-server 3.7.0 0.5.2

1.下载 helm chart: metrics-server-3.7.0.tgz

helm fetch kubernetes-sigs.github.io/metrics-ser…metrics-server/metrics-server --version=3.7.0 tar -zxvf metrics-server-3.7.0.tgz helm upgrade --install metrics-server metrics-server/metrics-server

2.修改helm chart配置: 添加namespace

metrics-server-3.7.0

vi templates/serviceaccount.yaml

{{- if .Values.serviceAccount.create -}}
apiVersion: v1
kind: ServiceAccount
metadata:
  name: {{ template "metrics-server.serviceAccountName" . }}
  namespace: {{ .Release.Namespace }}
  {{- with .Values.serviceAccount.annotations }}
  annotations:
    {{- toYaml . | nindent 4 }}
  {{- end }}
  labels:
    {{- include "metrics-server.labels" . | nindent 4 }}
{{- end -}}

vi templates/service.yaml

apiVersion: v1
kind: Service
metadata:
  name: {{ include "metrics-server.fullname" . }}
  namespace: {{ .Release.Namespace }}
  {{- with .Values.service.annotations }}
  annotations:
    {{- toYaml . | nindent 4 }}
  {{- end }}
  labels:
    {{- include "metrics-server.labels" . | nindent 4 }}
  {{- with .Values.service.labels -}}
    {{- toYaml . | nindent 4 }}
  {{- end }}
spec:
  type: {{ .Values.service.type }}
  ports:
    - name: https
      port: {{ .Values.service.port }}
      protocol: TCP
      targetPort: https
  selector:
    {{- include "metrics-server.selectorLabels" . | nindent 4 }}

vi templates/pdb.yaml

{{- if .Values.podDisruptionBudget.enabled -}}
apiVersion: {{ include "metrics-server.pdb.apiVersion" . }}
kind: PodDisruptionBudget
metadata:
  name: {{ include "metrics-server.fullname" . }}
  namespace: {{ .Release.Namespace }}
  labels:
    {{- include "metrics-server.labels" . | nindent 4 }}
spec:
  {{- if .Values.podDisruptionBudget.minAvailable }}
  minAvailable: {{ .Values.podDisruptionBudget.minAvailable }}
  {{- end  }}
  {{- if .Values.podDisruptionBudget.maxUnavailable }}
  maxUnavailable: {{ .Values.podDisruptionBudget.maxUnavailable }}
  {{- end  }}
  selector:
    matchLabels:
      {{- include "metrics-server.selectorLabels" . | nindent 6 }}
{{- end -}}

vi templates/deployment.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: {{ include "metrics-server.fullname" . }}
  namespace: {{ .Release.Namespace }}

vi values.yaml

image:
  # repository: k8s.gcr.io/metrics-server/metrics-server
 repository: registry.cn-hangzhou.aliyuncs.com/google_containers/metrics-server
  # Overrides the image tag whose default is v{{ .Chart.AppVersion }}
  tag: "v0.5.2"
  pullPolicy: IfNotPresent
  
args:
  # 默认参数
  - --secure-port=4443
  - --cert-dir=/tmp
  - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
  - --kubelet-use-node-status-port
  - --metric-resolution=15s
  # 添加忽略 kubelet tls证书验证即可 => fix: kubelet https tls证书验证失败问题
  - --kubelet-insecure-tls
  
resources:
  requests:
    cpu: 100m
    memory: 200Mi
  limits:
    cpu: 200m
    memory: 200Mi
3.helm chart 安装 metrics-server

tar -xvf metrics-server-3.7.0.tar 安装 metrics-server helm install metrics-server . -n monitor image.png

问题: 无法正常拉起Pod:metrics-server pod状态一直 0/1 Running 状态 image.png image.png 错误日志:

I1011 12:07:51.215241       1 server.go:188] "Failed probe" probe="metric-storage-ready" err="not metrics to serve"
E1011 12:07:57.718982       1 scraper.go:139] "Failed to scrape node" err="Get \"https://192.168.0.22:10250/stats/summary?only_cpu_and_memory=true\": x509: cannot validate certificate for 192.168.0.22 because it doesn't contain any IP SANs" node="aivpp"
I1011 12:08:01.216519       1 server.go:188] "Failed probe" probe="metric-storage-ready" err="not metrics to serve"
I1011 12:08:11.215193       1 server.go:188] "Failed probe" probe="metric-storage-ready" err="not metrics to serve"
E1011 12:08:12.715475       1 scraper.go:139] "Failed to scrape node" err="Get \"https://192.168.0.22:10250/stats/summary?only_cpu_and_memory=true\": x509: cannot validate certificate for 192.168.0.22 because it doesn't contain any IP SANs" node="aivpp"

问题原因: kubelet 的10250端口使用的是https协议,链接时需要验证tls证书 blog.csdn.net/avatar_2009…

ssoor.github.io/2020/03/25/…

fix方案: 添加忽略kubelet tls证书验证 vi values.yaml

args: 
  # 默认参数
  - --secure-port=4443
  - --cert-dir=/tmp
  - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
  - --kubelet-use-node-status-port
  - --metric-resolution=15s
  # 添加忽略 kubelet tls证书验证即可
  - --kubelet-insecure-tls

参考: github.com/kubernetes-… image.png 验证问题修复:

更新升级 metrics-server

helm upgrade metrics-server . -n monitor

metrics-server 正常被拉起,kubelet证书验证失败问题修复完成 image.png

kubectl top 命令正常可用

kubectl top pod --sort-by=memory

kubectl top pod --sort-by=cpu

kubectl top node

卸载 metrics-server

helm uninstall metrics-server -n monitor

FAQ:

1. metrics-server 版本兼容性问题: 3.11.0 版本不兼容 k8s 1.18 需要 1.19+

helm fetch kubernetes-sigs.github.io/metrics-ser… metrics-server/metrics-server --version=3.11.0

tar -zxvf metrics-server-3.11.0.tgz

安装

helm install metrics-server . -n monitor

[root@xxx metrics-server]# helm install metrics-server . -n monitor

k8s版本不match

Error: unable to build kubernetes objects from release manifest: error validating "": error validating data: ValidationError(Deployment.spec.template.spec.containers[0].securityContext): unknown field "seccompProfile" in io.k8s.api.core.v1.SecurityContext