k8s生产集群部署

327 阅读6分钟

k8s集群部署

cgroup 驱动

在 Linux 上,控制组(CGroup)用于限制分配给进程的资源。

kubelet 和底层容器运行时都需要对接控制组来强制执行 为 Pod 和容器管理资源 并为诸如 CPU、内存这类资源设置请求和限制。 kubelet 和容器运行时需使用相同的 cgroup 驱动并且采用相同的配置。

可用的 cgroup 驱动有两个:

  • cgroupfscgroupfs 驱动是 kubelet 中默认的 cgroup 驱动。当使用 cgroupfs 驱动时, kubelet 和容器运行时将直接对接 cgroup 文件系统来配置 cgroup。当 systemd 是初始化系统时, 推荐使用 cgroupfs 驱动,因为 systemd 期望系统上只有一个 cgroup 管理器。此外,如果你使用 cgroup v2, 则应用 systemd cgroup 驱动取代 cgroupfs
  • systemd:使用 systemd 作为其初始化系统时,初始化进程会生成并使用一个 root 控制组(cgroup),并充当 cgroup 管理器。同时使用 cgroupfs 驱动,将造成系统中针对可用的资源和使用中的资源出现两个视图,节点将在资源压力增大时变得不稳定。如果你使用 cgroup v2,则推荐 systemd cgroup 驱动,使用 cgroup v2 的推荐方法是使用一个默认启用 cgroup v2 的 Linux 发行版。

kubelet设置systemd

kubelet 编辑 KubeletConfigurationcgroupDriver 选项,并将其设置为 systemd

在版本 1.22 中,如果用户没有在 KubeletConfiguration 中设置 cgroupDriver 字段, kubeadm 会将它设置为默认值 systemd

apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
...
cgroupDriver: systemd

查看cgroupDriver

$ kubectl get cm kubelet-config -n kube-system -o yaml|grep cgroupDriver
    cgroupDriver: systemd

containerd设置systemd

containerd 结合runc 使用 systemd cgroup 驱动,在 /etc/containerd/config.toml 中设置:

[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
  ...
  [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
    SystemdCgroup = true

重启containerd

systemctl restart containerd

转发 IPv4 并让 iptables 看到桥接流量

cat <<EOF | tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF

modprobe overlay
modprobe br_netfilter

# 设置所需的 sysctl 参数,参数在重新启动后保持不变
cat <<EOF | tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables  = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward                 = 1
EOF

# 应用 sysctl 参数而不重新启动
sysctl --system

# [WARNING SystemVerification]: missing optional cgroups: hugetlb
mount -t cgroup -o hugetlb none /sys/fs/cgroup

容器运行时

自 1.24 版起,Dockershim 已从 Kubernetes 项目中移除

在 Kubernetes 中几个常见的容器运行时的用法。

安装containerd

github.com/containerd/…

安装containerd

wget https://github.com/containerd/containerd/releases/download/v1.6.17/containerd-1.6.17-linux-amd64.tar.gz
tar Cxzvf /usr/local containerd-1.6.17-linux-amd64.tar.gz
mkdir -p /usr/local/lib/systemd/system/
curl https://raw.githubusercontent.com/containerd/containerd/main/containerd.service > /usr/local/lib/systemd/system/containerd.service
systemctl daemon-reload
systemctl enable --now containerd

安装runc

wget https://github.com/opencontainers/runc/releases/download/v1.1.4/runc.amd64
install -m 755 runc.amd64 /usr/local/sbin/runc

安装CNI

wget https://github.com/containernetworking/plugins/releases/download/v1.2.0/cni-plugins-linux-amd64-v1.2.0.tgz
mkdir -p /opt/cni/bin
tar Cxzvf /opt/cni/bin cni-plugins-linux-amd64-v1.2.0.tgz

配置cgroup

$ containerd config default > /etc/containerd/config.toml
SystemdCgroup = true
$ systemctl restart containerd

查看

$ ctr  -v
ctr github.com/containerd/containerd v1.6.17

安装crictl

VERSION="v1.26.0" # check latest version in /releases page
wget https://github.com/kubernetes-sigs/cri-tools/releases/download/$VERSION/crictl-$VERSION-linux-amd64.tar.gz
sudo tar zxvf crictl-$VERSION-linux-amd64.tar.gz -C /usr/local/bin
rm -f crictl-$VERSION-linux-amd64.tar.gz

如果你的容器运行时,已经换成了containerd,则换成containerd的

  • unix:///var/run/dockershim.sock
  • unix:///run/containerd/containerd.sock
  • unix:///run/crio/crio.sock
  • unix:///var/run/cri-dockerd.sock # kubernetes 1.24+ 之后,dockershim已经变成了cri-docker
crictl config runtime-endpoint unix:///var/run/containerd/containerd.sock

继续使用docker作为cri

kubernetes1.24+正式移除dockershim,kubernetes正式拥抱纯净的CRI。如果我们想继续使用在k8s中使用docker,就必须使用cri-dockerd作为适配器,它让我们可以通过CRI来使用docker engine。

image-20220730233109830.png

docs.docker.com/engine/inst…

# 安装 docker
apt-get update
mkdir /etc/apt/keyrings
apt-get install -y ca-certificates curl gnupg lsb-release
echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/debian \
  $(lsb_release -cs) stable" | tee /etc/apt/sources.list.d/docker.list > /dev/null
curl -fsSL https://download.docker.com/linux/debian/gpg | gpg --dearmor -o /etc/apt/keyrings/docker.gpg
apt-get update
apt-get install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

github.com/Mirantis/cr…

# 直接安装deb包
$ wget https://github.com/Mirantis/cri-dockerd/releases/download/v0.3.9/cri-dockerd_0.3.9.3-0.debian-bullseye_amd64.deb
$ sudo dpkg -i cri-dockerd_0.3.9.3-0.debian-bullseye_amd64.deb
# 安装cri-dockerd后,为cri-docker.service,和cri-docker.socket创建了软链接
Created symlink /etc/systemd/system/multi-user.target.wants/cri-docker.service → /lib/systemd/system/cri-docker.service.
Created symlink /etc/systemd/system/sockets.target.wants/cri-docker.socket → /lib/systemd/system/cri-docker.socket.
$ crictl config runtime-endpoint unix:///var/run/cri-dockerd.sock

创建集群的准备

10.x.159.238  kube-api-vip
10.x.159.58   kube-master58
10.x.159.65   kube-master65
10.x.159.66   kube-master66

配置docker、kubelet目录

data目录

数据盘挂载目录为/data

  1. 将盘挂载到/data目录上,并固化到/etc/fstab
UUID=30d7dad7-35c6-4e83-aa80-d89dc155c1f1 /data ext4 relatime,prjquota 0 0
  1. 配置systemd service 文件,debian一般在/lib/systemd/system/下,redhat一般在/usr/lib/systemd/system/

data-make-private.service

[Unit]
Description=Make /data mount progagation be private
DefaultDependencies=false
After=data.mount
Before=local-fs.target

[Service]
Type=oneshot
ExecStart=/bin/mount --make-private /data

[Install]
WantedBy=local-fs.target
  1. 执行systemctl daemon-reload && systemctl enable --now data-make-private.service,重启机器生效

docker目录

docker rootfs目录为/var/lib/docker,数据盘挂载目录为/data

  1. mkdir -p /var/lib/docker /data/docker
  2. 配置systemd service 文件,debian一般在/lib/systemd/system/下,redhat一般在/usr/lib/systemd/system/

var-lib-docker.mount

[Unit]
Description=Bind mount /data/docker to /var/lib/docker
Requires=var-lib-docker-make-slave.service
After=data-make-private.service
Before=local-fs.target

[Mount]
What=/data/docker
Where=/var/lib/docker
Type=none
Options=defaults,rbind

[Install]
WantedBy=local-fs.target

var-lib-docker-make-slave.service

[Unit]
Description=Make /var/lib/docker mount propagation be slave
DefaultDependencies=false
After=var-lib-docker.mount
Before=local-fs.target

[Service]
Type=oneshot
ExecStart=/bin/mount --make-rslave /var/lib/docker
  1. 执行systemctl daemon-reload && systemctl enable var-lib-docker.mount,重启机器生效

kubelet目录

kubelet rootfs目录为/var/lib/kubelet,数据盘挂载目录为/data

  1. mkdir -p /var/lib/kubelet /data/kubelet
  2. 配置systemd service 文件,debian一般在/lib/systemd/system/下,redhat一般在/usr/lib/systemd/system/

var-lib-kubelet.mount

[Unit]
Description=Bind mount /data/kubelet to /var/lib/kubelet
After=data-make-private.service
Before=local-fs.target

[Mount]
What=/data/kubelet
Where=/var/lib/kubelet
Type=none
Options=defaults,rbind

[Install]
WantedBy=local-fs.target
  1. 执行systemctl daemon-reload && systemctl enable var-lib-kubelet.mount,重启机器生效

安装 kubelet kubeadm kubectl

apt-get install -y apt-transport-https ca-certificates curl
mkdir /etc/apt/keyrings/
curl -fsSLo /etc/apt/keyrings/kubernetes-archive-keyring.gpg https://packages.cloud.google.com/apt/doc/apt-key.gpg
echo "deb [signed-by=/etc/apt/keyrings/kubernetes-archive-keyring.gpg] https://apt.kubernetes.io/ kubernetes-xenial main" | tee /etc/apt/sources.list.d/kubernetes.list
apt-get update
apt-get install -y kubelet kubeadm kubectl
apt-mark hold kubelet kubeadm kubectl

【可选】配置 kubelet 的 cgroup 驱动,在初始化的时候带上kubeadm init --config kubeadm-config.yaml

kind: ClusterConfiguration
apiVersion: kubeadm.k8s.io/v1beta3
kubernetesVersion: v1.21.0 # 可以自定义k8s版本
---
kind: KubeletConfiguration
apiVersion: kubelet.config.k8s.io/v1beta1
cgroupDriver: systemd # 这个

为 kube-apiserver 创建负载均衡器

创建一个名为 kube-apiserver 的负载均衡器解析 DNS,或者可以部署负载均衡器:github.com/kubernetes/…

我们这边使用静态pod来部署keepalived和haproxy

keepalived

/etc/keepalived/keepalived.conf

global_defs {
    router_id LVS_DEVEL
}
vrrp_script check_apiserver {
  script "/etc/keepalived/check_apiserver.sh"
  interval 3
  weight -2
  fall 10
  rise 2
}

vrrp_instance VI_1 {
    state master # 要求1台master,其他slave
    interface eth0
    virtual_router_id 238
    priority 101 
    authentication {
        auth_type PASS
        auth_pass passkube
    }
    virtual_ipaddress {
        10.x.159.238
    }
    track_script {
        check_apiserver
    }
}

/etc/keepalived/check_apiserver.sh

#!/bin/sh

errorExit() {
    echo "*** $*" 1>&2
    exit 1
}

curl --silent --max-time 2 --insecure https://localhost:6443/ -o /dev/null || errorExit "Error GET https://localhost:6443/"
if ip addr | grep -q 10.x.159.238; then
    curl --silent --max-time 2 --insecure https://10.x.159.238:6443/ -o /dev/null || errorExit "Error GET https://10.x.159.238:6443/"
fi

/etc/kubernetes/manifests/keepalived.yaml

apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  name: keepalived
  namespace: kube-system
spec:
  containers:
  - image: osixia/keepalived:2.0.17
    name: keepalived
    resources: {}
    securityContext:
      capabilities:
        add:
        - NET_ADMIN
        - NET_BROADCAST
        - NET_RAW
    volumeMounts:
    - mountPath: /usr/local/etc/keepalived/keepalived.conf
      name: config
    - mountPath: /etc/keepalived/check_apiserver.sh
      name: check
  hostNetwork: true
  volumes:
  - hostPath:
      path: /etc/keepalived/keepalived.conf
    name: config
  - hostPath:
      path: /etc/keepalived/check_apiserver.sh
    name: check
status: {}

haproxy

/etc/haproxy/haproxy.cfg

#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
    log /dev/log local0
    log /dev/log local1 notice
    daemon

#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
    mode                    http
    log                     global
    option                  httplog
    option                  dontlognull
    option http-server-close
    option forwardfor       except 127.0.0.0/8
    option                  redispatch
    retries                 1
    timeout http-request    10s
    timeout queue           20s
    timeout connect         5s
    timeout client          20s
    timeout server          20s
    timeout http-keep-alive 10s
    timeout check           10s

#---------------------------------------------------------------------
# apiserver frontend which proxys to the control plane nodes
#---------------------------------------------------------------------
frontend apiserver
    bind *:16443
    mode tcp
    option tcplog
    default_backend apiserver

#---------------------------------------------------------------------
# round robin balancing for apiserver
#---------------------------------------------------------------------
backend apiserver
    option httpchk GET /healthz
    http-check expect status 200
    mode tcp
    option ssl-hello-chk
    balance     roundrobin
        server kube-master58 10.x.159.58:6443 check
        server kube-master65 10.x.159.65:6443 check
        server kube-master66 10.x.159.66:6443 check

/etc/kubernetes/manifests/haproxy.yaml

apiVersion: v1
kind: Pod
metadata:
  name: haproxy
  namespace: kube-system
spec:
  containers:
  - image: haproxy:2.1.4
    name: haproxy
    livenessProbe:
      failureThreshold: 8
      httpGet:
        host: localhost
        path: /healthz
        port: 16443
        scheme: HTTPS
    volumeMounts:
    - mountPath: /usr/local/etc/haproxy/haproxy.cfg
      name: haproxyconf
      readOnly: true
  hostNetwork: true
  volumes:
  - hostPath:
      path: /etc/haproxy/haproxy.cfg
      type: FileOrCreate
    name: haproxyconf
status: {}

初始化master节点

初始化控制平面

# --pod-network-cidr=192.168.0.0/16 为calico方案
kubeadm init --control-plane-endpoint "10.x.159.238:16443" --upload-certs \
--image-repository registry.aliyuncs.com/google_containers \
--service-cidr=10.140.0.0/16 --pod-network-cidr=192.168.0.0/16 \
--kubernetes-version v1.19.0 # 高版本的kubeadm有特定的kubernetes版本支持
# Found multiple CRI endpoints on the host. Please define which one do you wish to use by setting the 'criSocket' field in the kubeadm configuration file: unix:///var/run/containerd/containerd.sock, unix:///var/run/cri-dockerd.sock
# 添加
--cri-socket unix:///var/run/cri-dockerd.sock
# 设置
crictl config runtime-endpoint unix:///var/run/cri-dockerd.sock

国内获取k8s所需镜像

$ kubeadm config images list
registry.k8s.io/kube-apiserver:v1.26.1
registry.k8s.io/kube-controller-manager:v1.26.1
registry.k8s.io/kube-scheduler:v1.26.1
registry.k8s.io/kube-proxy:v1.26.1
registry.k8s.io/pause:3.9
registry.k8s.io/etcd:3.5.6-0
registry.k8s.io/coredns/coredns:v1.9.3
# 生成配置文件
$ kubeadm config print init-defaults > kubeadm.conf
# 指定阿里云下载地址
$ sed -i "s/imageRepository: .*/imageRepository: registry.aliyuncs.com\/google_containers/g" kubeadm.conf

直接拉取的脚本

#!/bin/bash
KUBE_VERSION=v1.18.0
KUBE_PAUSE_VERSION=3.2
ETCD_VERSION=3.4.3-0
DNS_VERSION=1.6.7
username=registry.cn-hangzhou.aliyuncs.com/google_containers

images=(kube-proxy:${KUBE_VERSION}
kube-scheduler:${KUBE_VERSION}
kube-controller-manager:${KUBE_VERSION}
kube-apiserver:${KUBE_VERSION}
pause:${KUBE_PAUSE_VERSION}
etcd:${ETCD_VERSION}
coredns:${DNS_VERSION}
    )

for image in ${images[@]}
do
    docker pull ${username}/${image}
    docker tag ${username}/${image} k8s.gcr.io/${image}
    docker rmi ${username}/${image}
done

修改tag脚本

#!/bin/bash

# 从docker images输出中获取旧的镜像仓库
OLD_IMAGES=$(docker images | grep 'registry.aliyuncs.com' | awk '{print $1 ":" $2}')

# 新的镜像仓库
NEW_REGISTRY="registry.k8s.io"

# 循环处理每个镜像
for image in $OLD_IMAGES; do
  # 从旧仓库标记镜像
  tag=`echo $image|awk -F ':' '{print $2}'`
  new_image=$NEW_REGISTRY:$tag
  docker tag "$image" "$new_image"
done

kubctl获取权限,不设置会出现如下报错

The connection to the server localhost:8080 was refused - did you specify the right host or port?

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

#启用 kubectl 命令的自动补全功能
echo "source <(kubectl completion bash)" >> ~/.bashrc

# 添加master
kubeadm join 10.x.159.238:16443 --token 4vg9bh.65dk2vfy20r2i238 \
        --discovery-token-ca-cert-hash sha256:3e59e39ff671bdd7349c73a939f80396e2b7c2e5a872f8d23389820f5235fe87 \
        --control-plane --certificate-key 3a47ffe9bc24db9b97ebfb3efc791600801f17370db02bfa12566f2faa454ddd
# 添加node
kubeadm join 10.x.159.238:16443 --token 4vg9bh.65dk2vfy20r2i238 \
        --discovery-token-ca-cert-hash sha256:3e59e39ff671bdd7349c73a939f80396e2b7c2e5a872f8d23389820f5235fe87

如果有遇到/etc/kubernetes/pki/ca.key,/etc/kubernetes/pki/ca.crt等找不到的情况,需要从第一个节点拷贝过来。原因是init的时候没有添加--upload-certs参数

查看token

$ kubeadm token list
TOKEN                     TTL         EXPIRES                USAGES                   DESCRIPTION                                                EXTRA GROUPS
4vg9bh.65dk2vfy20r2i238   22h         2023-02-16T03:42:04Z   authentication,signing   The default bootstrap token generated by 'kubeadm init'.   system:bootstrappers:kubeadm:default-node-token
dc6fvn.9wax0v3wjmhil6f8   59m         2023-02-15T05:42:04Z   <none>                   Proxy for managing TTL for the kubeadm-certs secret        <none>
# 创建一个不过期的token
$ kubeadm token create --ttl 0

把其余2台master也加入集群,查看etcd已经是3副本了

$ kubectl get cs
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME                 STATUS    MESSAGE                         ERROR
controller-manager   Healthy   ok
scheduler            Healthy   ok
etcd-0               Healthy   {"health":"true","reason":""}
$ kubectl get pod -n kube-system -o wide
NAME                                    READY   STATUS    RESTARTS      AGE    IP              NODE            NOMINATED NODE   READINESS GATES
coredns-5bbd96d687-bd4fx                0/1     Pending   0             136m   <none>          <none>          <none>           <none>
coredns-5bbd96d687-nrssm                0/1     Pending   0             136m   <none>          <none>          <none>           <none>
etcd-kube-master58                      1/1     Running   0             136m   10.x.159.58   kube-master58   <none>           <none>
etcd-kube-master65                      1/1     Running   0             71m    10.x.159.65   kube-master65   <none>           <none>
etcd-kube-master66                      1/1     Running   0             71m    10.x.159.66   kube-master66   <none>           <none>
haproxy-kube-master58                   1/1     Running   0             136m   10.x.159.58   kube-master58   <none>           <none>
haproxy-kube-master65                   1/1     Running   0             70m    10.x.159.65   kube-master65   <none>           <none>
haproxy-kube-master66                   1/1     Running   0             71m    10.x.159.66   kube-master66   <none>           <none>
keepalived-kube-master58                1/1     Running   0             136m   10.x.159.58   kube-master58   <none>           <none>
keepalived-kube-master65                1/1     Running   0             71m    10.x.159.65   kube-master65   <none>           <none>
keepalived-kube-master66                1/1     Running   0             70m    10.x.159.66   kube-master66   <none>           <none>
kube-apiserver-kube-master58            1/1     Running   0             136m   10.x.159.58   kube-master58   <none>           <none>
kube-apiserver-kube-master65            1/1     Running   0             71m    10.x.159.65   kube-master65   <none>           <none>
kube-apiserver-kube-master66            1/1     Running   0             71m    10.x.159.66   kube-master66   <none>           <none>
kube-controller-manager-kube-master58   1/1     Running   0             136m   10.x.159.58   kube-master58   <none>           <none>
kube-controller-manager-kube-master65   1/1     Running   0             71m    10.x.159.65   kube-master65   <none>           <none>
kube-controller-manager-kube-master66   1/1     Running   0             71m    10.x.159.66   kube-master66   <none>           <none>
kube-proxy-5cmxw                        1/1     Running   0             71m    10.x.159.66   kube-master66   <none>           <none>
kube-proxy-9bmth                        1/1     Running   0             71m    10.x.159.65   kube-master65   <none>           <none>
kube-proxy-plx45                        1/1     Running   0             136m   10.x.159.58   kube-master58   <none>           <none>
kube-scheduler-kube-master58            1/1     Running   0             136m   10.x.159.58   kube-master58   <none>           <none>
kube-scheduler-kube-master65            1/1     Running   0             71m    10.x.159.65   kube-master65   <none>           <none>
kube-scheduler-kube-master66            1/1     Running   0             71m    10.x.159.66   kube-master66   <none>           <none>

下载etcdctl,查看集群状态

ETCD_VER=v3.5.7
# choose either URL
GOOGLE_URL=https://storage.googleapis.com/etcd
GITHUB_URL=https://github.com/etcd-io/etcd/releases/download
DOWNLOAD_URL=${GOOGLE_URL}

rm -f /tmp/etcd-${ETCD_VER}-linux-amd64.tar.gz
rm -rf /tmp/etcd-download-test && mkdir -p /tmp/etcd-download-test

curl -L ${DOWNLOAD_URL}/${ETCD_VER}/etcd-${ETCD_VER}-linux-amd64.tar.gz -o /tmp/etcd-${ETCD_VER}-linux-amd64.tar.gz
tar xzvf /tmp/etcd-${ETCD_VER}-linux-amd64.tar.gz -C /tmp/etcd-download-test --strip-components=1
rm -f /tmp/etcd-${ETCD_VER}-linux-amd64.tar.gz
mv /tmp/etcd-download-test/etcdctl /usr/bin/

ETCDCTL_API=3 etcdctl \
--cert /etc/kubernetes/pki/etcd/peer.crt \
--key /etc/kubernetes/pki/etcd/peer.key \
--cacert /etc/kubernetes/pki/etcd/ca.crt \
--endpoints https://${HOST0}:2379 member list

没有部署cni,节点状态异常

$ kubectl get no
NAME            STATUS     ROLES           AGE   VERSION
kube-master58   NotReady   control-plane   62m   v1.26.1
$ journalctl -f -u kubelet
Feb 15 04:44:31 kube-master58 kubelet[531]: E0215 04:44:31.568226     531 kubelet.go:2475] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized"

【可选】去掉污点

kubectl taint no --all node-role.kubernetes.io/control-plane-

安装网络组件

caclico

docs.tigera.io/calico/late…

Kubernetes API 数据存储选项超过 50 个节点使用Typha 守护程序提供扩展。Typha 不包含在 etcd 中,因为 etcd 已经处理了许多客户端,所以使用 Typha 是多余的,不推荐使用。

50节点以下

curl https://raw.githubusercontent.com/projectcalico/calico/v3.25.0/manifests/calico.yaml -O
# 如果有自定义cidr需要修改
            - name: CALICO_IPV4POOL_CIDR
              value: "192.168.0.0/16"
kubectl apply -f calico.yaml

查看node已经Ready,且coredns也Running

$ kubectl get no
NAME            STATUS   ROLES           AGE    VERSION
kube-master58   Ready    control-plane   3h2m   v1.26.1
kube-master65   Ready    control-plane   116m   v1.26.1
kube-master66   Ready    control-plane   116m   v1.26.1

部署metrics

实现node和pod的监控

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/high-availability-1.21+.yaml
# 因为证书问题导致探活失败
# https://github.com/kubernetes-sigs/metrics-server/issues/767
kubectl edit deployment.apps/metrics-server -n kube-system
--kubelet-insecure-tls

添加node

部署好容器、kubelet等

# 添加worker
# master生成添加worker命令
$ kubeadm token create --print-join-command
 kubeadm join 192.168.1.10:6443 –token 42ojpt.z2h5ii9n898tzo36 –discovery-token-ca-cert-hash sha256:7cf14e8cb965d5eb9d66f3707ba20deeadc90bd36b730ce4c0e5d9db80d3625b

# master上生成用于新master加入的证书
$ kubeadm init phase upload-certs --experimental-upload-certs
 [upload-certs] Storing the certificates in Secret “kubeadm-certs” in the “kube-system” Namespace
 [upload-certs] Using certificate key:
 e799a655f667fc327ab8c91f4f2541b57b96d2693ab5af96314ebddea7a68526
# 添加master
 kubeadm join 192.168.1.10:6443 –token 42ojpt.z2h5ii9n898tzo36 –discovery-token-ca-cert-hash sha256:7cf14e8cb965d5eb9d66f3707ba20deeadc90bd36b730ce4c0e5d9db80d3625b \
 –experimental-control-plane –certificate-key e799a655f667fc327ab8c91f4f2541b57b96d2693ab5af96314ebddea7a68526