k8s搭建

570 阅读25分钟

官方参照文档 二进制源码安装参见:二进制源码安装 kubeadm安装参见:kubeadm

升级安装containerd和runc

#!/usr/bin/env bashset -euxo pipefail  
wget https://github.com/opencontainers/runc/releases/download/v1.1.12/runc.amd64 -O runc  
wget https://github.com/containerd/containerd/releases/download/v1.7.13/containerd-1.7.13-linux-amd64.tar.gz  
chmod +x runc && mv runc /usr/bin  
tar xfz containerd-1.7.13-linux-amd64.tar.gz -C /usr  
systemctl restart containerd && sleep 3  
systemctl is-active containerd

添加kube相关源

创建kubernetes的repo文件:

$ sudo tee /etc/yum.repos.d/kubernetes.repo <<-'EOF'
[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
EOF

google地址被墙的情况下可以使用阿里云或者中科大的镜像站:

$ sudo tee /etc/yum.repos.d/kubernetes.repo <<-'EOF'
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
EOF

安装kubeadm

(base) [root@node1 kubeadm]# ansible -i /root/playbooks/hosts k8s-workers -m shell -a "yum -y install kubeadm docker.io"

master部署

apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
nodeRegistration:
	kubeletExtraArgs:
		cgroup-driver: "systemd"
---
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
kubernetesVersion: v1.23.1 # 更改此处的版本号和kubeadm version一致
clusterName: "example-cluster"
controllerManager:
	extraArgs:
		horizontal-pod-autoscaler-sync-period: "10s"
apiServer:
	extraArgs:
		runtime-config: "api/all=true"

初始化kubeadm:

kubeadm init --config kubeadm.yaml

会出现如下镜像下载不出来的问题:

(base) [root@node1 kubeadm]# kubeadm init --config kubeadm.yaml
[init] Using Kubernetes version: v1.23.1
[preflight] Running pre-flight checks
        [WARNING Service-Kubelet]: kubelet service is not enabled, please run 'systemctl enable kubelet.service'
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
error execution phase preflight: [preflight] Some fatal errors occurred:
        [ERROR ImagePull]: failed to pull image k8s.gcr.io/kube-apiserver:v1.23.1: output: Error response from daemon: Get "https://k8s.gcr.io/v2/": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
, error: exit status 1
        [ERROR ImagePull]: failed to pull image k8s.gcr.io/kube-controller-manager:v1.23.1: output: Error response from daemon: Get "https://k8s.gcr.io/v2/": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
, error: exit status 1
        [ERROR ImagePull]: failed to pull image k8s.gcr.io/kube-scheduler:v1.23.1: output: Error response from daemon: Get "https://k8s.gcr.io/v2/": context deadline exceeded
, error: exit status 1
        [ERROR ImagePull]: failed to pull image k8s.gcr.io/kube-proxy:v1.23.1: output: Error response from daemon: Get "https://k8s.gcr.io/v2/": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
, error: exit status 1
        [ERROR ImagePull]: failed to pull image k8s.gcr.io/pause:3.6: output: Error response from daemon: Get "https://k8s.gcr.io/v2/": dial tcp 142.250.157.82:443: i/o timeout
, error: exit status 1
        [ERROR ImagePull]: failed to pull image k8s.gcr.io/etcd:3.5.1-0: output: Error response from daemon: Get "https://k8s.gcr.io/v2/": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
, error: exit status 1
        [ERROR ImagePull]: failed to pull image k8s.gcr.io/coredns/coredns:v1.8.6: output: Error response from daemon: Get "https://k8s.gcr.io/v2/": dial tcp 142.250.157.82:443: i/o timeout
, error: exit status 1
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher

解决此问题 从DockerHub的其它仓库拉取: 首先使用下面的命令获取需要的docker镜像名称:

(base) [root@node1 kubeadm]# kubeadm config images list
k8s.gcr.io/kube-apiserver:v1.23.3
k8s.gcr.io/kube-controller-manager:v1.23.3
k8s.gcr.io/kube-scheduler:v1.23.3
k8s.gcr.io/kube-proxy:v1.23.3
k8s.gcr.io/pause:3.6
k8s.gcr.io/etcd:3.5.1-0
k8s.gcr.io/coredns/coredns:v1.8.6
(base) [root@node1 kubeadm]#

注意:新版本的coredns改名了,变成了coredns/coredns,记得在images里面改一下 hub.docker.com/search?q=ku… 首先要看看该在哪个地方拉取,可以去docker hub搜一搜哪里有kube-proxy之类的组件 进入dockerhub搜索: hub.docker.com/search?q=ku… 按照最近更新排序,结果如下,可以发现一个下载次数10k+,更新也很频繁的仓库: 在这里插入图片描述 编写脚本:

vim pull_k8s_images.sh

脚本内容如下:

#!/bin/bash
set -o errexit
set -o nounset
set -o pipefail

##这里定义版本,按照上面得到的列表自己改一下版本号

KUBE_VERSION=v1.23.3
KUBE_PAUSE_VERSION=3.6
ETCD_VERSION=3.5.1-0
DNS_VERSION=v1.8.6

##这是原始仓库名,最后需要改名成这个
GCR_URL=k8s.gcr.io

##这里就是写你要使用的仓库
#DOCKERHUB_URL=rancher
DOCKERHUB_URL=k8simage

##这里是镜像列表,新版本要把coredns改成coredns/coredns
images=(
kube-proxy:${KUBE_VERSION}
kube-scheduler:${KUBE_VERSION}
kube-controller-manager:${KUBE_VERSION}
kube-apiserver:${KUBE_VERSION}
pause:${KUBE_PAUSE_VERSION}
etcd:${ETCD_VERSION}
coredns/coredns:${DNS_VERSION}
)

##这里是拉取和改名的循环语句
for imageName in ${images[@]} ; do
  docker pull $DOCKERHUB_URL/$imageName
  docker tag $DOCKERHUB_URL/$imageName $GCR_URL/$imageName
  docker rmi $DOCKERHUB_URL/$imageName
done

执行结果

(base) [root@node1 kubeadm]# ./pull_k8s_images.sh
v1.23.3: Pulling from k8simage/kube-proxy
b2481554545f: Pull complete
d9c824a47c4e: Pull complete
fd68cbcdf87c: Pull complete
Digest: sha256:def87f007b49d50693aed83d4703d0e56c69ae286154b1c7a20cd1b3a320cf7c
Status: Downloaded newer image for k8simage/kube-proxy:v1.23.3
docker.io/k8simage/kube-proxy:v1.23.3
Untagged: k8simage/kube-proxy:v1.23.3
Untagged: k8simage/kube-proxy@sha256:def87f007b49d50693aed83d4703d0e56c69ae286154b1c7a20cd1b3a320cf7c
v1.23.3: Pulling from k8simage/kube-scheduler
2df365faf0e3: Pull complete
d3ec803c6980: Pull complete
6c24609dde40: Pull complete
Digest: sha256:32308abe86f7415611ca86ee79dd0a73e74ebecb2f9e3eb85fc3a8e62f03d0e7
Status: Downloaded newer image for k8simage/kube-scheduler:v1.23.3
docker.io/k8simage/kube-scheduler:v1.23.3
Untagged: k8simage/kube-scheduler:v1.23.3
Untagged: k8simage/kube-scheduler@sha256:32308abe86f7415611ca86ee79dd0a73e74ebecb2f9e3eb85fc3a8e62f03d0e7
v1.23.3: Pulling from k8simage/kube-controller-manager
2df365faf0e3: Already exists
d3ec803c6980: Already exists
87cf352ab523: Pull complete
Digest: sha256:b721871d9a9c55836cbcbb2bf375e02696260628f73620b267be9a9a50c97f5a
Status: Downloaded newer image for k8simage/kube-controller-manager:v1.23.3
docker.io/k8simage/kube-controller-manager:v1.23.3
Untagged: k8simage/kube-controller-manager:v1.23.3
Untagged: k8simage/kube-controller-manager@sha256:b721871d9a9c55836cbcbb2bf375e02696260628f73620b267be9a9a50c97f5a
v1.23.3: Pulling from k8simage/kube-apiserver
2df365faf0e3: Already exists
d3ec803c6980: Already exists
a5221b90f9cc: Pull complete
Digest: sha256:b8eba88862bab7d3d7cdddad669ff1ece006baa10d3a3df119683434497a0949
Status: Downloaded newer image for k8simage/kube-apiserver:v1.23.3
docker.io/k8simage/kube-apiserver:v1.23.3
Untagged: k8simage/kube-apiserver:v1.23.3
Untagged: k8simage/kube-apiserver@sha256:b8eba88862bab7d3d7cdddad669ff1ece006baa10d3a3df119683434497a0949
3.6: Pulling from k8simage/pause
fbe1a72f5dcd: Pull complete
Digest: sha256:3d380ca8864549e74af4b29c10f9cb0956236dfb01c40ca076fb6c37253234db
Status: Downloaded newer image for k8simage/pause:3.6
docker.io/k8simage/pause:3.6
Untagged: k8simage/pause:3.6
Untagged: k8simage/pause@sha256:3d380ca8864549e74af4b29c10f9cb0956236dfb01c40ca076fb6c37253234db
3.5.1-0: Pulling from k8simage/etcd
e8614d09b7be: Pull complete
45b6afb4a92f: Pull complete
f951ee5fe858: Pull complete
0c6b9ab3ebf9: Pull complete
7314eabc351c: Pull complete
Digest: sha256:64b9ea357325d5db9f8a723dcf503b5a449177b17ac87d69481e126bb724c263
Status: Downloaded newer image for k8simage/etcd:3.5.1-0
docker.io/k8simage/etcd:3.5.1-0
Untagged: k8simage/etcd:3.5.1-0
Untagged: k8simage/etcd@sha256:64b9ea357325d5db9f8a723dcf503b5a449177b17ac87d69481e126bb724c263
Error response from daemon: manifest for k8simage/coredns:v1.8.6 not found: manifest unknown: manifest unknown

最后代理中没有相应的dns镜像,搜索coredns:v1.8.6,得到如下结果: 在这里插入图片描述 在这里插入图片描述 按照提示执行:

(base) [root@node1 kubeadm]# docker pull xyz349925756/coredns:v1.8.6
v1.8.6: Pulling from xyz349925756/coredns
Digest: sha256:8916c89e1538ea3941b58847e448a2c6d940c01b8e716b20423d2d8b189d3972
Status: Downloaded newer image for xyz349925756/coredns:v1.8.6
docker.io/xyz349925756/coredns:v1.8.6
(base) [root@node1 kubeadm]# docker tag  xyz349925756/coredns:v1.8.6  k8s.gcr.io/coredns/coredns:v1.8.6
(base) [root@node1 kubeadm]# docker rmi xyz349925756/coredns:v1.8.6
Untagged: xyz349925756/coredns:v1.8.6
Untagged: xyz349925756/coredns@sha256:8916c89e1538ea3941b58847e448a2c6d940c01b8e716b20423d2d8b189d3972

再次运行

(base) [root@node1 kubeadm]# kubeadm init --config kubeadm.yaml
[init] Using Kubernetes version: v1.23.1
[preflight] Running pre-flight checks
        [WARNING Service-Kubelet]: kubelet service is not enabled, please run 'systemctl enable kubelet.service'
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.clu                                                                    ster.local node1] and IPs [10.96.0.1 192.168.111.49]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [localhost node1] and IPs [192.168.111.49 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [localhost node1] and IPs [192.168.111.49 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". Th                                                                    is can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/health                                                                    z": dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/health                                                                    z": dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/health                                                                    z": dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/health                                                                    z": dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/health                                                                    z": dial tcp [::1]:10248: connect: connection refused.

        Unfortunately, an error has occurred:
                timed out waiting for the condition

        This error is likely caused by:
                - The kubelet is not running
                - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

        If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
                - 'systemctl status kubelet'
                - 'journalctl -xeu kubelet'

        Additionally, a control plane component may have crashed or exited when started by the container runtime.
        To troubleshoot, list all containers using your preferred container runtimes CLI.

        Here is one example how you may list all Kubernetes containers running in docker:
                - 'docker ps -a | grep kube | grep -v pause'
                Once you have found the failing container, you can inspect its logs with:
                - 'docker logs CONTAINERID'

error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher

看上面的问题像是kubelet启动失败了,执行命令tail /var/log/messages查看进一步的原因:

Feb 16 18:03:11 node1 kubelet: I0216 18:03:11.958020    6692 docker_service.go:264] "Docker Info" dockerInfo=&{ID:ALTJ:7HTW:ENJE:LI3Z:V5AN:FV2E:4LDK:HJYJ:NQGO:MJAR:BIGS:O67L Containers:2 ContainersRunning:2 ContainersPaused:0 ContainersStopped:0 Images:20 Driver:overlay2 DriverStatus:[[Backing Filesystem xfs] [Supports d_type true] [Native Overlay Diff true] [userxattr false]] SystemStatus:[] Plugins:{Volume:[local] Network:[bridge host ipvlan macvlan null overlay] Authorization:[] Log:[awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog]} MemoryLimit:true SwapLimit:true KernelMemory:true KernelMemoryTCP:true CPUCfsPeriod:true CPUCfsQuota:true CPUShares:true CPUSet:true PidsLimit:true IPv4Forwarding:true BridgeNfIptables:true BridgeNfIP6tables:true Debug:false NFd:41 OomKillDisable:true NGoroutines:45 SystemTime:2022-02-16T18:03:11.944902238+08:00 LoggingDriver:json-file CgroupDriver:cgroupfs CgroupVersion:1 NEventsListener:0 KernelVersion:3.10.0-1160.36.2.el7.x86_64 OperatingSystem:CentOS Linux 7 (Core) OSVersion:7 OSType:linux Architecture:x86_64 IndexServerAddress:https://index.docker.io/v1/ RegistryConfig:0xc000be43f0 NCPU:8 MemTotal:33566306304 GenericResources:[] DockerRootDir:/var/lib/docker HTTPProxy: HTTPSProxy: NoProxy: Name:node1 Labels:[] ExperimentalBuild:false ServerVersion:20.10.12 ClusterStore: ClusterAdvertise: Runtimes:map[io.containerd.runc.v2:{Path:runc Args:[] Shim:<nil>} io.containerd.runtime.v1.linux:{Path:runc Args:[] Shim:<nil>} runc:{Path:runc Args:[] Shim:<nil>}] DefaultRuntime:runc Swarm:{NodeID: NodeAddr: LocalNodeState:inactive ControlAvailable:false Error: RemoteManagers:[] Nodes:0 Managers:0 Cluster:<nil> Warnings:[]} LiveRestoreEnabled:false Isolation: InitBinary:docker-init ContainerdCommit:{ID:7b11cfaabd73bb80907dd23182b9347b4245eb5d Expected:7b11cfaabd73bb80907dd23182b9347b4245eb5d} RuncCommit:{ID:v1.0.2-0-g52b36a2 Expected:v1.0.2-0-g52b36a2} InitCommit:{ID:de40ad0 Expected:de40ad0} SecurityOptions:[name=seccomp,profile=default] ProductLicense: DefaultAddressPools:[] Warnings:[]}
Feb 16 18:03:11 node1 kubelet: E0216 18:03:11.958130    6692 server.go:302] "Failed to run kubelet" err="failed to run Kubelet: misconfiguration: kubelet cgroup driver: \"systemd\" is different from docker cgroup driver: \"cgroupfs\""
Feb 16 18:03:11 node1 systemd: kubelet.service: main process exited, code=exited, status=1/FAILURE
Feb 16 18:03:11 node1 systemd: Unit kubelet.service entered failed state.
Feb 16 18:03:11 node1 systemd: kubelet.service failed.

参考:www.cnblogs.com/hellxz/p/ku… 上述日志表明:

  1. 应该将docker的cgroup driver改成systemd
  2. 应该将k8s的cgroup driver改成systemd
cat <<EOF
{"exec-opts": ["native.cgroupdriver=systemd"]}
EOF  > /etc/docker/daemon.json 
vim /var/lib/kubelet/config.yaml

在这里插入图片描述 重启docker和kubelet

systemctl daemon-reload 
systemctl daemon-reload && systemctl restart kubelet ##kubeadm join后才能使用找个哦

修改之后,再次运行:

(base) [root@node1 kubeadm]# kubeadm init --config kubeadm.yaml
[init] Using Kubernetes version: v1.23.3
[preflight] Running pre-flight checks
        [WARNING Service-Kubelet]: kubelet service is not enabled, please run 'systemctl enable kubelet.service'
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Using existing ca certificate authority
[certs] Using existing apiserver certificate and key on disk
[certs] Using existing apiserver-kubelet-client certificate and key on disk
[certs] Using existing front-proxy-ca certificate authority
[certs] Using existing front-proxy-client certificate and key on disk
[certs] Using existing etcd/ca certificate authority
[certs] Using existing etcd/server certificate and key on disk
[certs] Using existing etcd/peer certificate and key on disk
[certs] Using existing etcd/healthcheck-client certificate and key on disk
[certs] Using existing apiserver-etcd-client certificate and key on disk
[certs] Using the existing "sa" key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Using existing kubeconfig file: "/etc/kubernetes/admin.conf"
[kubeconfig] Using existing kubeconfig file: "/etc/kubernetes/kubelet.conf"
[kubeconfig] Using existing kubeconfig file: "/etc/kubernetes/controller-manager.conf"
[kubeconfig] Using existing kubeconfig file: "/etc/kubernetes/scheduler.conf"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 8.504891 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.23" in namespace kube-system with the configuration for the kubelets in the cluster
NOTE: The "kubelet-config-1.23" naming of the kubelet ConfigMap is deprecated. Once the UnversionedKubeletConfigMap feature gate graduates to Beta the default name will become just "kubelet-config". Kubeadm upgrade will handle this transition transparently.
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node node1 as control-plane by adding the labels: [node-role.kubernetes.io/master(deprecated) node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
[mark-control-plane] Marking the node node1 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: 6s4uqt.t61rqtwrdgtzlwyl
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.111.49:6443 --token 6s4uqt.t61rqtwrdgtzlwyl \
        --discovery-token-ca-cert-hash sha256:9586fa1b73054fed81174323ef7a130ba4bf5b0a8233526a4c0d0ae2eb7c4534
(base) [root@node1 kubeadm]#

查看pod状态:

(base) [root@node1 kubeadm]# kubectl get pods -n kube-system
NAME                            READY   STATUS     RESTARTS   AGE
coredns-64897985d-ljf28         0/1     Pending    0          16h
coredns-64897985d-z5gtp         0/1     Pending    0          16h
etcd-node1                      1/1     Running    0          16h
kube-apiserver-node1            1/1     Running    0          16h
kube-controller-manager-node1   1/1     Running    0          16h
kube-proxy-62hbt                1/1     Running    0          16h
kube-scheduler-node1            1/1     Running    0          16h

coredns处于pending状态

安装网络插件

(base) [root@node1 kubeadm]# kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version |base64|tr -d '\n')"
serviceaccount/weave-net created
clusterrole.rbac.authorization.k8s.io/weave-net created
clusterrolebinding.rbac.authorization.k8s.io/weave-net created
role.rbac.authorization.k8s.io/weave-net created
rolebinding.rbac.authorization.k8s.io/weave-net created
daemonset.apps/weave-net created

再次查看pod状态:

(base) [root@node1 kubeadm]# kubectl get pods -n kube-system
NAME                            READY   STATUS     RESTARTS   AGE
coredns-64897985d-ljf28         0/1     Pending    0          16h
coredns-64897985d-z5gtp         0/1     Pending    0          16h
etcd-node1                      1/1     Running    0          16h
kube-apiserver-node1            1/1     Running    0          16h
kube-controller-manager-node1   1/1     Running    0          16h
kube-proxy-62hbt                1/1     Running    0          16h
kube-scheduler-node1            1/1     Running    0          16h
weave-net-98lzc                 0/2     Init:0/1   0          4m36s

查看weave-net的描述

(base) [root@node1 kubeadm]# kubectl describe pod weave-net-98lzc  -n kube-system
Name:                 weave-net-98lzc
Namespace:            kube-system
Priority:             2000001000
Priority Class Name:  system-node-critical
Node:                 node1/192.168.111.49
Start Time:           Thu, 17 Feb 2022 11:05:07 +0800
Labels:               controller-revision-hash=59d968cb54
                      name=weave-net
                      pod-template-generation=1
Annotations:          <none>
Status:               Pending
IP:                   192.168.111.49
IPs:
  IP:           192.168.111.49
Controlled By:  DaemonSet/weave-net
Init Containers:
  weave-init:
    Container ID:
    Image:         ghcr.io/weaveworks/launcher/weave-kube:2.8.1
    Image ID:
    Port:          <none>
    Host Port:     <none>
    Command:
      /home/weave/init.sh
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /host/etc from cni-conf (rw)
      /host/home from cni-bin2 (rw)
      /host/opt from cni-bin (rw)
      /lib/modules from lib-modules (rw)
      /run/xtables.lock from xtables-lock (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-zdqwc (ro)
Containers:
  weave:
    Container ID:
    Image:         ghcr.io/weaveworks/launcher/weave-kube:2.8.1
    Image ID:
    Port:          <none>
    Host Port:     <none>
    Command:
      /home/weave/launch.sh
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Requests:
      cpu:      50m
      memory:   100Mi
    Readiness:  http-get http://127.0.0.1:6784/status delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:
      HOSTNAME:         (v1:spec.nodeName)
      INIT_CONTAINER:  true
    Mounts:
      /host/etc/machine-id from machine-id (ro)
      /host/var/lib/dbus from dbus (rw)
      /run/xtables.lock from xtables-lock (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-zdqwc (ro)
      /weavedb from weavedb (rw)
  weave-npc:
    Container ID:
    Image:          ghcr.io/weaveworks/launcher/weave-npc:2.8.1
    Image ID:
    Port:           <none>
    Host Port:      <none>
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Requests:
      cpu:     50m
      memory:  100Mi
    Environment:
      HOSTNAME:   (v1:spec.nodeName)
    Mounts:
      /run/xtables.lock from xtables-lock (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-zdqwc (ro)
Conditions:
  Type              Status
  Initialized       False
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  weavedb:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/weave
    HostPathType:
  cni-bin:
    Type:          HostPath (bare host directory volume)
    Path:          /opt
    HostPathType:
  cni-bin2:
    Type:          HostPath (bare host directory volume)
    Path:          /home
    HostPathType:
  cni-conf:
    Type:          HostPath (bare host directory volume)
    Path:          /etc
    HostPathType:
  dbus:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/dbus
    HostPathType:
  lib-modules:
    Type:          HostPath (bare host directory volume)
    Path:          /lib/modules
    HostPathType:
  machine-id:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/machine-id
    HostPathType:  FileOrCreate
  xtables-lock:
    Type:          HostPath (bare host directory volume)
    Path:          /run/xtables.lock
    HostPathType:  FileOrCreate
  kube-api-access-zdqwc:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 :NoSchedule op=Exists
                             :NoExecute op=Exists
                             node.kubernetes.io/disk-pressure:NoSchedule op=Exists
                             node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                             node.kubernetes.io/network-unavailable:NoSchedule op=Exists
                             node.kubernetes.io/not-ready:NoExecute op=Exists
                             node.kubernetes.io/pid-pressure:NoSchedule op=Exists
                             node.kubernetes.io/unreachable:NoExecute op=Exists
                             node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
  Type    Reason     Age    From               Message
  ----    ------     ----   ----               -------
  Normal  Scheduled  6m31s  default-scheduler  Successfully assigned kube-system/weave-net-98lzc to node1
  Warning  Failed     104s               kubelet            Failed to pull image "ghcr.io/weaveworks/launcher/weave-kube:2.8.1": rpc error: code = Unknown desc = context canceled
  Warning  Failed     104s               kubelet            Error: ErrImagePull
  Normal   BackOff    104s               kubelet            Back-off pulling image "ghcr.io/weaveworks/launcher/weave-kube:2.8.1"
  Warning  Failed     104s               kubelet            Error: ImagePullBackOff
  Normal   Pulling    88s (x2 over 12m)  kubelet            Pulling image "ghcr.io/weaveworks/launcher/weave-kube:2.8.1"


发现在pull镜像时,发生错误 解决:

docker pull weaveworks/weave-kube:2.8.1
docker tag weaveworks/weave-kube:2.8.1 ghcr.io/weaveworks/launcher/weave-kube:2.8.1
docker rmi weaveworks/weave-kube:2.8.1
docker pull weaveworks/weave-npc:2.8.1
docker tag weaveworks/weave-npc:2.8.1 ghcr.io/weaveworks/launcher/weave-npc:2.8.1
docker rmi weaveworks/weave-npc:2.8.1

在这里插入图片描述 再次看pod描述:

(base) [root@node1 kubeadm]# kubectl describe pod weave-net-98lzc  -n kube-system
Events:
  Type     Reason     Age                  From               Message
  ----     ------     ----                 ----               -------
  Normal   Scheduled  29m                  default-scheduler  Successfully assigned kube-system/weave-net-98lzc to node1
  Warning  Failed     18m                  kubelet            Failed to pull image "ghcr.io/weaveworks/launcher/weave-kube:2.8.1": rpc error: code = Unknown desc = context canceled
  Warning  Failed     18m                  kubelet            Error: ErrImagePull
  Normal   BackOff    18m                  kubelet            Back-off pulling image "ghcr.io/weaveworks/launcher/weave-kube:2.8.1"
  Warning  Failed     18m                  kubelet            Error: ImagePullBackOff
  Normal   Pulling    18m (x2 over 29m)    kubelet            Pulling image "ghcr.io/weaveworks/launcher/weave-kube:2.8.1"
  Normal   Pulled     10m                  kubelet            Successfully pulled image "ghcr.io/weaveworks/launcher/weave-kube:2.8.1" in 7m33.767806399s
  Normal   Created    10m                  kubelet            Created container weave-init
  Normal   Started    10m                  kubelet            Started container weave-init
  Normal   Pulling    10m                  kubelet            Pulling image "ghcr.io/weaveworks/launcher/weave-npc:2.8.1"
  Normal   Pulled     4m46s                kubelet            Successfully pulled image "ghcr.io/weaveworks/launcher/weave-npc:2.8.1" in 5m39.573715137s
  Normal   Pulled     4m45s (x2 over 10m)  kubelet            Container image "ghcr.io/weaveworks/launcher/weave-kube:2.8.1" already present on machine
  Normal   Created    4m45s (x2 over 10m)  kubelet            Created container weave
  Normal   Started    4m45s (x2 over 10m)  kubelet            Started container weave
  Normal   Created    4m45s                kubelet            Created container weave-npc
  Normal   Started    4m45s                kubelet            Started container weave-npc
  Warning  Unhealthy  4m44s                kubelet            Readiness probe failed: Get "http://127.0.0.1:6784/status": dial tcp 127.0.0.1:6784: connect: connection refused

pod状态:

(base) [root@node1 kubeadm]# kubectl get pods -n kube-system
NAME                            READY   STATUS    RESTARTS      AGE
coredns-64897985d-ljf28         1/1     Running   0             17h
coredns-64897985d-z5gtp         1/1     Running   0             17h
etcd-node1                      1/1     Running   0             17h
kube-apiserver-node1            1/1     Running   0             17h
kube-controller-manager-node1   1/1     Running   0             17h
kube-proxy-62hbt                1/1     Running   0             17h
kube-scheduler-node1            1/1     Running   0             17h
weave-net-98lzc                 2/2     Running   1 (10m ago)   29m

其他机器

##拷贝repo文件到集群中其他机器
(base) [root@node1 kubeadm]# ansible -i /root/playbooks/hosts k8s-workers -m copy -a "src=/etc/yum.repos.d/kubernetes.repo dest=/etc/yum.repos.d/kubernetes.repo"
##其他机器安装kubeadm
(base) [root@node1 kubeadm]# ansible -i /root/playbooks/hosts k8s-workers -m shell -a "yum -y install kubeadm docker-ce"
## 修改docker配置,在所有节点上面执行
cat > /etc/docker/daemon.json <<EOF
{"exec-opts": ["native.cgroupdriver=systemd"]}
EOF
## 

错误举例:

[ERROR FileAvailable--etc-kubernetes-pki-ca.crt]: /etc/kubernetes/pki/ca.crt already exists

参考:blog.csdn.net/yuxuan89814… 从节点加入的时候,报错:

(base) [root@node2 kubeadm]# kubeadm join 192.168.111.49:6443 --token cj3dco.j70mpt18ruhwhp57 \
>         --discovery-token-ca-cert-hash sha256:cc0e757c274d620172f07338d09b94909c5cbadb5541053caf002a86c85509e9
[preflight] Running pre-flight checks
error execution phase preflight: [preflight] Some fatal errors occurred:
        [ERROR FileAvailable--etc-kubernetes-kubelet.conf]: /etc/kubernetes/kubelet.conf already exists
        [ERROR Port-10250]: Port 10250 is in use
        [ERROR FileAvailable--etc-kubernetes-pki-ca.crt]: /etc/kubernetes/pki/ca.crt already exists
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher

需要重置一下kubeadm:

(base) [root@node2 kubeadm]# kubeadm reset
[reset] WARNING: Changes made to this host by 'kubeadm init' or 'kubeadm join' will be reverted.
[reset] Are you sure you want to proceed? [y/N]: y
[preflight] Running pre-flight checks
W0223 16:28:05.496433   20399 removeetcdmember.go:80] [reset] No kubeadm config, using etcd pod spec to get data directory
[reset] No etcd config found. Assuming external etcd
[reset] Please, manually reset etcd to prevent further issues
[reset] Stopping the kubelet service
[reset] Unmounting mounted directories in "/var/lib/kubelet"
[reset] Deleting contents of config directories: [/etc/kubernetes/manifests /etc/kubernetes/pki]
[reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf]
[reset] Deleting contents of stateful directories: [/var/lib/kubelet /var/lib/dockershim /var/run/kubernetes /var/lib/cni]

The reset process does not clean CNI configuration. To do so, you must remove /etc/cni/net.d

The reset process does not reset or clean up iptables rules or IPVS tables.
If you wish to reset iptables, you must do so manually by using the "iptables" command.

If your cluster was setup to utilize IPVS, run ipvsadm --clear (or similar)
to reset your system's IPVS tables.

The reset process does not clean your kubeconfig files and you must remove them manually.
Please, check the contents of the $HOME/.kube/config file.

问题2

www.cnblogs.com/hgdf/p/1386…

问题3

参考:segmentfault.com/a/119000002… 从节点加入的时候:

couldn't validate the identity of the API Server: could not find a JWS signature in the cluster-info ConfigMap for token ID "6s4uqt"

原因:可能是token写的不对,或者过期 解决:

(base) [root@node1 playbooks]# kubeadm token list
# 此处没有任何输出, 表明没有存活的token

生成token

$ kubeadm token create --ttl 0
# === 以下为输出 ===
W0706 19:02:57.015210   11101 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
cj3dco.j70mpt18ruhwhp57
##===查看有效token===
(base) [root@node1 playbooks]# kubeadm token list
TOKEN                     TTL         EXPIRES                USAGES                   DESCRIPTION                                                EXTRA GROUPS
cj3dco.j70mpt18ruhwhp57   22h         2022-02-24T08:27:11Z   authentication,signing   <none>                                                     system:bootstrappers:kubeadm:default-node-token
(base) [root@node1 playbooks]#
##=== 生成证书摘要 ===
(base) [root@node1 playbooks]# openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
cc0e757c274d620172f07338d09b94909c5cbadb5541053caf002a86c85509e9
##===使用上面生成的===
(base) [root@node2 kubeadm]# kubeadm join 192.168.111.49:6443 --token cj3dco.j70mpt18ruhwhp57         --discovery-token-ca-cert-hash sha256:cc0e757c274d620172f07338d09b94909c5cbadb5541053caf002a86c85509e9
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

以上生成 token 和 hash 可以在生成token的时候加上 --print-join-command 直接打印出来. 毕竟生成 token 就是用来添加节点用的

(base) [root@node1 playbooks]# kubeadm token create --print-join-command --ttl=0
kubeadm join 192.168.111.49:6443 --token 5e17mw.om70vbfaapn6bmdo --discovery-token-ca-cert-hash sha256:cc0e757c274d620172f07338d09b94909c5cbadb5541053caf002a86c85509e9
(base) [root@node1 playbooks]# kubeadm token list
TOKEN                     TTL         EXPIRES   USAGES                   DESCRIPTION                                                EXTRA GROUPS
5e17mw.om70vbfaapn6bmdo   <forever>   <never>   authentication,signing   <none>                                                     system:bootstrappers:kubeadm:default-node-token
cj3dco.j70mpt18ruhwhp57   22h         2022-02-24T08:27:11Z   authentication,signing   <none>                                                     system:bootstrappers:kubeadm:default-node-token

可以通过 kube config 命令查看 cluster-info 的内容,data下可以看到当前有效的token:

(base) [root@node1 playbooks]# kubectl get configmap cluster-info --namespace=kube-public -o yaml
apiVersion: v1
data:
  jws-kubeconfig-5e17mw: eyJhbGciOiJIUzI1NiIsImtpZCI6IjVlMTdtdyJ9..qOzSmW61QyKu6oi4urPa8g6koiYgcYxs9Ih9dGj629c
  jws-kubeconfig-cj3dco: eyJhbGciOiJIUzI1NiIsImtpZCI6ImNqM2RjbyJ9..jjwgIh0oiVcOFp6Lszo7Ri7IE5Ziza-m9Xqn06hzNFw
  kubeconfig: |
    apiVersion: v1
    clusters:
    - cluster:
        certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUMvakNDQWVhZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFWTVJNd0VRWURWUVFERXdwcmRXSmwKY201bGRHVnpNQjRYRFRJeU1ESXlNekE0TWpZMU5Gb1hEVE15TURJeU1UQTRNalkxTkZvd0ZURVRNQkVHQTFVRQpBeE1LYTNWaVpYSnVaWFJsY3pDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBTTRZCkowZnRabUthSWtzUHpYZWkrSGN6RmNUSVVSVVFxZVhZNWoxaVRMVVFLbk1nREFiT2hKLzArRkFQV0R1TUREU08KSElLWWVYNUJNTnBNby9ndEU5ZXRZMVVqV0NjRE81OGJZVG4yeUhVVmpNSDgrL1dIVkZtZmYzYlpsSWh3a3l0OQpSOW5ISjFZeGlqdDMweTRoZklsTkZwQkNNQjA0YnNrODBMNlVNQXFCVVNwSFgyVW9sTldYbHlHYkppelFJK0Z5CnZucVZJalptQXFoNHAwSC9OdlB5dTg5dkdRVHFTOVhBdEVxV2cwYzJ2RkhkRGpUNzJTdTVFREI5M1ZZZFdEVE4KWXEyamJqOGJyd0tRejNCeWYzV2FPdWNWQ05SYzRkSzk0TTliN2swK2c2cGJSSVV6SG1GQndYVUlBTVB5WFhDVApCVENIRzZ2LzVVdXZFamJFRmo4Q0F3RUFBYU5aTUZjd0RnWURWUjBQQVFIL0JBUURBZ0trTUE4R0ExVWRFd0VCCi93UUZNQU1CQWY4d0hRWURWUjBPQkJZRUZHY09jOTk3L3JwZ3VPTkgyekwzNHdFMTZqSEVNQlVHQTFVZEVRUU8KTUF5Q0NtdDFZbVZ5Ym1WMFpYTXdEUVlKS29aSWh2Y05BUUVMQlFBRGdnRUJBRi9rcFJlTDdrQ3AzS0FXSlR4SgpiYkhNSG4zd2xzQVFIOTlDTXBSRTFYSENUdlBOSERjWDJVbkY0dUM2dzV3d3BKNHF5a2I2Um1yUVdMWnF4V1NJCkFtOGlEQnUvVStXOFVnVG9FckpET1kzQWFSM0xUZzZZcXBLb1lxUVB1TGI3ZGZrV3NnOXZYUllMdnpxZ1NnYmkKUGR6WEJnMko2OWlxUFZEZDVvdUF5QVBFTUxBazdzTWV1dHFOd25Ma0FHR3J2WWlIZGdCbEZ2L3padTBsV2Z1Rwp1QWNzTThmdXp6dVJpbXhnRThaMWFvRWdSVXhJMzM1ekRkMUVPeWRwTDNoSnJBUS9VWTFRWmczemdDNmM2NEVFCkYyczZsemdGQllDSXhnamkvR0tIazJwbjRMKzBLVDZiUEhscE9NT0JnU0hsQW1HeHdrbk1YKzBRTlNZQnljeEwKUnkwPQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg==
        server: https://192.168.111.49:6443
      name: ""
    contexts: null
    current-context: ""
    kind: Config
    preferences: {}
    users: null
kind: ConfigMap
metadata:
  creationTimestamp: "2022-02-23T08:27:11Z"
  name: cluster-info
  namespace: kube-public
  resourceVersion: "7286"
  uid: 3ba07830-eff2-4578-b77f-fb5798341a61

问题4

在master上面kubeadm init的时候,或者在从节点上面进行kubeadm join的时候,都可能发生下面的问题

The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.

此问题多半是因为cni插件的pod没有正常启动,一般是因为插件镜像没有pull成功,现象是从节点的node都是NotReady状态:

(base) [root@node1 kubeadm]# kubectl get nodes
NAME    STATUS     ROLES                  AGE     VERSION
node1   Ready      control-plane,master   6d21h   v1.23.3
node2   NotReady   <none>                 45m     v1.23.4
node4   NotReady   <none>                 40m     v1.23.4

## 同时伴随weave-net的pod不正常,describe发现pull镜像失败
(base) [root@node1 kubeadm]# kubectl get pods -n kube-system
NAME                            READY   STATUS                  RESTARTS       AGE
coredns-64897985d-ljf28         1/1     Running                 0              6d21h
coredns-64897985d-z5gtp         1/1     Running                 0              6d21h
etcd-node1                      1/1     Running                 0              6d21h
kube-apiserver-node1            1/1     Running                 0              6d21h
kube-controller-manager-node1   1/1     Running                 0              6d21h
kube-proxy-62hbt                1/1     Running                 0              6d21h
kube-proxy-6slqw                0/1     ImagePullBackOff        0              29m
kube-proxy-klp9x                0/1     ImagePullBackOff        0              24m
kube-scheduler-node1            1/1     Running                 0              6d21h
weave-net-469x6                 0/2     Init:ImagePullBackOff   0              29m
weave-net-98lzc                 2/2     Running                 1 (6d3h ago)   6d4h
weave-net-glnsr                 0/2     Init:ImagePullBackOff   0              24m

解决方法参照上面的,在每个节点上面pullweave的镜像

docker pull weaveworks/weave-kube:2.8.1
docker tag weaveworks/weave-kube:2.8.1 ghcr.io/weaveworks/launcher/weave-kube:2.8.1
docker rmi weaveworks/weave-kube:2.8.1
docker pull weaveworks/weave-npc:2.8.1
docker tag weaveworks/weave-npc:2.8.1 ghcr.io/weaveworks/launcher/weave-npc:2.8.1
docker rmi weaveworks/weave-npc:2.8.1

问题5:

关于docker配置文件: 使用systemctl status docker,查看配置文件。 参考www.cnblogs.com/ooops/p/128…

容器下载最终方案:

参考:阿里云构建国外镜像 阿里云镜像地址

查看kubelet启动日志: journalctl -xefu kubelet

kubectl logs --previous podname -n namespace

kubeadm config print init-defaults 生成文件命令:kubeadm config print init-defaults --component-configs KubeProxyConfiguration,KubeletConfiguration > kubeadm-config.yaml kubeadm config images list

  • hostPath: path: /var/lib/etcd type: DirectoryOrCreate

HA 高可用

[root@sc-master-2 ~]# cat /etc/nginx/nginx.conf
# For more information on configuration, see:
#   * Official English Documentation: http://nginx.org/en/docs/
#   * Official Russian Documentation: http://nginx.org/ru/docs/

user nginx;
worker_processes auto;
error_log /var/log/nginx/error.log;
pid /run/nginx.pid;

# Load dynamic modules. See /usr/share/doc/nginx/README.dynamic.
include /usr/share/nginx/modules/*.conf;

events {
    worker_connections 1024;
}
# 四层负载均衡,为三台Master apiserver组件提供负载均衡
stream {
    log_format  main  '$remote_addr $upstream_addr - [$time_local] $status $upstream_bytes_sent';
    access_log  /var/log/nginx/k8s-access.log  main;
    upstream k8s-apiserver {
       server 172.70.21.11:6443; 
       server 172.70.21.12:6443; 
       server 172.70.21.13:6443;  
    }
    
    server {
       listen 16443; # 由于nginx与master节点复用,这个监听端口不能是6443,否则会冲突
       proxy_pass k8s-apiserver;
    }
}

[root@sc-master-2 ~]# cd /etc/keepalived/
check_nginx.sh   keepalived.conf  
[root@sc-master-2 ~]# cd /etc/keepalived/
[root@sc-master-2 keepalived]# ls
check_nginx.sh  keepalived.conf
[root@sc-master-2 keepalived]# cat check_nginx.sh 
#!/bin/bash

A=`ps -C nginx --no-header |wc -l`

if [ $A -eq 0 ];then
        /usr/bin/kill -15 `cat /var/run/keepalived.pid`
fi

[root@sc-master-2 keepalived]# cat keepalived.conf 
! Configuration File for keepalived

vrrp_script chk_nginx {
  script "/etc/keepalived/check_nginx.sh"
  interval 2
  weight 4
}

vrrp_instance VI_1 {
    state BACKUP
    interface eno1
    virtual_router_id 51
    priority 60
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    track_script {
      chk_nginx #执行脚本
    }
    virtual_ipaddress {
        172.70.21.9
    }
}