在ubuntu22.04-arm64系统上基于kubeadm部署kubernetes测试集群

352 阅读5分钟

本文档用于自学k8s,搭建测试环境使用,我是在mac上使用utm虚拟机工具,创建3台ubuntu22-arm64操作系统的虚拟机,配置均为2核2G,采用 1 master 2 worker架构。

创建时间:2024-09-21

一、部署架构规划

角色主机名运行组件ip配置
master01server01etcd、apiserver、controller-manager、scheduler、kubelet、proxy、flannel、runc192.168.64.72核2G
worker01server02pod、kubelet、proxy、flannel、runc192.168.64.82核2G
worker02server03pod、kubelet、proxy、flannel、runc192.168.64.92核2G

二、软件版本记录

  • docker server :27.2.1
  • containerd :1.7.22
  • kubeadm : v1.28.2
  • kubelet : v1.28.2
  • kubectl : v1.28.2

三、集群服务器初始化(3.*操作,所有机器执行)

3.1 添加主机名

cat >> /etc/hosts <<EOF
192.168.64.7 m1 server01
192.168.64.8 w1 server02
192.168.64.9 w2 server03
EOF

3.2 主机时间保持一致,使用timedatectl

timedatectl set-time Asia/Shanghai #临时改变,重启失效
timedatectl status #查看当前时区和时间
cp /usr/share/zoneinfo/Asia/Shanghai  /etc/localtime #修改配置文件,永久生效

3.3 关闭ubuntu默认ufw防火墙

systemctl status ufw
systemctl disable ufw && systemctl stop ufw
## 显示 “Active: inactive (dead)”即可

3.4 关闭swap分区

swapoff -a #临时关闭swap分区
sed -ri 's/.swap*./#&/g' /etc/fstab  #sed修改配置文件永久关闭

3.5 配置内核转发及网桥过滤

##为了确保网络和资源管理的正确性和性能。它们确保了正确的网络转发、防火墙规则以及内存管理策略。)
  ##添加配置文件
cat << EOF | tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF
  ##加载模块
modprobe overlay
modprobe br_netfilter
  ##添加网桥过滤及内核转发配置文件
cat  <<EOF | tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
vm.swappiness = 0
EOF
  ##载入配置使之生效
sysctl --system

3.6 安装容器运行时

容器运行时 使 Kubernetes能够有效运行容器,它负责管理 Kubernetes 环境中容器的执行和生命周期。 从k8s 1.24版本开始,负责kubelet与docker直接握手的docker-shim已经不再支持,转而强制使用容器运行时Container Runtime作为kubelet和容器的桥梁。我们有两种方式来安装:
(1)安装docker其中包含containerd.io作为容器运行时。
(2)单独下载containerd+runc+cni 套装作为容器运行时。
图方便建议直接方式1安装docker;想要更深入理解组件作用,选择第2种方式。

3.6.1 安装docker

参考官方文档:docs.docker.com/engine/inst…

  1. 卸载旧文件及安装包
 for pkg in docker.io docker-doc docker-compose docker-compose-v2 podman-docker containerd runc; do sudo apt-get remove $pkg; done
  1. 设置官方docker-apt仓库密钥并添加下载源
## Add Docker's official GPG key:
sudo apt-get update
sudo apt-get install ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc

## Add the repository to Apt sources:
echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
  $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update

如网络不通,可以设置国内docker-apt仓库密钥并添加下载源

sudo curl -fsSL https://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg | sudo apt-key add - 

sudo echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://mirrors.aliyun.com/docker-ce/linux/ubuntu \
  $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
  1. 安装最新版本docker
 sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

4.验证docker版本

docker version
## 屏幕显示如下
Client: Docker Engine - Community
 Version:           27.2.1
 API version:       1.47
 Go version:        go1.22.7
 Git commit:        9e34c9b
 Built:             Fri Sep  6 12:09:00 2024
 OS/Arch:           linux/arm64
 Context:           default

Server: Docker Engine - Community
 Engine:
  Version:          27.2.1
  API version:      1.47 (minimum version 1.24)
  Go version:       go1.22.7
  Git commit:       8b539b8
  Built:            Fri Sep  6 12:09:00 2024
  OS/Arch:          linux/arm64
  Experimental:     false
  
 containerd:  ## 作为容器运行时接口
  Version:          1.7.22
  GitCommit:        7f7fdf5fed64eb6a7caf99b3e12efcf9d60e311c
  
 runc:    ##作为容器运行时与linux底层交互
  Version:          1.1.14
  GitCommit:        v1.1.14-0-g2c9f560

 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

5.检查服务状态,启动docker并设置为开机自启

##启动并设置开机自启
systemctl start docker
ststemctl enable docker
## 三个服务都应是running状态
systemctl status containerd.service
systemctl status docker.service
systemctl status docker.socket

3.6.2 手动安装容器运行时组件

1.下载 containerd 二进制文件并将其提取到 /usr/local

wget https://github.com/containerd/containerd/releases/download/v1.7.11/containerd-1.7.11-linux-arm64.tar.gz
sudo tar Cxzvf /usr/local containerd-1.7.11-linux-arm64.tar.gz

2.生成containerd配置文件

sudo mkdir /etc/containerd
containerd config default > config.toml
sudo cp config.toml /etc/containerd

3.为containerd配置service启动文件

wget https://raw.githubusercontent.com/containerd/containerd/main/containerd.service
sudo cp containerd.service /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable --now containerd
sudo systemctl restart containerd
## 此时执行systemctl status containerd 应显示 Active: active (running)
  1. 安装runc (作为实际容器运行时与linux底层交互)
wget https://github.com/opencontainers/runc/releases/download/v1.1.10/runc.arm64
sudo install -m 755 runc.arm64 /usr/local/sbin/runc

5.为containerd安装cni插件来与网络交互

wget https://github.com/containernetworking/plugins/releases/download/v1.4.0/cni-plugins-linux-arm64-v1.4.0.tgz
sudo mkdir -p /opt/cni/bin
sudo tar Cxzvf /opt/cni/bin cni-plugins-linux-arm64-v1.4.0.tgz

3.7 配置容器运行时的systemd驱动

## 生成默认配置文件
containerd config default > /etc/containerd/config.toml

## 修改默认配置文件
vim /etc/containerd/config.toml
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]段落
默认:
SystemdCgroup = false
改为:
SystemdCgroup = true

## 这一步如果虚拟机可以科学上网也可使用默认地址
[plugins."io.containerd.grpc.v1.cri"]段落
默认:
sandbox_image = "registry.k8s.io/pause:3.6"
改为:
sandbox_image = "registry.aliyuncs.com/google_containers/pause:3.9"

## 重启containerd
sudo systemctl daemon-reload
sudo systemctl restart containerd
sudo systemctl status containerd

3.8 设置官方kubernetes仓库密钥并添加下载源

curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.28/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.28/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list
sudo apt update

如网络不通,可设置国内仓库密钥并添加下载源

sudo curl -fsSL https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | sudo apt-key add - 
sudo add-apt-repository "deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main"
sudo apt update

3.9 安装kubernetes组件

## 我们首先下载一些辅助安装包
sudo apt-get install -y apt-transport-https ca-certificates curl gpg
## 安装1.28指定版本
sudo apt-get install -y kubelet=1.28.4-1.1 kubeadm=1.28.4-1.1 kubectl=1.28.4-1.1
## 锁定软件版本,以防自动更新导致集群不可用
sudo apt-mark hold kubelet kubeadm kubectl

## 确认服务状态(此时kubelet暂未启动)
systemctl status kubelet
## 这里没有启动成功是正常的,因为kubelet服务成功启动的先决条件,需要kubelet的配置文件,所在目录/var/lib/kubelet还没有建立,主节点执行kubeadm init后可以正常启动。

四、kubenetes初始化 (只在master节点操作)

4.1 预拉取镜像

## 通过下面的命令看到kubeadm默认配置的kubernetes镜像,是外网的镜像
kubeadm config images list --kubernetes-version=v1.28.2
## 显示如下
registry.k8s.io/kube-apiserver:v1.28.2
registry.k8s.io/kube-controller-manager:v1.28.2
registry.k8s.io/kube-scheduler:v1.28.2
registry.k8s.io/kube-proxy:v1.28.2
registry.k8s.io/pause:3.9
registry.k8s.io/etcd:3.5.9-0
registry.k8s.io/coredns/coredns:v1.10.1

## 查看aliyun镜像源可以拉取的镜像
kubeadm config images list --image-repository registry.aliyuncs.com/google_containers --kubernetes-version=v1.28.2
## 显示如下
registry.aliyuncs.com/google_containers/kube-apiserver:v1.28.2
registry.aliyuncs.com/google_containers/kube-controller-manager:v1.28.2
registry.aliyuncs.com/google_containers/kube-scheduler:v1.28.2
registry.aliyuncs.com/google_containers/kube-proxy:v1.28.2
registry.aliyuncs.com/google_containers/pause:3.9
registry.aliyuncs.com/google_containers/etcd:3.5.9-0
registry.aliyuncs.com/google_containers/coredns:v1.10.1

## 根据本机网络情况拉取国内|官网镜像
kubeadm config images pull --kubernetes-version v1.28.2 --image-repository registry.aliyuncs.com/google_containers

4.2 初始化master

初始化有两种方式任选其一:(1)手工添加init参数;(2)生成配置文件修改后

方法1.手工添加init参数初始化

kubeadm init \
  --control-plane-endpoint="192.168.64.7" \
  --image-repository registry.aliyuncs.com/google_containers \
  --kubernetes-version v1.28.2 \
  --service-cidr=10.96.0.0/12 \
  --pod-network-cidr=10.244.0.0/16 \
  --ignore-preflight-errors=all
### –-apiserver-advertise-address # 集群通告地址,单master时为控制面使用的m01的服务器IP
### --control-plane-endpoint="kubeapi.magedu.com" #和上面的参数二选一,如果用这个,则配置kubeapi这个规划好的master虚拟主机名
### –-image-repository #由于默认拉取镜像地址k8s.gcr.io国内无法访问,这里指定阿里云镜像仓库地址
### –-kubernetes-version #K8s版本,与上面安装的k8s软件一致
### –-service-cidr #集群内部虚拟网络,Pod统一访问入口,可以不用更改,直接用上面的参数
### –-pod-network-cidr #Pod网络,与下面部署的CNI网络组件yaml中保持一致,可以不用更改,直接用上面的参数

方法2.生成kubeadm默认配置文件初始化

kubeadm config print init-defaults > kubeadm.yaml

修改默认配置 vim kubeadm.yaml(总共四处需修改)

1.修改localAPIEndpoint.advertiseAddress为master的ip;

2.修改nodeRegistration.name为当前节点名称;

3.修改imageRepository为国内源:registry.cn-hangzhou.aliyuncs.com/google_containers

4.添加networking.podSubnet,该网络ip范围不能与networking.serviceSubnet冲突,也不能与节点网络192.168.64.0/24相冲突;所以我就设置成10.244.0.0/16;

修改后内容如下:

apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:
  - system:bootstrappers:kubeadm:default-node-token
  token: abcdef.0123456789abcdef
  ttl: 24h0m0s
  usages:
  - signing
  - authentication
kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: 192.168.64.7
  bindPort: 6443
nodeRegistration:
  criSocket: unix:///var/run/containerd/containerd.sock
  imagePullPolicy: IfNotPresent
  name: m1
  taints: null
---
apiServer:
  timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns: {}
etcd:
  local:
    dataDir: /var/lib/etcd
imageRepository: registry.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: 1.28.2
networking:
  dnsDomain: cluster.local
  podSubnet: 10.244.0.0/16 #添加了Pod网段信息
  serviceSubnet: 10.96.0.0/12
scheduler: {}

执行初始化

 sudo kubeadm init —config kubeadm.yaml

初始化成功显示如下,按照提示执行命令

Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user: 
    mkdir -p $HOME/.kube 
    sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config 
    sudo chown $(id -u):$(id -g) $HOME/.kube/config 
Alternatively, if you are the root user, you can run: 
    export KUBECONFIG=/etc/kubernetes/admin.conf 
You should now deploy a pod network to the cluster. 
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: 
    https://kubernetes.io/docs/concepts/cluster-administration/addons/ 
You can now join any number of control-plane nodes by copying certificate authorities and service account keys on each node and then running the following as root: 
    kubeadm join kubeapi.beety.com:6443 --token avjbin.trcl3dwub19jjcau \ --discovery-token-ca-cert-hash sha256:80943437321a578c95a90bc2dae6267f4955b1d09fdf0b8c4b76e938993a780e \ --control-plane 
Then you can join any number of worker nodes by running the following on each as root: 
    kubeadm join kubeapi.beety.com:6443 --token avjbin.trcl3dwub19jjcau \ --discovery-token-ca-cert-hash sha256:80943437321a578c95a90bc2dae6267f4955b1d09fdf0b8c4b76e938993a780e

此时节点还是NOT READY状态(因为集群没有安装网络插件,可以看到coredns还未启动)

4.3 安装集群网络

1.下载kube-flannel.yml 配置文件

wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

2.拉取配置文件中所需镜像

grep image: kube-flannel.yml

docker pull docker.io/flannel/flannel-cni-plugin:v1.5.1-flannel2
docker pull docker.io/flannel/flannel:v0.25.6
docker pull docker.io/flannel/flannel:v0.25.6

## 如果无法拉取可尝试 quay.io/coreos/flannel:$tag,然后同步更改配置文件中镜像配置

3.创建flannel网络

kubectl create -f kube-flannel.yml

## 创建命令的打印结果
podsecuritypolicy.policy/psp.flannel.unprivileged created
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
serviceaccount/flannel created
configmap/kube-flannel-cfg created
daemonset.apps/kube-flannel-ds-arm64 created
daemonset.apps/kube-flannel-ds-arm created
daemonset.apps/kube-flannel-ds-ppc64le created
daemonset.apps/kube-flannel-ds-s390x created

##查看pod运行状态 kubectl get pod -A

NAMESPACE      NAME                         READY   STATUS    RESTARTS       AGE

kube-flannel   kube-flannel-ds-49klq        1/1     Running   0              14h

kube-flannel   kube-flannel-ds-kx58p        1/1     Running   12 (13h ago)   14h

kube-flannel   kube-flannel-ds-tbhz4        1/1     Running   12 (13h ago)   14h

kube-system    coredns-66f779496c-95v9f     1/1     Running   0              3d3h

kube-system    coredns-66f779496c-fw8c7     1/1     Running   0              3d3h

kube-system    etcd-m1                      1/1     Running   2 (14h ago)    3d3h

kube-system    kube-apiserver-m1            1/1     Running   2 (14h ago)    3d3h

kube-system    kube-controller-manager-m1   1/1     Running   2 (14h ago)    3d3h

kube-system    kube-proxy-7dd5l             1/1     Running   13 (13h ago)   14h

kube-system    kube-proxy-cjcgl             1/1     Running   2 (14h ago)    3d3h

kube-system    kube-proxy-mxjr5             1/1     Running   12 (13h ago)   14h

kube-system    kube-scheduler-m1            1/1     Running   2 (14h ago)    3d3h

## 所有pod均为Running 可以继续加入worker节点

4.4 加入worker节点 (worker节点执行)

worker节点分别执行以下命令加入集群

kubeadm join kubeapi.beety.com:6443 --token avjbin.trcl3dwub19jjcau \ --discovery-token-ca-cert-hash sha256:80943437321a578c95a90bc2dae6267f4955b1d09fdf0b8c4b76e938993a780e

查看节点列表 kubectl get nodes

NAME       STATUS   ROLES           AGE    VERSION

m1         Ready    control-plane   3d3h   v1.28.2

server02   Ready    <none>          14h    v1.28.2

server03   Ready    <none>          14h    v1.28.2

至此,一个完整的可用于测试环境的k8s集群就完成了,你可以继续练习创建deployment,service等对象资源了。

五、遇到问题及解决

网络问题

5.1 kube-proxy或flannel pod 状态处于 CrashLoopBackOff

该问题是部署时有一台worker节点忘记修改修改containerd的 SystemdCgroup = true 驱动配置了,修改后解决。

5.2 worker节点无法部署flannel,状态为init error

切记如果部署网络插件时需要手动拉取镜像,需要集群内所有主机均拉取flannel镜像。