总结k8s基于kubeadm安装过程中的所有可能存在的问题及情况,k8s的版本为1.25,系统centos7通虚拟机构建
本文不含对k8s的具体介绍,单纯描述k8s的安装,本文的内容方式以先安装后解释的方式展开,在3节点中就是具体的安装命令集合修改关键信息即可复制安装。
希望对你的k8s基于帮助
1 条件 -- 来自官方的摘抄看看就好
-
一台兼容的 Linux 主机。Kubernetes 项目为基于 Debian 和 Red Hat 的 Linux 发行版以及一些不提供包管理器的发行版提供通用的指令。
-
每台机器 2 GB 或更多的 RAM(如果少于这个数字将会影响你应用的运行内存)。
-
CPU 2 核心及以上。
-
集群中的所有机器的网络彼此均能相互连接(公网和内网都可以)。
-
节点之中不可以有重复的主机名、MAC 地址或 product_uuid。请参见这里了解更多详细信息。
-
开启机器上的某些端口。请参见这里了解更多详细信息。
-
禁用交换分区。为了保证 kubelet 正常工作,你必须禁用交换分区
2. 机器与须知
| 系统 | ip | 角色 | CPU | memory | hostname |
|---|---|---|---|---|---|
| centos7 | 192.168.117.120 | master | ≥2 | ≥2 | master-120 |
| centos7 | 192.168.117.130 | worker / node | ≥2 | ≥2 | node-130 |
如上是机器的信息,安装过程为:1. 先在master上安装好kubelet等程序,2. 再将node节点加入
安装前必须要做的事情 分别 设计node与master机器的主机名
3. 安装命令集合 -- 如下命令ctrl c + ctrl v的方式放到xshell中执行
如下命令集合为master节点的安装,但先不要执行,先复制到文本中命令内需要修改关键信息
- hosts配置,hostname的配置
- kubeadm初始化的配置信息
注意关键位置需要修改的地方以在如下代码中进行标注,修改完建议删除注释降低可能存在的问题,下面所有的代码一次性复制一次性放到xshell中执行即可安装k8s。
master 节点的安装
# 此处需要修改,执行修改完删除注释
# 修改为你自己配置的机hostname
hostnamectl set-hostname master-120
sed -ri 's#(SELINUX=).*#\1disabled#' /etc/selinux/config
setenforce 0
systemctl disable firewalld
systemctl stop firewalld
swapoff -a
# 此处需要修改,执行修改完删除注释
# 修改为你自己配置的机器ip和hostname
cat >>/etc/hosts<<EOF
192.168.117.120 master-120
192.168.117.130 node-130
EOF
systemctl stop firewalld
systemctl disable firewalld
touch /etc/modules-load.d/k8s.conf
cat > /etc/modules-load.d/k8s.conf << EOF
br_netfilter
EOF
cat > /etc/sysctl.d/k8s.conf << EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-iptables=1
net.bridge.bridge-nf-call-ip6tables=1
net.ipv4.ip_forward=1
net.ipv4.tcp_tw_recycle=0
vm.swappiness=0
vm.overcommit_memory=1
vm.panic_on_oom=0
fs.inotify.max_user_instances=8192
fs.inotify.max_user_watches=1048576
fs.file-max=52706963
fs.nr_open=52706963
net.ipv6.conf.all.disable_ipv6=1
net.netfilter.nf_conntrack_max=2310720
EOF
sudo sysctl --system
swapoff -a
sed -ri 's/.*swap.*/#&/'
setenforce 0
sed -i 's/enforcing/disabled/' /etc/selinux/config
systemctl disable firewalld
yum install -y yum-utils device-mapper-persistent-data lvm2
yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
yum -y install docker-ce
mkdir /etc/docker/
touch /etc/docker/daemon.json
cat > /etc/docker/daemon.json << EOF
{
"exec-opts": ["native.cgroupdriver=systemd"],
"registry-mirrors": ["https://hub-mirror.c.163.com"]
}
EOF
systemctl enable docker
touch /etc/yum.repos.d/kubernetes.repo
cat > /etc/yum.repos.d/kubernetes.repo << EOF
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes
systemctl enable kubelet.service
cat > /etc/sysctl.conf << EOF
net.bridge.bridge-nf-call-ip6tables=1
net.bridge.bridge-nf-call-iptables=1
EOF
modprobe br_netfilter
sysctl -p
yum -y install containerd
rm -rf /etc/containerd/config.toml
systemctl restart containerd
mkdir /etc/containerd -p
containerd config default > /etc/containerd/config.toml
sed -i 's/k8s.gcr.io/registry.cn-beijing.aliyuncs.com\/abcdocker/' /etc/containerd/config.toml
systemctl restart containerd
iptables -F && iptables -t nat -F && iptables -t mangle -F && iptables -X
touch kubeadm.yaml
cat > kubeadm.yaml << EOF
apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
# 此处需要修改,执行修改完删除注释
# 修改为你自己机器的ip
advertiseAddress: 192.168.117.120
bindPort: 6443
nodeRegistration:
criSocket: unix:///var/run/containerd/containerd.sock
imagePullPolicy: IfNotPresent
# 此处需要修改,执行修改完删除注释
# 修改为你当前机器配置的hostname
name: master-120
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/master
---
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns: {}
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: registry.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: 1.25.0
networking:
dnsDomain: cluster.local
serviceSubnet: 10.96.0.0/12
podSubnet: 10.244.0.0/16
scheduler: {}
EOF
kubeadm config images pull --config kubeadm.yaml
kubeadm config images list
kubeadm init --config=./kubeadm.yaml --upload-certs | tee kubeadm-init.log
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
kubectl get cs
kubectl get node
node节点和master差别不是很大,主要是在kubeadm init的时候需要更换为kubeadm join才ok。 修改也如master节点一样
注意!代码先放在文本中先不执行,在下面命令中kubeadm join信息来自kubeadm init初始化后成功出现的内容。
# 此处需要修改,执行修改完删除注释
# 修改为你自己配置的机hostname
hostnamectl set-hostname node-130
sed -ri 's#(SELINUX=).*#\1disabled#' /etc/selinux/config
setenforce 0
systemctl disable firewalld
systemctl stop firewalld
swapoff -a
# 此处需要修改,执行修改完删除注释
# 修改为你自己配置的机器ip和hostname
cat >>/etc/hosts<<EOF
192.168.117.120 master-120
192.168.117.130 node-130
EOF
systemctl stop firewalld
systemctl disable firewalld
touch /etc/modules-load.d/k8s.conf
cat > /etc/modules-load.d/k8s.conf << EOF
br_netfilter
EOF
cat > /etc/sysctl.d/k8s.conf << EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-iptables=1
net.bridge.bridge-nf-call-ip6tables=1
net.ipv4.ip_forward=1
net.ipv4.tcp_tw_recycle=0
vm.swappiness=0
vm.overcommit_memory=1
vm.panic_on_oom=0
fs.inotify.max_user_instances=8192
fs.inotify.max_user_watches=1048576
fs.file-max=52706963
fs.nr_open=52706963
net.ipv6.conf.all.disable_ipv6=1
net.netfilter.nf_conntrack_max=2310720
EOF
sudo sysctl --system
swapoff -a
sed -ri 's/.*swap.*/#&/'
setenforce 0
sed -i 's/enforcing/disabled/' /etc/selinux/config
systemctl disable firewalld
yum install -y yum-utils device-mapper-persistent-data lvm2
yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
yum -y install docker-ce
mkdir /etc/docker/
touch /etc/docker/daemon.json
cat > /etc/docker/daemon.json << EOF
{
"exec-opts": ["native.cgroupdriver=systemd"],
"registry-mirrors": ["https://hub-mirror.c.163.com"]
}
EOF
systemctl enable docker
touch /etc/yum.repos.d/kubernetes.repo
cat > /etc/yum.repos.d/kubernetes.repo << EOF
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes
systemctl enable kubelet.service
cat > /etc/sysctl.conf << EOF
net.bridge.bridge-nf-call-ip6tables=1
net.bridge.bridge-nf-call-iptables=1
EOF
modprobe br_netfilter
sysctl -p
yum -y install containerd
rm -rf /etc/containerd/config.toml
systemctl restart containerd
mkdir /etc/containerd -p
containerd config default > /etc/containerd/config.toml
sed -i 's/k8s.gcr.io/registry.cn-beijing.aliyuncs.com\/abcdocker/' /etc/containerd/config.toml
systemctl restart containerd
iptables -F && iptables -t nat -F && iptables -t mangle -F && iptables -X
# 此处需要修改,执行修改完删除注释
# 修改为master节点运行kubeadm init之后显示的kubeadm join的信息
kubeadm join 192.168.117.120:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:66dbacf906dbfbb793cce0a53dccf6679a51172bea8c04e71fa54bc150a2e05c
4. 安装过程解释 -- 如果不了解可以看这里
k8s的安装大致上分为如下过程
- k8s在安装的时候如果是学习,那么建议先关闭机器的防火墙 - 降低问题
- 设置host - 这一步很重要k8s的集群中就是根据hostname查找对应机器的
- 配置系统环境:iptable/swapp/selinux -- 因为k8s对系统环境有要求因此须设置
- 安装docker
- 配置k8s的镜像源再安装kubeadm等程序
- 基于kubeadm 初始化k8s -- 完成
- 加入node
4.1 防火墙
$ systemctl stop firewalld
$ systemctl disable firewalld
4.2 设置host
# 设置主/从机器
$ hostnamectl set-hostname master-120
# 配置host
$ cat > /etc/hosts << EOF
192.168.117.120 master-120
192.168.117.130 node-130
EOF
4.3 配置系统环境:iptable/swapp/selinux
$ cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
br_netfilter
EOF
$ cat > /etc/sysctl.d/k8s.conf << EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-iptables=1
net.bridge.bridge-nf-call-ip6tables=1
net.ipv4.ip_forward=1
net.ipv4.tcp_tw_recycle=0
vm.swappiness=0
vm.overcommit_memory=1
vm.panic_on_oom=0
fs.inotify.max_user_instances=8192
fs.inotify.max_user_watches=1048576
fs.file-max=52706963
fs.nr_open=52706963
net.ipv6.conf.all.disable_ipv6=1
net.netfilter.nf_conntrack_max=2310720
EOF
$ sudo sysctl --system
$ swapoff -a
$ sed -ri 's/.*swap.*/#&/'
$ setenforce 0
$ sed -i 's/enforcing/disabled/' /etc/selinux/config
$ cat > /etc/sysctl.conf << EOF
net.bridge.bridge-nf-call-ip6tables=1
net.bridge.bridge-nf-call-iptables=1
EOF
$ modprobe br_netfilter
$ sysctl -p
4.4 安装docker
# 安装依赖
$ yum install -y yum-utils device-mapper-persistent-data lvm2
# 添加软件源信息
$ yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
$ yum -y install docker-ce
# 配置docker镜像源和cgroup
$ mkdir /etc/docker/
$ touch /etc/docker/daemon.json
$ cat > /etc/docker/daemon.json << EOF
{
"exec-opts": ["native.cgroupdriver=systemd"],
"registry-mirrors": ["https://hub-mirror.c.163.com"]
}
EOF
$ systemctl enable docker --now
4.5 配置k8s的镜像源再安装kubeadm等程序
$ touch /etc/yum.repos.d/kubernetes.repo
$ cat > /etc/yum.repos.d/kubernetes.repo << EOF
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.g
EOF
$ yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes
$ systemctl enable kubelet.service
4.6 基于kubeadm初始化k8s -- 完成
一般推荐kubeadm直接init,但我建议可以将配置输出修改, 一般问题都出在这一步
修改:localAPIEndpoint.advertiseAddress 与 nodeRegistration.name 及 nodeRegistration.taints
imageRepository镜像地址修改为阿里云镜像地址registry.aliyuncs.com/google_containers
$ kubeadm config print init-defaults > kubeadm.yaml
$ cat kubeadm.yaml
apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 192.168.117.120
bindPort: 6443
nodeRegistration:
criSocket: unix:///var/run/containerd/containerd.sock
imagePullPolicy: IfNotPresent
name: master-120
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/master
---
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns: {}
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: registry.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: 1.25.0
networking:
dnsDomain: cluster.local
serviceSubnet: 10.96.0.0/12
scheduler: {}
然后下载镜像
$ kubeadm config images pull --config kubeadm.yaml
再执行如下命令降低可能存在的问题, 命令中有重复命令如果能降低错误率多执行还是没事的。
$ cat > /etc/sysctl.conf << EOF
net.bridge.bridge-nf-call-ip6tables=1
net.bridge.bridge-nf-call-iptables=1
EOF
$ modprobe br_netfilter
$ sysctl -p
$ mkdir /etc/containerd -p
$ containerd config default > /etc/containerd/config.toml
$ sed -i 's/k8s.gcr.io/registry.aliyuncs.com\/google_containers/' /etc/containerd/config.toml
$ systemctl restart containerd
$ iptables -F && iptables -t nat -F && iptables -t mangle -F && iptables -X
最后执行kubeadm init
kubeadm init --config=./kubeadm.yaml --upload-certs | tee kubeadm-init.log
ok的话是这个结果
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 192.168.117.120:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:d6d7267fb7687fe772e12e6eb240de1d1a2651bb8a4512af17f4946a24dfbc1b
然后执行如下命令,在执行kubeadm init成功之后kubectl默认会在执行的用户家目录下面的.kube目录下寻找config文件。这里是将在初始化时[kubeconfig]步骤生成的admin.conf拷贝到.kube/config。 在该配置文件中,记录了API Server的访问地址,所以后面直接执行kubectl命令就可以正常连接到API Server中
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
4.7 node节点
node节点和master节点的执行过程类似,整个过程的执行一直执行到 基于kubeadm初始化k8s
node 需要将kubeadm init改为 kubeadm join,而这个信息就是kubeadm init执行之后的结果
注意!!!你的hostname是需要先配置好的。hostnamectl set-hostname node-130
kubeadm join 192.168.117.120:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:d6d7267fb7687fe772e12e6eb240de1d1a2651bb8a4512af17f4946a24dfbc1b
4.8 验证
在master执行如下命令
[root@localhost ~]# kubectl get cs
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME STATUS MESSAGE ERROR
scheduler Healthy ok
controller-manager Healthy ok
etcd-0 Healthy {"health":"true","reason":""}
[root@localhost ~]# kubectl get node
NAME STATUS ROLES AGE VERSION
node-130 NotReady control-plane 10m v1.25.2
5. 安装问题
k8s的问题主要集中在kubeadm init问题如下
问题 1. [ERROR FileContent--proc-sys-net-ipv4-ip_forward]: /proc/sys/net/ipv4/ip_forward contents are not set to 1
出现上面的错误是因为没有配置行k8s,回到 “允许 iptables 检查桥接流量” 这块代码配置/etc/sysctl.d/k8s.conf
问题 2. [ERROR FileContent--proc-sys-net-bridge-bridge-nf-call-iptables]: /proc/sys/net/bridge/bridge-nf-call-iptables does not exist
$ cat > /etc/sysctl.conf << EOF
net.bridge.bridge-nf-call-ip6tables=1
net.bridge.bridge-nf-call-iptables=1
EOF
$ modprobe br_netfilter
$ sysctl -p
问题 3. [ERROR CRI]: container runtime is not running: output: time="2022-10-07T00:10:24+08:00" level=fatal msg="unable to determine runtime API version: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial unix /var/run/containerd/containerd.sock: connect: no such file or directory""
# yum -y install containerd
# rm -rf /etc/containerd/config.toml
# systemctl restart containerd
问题 4. 一直卡在如下情况
# 初始化卡在
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
# 查看日志为
10月 08 00:31:15 master-120 kubelet[21962]: E1008 00:31:15.145710 21962 kubelet.go:2448] "Error getting node" err="node \"master-120\" not found"
解决有如下可能建议一个个尝试 问题因素:
- 拉取镜像失败,国内拉取google失败,可以换成阿里云,需要修改kubeadm-init.yaml ,
imageRepository: registry.aliyuncs.com/google_containers
- 检查容器是否正常启动以及版本正确
yum install docker-ce
vim /etc/docker/daemon.json
{
"registry-mirrors": ["https://82m9ar63.mirror.aliyuncs.com"],
"exec-opts": ["native.cgroupdriver=systemd"]
}
- 配置的vip不能被访问,导致不能连接apiserver,检查防火墙配置。这也是导致我的初始化报错的原因。
kubeadm reset
systemctl daemon-reload
modprobe -- ip_vs
modprobe -- ip_vs_rr
modprobe -- ip_vs_wrr
modprobe -- ip_vs_sh
modprobe -- nf_conntrack_ipv4
sysctl -p
-
如果失败,则清空初始化信息,执行kubeadm reset , 关闭docker,重启防火墙,如果etcd是外部的,将看到以前集群的状态,需要删除etcd数据,例如etcdctl del "" --prefix
-
containerd镜像问题
kubeadm reset
systemctl daemon-reload
mkdir /etc/containerd -p
containerd config default > /etc/containerd/config.toml
sed -i 's/k8s.gcr.io/registry.aliyuncs.com\/google_containers/' /etc/containerd/config.toml
systemctl restart containerd
手动将k8s.gcr.io,修改为registry.cn-hangzhou.aliyuncs.com/google_containers)
- iptables配置问题
swapoff -a
kubeadm reset
systemctl daemon-reload
systemctl restart kubelet
iptables -F && iptables -t nat -F && iptables -t mangle -F && iptables -X