一 安装说明
部署说明
本次部署采用的系统及组件版本:
| 项目 | 版本 |
|---|---|
| 操作系统 | AlmaLinux release 10.0 (基本和centos7操作一样) |
| 内核版本 | 6.12.0 |
| Kubernetes | v1.33.5 |
| containerd | 2.1.4 |
| CNI 插件 | v1.8.0 |
| crictl | 1.34.0 |
| etcd | 3.5.21-0 |
离线安装包:
链接: pan.baidu.com/s/19CjX1Imi…
提取码: 8888
二 准备开始
- Linux 主机,兼容 Debian / RedHat 系列或其他无包管理器的发行版。
- 如果不在 AlmaLinux/类似系统上,请确认内核版本 ≥ v5.13(参考 官方文档)。
- 每台机器 ≥ 2 GB RAM;控制平面节点建议 ≥ 2 CPU。
- 集群中的所有机器之间必须能网络互通。
- 每节点主机名、MAC 地址、product_uuid 唯一。
- 禁用 swap。
三 集群安装
3.1 基本网络 / 主机名 /静态 IP 配置
| 主机 | IP 地址 | 主机名 | 备注 |
|---|---|---|---|
| 192.168.1.11 | master01 | 主控制平面1 | 4C / 4G / 40G |
| 192.168.1.12 | master02 | 主控制平面2 | 同上 |
| 192.168.1.13 | master03 | 主控制平面3 | 同上 |
| 192.168.1.100 | master-lb | VIP(虚拟 IP) | 与公司内网不冲突 |
Kubernetes Service 网段:
10.96.0.0/12Pod 网段:10.244.0.0/16
3.2 系统环境 & 基本配置(所有节点)
3.2.1 系统版本确认
cat /etc/redhat-release
# 应为 AlmaLinux release 10.0 (Purple Lion)
3.2.2 修改 /etc/hosts
在所有节点:
echo '192.168.1.11 master01
192.168.1.12 master02
192.168.1.13 master03
192.168.1.100 master-lb' >> /etc/hosts
3.2.3 关闭防火墙与 SELinux
systemctl disable --now firewalld
setenforce 0
sed -i 's#SELINUX=enforcing#SELINUX=disabled#g' /etc/sysconfig/selinux
sed -i 's#SELINUX=enforcing#SELINUX=disabled#g' /etc/selinux/config
3.2.4 禁用 swap
swapoff -a
sed -i.bak '/swap/s/^/#/' /etc/fstab
3.2.5 时间同步
- 安装 ntp 或
- 使用 chronyd
dnf install -y ntpd
# 或
systemctl status chronyd
3.2.6 系统限制(limits)
echo "* soft nofile 65536" >> /etc/security/limits.conf
echo "* hard nofile 65536" >> /etc/security/limits.conf
echo "* soft nproc 65536" >> /etc/security/limits.conf
echo "* hard nproc 65536" >> /etc/security/limits.conf
echo "* soft memlock unlimited" >> /etc/security/limits.conf
echo "* hard memlock unlimited" >> /etc/security/limits.conf
3.2.7 无密码 SSH 登陆(Master01 -> 所有节点)
在主控机 Master01 上:
ssh-keygen -t rsa # 回车全部默认值
for i in master01 master02 master03; do
ssh-copy-id -i ~/.ssh/id_rsa.pub $i
done
3.3 内核与 ipvs 配置
3.3.1 安装 ipvsadm 及相关模块
dnf install -y ipvsadm ipset sysstat conntrack libseccomp
modprobe ip_vs
modprobe ip_vs_rr
modprobe ip_vs_wrr
modprobe ip_vs_sh
modprobe nf_conntrack
3.3.2 ipvs 模块开机加载
cat > /etc/modules-load.d/ipvs.conf <<EOF
ip_vs
ip_vs_lc
ip_vs_wlc
ip_vs_rr
ip_vs_wrr
ip_vs_lblc
ip_vs_lblcr
ip_vs_dh
ip_vs_sh
ip_vs_fo
ip_vs_nq
ip_vs_sed
ip_vs_ftp
ip_vs_sh
nf_conntrack
ip_tables
ip_set
xt_set
ipt_set
ipt_rpfilter
ipt_REJECT
ipip
EOF
systemctl enable --now systemd-modules-load.service
检查加载情况:
lsmod | grep -e ip_vs -e nf_conntrack
3.3.3 配置内核参数
在所有节点创建 /etc/sysctl.d/k8s.conf:
cat <<EOF > /etc/sysctl.d/k8s.conf
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
fs.may_detach_mounts = 1
vm.overcommit_memory = 1
net.ipv4.conf.all.route_localnet = 1
vm.panic_on_oom = 0
fs.inotify.max_user_watches = 89100
fs.file-max = 52706963
fs.nr_open = 52706963
net.netfilter.nf_conntrack_max = 2310720
net.ipv4.tcp_keepalive_time = 600
net.ipv4.tcp_keepalive_probes = 3
net.ipv4.tcp_keepalive_intvl = 15
net.ipv4.tcp_max_tw_buckets = 36000
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_max_orphans = 327680
net.ipv4.tcp_orphan_retries = 3
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_max_syn_backlog = 16384
net.ipv4.ip_conntrack_max = 65536
net.ipv4.tcp_max_syn_backlog = 16384
net.ipv4.tcp_timestamps = 0
net.core.somaxconn = 16384
EOF
sysctl --system
重启机器以保证修改生效:
reboot
重启后确认内核模块仍已加载:
lsmod | grep --color=auto -e ip_vs -e nf_conntrack
3.4 安装 containerd + CRI 工具
配置内核参数 转发 IPv4 并让 iptables 看到桥接流量
cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF
sudo modprobe overlay
sudo modprobe br_netfilter
# 应用 sysctl 参数而不重新启动
sudo sysctl --system
通过运行以下指令确认 br_netfilter 和 overlay 模块被加载:
lsmod | grep br_netfilter
lsmod | grep overlay
查看内核参数是否为 1
sysctl net.bridge.bridge-nf-call-iptables net.bridge.bridge-nf-call-ip6tables net.ipv4.ip_forward
3.4.1 下载与安装 containerd
wget https://github.com/containerd/containerd/releases/download/v2.1.4/containerd-2.1.4-linux-amd64.tar.gz
tar xvf containerd-2.1.4-linux-amd64.tar.gz
mv bin/* /usr/local/bin/
mkdir /etc/containerd
containerd config default > /etc/containerd/config.toml
3.4.2 containerd 启动文件
cat > /usr/lib/systemd/system/containerd.service <<EOF
[Unit]
Description=containerd container runtime
Documentation=https://containerd.io
After=network.target local-fs.target
[Service]
ExecStartPre=-/sbin/modprobe overlay
ExecStart=/usr/local/bin/containerd
Type=notify
Delegate=yes
KillMode=process
Restart=always
RestartSec=5
LimitNPROC=infinity
LimitCORE=infinity
LimitNOFILE=infinity
TasksMax=infinity
OOMScoreAdjust=-999
[Install]
WantedBy=multi-user.target
EOF
systemctl daemon-reload
systemctl enable --now containerd
3.4.3 安装 runc
install -m 755 runc.amd64 /usr/local/sbin/runc
3.4.4 安装 CNI 插件
mkdir -p /opt/cni/bin
tar Cxzvf /opt/cni/bin cni-plugins-linux-amd64-v1.8.0.tgz
3.4.5 安装 crictl
tar -xf crictl-v1.34.0-linux-amd64.tar.gz -C /usr/local/bin
cat > /etc/crictl.yaml <<EOF
runtime-endpoint: unix:///var/run/containerd/containerd.sock
image-endpoint: unix:///var/run/containerd/containerd.sock
timeout: 30
debug: false
pull-image-on-create: false
EOF
3.4.6 启用 systemd cgroup 驱动
cgroup 详细介绍请查看 官方文档
编辑 /etc/containerd/config.toml 中对应部分:
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
ShimCgroup = '' # 在这行下面添加
SystemdCgroup = true # 默认是没有这行的
重启 containerd:
systemctl restart containerd
3.5 高可用组件:HAProxy + Keepalived
3.5.1 安装
在所有 Master 节点上:
dnf install -y haproxy keepalived
3.5.2 配置 HAProxy
所有 Master 节点共享相同配置文件 /etc/haproxy/haproxy.cfg,内容如下:
cat > /etc/haproxy/haproxy.cfg << EOF
global
maxconn 2000
ulimit-n 16384
log 127.0.0.1 local0 err
stats timeout 30s
defaults
log global
mode http
option httplog
timeout connect 5000
timeout client 50000
timeout server 50000
timeout http-request 15s
timeout http-keep-alive 15s
frontend k8s-master
bind 0.0.0.0:8443
mode tcp
option tcplog
tcp-request inspect-delay 5s
default_backend k8s-master
backend k8s-master
mode tcp
balance roundrobin
option httpchk GET /healthz
http-check expect status 200
option tcp-check
default-server inter 10s downinter 5s rise 2 fall 2 slowstart 60s maxconn 250 maxqueue 256 weight 100
server master01 192.168.1.11:6443 check
server master02 192.168.1.12:6443 check
server master03 192.168.1.13:6443 check
EOF
3.5.3 Keepalived 配置(不同节点略有差异)
Master01:
cat > /etc/keepalived/keepalived.conf << EOF
! Configuration File for keepalived
global_defs {
router_id LVS_DEVEL
}
vrrp_script chk_apiserver {
script "/etc/keepalived/check_apiserver.sh"
interval 5
weight -5
fall 2
rise 1
}
vrrp_instance VI_1 {
state MASTER
interface ens33
mcast_src_ip 192.168.1.11
virtual_router_id 51
priority 100
nopreempt
advert_int 2
authentication {
auth_type PASS
auth_pass K8SHA_KA_AUTH
}
virtual_ipaddress {
192.168.1.100
}
track_script {
chk_apiserver
}
}
EOF
Master02:
cat > /etc/keepalived/keepalived.conf << EOF
! Configuration File for keepalived
global_defs {
router_id LVS_DEVEL
}
vrrp_script chk_apiserver {
script "/etc/keepalived/check_apiserver.sh"
interval 5
weight -5
fall 2
rise 1
}
vrrp_instance VI_1 {
state BACKUP
interface ens33
mcast_src_ip 192.168.1.12
virtual_router_id 51
priority 99
nopreempt
advert_int 2
authentication {
auth_type PASS
auth_pass K8SHA_KA_AUTH
}
virtual_ipaddress {
192.168.1.100
}
track_script {
chk_apiserver
} }
EOF
Master03:
cat > /etc/keepalived/keepalived.conf << EOF
! Configuration File for keepalived
global_defs {
router_id LVS_DEVEL
}
vrrp_script chk_apiserver {
script "/etc/keepalived/check_apiserver.sh"
interval 5
weight -5
fall 2
rise 1
}
vrrp_instance VI_1 {
state BACKUP
interface ens33
mcast_src_ip 192.168.1.13
virtual_router_id 51
priority 98
nopreempt
advert_int 2
authentication {
auth_type PASS
auth_pass K8SHA_KA_AUTH
}
virtual_ipaddress {
192.168.1.100
}
track_script {
chk_apiserver
} }
EOF
健康检查脚本 /etc/keepalived/check_apiserver.sh:
cat > /etc/keepalived/check_apiserver.sh << EOF
#!/bin/bash
err=0
for k in $(seq 1 3)
do
check_code=$(pgrep haproxy)
if [[ $check_code == "" ]]; then
err=$(expr $err + 1)
sleep 1
continue
else
err=0
break
fi
done
if [[ $err != "0" ]]; then
echo "systemctl stop keepalived"
/usr/bin/systemctl stop keepalived
exit 1
else
exit 0
fi
EOF
chmod +x /etc/keepalived/check_apiserver.sh
启动服务:
systemctl daemon-reload
systemctl enable --now haproxy
systemctl enable --now keepalived
测试 VIP 是否可 ping 通:
ping 192.168.1.100
3.6 安装 Kubernetes 核心组件:kubeadm, kubelet, kubectl
3.6.1 配置 yum 源
cat <<EOF | sudo tee /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://pkgs.k8s.io/core:/stable:/v1.33/rpm/
enabled=1
gpgcheck=1
gpgkey=https://pkgs.k8s.io/core:/stable:/v1.33/rpm/repodata/repomd.xml.key
exclude=kubelet kubeadm kubectl cri-tools kubernetes-cni
EOF
3.6.2 安装并启用服务
dnf install -y kubelet kubeadm kubectl --disableexcludes=kubernetes
systemctl enable --now kubelet
3.7 初始化 Master01(控制面第一个节点)
3.7.1 查看所需镜像并预先拉取
kubeadm config images list
所需镜像(版本 v1.33.5):
registry.k8s.io/kube-apiserver:v1.33.5registry.k8s.io/kube-controller-manager:v1.33.5registry.k8s.io/kube-scheduler:v1.33.5registry.k8s.io/kube-proxy:v1.33.5registry.k8s.io/coredns/coredns:v1.12.0registry.k8s.io/pause:3.10registry.k8s.io/etcd:3.5.21-0
导入镜像示例:
ctr -n k8s.io image import 加镜像名字 # 或者导入自己的镜像仓库在pull下来
# 倒入好镜像以后用crictl查看ctr也能查看,但是不直观
crictl images
# ctr 好像又命名空间的概念 我也没研究过 要是嫌麻烦可以安装docker客户端工具管理containerd
ctr -n k8s.io images ls
3.7.2 生成并修改初始化配置文件
kubeadm config print init-defaults > kubeadm-init.yaml
修改生成的 kubeadm-init.yaml,例子如下:
cat > ./kubeadm-init.yaml << EOF
apiVersion: kubeadm.k8s.io/v1beta4
# 引导令牌(保持默认即可)
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
# 本地API端点
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 192.168.1.11
bindPort: 6443
nodeRegistration:
criSocket: unix:///var/run/containerd/containerd.sock
imagePullPolicy: IfNotPresent
imagePullSerial: true
name: master01
taints: null
# 超时设置(保持默认即可)
timeouts:
controlPlaneComponentHealthCheck: 4m0s
discovery: 5m0s
etcdAPICall: 2m0s
kubeletHealthCheck: 4m0s
kubernetesAPICall: 1m0s
tlsBootstrap: 5m0s
upgradeManifests: 5m0s
---
apiServer: {}
apiVersion: kubeadm.k8s.io/v1beta4
caCertificateValidityPeriod: 87600h0m0s
certificateValidityPeriod: 8760h0m0s
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns: {}
encryptionAlgorithm: RSA-2048
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: registry.k8s.io
kind: ClusterConfiguration
kubernetesVersion: 1.33.5
networking:
dnsDomain: cluster.local
serviceSubnet: 10.96.0.0/12
podSubnet: 10.244.0.0/16
# 如果不是高可用集群 删除这行即可
controlPlaneEndpoint: "192.168.1.100:8443"
proxy: {}
scheduler: {}
EOF
3.7.3 执行初始化
- 初始化以后会在/etc/kubernetes目录下生成对应的证书和配置文件,之后其他Master节点加入Master01即可。
- 初始化的时候可以看详细日志 在后面添加 --v=5 即可
kubeadm init --config kubeadm-init.yaml --upload-certs
若初始化失败,可重置再来:
kubeadm reset -f ; ipvsadm --clear ; rm -rf ~/.kube
初始化成功后,配置 kubeconfig:
mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config
# 或者如果是 root 用户
export KUBECONFIG=/etc/kubernetes/admin.conf
- 初始化成功以后显示如下
解释: 要开始使用您的集群,您需要以普通用户身份运行以下命令
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
或者,如果您是 root 用户,可以运行:
export KUBECONFIG=/etc/kubernetes/admin.conf
接下来,您需要部署一个 Pod 网络到集群中。您可以在以下链接中的选项之一中选择,并运行 kubectl apply -f [podnetwork].yaml:
您现在可以通过在每个控制平面节点上以 root 用户身份运行以下命令来加入任意数量的控制平面节点:
kubeadm join 192.168.1.100:8443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:3fcd0d0ac88c9a4f1321f6d15cb484b8f67b1492c10282f5faa3070b5741635f \
--control-plane --certificate-key bf521ccd59a5d33a2d8370e0ae9f10b7f00db3412f1c066aafd0e516c80664ae
请注意,certificate-key 提供对集群敏感数据的访问权限,请保密!为了安全起见,上传的证书将在两个小时后被删除;如果需要,您可以使用 "kubeadm init phase upload-certs --upload-certs" 在之后重新加载证书。
然后,您可以通过在每个工作节点上以 root 用户身份运行以下命令来加入任意数量的工作节点:
kubeadm join 192.168.1.100:8443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:3fcd0d0ac88c9a4f1321f6d15cb484b8f67b1492c10282f5faa3070b5741635f
3.8 部署网络插件(Calico)
下载地址:github.com/projectcali… 下载好以后修改配置
# 添加etcd 节点
sed -i 's#etcd_endpoints: "http://<ETCD_IP>:<ETCD_PORT>"#etcd_endpoints: "https://192.168.1.11:2379,https://192.168.1.12:2379,https://192.168.1.13:2379"#g' calico-etcd.yaml
# 添加证书
ETCD_CA=`cat /etc/kubernetes/pki/etcd/ca.crt | base64 | tr -d '\n'`
ETCD_CERT=`cat /etc/kubernetes/pki/etcd/server.crt | base64 | tr -d '\n'`
ETCD_KEY=`cat /etc/kubernetes/pki/etcd/server.key | base64 | tr -d '\n'`
sed -i "s@# etcd-key: null@etcd-key: ${ETCD_KEY}@g; s@# etcd-cert: null@etcd-cert: ${ETCD_CERT}@g; s@# etcd-ca: null@etcd-ca: ${ETCD_CA}@g" calico-etcd.yaml
# 添加证书路径
sed -i 's#etcd_ca: ""#etcd_ca: "/calico-secrets/etcd-ca"#g; s#etcd_cert: ""#etcd_cert: "/calico-secrets/etcd-cert"#g; s#etcd_key: "" #etcd_key: "/calico-secrets/etcd-key" #g' calico-etcd.yaml
# 修改pod网段地址
POD_SUBNET="10.244.0.0/16"
sed -i 's@# - name: CALICO_IPV4POOL_CIDR@- name: CALICO_IPV4POOL_CIDR@g; s@# value: "192.168.0.0/16"@ value: '"${POD_SUBNET}"'@g' calico-etcd.yaml
全部修改好以后检查一遍没问题就可以部署了
kubectl create -f calico-etcd.yaml
部署成功以后再次查看集群状态就没问题了
[root@master01 ~]# kubectl get node
NAME STATUS ROLES AGE VERSION
master01 Ready control-plane 1h45m v1.33.5
master02 Ready control-plane 1h24m v1.33.5
master03 Ready control-plane 1h23m v1.33.5
3.9 部署 Metrics Server
安装之前需要删除污点
kubectl taint node --all node-role.kubernetes.io/control-plane:NoSchedule-
这是官方配置文件,直接拿来用会提示缺少证书:github.com/kubernetes-…
以下为修改添加证书相关路径添加挂在点等等 证书文件路径为/etc/kubernetes/pki/front-proxy-ca.crt(部署集群时自动生成的证书) 在安装Metrics
cat > ./components.yaml << E
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
k8s-app: metrics-server
name: metrics-server
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
k8s-app: metrics-server
rbac.authorization.k8s.io/aggregate-to-admin: "true"
rbac.authorization.k8s.io/aggregate-to-edit: "true"
rbac.authorization.k8s.io/aggregate-to-view: "true"
name: system:aggregated-metrics-reader
rules:
- apiGroups:
- metrics.k8s.io
resources:
- pods
- nodes
verbs:
- get
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
k8s-app: metrics-server
name: system:metrics-server
rules:
- apiGroups:
- ""
resources:
- nodes/metrics
verbs:
- get
- apiGroups:
- ""
resources:
- pods
- nodes
verbs:
- get
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
labels:
k8s-app: metrics-server
name: metrics-server-auth-reader
namespace: kube-system
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: extension-apiserver-authentication-reader
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
k8s-app: metrics-server
name: metrics-server:system:auth-delegator
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:auth-delegator
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
k8s-app: metrics-server
name: system:metrics-server
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:metrics-server
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: kube-system
---
apiVersion: v1
kind: Service
metadata:
labels:
k8s-app: metrics-server
name: metrics-server
namespace: kube-system
spec:
ports:
- appProtocol: https
name: https
port: 443
protocol: TCP
targetPort: https
selector:
k8s-app: metrics-server
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
k8s-app: metrics-server
name: metrics-server
namespace: kube-system
spec:
selector:
matchLabels:
k8s-app: metrics-server
strategy:
rollingUpdate:
maxUnavailable: 0
template:
metadata:
labels:
k8s-app: metrics-server
spec:
containers:
- args:
- --cert-dir=/tmp
- --secure-port=10250
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
- --kubelet-use-node-status-port
- --metric-resolution=15s
- --kubelet-insecure-tls
- --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt
- --requestheader-username-headers=X-Remote-User
- --requestheader-group-headers=X-Remote-Group
- --requestheader-extra-headers-prefix=X-Remote-Extra-
image: registry.k8s.io/metrics-server/metrics-server:v0.8.0
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 3
httpGet:
path: /livez
port: https
scheme: HTTPS
periodSeconds: 10
name: metrics-server
ports:
- containerPort: 10250
name: https
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /readyz
port: https
scheme: HTTPS
initialDelaySeconds: 20
periodSeconds: 10
resources:
requests:
cpu: 100m
memory: 200Mi
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 1000
seccompProfile:
type: RuntimeDefault
volumeMounts:
- mountPath: /tmp
name: tmp-dir
- mountPath: /etc/kubernetes/pki
name: k8s-certs
nodeSelector:
kubernetes.io/os: linux
priorityClassName: system-cluster-critical
serviceAccountName: metrics-server
volumes:
- emptyDir: {}
name: tmp-dir
- hostPath:
path: /etc/kubernetes/pki
name: k8s-certs
---
apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
labels:
k8s-app: metrics-server
name: v1beta1.metrics.k8s.io
spec:
group: metrics.k8s.io
groupPriorityMinimum: 100
insecureSkipTLSVerify: true
service:
name: metrics-server
namespace: kube-system
version: v1beta1
versionPriority: 100
E
kubectl create -f components.yaml
3.10 将 kube-proxy 切换到 ipvs 模式
kubectl edit cm kube-proxy -n kube-system
# 将 mode 修改为 "ipvs"
更新Kube-Proxy的Pod:
kubectl patch daemonset kube-proxy -n kube-system -p "{\"spec\":{\"template\":{\"metadata\":{\"annotations\":{\"date\":\"$(date +'%s')\"}}}}}"
验证模式:
curl 127.0.0.1:10249/proxyMode
# 应显示 ipvs
3.11 其它工具、Ingress、Storage 等(可选)
注意:一下的相关组件都是一年以前的老版本 如果需要新版本直接在官网下载最新版本安装即可 安装方法可以参考我的教程
安装 Helm 点击跳转
安装 ingress 控制器 点击跳转
安装rook存储 点击跳转
四 注意事项
- kubeadm 默认签发的证书有效期为 一年,生产环境可考虑延长或设置自动更新。
- 控制平面组件(kube-apiserver、controller-manager、scheduler、etcd)以静态 Pod 方式运行,配置文件在
/etc/kubernetes/manifests;更改后 kubelet 会自动重启对应 Pod。 - kubelet 的配置在
/etc/sysconfig/kubelet和/var/lib/kubelet/config.yaml。 - 默认情况下 control-plane/master 节点有污点,不允许调度普通 Pod;如需在 master 上部署 Pod,需要移除污点:
## 查看污点
kubectl describe node | grep Taint
## 删除污点
kubectl taint node --all node-role.kubernetes.io/control-plane:NoSchedule-
五 安装 Kuboard(管理平台 可选)
- 官网:Kuboard(支持在线及离线安装)
- 安装命令:
kubectl apply -f https://addons.kuboard.cn/kuboard/kuboard-v3.yaml
六 验证集群状态
6.1 查看节点状态
kubectl get nodes
确保所有节点状态为 Ready,角色分配符合预期(例如 control-plane、worker 等)。
6.2 查看系统组件 Pod 状态
kubectl get pods -n kube-system
核心服务(如 CoreDNS、kube-proxy、calico/node 等)应为 Running。
6.3 确认控制面组件健康状态
kubectl get componentstatuses
(注意:kubectl get cs/componentstatuses 在新版 Kubernetes 中已被标注为弃用,但仍可用于基本诊断)(
6.4 查看集群信息
kubectl cluster-info
这会显示 API Server、DNS 等服务的地址,确保它们都处于可访问状态。
6.5 API Server 健康检查
-
使用 readiness probe 查看 API 的健康状态:
kubectl get --raw='/readyz?verbose'返回
ok表示 API 已准备就绪并可以处理请求。
6.6 CNI 网络检查
-
核心 DNS 能否正常解析: 部署一个临时 Pod(如
dnsutils或busybox),然后执行 nslookup:kubectl run dnsutils --image=tutum/dnsutils --command -- sleep inf kubectl exec -ti dnsutils -- nslookup kubernetes.default能解析成功说明网络正常
6.7 应用测试
-
部署测试 Pod / Deployment:
kubectl apply -f https://k8s.io/examples/application/deployment.yaml kubectl get pods查看是否能正常创建并运行。
-
暴露服务:
kubectl expose deployment nginx-deployment --port=80 --type=NodePort访问节点 IP + 分配的 NodePort,确保服务可访问。
6.8 Resource Metrics 测试(可选)
如果你已安装 Metrics Server:
kubectl top nodes
kubectl top pods -n kube-system
这些命令能返回节点和 Pod 的 CPU/内存使用情况,说明 Metrics API 正常工作。
6.9 事件查看与调试
-
使用以下命令查看系统事件日志,如果有资源调度或服务启动失败等问题,可以及时发现原因:
kubectl get events --sort-by='.metadata.creationTimestamp' -
也可以导出集群状态供诊断:
kubectl cluster-info dump