主机准备
本次部署使用的系统为Rocky9.2,7个节点
[root@node1 ~]# uname -a
Linux node1 5.14.0-284.11.1.el9_2.x86_64 #1 SMP PREEMPT_DYNAMIC Tue May 9 17:09:15 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
节点分配如下
| 序号 | 主机名 | IP地址 | 说明 |
|---|---|---|---|
| 1 | node1 | 192.168.202.129 | 部署haproxy使用的节点1 |
| 2 | node2 | 192.168.202.130 | 部署haproxy使用的节点2 |
| 3 | node3 | 192.168.202.131 | K8s master1 |
| 4 | node4 | 192.168.202.132 | K8s master2 |
| 5 | node5 | 192.168.202.134 | K8s master3 |
| 6 | node6 | 192.168.202.136 | Worker node1 |
| 7 | node7 | 192.168.202.137 | Worker node2 |
| 8 (只是IP) | lb | 192.168.202.140 | 作为VIP(浮动IP) |
主机名与IP地址解析配置
所有服务器都需要配置,编辑/etc/hosts文件,添加下面的主机解析配置
[root@node1 ~]# vim /etc/hosts
[root@node1 ~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.202.129 node1
192.168.202.130 node2
192.168.202.131 node3
192.168.202.132 node4
192.168.202.134 node5
192.168.202.136 node6
192.168.202.137 node7
关闭主机的防火墙
所有主机都要配置
systemctl disable firewalld --now
关闭SELinux
所有主机都要配置
可以使用sestat查看SELinux是否是启用的状态
# 如果是关闭的话会显示disabled
[root@node7 ~]# sestatus
SELinux status: disabled
# 如果是开启的话会显示enabled
[root@node7 ~]# sestatus
SELinux status: enabled
SELinuxfs mount: /sys/fs/selinux
SELinux root directory: /etc/selinux
Loaded policy name: targeted
Current mode: permissive
Mode from config file: disabled
Policy MLS status: enabled
Policy deny_unknown status: allowed
Memory protection checking: actual (secure)
Max kernel policy version: 33
关闭SELinux
setenforce 0
sed -i 's#SELINUX=enforcing#SELINUX=disabled#g' /etc/selinux/config
配置时间同步
所有节点都要做,保证节点的时间一致
# 查看当前的时区,将时区设置为Asia/Shanghai
[root@node1 ~]# timedatectl
Local time: Mon 2023-09-25 23:22:02 CST
Universal time: Mon 2023-09-25 15:22:02 UTC
RTC time: Mon 2023-09-25 15:22:02
Time zone: Asia/Shanghai (CST, +0800)
System clock synchronized: yes
NTP service: active
RTC in local TZ: no
# 如果时区不是Asia/Shanghai需要将节点都设置为这个时区
[root@node1 ~]# timedatectl set-timezone Asia/Shanghai
# 启动chronyd,作为时间同步的服务
[root@node1 ~]# systemctl enable chronyd --now
ipvs管理工具安装以及模块加载
在部署K8s master和worker节点上安装即可
[root@node1 ~]# yum install ipvsadm ipset sysstat conntrack libseccomp -y
配置ipvs相关模块
# 添加模块配置文件, /etc/modules-load.d中的文件能够在启动时加载内核模块
cat >> /etc/modules-load.d/ipvs.conf <<EOF
ip_vs
ip_vs_rr
ip_vs_wrr
ip_vs_sh
nf_conntrack
EOF
# 重启启动加载内核模块的服务
[root@node1 ~]# systemctl restart systemd-modules-load.service
# 查看内核模块是否加载
[root@node1 ~]# lsmod | grep -e ip_vs -e nf_conntrack
ip_vs_sh 16384 0
ip_vs_wrr 16384 0
ip_vs_rr 16384 0
ip_vs 204800 6 ip_vs_rr,ip_vs_sh,ip_vs_wrr
nf_conntrack_netlink 57344 0
nf_conntrack 188416 5 xt_conntrack,nf_nat,nf_conntrack_netlink,xt_MASQUERADE,ip_vs
nf_defrag_ipv6 24576 2 nf_conntrack,ip_vs
nf_defrag_ipv4 16384 1 nf_conntrack
nfnetlink 20480 4 nft_compat,nf_conntrack_netlink,nf_tables
libcrc32c 16384 5 nf_conntrack,nf_nat,nf_tables,xfs,ip_vs
开启主机内核路由转发及网桥过滤
所有主机均需要操作
配置内核加载br_netfilter和iptables放行ipv4和ipv6流量,确保集群内的容器能够正常通信。
添加网桥过滤及内核转发配置文件
# vm.swappiness = 0 为对swap分区进行关闭
cat > /etc/sysctl.d/k8s.conf <<EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
vm.swappiness = 0
EOF
将br_netfilter模块添加到开机加载
[root@node1 ~]# cat > /etc/modules-load.d/containerd.conf <<EOF
br_netfilter
EOF
# 重启启动加载内核模块的服务
[root@node1 ~]# systemctl restart systemd-modules-load.service
关闭主机swap分区
在部署K8s master和worker节点上安装即可
sed -i 's/^\(.*swap.*\)$/#\1/g' /etc/fstab
swapoff -a
HAProxy和Keeplived的部署
在node1 和 node2 上安装HAProxy和Keepalived
yum install haproxy keepalived -y
HAProxy 配置文件准备
HAProxy两个节点的配置文件相同,配置文件如下
# 用于设置全局配置参数,属于进程级的配置,通常与操作系统配置相关
global
# 设置每个HAProxy进程可以接收的最大并发连接数
maxconn 2000
# 进程所能打开的文件描述符的个数,默认情况下其会自动进行计算,因此不建议修改此选项
ulimit-n 16384
# 全局的日志配置,local0是日志设备,info表示日志级别。其中日志级别有err, warning, info, debug 4种。这个配置表示使用127.0.0.1上的rsyslog服务中的local0日志设备,记录日志等级为info
log 127.0.0.1 local0 err
# stats表示启用HAProxy的统计功能,timeout设置统计模块的超时时间为30s,也就是说如果统计模块的操作在30s内没有完成,就会终止该操作,以防止统计功能占用过多资源。
stats timeout 30s
# 默认配置项,针对以下的frontend、backend和listen生效
defaults
# 使用全局配置中定义的日志配置
log global
# 默认的模式,mode { tcp|http|health },tcp是4层,http是7层,health只会返回OK
mode http
# 记录HTTP 请求,session 状态和计时器
option httplog
# 连接超时
timeout connect 5000
# 客户端超时
timeout client 50000
# 服务端超时
timeout server 50000
# HAProxy中处理HTTP请求的超时时间为15秒,也就是说,如果HAProxy收到一个HTTP请求,但是在15秒内没有返回响应,那么就会终止这个请求
timeout http-request 15s
# 对HTTP保持活动连接的超时时间
timeout http-keep-alive 15s
# 前端配置,monitor-in为名称,可以自己定义。
# 如下的配置表示可以通过http://ip:33305/monitor来查看HAProxy的统计数据和状态信息。
frontend monitor-in
# 指定frontend监听的地址和端口
bind *:33305
# 设置HAProxy实例默认的运行模式
mode http
option httplog
# 用来定义HAProxy内置监控页面的访问路径,方便运维人员通过web界面实时查看HAProxy运行状态。
monitor-uri /monitor
#
frontend k8s-master
bind 0.0.0.0:6443
bind 127.0.0.1:6443
mode tcp
option tcplog
tcp-request inspect-delay 5s
default_backend k8s-master
backend k8s-master
mode tcp
option tcplog
option tcp-check
# rr轮询负载均衡
balance roundrobin
#default-server: 指定默认后端服务器组
#inter 10s: 后端服务器检测间隔为10秒
#downinter 5s: 服务器入机检测间隔为5秒
#rise 2: 服务器响应2次健康才设为如常
#fall 2: 服务器响应2次异常才设为异常
#slowstart 60s: 慢启动模式时间为60秒
#maxconn 250: 每个服务器最大连接数为250
#maxqueue 256: 请求队列最大长度为256
#weight 100: 服务器默认权重为100
#为默认后端服务器组配置健康检查参数,每10秒检查一次服务器状态,宕机后5秒再检测,服务器需要连续正常2次才标记健康,连续异常2次标记不健康。新加入的服务器将在60秒内逐步增加流量(慢启动)。每个服务器允许连接数250,队列长度256。权重默认100,表示流量分配权重。
default-server inter 10s downinter 5s rise 2 fall 2 slowstart 60s maxconn 250 maxqueue 256 weight 100
server node3 192.168.202.131:6443 check
server node4 192.168.202.132:6443 check
server node5 192.168.202.134:6443 check
配置Keepalived
node1和node2两台节点上部署,不过两台节点上的配置不一样
将node1配置为Keepalived master节点,配置文件如下
[root@node1 ~]# cat /etc/keepalived/keepalived.conf
# 指定全局参数定义块
global_defs {
# 为VRRP组定义唯一路由ID,用于区分不同VRRP组
router_id LVS_DEVEL
# 指定LVS脚本或挂钩文件执行用户为root用户
script_user root
# 启用脚本安全访问控制,保证只有root用户有执行权限
enable_script_security
}
# 定义健康检查脚本
vrrp_script chk_apiserver {
# 指定健康检查所使用的脚本的路径
script "/etc/keepalived/check_apiserver.sh"
# 脚本检查间隔为5秒
interval 5
# 检查失败后VRRP权重为5
weight -5
# 连续两次检查失败认为挂掉
fall 2
# 需要检查成功1次才恢复
rise 1
}
# 定义了VRRP实例
vrrp_instance VI_1 {
# 指定该节点为MASTER节点
state MASTER
# 指定对外提供服务的网卡接口
interface ens160
# 指定本节点组播源IP
mcast_src_ip 192.168.202.129
# 定义VRRP组ID为51
virtual_router_id 51
# 定义节点优先级为100
priority 100
abvert_int 2
# 定义不同主机交换组播信息的认证信息
authentication {
auth_type PASS
auth_pass K8SHA_KA_AUTH
}
# 定义浮动IP
virtual_ipaddress {
192.168.202.100
}
# 关联定义的chk_apiserver检查脚本
track_script {
chk_apiserver
}
}
将node2配置为Keepalived backup节点,配置文件如下
[root@node2 ~]# cat /etc/keepalived/keepalived.conf
global_defs {
router_id LVS_DEVEL
script_user root
enable_script_security
}
vrrp_script chk_apiserver {
script "/etc/keepalived/check_apiserver.sh"
interval 5
weight -5
fall 2
rise 1
}
vrrp_instance VI_1 {
state BACKUP # 这里需要设置为备份节点
interface ens160
mcast_src_ip 192.168.202.130
virtual_router_id 51
priority 99 # 优先级没有master节点高
abvert_int 2
authentication {
auth_type PASS
auth_pass K8SHA_KA_AUTH
}
virtual_ipaddress {
192.168.202.140
}
track_script {
chk_apiserver
}
}
健康检查脚本如下
[root@node1 ~]# cat /etc/keepalived/check_apiserver.sh
err=0
for k in $(seq 1 3)
do
check_code=$(pgrep haproxy)
if [[ $check_code == "" ]];then
err=$(expr $err + 1)
sleep 1
continue
else
err=0
break
fi
done
if [[ $err != "0" ]];then
echo "systemctl stop keepalived"
/usr/bin/systemctl stop keepalived
exit 1
else
exit 0
fi
需要给脚本增加可执行的权限
[root@node1 ~]# chmod +x /etc/keepalived/check_apiserver.sh
启动HAProxy和Keepalived
# 先启动HAProxy
[root@node1 ~]# systemctl enable haproxy --now
Created symlink /etc/systemd/system/multi-user.target.wants/haproxy.service → /usr/lib/systemd/system/haproxy.service.
# 后启动Keepalived,以为Keepalived有健康状态检查。
[root@node1 ~]# systemctl enable keepalived --now
Created symlink /etc/systemd/system/multi-user.target.wants/keepalived.service → /usr/lib/systemd/system/keepalived.service.
创建etcd集群
在node3, node4, nod5三个节点上部署etcd集群。
首先生成etcd集群所使用的证书
在node3上生成证书,然后将证书拷贝到node4和node5上
创建工作目录
mkdir -p /data/k8s-work
获取cfssl工具
cd /data/k8s-work
wget https://pkg.cfssl.org/R1.2/cfssl_linux-amd64
wget https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64
wget https://pkg.cfssl.org/R1.2/cfssl-certinfo_linux-amd64
说明:
cfssl是使用go编写,由CloudFlare开源的一款PKI/TLS工具。主要程序有:
- cfssl,是CFSSL的命令行工具
- cfssljson用来从cfssl程序获取JSON输出,并将证书,密钥,CSR和bundle写入文件中。
给证书生成工具添加可执行权限并将文件移动到/usr/local/bin目录
# 添加可执行权限
[root@node3 k8s-work]# chmod +x cfssl*
# 移动到指定的文件夹并重新命名
[root@node3 k8s-work]# mv cfssl_linux-amd64 /usr/local/bin/cfssl
[root@node3 k8s-work]# mv cfssljson_linux-amd64 /usr/local/bin/cfssljson
[root@node3 k8s-work]# mv cfssl-certinfo_linux-amd64 /usr/local/bin/cfssl-certinfo
配置ca证书的请求文件
cat > ca-csr.json <<"EOF"
{
"CN": "kubernetes",
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "Beijing",
"L": "Beijing",
"O": "k8s",
"OU": "CN"
}
],
"ca": {
"expiry": "87600h"
}
}
EOF
创建ca证书
[root@node3 k8s-work]# cfssl gencert -initca ca-csr.json | cfssljson -bare ca
配置ca证书策略
cat > ca-config.json <<"EOF"
{
"signing": {
"default": {
"expiry": "87600h"
},
"profiles": {
"kubernetes": {
"usages": [
"signing",
"key encipherment",
"server auth",
"client auth"
],
"expiry": "87600h"
}
}
}
}
EOF
# 说明
# server auth 表示client可以对使用该ca对server提供的证书进行验证
# client auth 表示server可以使用该ca对client提供的证书进行验证
创建etcd证书请求文件
cat > etcd-csr.json <<"EOF"
{
"CN": "etcd",
"hosts": [
"127.0.0.1",
"192.168.202.131",
"192.168.202.132",
"192.168.202.134"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [{
"C": "CN",
"ST": "Beijing",
"L": "Beijing",
"O": "k8s",
"OU": "CN"
}]
}
EOF
生成etcd证书
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes etcd-csr.json | cfssljson -bare etcd
# 最后该工作目录下的文件如下
[root@node3 k8s-work]# ls
ca-config.json ca.csr ca-csr.json ca-key.pem ca.pem etcd.csr etcd-csr.json etcd-key.pem etcd.pem
下载etcd
wget https://github.com/etcd-io/etcd/releases/download/v3.5.9/etcd-v3.5.9-linux-amd64.tar.gz
解压etcd压缩包并将解压的二进制文件拷贝到/usr/local/bin
# 解压压缩包
[root@node3 ~]# tar -xf etcd-v3.5.9-linux-amd64.tar.gz
[root@node3 ~]# ls etcd-v3.5.9-linux-amd64
Documentation etcd etcdctl etcdutl README-etcdctl.md README-etcdutl.md README.md READMEv2-etcdctl.md
# 拷贝etcd二进制文件到/usr/local/bin
# -p表示保留文件的权限,所有者和时间戳
[root@node3 ~]# cp -p etcd-v3.5.9-linux-amd64/etcd* /usr/local/bin
创建etcd的配置文件
node3节点etcd配置文件
# 创建保存etcd配置文件的目录
[root@node3 ~]# mkdir /etc/etcd/
# 创建配置文件
[root@node3 ~]# cat > /etc/etcd/etcd.conf <<"EOF"
#[Member]
ETCD_NAME="etcd1"
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="https://192.168.202.131:2380"
ETCD_LISTEN_CLIENT_URLS="https://192.168.202.131:2379,http://127.0.0.1:2379"
#[Clustering]
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://192.168.202.131:2380"
ETCD_ADVERTISE_CLIENT_URLS="https://192.168.202.131:2379"
ETCD_INITIAL_CLUSTER="etcd1=https://192.168.202.131:2380,etcd2=https://192.168.202.132:2380,etcd3=https://192.168.202.134:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="new"
EOF
--------------------------------------------------------------------------------
说明:
ETCD_NAME:节点名称,集群中唯一
ETCD_DATA_DIR:数据目录
ETCD_LISTEN_PEER_URLS:集群通信监听地址
ETCD_LISTEN_CLIENT_URLS:客户端访问监听地址
ETCD_INITIAL_ADVERTISE_PEER_URLS:集群通告地址
ETCD_ADVERTISE_CLIENT_URLS:客户端通告地址
ETCD_INITIAL_CLUSTER:集群节点地址
ETCD_INITIAL_CLUSTER_TOKEN:集群Token
ETCD_INITIAL_CLUSTER_STATE:加入集群的当前状态,new是新集群,existing表示加入已有集群
创建保存证书的目录和保存数据的目录
mkdir -p /etc/etcd/ssl
mkdir -p /var/lib/etcd/default.etcd
拷贝生成的证书文件到/etc/etcd/ssl目录
[root@node3 ~]# cp /data/k8s-work/ca*.pem /etc/etcd/ssl
[root@node3 ~]# cp /data/k8s-work/etcd*.pem /etc/etcd/ssl/
创建etcd的systemd管理文件
cat > /etc/systemd/system/etcd.service <<"EOF"
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target
[Service]
Type=notify
EnvironmentFile=-/etc/etcd/etcd.conf
WorkingDirectory=/var/lib/etcd/
ExecStart=/usr/local/bin/etcd \
--cert-file=/etc/etcd/ssl/etcd.pem \
--key-file=/etc/etcd/ssl/etcd-key.pem \
--trusted-ca-file=/etc/etcd/ssl/ca.pem \
--peer-cert-file=/etc/etcd/ssl/etcd.pem \
--peer-key-file=/etc/etcd/ssl/etcd-key.pem \
--peer-trusted-ca-file=/etc/etcd/ssl/ca.pem \
--peer-client-cert-auth \
--client-cert-auth
Restart=on-failure
RestartSec=5
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
EOF
node4 etcd配置文件
# 创建保存etcd配置文件的目录
[root@node3 ~]# mkdir /etc/etcd/
# 创建配置文件
[root@node3 ~]# cat > /etc/etcd/etcd.conf <<"EOF"
#[Member]
ETCD_NAME="etcd2"
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="https://192.168.202.132:2380"
ETCD_LISTEN_CLIENT_URLS="https://192.168.202.132:2379,http://127.0.0.1:2379"
#[Clustering]
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://192.168.202.132:2380"
ETCD_ADVERTISE_CLIENT_URLS="https://192.168.202.132:2379"
ETCD_INITIAL_CLUSTER="etcd1=https://192.168.202.131:2380,etcd2=https://192.168.202.132:2380,etcd3=https://192.168.202.134:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="new"
EOF
创建保存证书的目录和保存数据的目录
mkdir -p /etc/etcd/ssl
mkdir -p /var/lib/etcd/default.etcd
将在node3上生成的证书拷贝到node4的/etc/etcd/ssl目录
[root@node3 ssl]# pwd
/etc/etcd/ssl
[root@node3 ssl]# ls
ca-key.pem ca.pem etcd-key.pem etcd.pem
[root@node3 ssl]# scp ./* node4:/etc/etcd/ssl
root@node4's password:
ca-key.pem 100% 1679 2.5MB/s 00:00
ca.pem 100% 1346 2.3MB/s 00:00
etcd-key.pem 100% 1679 3.2MB/s 00:00
etcd.pem
创建etcd的systemd管理文件
cat > /etc/systemd/system/etcd.service <<"EOF"
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target
[Service]
Type=notify
EnvironmentFile=-/etc/etcd/etcd.conf
WorkingDirectory=/var/lib/etcd/
ExecStart=/usr/local/bin/etcd \
--cert-file=/etc/etcd/ssl/etcd.pem \
--key-file=/etc/etcd/ssl/etcd-key.pem \
--trusted-ca-file=/etc/etcd/ssl/ca.pem \
--peer-cert-file=/etc/etcd/ssl/etcd.pem \
--peer-key-file=/etc/etcd/ssl/etcd-key.pem \
--peer-trusted-ca-file=/etc/etcd/ssl/ca.pem \
--peer-client-cert-auth \
--client-cert-auth
Restart=on-failure
RestartSec=5
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
EOF
node5 etcd配置文件
# 创建保存etcd配置文件的目录
[root@node3 ~]# mkdir /etc/etcd/
# 创建配置文件
[root@node3 ~]# cat > /etc/etcd/etcd.conf <<"EOF"
#[Member]
ETCD_NAME="etcd3"
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="https://192.168.202.134:2380"
ETCD_LISTEN_CLIENT_URLS="https://192.168.202.134:2379,http://127.0.0.1:2379"
#[Clustering]
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://192.168.202.134:2380"
ETCD_ADVERTISE_CLIENT_URLS="https://192.168.202.134:2379"
ETCD_INITIAL_CLUSTER="etcd1=https://192.168.202.131:2380,etcd2=https://192.168.202.132:2380,etcd3=https://192.168.202.134:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="new"
EOF
创建保存证书的目录和保存数据的目录
mkdir -p /etc/etcd/ssl
mkdir -p /var/lib/etcd/default.etcd
将在node3上生成的证书拷贝到node5的/etc/etcd/ssl目录
[root@node3 ssl]# pwd
/etc/etcd/ssl
[root@node3 ssl]# ls
ca-key.pem ca.pem etcd-key.pem etcd.pem
[root@node3 ssl]# scp ./* node5:/etc/etcd/ssl
root@node5's password:
ca-key.pem 100% 1679 2.5MB/s 00:00
ca.pem 100% 1346 2.3MB/s 00:00
etcd-key.pem 100% 1679 3.2MB/s 00:00
etcd.pem
创建etcd的systemd管理文件
cat > /etc/systemd/system/etcd.service <<"EOF"
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target
[Service]
Type=notify
EnvironmentFile=-/etc/etcd/etcd.conf
WorkingDirectory=/var/lib/etcd/
ExecStart=/usr/local/bin/etcd \
--cert-file=/etc/etcd/ssl/etcd.pem \
--key-file=/etc/etcd/ssl/etcd-key.pem \
--trusted-ca-file=/etc/etcd/ssl/ca.pem \
--peer-cert-file=/etc/etcd/ssl/etcd.pem \
--peer-key-file=/etc/etcd/ssl/etcd-key.pem \
--peer-trusted-ca-file=/etc/etcd/ssl/ca.pem \
--peer-client-cert-auth \
--client-cert-auth
Restart=on-failure
RestartSec=5
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
EOF
启动etcd集群
node3,node4,node5上都执行
systemctl daemon-reload
systemctl enable --now etcd.service
验证etcd集群状态
[root@node3 ssl]# ETCDCTL_API=3 /usr/local/bin/etcdctl --write-out=table --cacert=/etc/etcd/ssl/ca.pem --cert=/etc/etcd/ssl/etcd.pem --key=/etc/etcd/ssl/etcd-key.pem --endpoints=https://192.168.202.131:2379,https://192.168.202.132:2379,https://192.168.202.134:2379 endpoint health
+------------------------------+--------+------------+-------+
| ENDPOINT | HEALTH | TOOK | ERROR |
+------------------------------+--------+------------+-------+
| https://192.168.202.131:2379 | true | 8.678307ms | |
| https://192.168.202.132:2379 | true | 9.428444ms | |
| https://192.168.202.134:2379 | true | 9.347641ms | |
+------------------------------+--------+------------+-------+
检查数据库的性能
[root@node3 ssl]# ETCDCTL_API=3 /usr/local/bin/etcdctl --write-out=table --cacert=/etc/etcd/ssl/ca.pem --cert=/etc/etcd/ssl/etcd.pem --key=/etc/etcd/ssl/etcd-key.pem --endpoints=https://192.168.202.131:2379,https://192.168.202.132:2379,https://192.168.202.134:2379 check perf
59 / 60 Booooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooom ! 98.33%PASS: Throughput is 151 writes/s
PASS: Slowest request took 0.023224s
PASS: Stddev is 0.000461s
PASS
检查成员列表
[root@node3 ssl]# ETCDCTL_API=3 /usr/local/bin/etcdctl --write-out=table --cacert=/etc/etcd/ssl/ca.pem --cert=/etc/etcd/ssl/etcd.pem --key=/etc/etcd/ssl/etcd-key.pem --endpoints=https://192.168.202.131:2379,https://192.168.202.132:2379,https://192.168.202.134:2379 member list
+------------------+---------+-------+------------------------------+------------------------------+------------+
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER |
+------------------+---------+-------+------------------------------+------------------------------+------------+
| 80348da092b8e96 | started | etcd1 | https://192.168.202.131:2380 | https://192.168.202.131:2379 | false |
| 70222b4638121a8b | started | etcd2 | https://192.168.202.132:2380 | https://192.168.202.132:2379 | false |
| a9089efe301a5c67 | started | etcd3 | https://192.168.202.134:2380 | https://192.168.202.134:2379 | false |
+------------------+---------+-------+------------------------------+------------------------------+------------+
查看etcd集群节点状态
[root@node3 ssl]# ETCDCTL_API=3 /usr/local/bin/etcdctl --write-out=table --cacert=/etc/etcd/ssl/ca.pem --cert=/etc/etcd/ssl/etcd.pem --key=/etc/etcd/ssl/etcd-key.pem --endpoints=https://192.168.202.131:2379,https://192.168.202.132:2379,https://192.168.202.134:2379 endpoint status
+------------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+------------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| https://192.168.202.131:2379 | 80348da092b8e96 | 3.5.9 | 22 MB | true | false | 2 | 8990 | 8990 | |
| https://192.168.202.132:2379 | 70222b4638121a8b | 3.5.9 | 22 MB | false | false | 2 | 8990 | 8990 | |
| https://192.168.202.134:2379 | a9089efe301a5c67 | 3.5.9 | 22 MB | false | false | 2 | 8990 | 8990 | |
+------------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
k8s master节点所需组件的部署
master(node3, node4, node5)节点所需的组件有以下几个部分
- kube apiserver
- kubectl
- controller manager
- kube scheduler
下载二进制软件
# 在node3下载的软件包,后面再拷贝到其他节点上
[root@node3 ~]# wget https://dl.k8s.io/v1.28.2/kubernetes-server-linux-amd64.tar.gz
解压下载好的软件包
# 解压压缩包
[root@node3 ~]# tar -xf kubernetes-server-linux-amd64.tar.gz
[root@node3 ~]# cd kubernetes
[root@node3 kubernetes]# ls
addons kubernetes-src.tar.gz LICENSES server
[root@node3 kubernetes]# cd server/
# 进入bin目录,bin目录里面包含相关的二进制文件
[root@node3 server]# ls bin/
apiextensions-apiserver kube-apiserver.docker_tag kube-controller-manager.tar kubectl.tar kube-proxy.docker_tag kube-scheduler.tar
kubeadm kube-apiserver.tar kubectl kubelet kube-proxy.tar mounter
kube-aggregator kube-controller-manager kubectl-convert kube-log-runner kube-scheduler
kube-apiserver kube-controller-manager.docker_tag kubectl.docker_tag kube-proxy kube-scheduler.docker_tag
node3节点拷贝需要的文件到指定的目录
[root@node3 bin]# cp kube-apiserver kube-controller-manager kube-scheduler kubectl /usr/local/bin/
从node3上将需要的二进制文件拷贝到node4
[root@node3 bin]# scp kube-apiserver kube-controller-manager kube-scheduler kubectl node4:/usr/local/bin/
root@node4's password:
kube-apiserver 100% 116MB 197.7MB/s 00:00
kube-controller-manager 100% 112MB 195.2MB/s 00:00
kube-scheduler 100% 53MB 198.9MB/s 00:00
kubectl 100% 48MB 202.5MB/s 00:00
[root@node3 bin]#
从node3上将需要的二进制文件拷贝到node5
[root@node3 bin]# scp kube-apiserver kube-controller-manager kube-scheduler kubectl node5:/usr/local/bin/
root@node5's password:
kube-apiserver 100% 116MB 171.9MB/s 00:00
kube-controller-manager 100% 112MB 180.5MB/s 00:00
kube-scheduler 100% 53MB 200.3MB/s 00:00
kubectl 100% 48MB 221.1MB/s 00:00
[root@node3 bin]#
部署apiserver组件
创建目录,在所有的k8s master节点上执行
[root@node3 bin]# mkdir -p /etc/kubernetes
[root@node3 bin]# mkdir -p /etc/kubernetes/ssl
[root@node3 bin]# mkdir -p /var/log/kubernetes
创建apiserver证书请求文件
cat > kube-apiserver-csr.json << "EOF"
{
"CN": "kubernetes",
"hosts": [
"127.0.0.1",
"192.168.202.100",
"192.168.202.129",
"192.168.202.130",
"192.168.202.131",
"192.168.202.132",
"192.168.202.134",
"192.168.202.136",
"192.168.202.137",
"192.168.202.138",
"192.168.202.139",
"192.168.202.140",
"10.96.0.1",
"kubernetes",
"kubernetes.default",
"kubernetes.default.svc",
"kubernetes.default.svc.cluster",
"kubernetes.default.svc.cluster.local"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "Beijing",
"L": "Beijing",
"O": "k8s",
"OU": "CN"
}
]
}
EOF
说明:
如果 hosts 字段不为空则需要指定授权使用该证书的 IP(含VIP) 或域名列表。由于该证书被 集群使用,需要将节点的IP都填上,为了方便后期扩容可以多写几个预留的IP。
同时还需要填写 service 网络的首个IP(一般是 kube-apiserver 指定的 service-cluster-ip-range 网段的第一个IP,如 10.96.0.1)。
生成apiserver证书及token文件
[root@node3 k8s-work]# cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kube-apiserver-csr.json | cfssljson -bare kube-apiserver
将生成的证书拷贝到指定的目录
[root@node3 k8s-work]# cp kube-apiserver.pem kube-apiserver-key.pem ca.pem ca-key.pem /etc/kubernetes/ssl/
创建TLS机制所需的token
创建TLS机制所需TOKEN TLS Bootstraping:Master apiserver启用TLS认证后,Node节点kubelet和kube-proxy与kube-apiserver进行通信,必须使用CA签发的有效证书才可以,当Node节点很多时,这种客户端证书颁发需要大量工作,同样也会增加集群扩展复杂度。为了简化流程,Kubernetes引入了TLS bootstraping机制来自动颁发客户端证书,kubelet会以一个低权限用户自动向apiserver申请证书,kubelet的证书由apiserver动态签署。所以强烈建议在Node上使用这种方式,目前主要用于kubelet,kube-proxy还是由我们统一颁发一个证书。
cat > /etc/kubernetes/token.csv << EOF
$(head -c 16 /dev/urandom | od -An -t x | tr -d ' '),kubelet-bootstrap,10001,"system:kubelet-bootstrap"
EOF
node3上创建apiserver服务配置文件
cat > /etc/kubernetes/kube-apiserver.conf << "EOF"
KUBE_APISERVER_OPTS="--enable-admission-plugins=NamespaceLifecycle,NodeRestriction,LimitRanger,ServiceAccount,DefaultStorageClass,ResourceQuota \
--anonymous-auth=false \
--bind-address=192.168.202.131 \
--secure-port=6443 \
--advertise-address=192.168.202.131 \
--authorization-mode=Node,RBAC \
--runtime-config=api/all=true \
--enable-bootstrap-token-auth \
--service-cluster-ip-range=10.96.0.0/16 \
--token-auth-file=/etc/kubernetes/token.csv \
--service-node-port-range=30000-32767 \
--tls-cert-file=/etc/kubernetes/ssl/kube-apiserver.pem \
--tls-private-key-file=/etc/kubernetes/ssl/kube-apiserver-key.pem \
--client-ca-file=/etc/kubernetes/ssl/ca.pem \
--kubelet-client-certificate=/etc/kubernetes/ssl/kube-apiserver.pem \
--kubelet-client-key=/etc/kubernetes/ssl/kube-apiserver-key.pem \
--service-account-key-file=/etc/kubernetes/ssl/ca-key.pem \
--service-account-signing-key-file=/etc/kubernetes/ssl/ca-key.pem \
--service-account-issuer=api \
--etcd-cafile=/etc/etcd/ssl/ca.pem \
--etcd-certfile=/etc/etcd/ssl/etcd.pem \
--etcd-keyfile=/etc/etcd/ssl/etcd-key.pem \
--etcd-servers=https://192.168.202.131:2379,https://192.168.202.132:2379,https://192.168.202.134:2379 \
--allow-privileged=true \
--apiserver-count=3 \
--audit-log-maxage=30 \
--audit-log-maxbackup=3 \
--audit-log-maxsize=100 \
--audit-log-path=/var/log/kube-apiserver-audit.log \
--event-ttl=1h \
--v=4"
EOF
创建apiserver服务管理配置文件
cat > /etc/systemd/system/kube-apiserver.service << "EOF"
[Unit]
Description=Kubernetes API Server
Documentation=https://github.com/kubernetes/kubernetes
After=etcd.service
Wants=etcd.service
[Service]
EnvironmentFile=-/etc/kubernetes/kube-apiserver.conf
ExecStart=/usr/local/bin/kube-apiserver $KUBE_APISERVER_OPTS
Restart=on-failure
RestartSec=5
Type=notify
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
EOF
对node4节点进行配置
从node3 将证书文件拷贝到node4
[root@node3 k8s-work]# scp ca*.pem root@node4:/etc/kubernetes/ssl
root@node4's password:
ca-key.pem 100% 1679 3.1MB/s 00:00
ca.pem 100% 1346 2.1MB/s 00:00
[root@node3 k8s-work]# scp kube-apiserver*.pem root@node4:/etc/kubernetes/ssl/
root@node4's password:
kube-apiserver-key.pem 100% 1679 3.5MB/s 00:00
kube-apiserver.pem
将token.csv也拷贝到node4
scp /etc/kubernetes/token.csv node4:/etc/kubernetes/
node4上创建apiserver服务配置文件
cat > /etc/kubernetes/kube-apiserver.conf << "EOF"
KUBE_APISERVER_OPTS="--enable-admission-plugins=NamespaceLifecycle,NodeRestriction,LimitRanger,ServiceAccount,DefaultStorageClass,ResourceQuota \
--anonymous-auth=false \
--bind-address=192.168.202.132 \
--secure-port=6443 \
--advertise-address=192.168.202.132 \
--authorization-mode=Node,RBAC \
--runtime-config=api/all=true \
--enable-bootstrap-token-auth \
--service-cluster-ip-range=10.96.0.0/16 \
--token-auth-file=/etc/kubernetes/token.csv \
--service-node-port-range=30000-32767 \
--tls-cert-file=/etc/kubernetes/ssl/kube-apiserver.pem \
--tls-private-key-file=/etc/kubernetes/ssl/kube-apiserver-key.pem \
--client-ca-file=/etc/kubernetes/ssl/ca.pem \
--kubelet-client-certificate=/etc/kubernetes/ssl/kube-apiserver.pem \
--kubelet-client-key=/etc/kubernetes/ssl/kube-apiserver-key.pem \
--service-account-key-file=/etc/kubernetes/ssl/ca-key.pem \
--service-account-signing-key-file=/etc/kubernetes/ssl/ca-key.pem \
--service-account-issuer=api \
--etcd-cafile=/etc/etcd/ssl/ca.pem \
--etcd-certfile=/etc/etcd/ssl/etcd.pem \
--etcd-keyfile=/etc/etcd/ssl/etcd-key.pem \
--etcd-servers=https://192.168.202.131:2379,https://192.168.202.132:2379,https://192.168.202.134:2379 \
--allow-privileged=true \
--apiserver-count=3 \
--audit-log-maxage=30 \
--audit-log-maxbackup=3 \
--audit-log-maxsize=100 \
--audit-log-path=/var/log/kube-apiserver-audit.log \
--event-ttl=1h \
--v=4"
EOF
在node4创建apiserver服务管理配置文件
cat > /etc/systemd/system/kube-apiserver.service << "EOF"
[Unit]
Description=Kubernetes API Server
Documentation=https://github.com/kubernetes/kubernetes
After=etcd.service
Wants=etcd.service
[Service]
EnvironmentFile=-/etc/kubernetes/kube-apiserver.conf
ExecStart=/usr/local/bin/kube-apiserver $KUBE_APISERVER_OPTS
Restart=on-failure
RestartSec=5
Type=notify
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
EOF
对node5节点进行配置
从node3 将证书文件拷贝到node5
[root@node3 k8s-work]# scp ca*.pem root@node5:/etc/kubernetes/ssl
root@node5's password:
ca-key.pem 100% 1679 3.4MB/s 00:00
ca.pem 100% 1346 4.5MB/s 00:00
[root@node3 k8s-work]# scp kube-apiserver*.pem root@node5:/etc/kubernetes/ssl/
root@node5's password:
kube-apiserver-key.pem 100% 1679 3.6MB/s 00:00
kube-apiserver.pem
将token.csv也拷贝到node5
scp /etc/kubernetes/token.csv node5:/etc/kubernetes/
node5上创建apiserver服务配置文件
cat > /etc/kubernetes/kube-apiserver.conf << "EOF"
KUBE_APISERVER_OPTS="--enable-admission-plugins=NamespaceLifecycle,NodeRestriction,LimitRanger,ServiceAccount,DefaultStorageClass,ResourceQuota \
--anonymous-auth=false \
--bind-address=192.168.202.134 \
--secure-port=6443 \
--advertise-address=192.168.202.134 \
--authorization-mode=Node,RBAC \
--runtime-config=api/all=true \
--enable-bootstrap-token-auth \
--service-cluster-ip-range=10.96.0.0/16 \
--token-auth-file=/etc/kubernetes/token.csv \
--service-node-port-range=30000-32767 \
--tls-cert-file=/etc/kubernetes/ssl/kube-apiserver.pem \
--tls-private-key-file=/etc/kubernetes/ssl/kube-apiserver-key.pem \
--client-ca-file=/etc/kubernetes/ssl/ca.pem \
--kubelet-client-certificate=/etc/kubernetes/ssl/kube-apiserver.pem \
--kubelet-client-key=/etc/kubernetes/ssl/kube-apiserver-key.pem \
--service-account-key-file=/etc/kubernetes/ssl/ca-key.pem \
--service-account-signing-key-file=/etc/kubernetes/ssl/ca-key.pem \
--service-account-issuer=api \
--etcd-cafile=/etc/etcd/ssl/ca.pem \
--etcd-certfile=/etc/etcd/ssl/etcd.pem \
--etcd-keyfile=/etc/etcd/ssl/etcd-key.pem \
--etcd-servers=https://192.168.202.131:2379,https://192.168.202.132:2379,https://192.168.202.134:2379 \
--allow-privileged=true \
--apiserver-count=3 \
--audit-log-maxage=30 \
--audit-log-maxbackup=3 \
--audit-log-maxsize=100 \
--audit-log-path=/var/log/kube-apiserver-audit.log \
--event-ttl=1h \
--v=4"
EOF
在node4创建apiserver服务管理配置文件
cat > /etc/systemd/system/kube-apiserver.service << "EOF"
[Unit]
Description=Kubernetes API Server
Documentation=https://github.com/kubernetes/kubernetes
After=etcd.service
Wants=etcd.service
[Service]
EnvironmentFile=-/etc/kubernetes/kube-apiserver.conf
ExecStart=/usr/local/bin/kube-apiserver $KUBE_APISERVER_OPTS
Restart=on-failure
RestartSec=5
Type=notify
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
EOF
启动服务
在node3, node4, node5三个节点上启动apiserver服务
systemctl daemon-reload
systemctl enable kube-apiserver --now
部署kubectl
创建kubectl证书请求文件
cat > admin-csr.json << "EOF"
{
"CN": "admin",
"hosts": [],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "Beijing",
"L": "Beijing",
"O": "system:masters",
"OU": "system"
}
]
}
EOF
说明:
后续 kube-apiserver 使用 RBAC 对客户端(如 kubelet、kube-proxy、Pod)请求进行授权;
kube-apiserver 预定义了一些 RBAC 使用的 RoleBindings,如 cluster-admin 将 Group system:masters 与 Role cluster-admin 绑定,该 Role 授予了调用kube-apiserver 的所有 API的权限;
O指定该证书的 Group 为 system:masters,kubelet 使用该证书访问 kube-apiserver 时 ,由于证书被 CA 签名,所以认证通过,同时由于证书用户组为经过预授权的 system:masters,所以被授予访问所有 API 的权限;
注:
这个admin 证书,是将来生成管理员用的kubeconfig 配置文件用的,现在我们一般建议使用RBAC 来对kubernetes 进行角色权限控制, kubernetes 将证书中的CN 字段 作为User, O 字段作为 Group;
"O": "system:masters", 必须是system:masters,否则后面kubectl create clusterrolebinding报错。
生成证书文件
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes admin-csr.json | cfssljson -bare admin
复制证书文件到指定目录
cp admin*.pem /etc/kubernetes/ssl/
生成kubeconfig配置文件
kube.config 为 kubectl 的配置文件,包含访问 apiserver 的所有信息,如 apiserver 地址、CA 证书和自身使用的证书
kubectl config set-cluster kubernetes --certificate-authority=ca.pem --embed-certs=true --server=https://192.168.202.140:6443 --kubeconfig=kube.config
kubectl config set-credentials admin --client-certificate=admin.pem --client-key=admin-key.pem --embed-certs=true --kubeconfig=kube.config
kubectl config set-context kubernetes --cluster=kubernetes --user=admin --kubeconfig=kube.config
kubectl config use-context kubernetes --kubeconfig=kube.config
准备kubectl配置文件并进行角色绑定
mkdir ~/.kube
cp kube.config ~/.kube/config
kubectl create clusterrolebinding kube-apiserver:kubelet-apis --clusterrole=system:kubelet-api-admin --user kubernetes --kubeconfig=/root/.kube/config
查看状态
查看集群的状态
[root@node3 k8s-work]# kubectl cluster-info
Kubernetes control plane is running at https://192.168.202.140:6443
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
查看组件的状态
[root@node3 k8s-work]# kubectl get componentstatuses
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME STATUS MESSAGE ERROR
scheduler Unhealthy Get "https://127.0.0.1:10259/healthz": dial tcp 127.0.0.1:10259: connect: connection refused
controller-manager Unhealthy Get "https://127.0.0.1:10257/healthz": dial tcp 127.0.0.1:10257: connect: connection refused
etcd-0 Healthy ok
查看命名空间中的资源对象
[root@node3 k8s-work]# kubectl get all --all-namespaces
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 37m
同步kubectl配置文件到集群其它master节点
# node4
[root@node4 ~]# mkdir /root/.kube
# node5
[root@node5 ~]# mkdir /root/.kube
[root@node3 k8s-work]# scp /root/.kube/config node4:/root/.kube/config
root@node4's password:
config 100% 6239 10.5MB/s 00:00
[root@node3 k8s-work]# scp /root/.kube/config node5:/root/.kube/config
root@node5's password:
config
部署kube-controller-manager
创建kube-controller-manager证书请求文件
cat > kube-controller-manager-csr.json << "EOF"
{
"CN": "system:kube-controller-manager",
"key": {
"algo": "rsa",
"size": 2048
},
"hosts": [
"127.0.0.1",
"192.168.202.131",
"192.168.202.132",
"192.168.202.134"
],
"names": [
{
"C": "CN",
"ST": "Beijing",
"L": "Beijing",
"O": "system:kube-controller-manager",
"OU": "system"
}
]
}
EOF
说明:
hosts 列表包含所有 kube-controller-manager 节点 IP;
CN 为 system:kube-controller-manager;
O 为 system:kube-controller-manager,kubernetes 内置的 ClusterRoleBindings system:kube-controller-manager 赋予 kube-controller-manager 工作所需的权限
创建kube-controller-manager证书文件
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kube-controller-manager-csr.json | cfssljson -bare kube-controller-manager
创建kube-controller-manager的kube-controller-manager.kubeconfig
kubectl config set-cluster kubernetes --certificate-authority=ca.pem --embed-certs=true --server=https://192.168.202.140:6443 --kubeconfig=kube-controller-manager.kubeconfig
kubectl config set-credentials system:kube-controller-manager --client-certificate=kube-controller-manager.pem --client-key=kube-controller-manager-key.pem --embed-certs=true --kubeconfig=kube-controller-manager.kubeconfig
kubectl config set-context system:kube-controller-manager --cluster=kubernetes --user=system:kube-controller-manager --kubeconfig=kube-controller-manager.kubeconfig
kubectl config use-context system:kube-controller-manager --kubeconfig=kube-controller-manager.kubeconfig
创建kube-controller-manager配置文件
cat > kube-controller-manager.conf << "EOF"
KUBE_CONTROLLER_MANAGER_OPTS="
--secure-port=10257 \
--bind-address=127.0.0.1 \
--kubeconfig=/etc/kubernetes/kube-controller-manager.kubeconfig \
--service-cluster-ip-range=10.96.0.0/16 \
--cluster-name=kubernetes \
--cluster-signing-cert-file=/etc/kubernetes/ssl/ca.pem \
--cluster-signing-key-file=/etc/kubernetes/ssl/ca-key.pem \
--allocate-node-cidrs=true \
--cluster-cidr=10.244.0.0/16 \
--root-ca-file=/etc/kubernetes/ssl/ca.pem \
--service-account-private-key-file=/etc/kubernetes/ssl/ca-key.pem \
--leader-elect=true \
--feature-gates=RotateKubeletServerCertificate=true \
--controllers=*,bootstrapsigner,tokencleaner \
--horizontal-pod-autoscaler-sync-period=10s \
--tls-cert-file=/etc/kubernetes/ssl/kube-controller-manager.pem \
--tls-private-key-file=/etc/kubernetes/ssl/kube-controller-manager-key.pem \
--use-service-account-credentials=true \
--v=2"
EOF
创建服务启动文件
cat > kube-controller-manager.service << "EOF"
[Unit]
Description=Kubernetes Controller Manager
Documentation=https://github.com/kubernetes/kubernetes
[Service]
EnvironmentFile=-/etc/kubernetes/kube-controller-manager.conf
ExecStart=/usr/local/bin/kube-controller-manager $KUBE_CONTROLLER_MANAGER_OPTS
Restart=on-failure
RestartSec=5
[Install]
WantedBy=multi-user.target
EOF
将下列文件拷贝到master文件指定的文件夹
cp kube-controller-manager*.pem /etc/kubernetes/ssl/
cp kube-controller-manager.kubeconfig /etc/kubernetes/
cp kube-controller-manager.conf /etc/kubernetes/
cp kube-controller-manager.service /usr/lib/systemd/system/
scp kube-controller-manager*.pem node4:/etc/kubernetes/ssl/
scp kube-controller-manager*.pem node5:/etc/kubernetes/ssl/
scp kube-controller-manager.kubeconfig kube-controller-manager.conf node4:/etc/kubernetes/
scp kube-controller-manager.kubeconfig kube-controller-manager.conf node5:/etc/kubernetes/
scp kube-controller-manager.service node4:/usr/lib/systemd/system/
scp kube-controller-manager.service node5:/usr/lib/systemd/system/
启动服务
systemctl daemon-reload
systemctl enable --now kube-controller-manager
查看组件的运行状态
[root@node5 ~]# kubectl get componentstatuses
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME STATUS MESSAGE ERROR
scheduler Unhealthy Get "https://127.0.0.1:10259/healthz": dial tcp 127.0.0.1:10259: connect: connection refused
controller-manager Healthy ok
etcd-0 Healthy ok
部署kube-scheduler
创建kube-scheduler证书请求文件
cat > kube-scheduler-csr.json << "EOF"
{
"CN": "system:kube-scheduler",
"hosts": [
"127.0.0.1",
"192.168.202.131",
"192.168.202.132",
"192.168.202.134"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "Beijing",
"L": "Beijing",
"O": "system:kube-scheduler",
"OU": "system"
}
]
}
EOF
生成kube-scheduler证书
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kube-scheduler-csr.json | cfssljson -bare kube-scheduler
创建kube-scheduler的kubeconfig
kubectl config set-cluster kubernetes --certificate-authority=ca.pem --embed-certs=true --server=https://192.168.202.140:6443 --kubeconfig=kube-scheduler.kubeconfig
kubectl config set-credentials system:kube-scheduler --client-certificate=kube-scheduler.pem --client-key=kube-scheduler-key.pem --embed-certs=true --kubeconfig=kube-scheduler.kubeconfig
kubectl config set-context system:kube-scheduler --cluster=kubernetes --user=system:kube-scheduler --kubeconfig=kube-scheduler.kubeconfig
kubectl config use-context system:kube-scheduler --kubeconfig=kube-scheduler.kubeconfig
创建服务配置文件
cat > kube-scheduler.conf << "EOF"
KUBE_SCHEDULER_OPTS="
--bind-address=127.0.0.1 \
--kubeconfig=/etc/kubernetes/kube-scheduler.kubeconfig \
--leader-elect=true \
--v=2"
EOF
创建服务启动配置文件
cat > kube-scheduler.service << "EOF"
[Unit]
Description=Kubernetes Scheduler
Documentation=https://github.com/kubernetes/kubernetes
[Service]
EnvironmentFile=-/etc/kubernetes/kube-scheduler.conf
ExecStart=/usr/local/bin/kube-scheduler $KUBE_SCHEDULER_OPTS
Restart=on-failure
RestartSec=5
[Install]
WantedBy=multi-user.target
EOF
同步集群文件到master节点
cp kube-scheduler*.pem /etc/kubernetes/ssl/
cp kube-scheduler.kubeconfig /etc/kubernetes/
cp kube-scheduler.conf /etc/kubernetes/
cp kube-scheduler.service /usr/lib/systemd/system/
scp kube-scheduler*.pem node4:/etc/kubernetes/ssl/
scp kube-scheduler*.pem node5:/etc/kubernetes/ssl/
scp kube-scheduler.kubeconfig kube-scheduler.conf node4:/etc/kubernetes/
scp kube-scheduler.kubeconfig kube-scheduler.conf node5:/etc/kubernetes/
scp kube-scheduler.service node4:/usr/lib/systemd/system/
scp kube-scheduler.service node5:/usr/lib/systemd/system/
启动服务
systemctl daemon-reload
systemctl enable --now kube-scheduler
systemctl status kube-scheduler
查看服务
[root@node5 ~]# kubectl get componentstatuses
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME STATUS MESSAGE ERROR
controller-manager Healthy ok
scheduler Healthy ok
etcd-0 Healthy ok
k8s 工作节点所需组件的部署
worker(node6, node7)节点所需的组件有以下几个部分
- kubelet
- kube-proxy
- 容器运行时工具 containerd
从node3中将需要的二进制文件拷贝到node6和node7
[root@node3 bin]# scp kubelet kube-proxy node6:/usr/local/bin/
[root@node3 bin]# scp kubelet kube-proxy node7:/usr/local/bin/
安装containerd
wget https://github.com/containerd/containerd/releases/download/v1.6.24/cri-containerd-cni-1.6.24-linux-amd64.tar.gz
tar -xf cri-containerd-cni-1.6.24-linux-amd64.tar.gz -C /
默认解压后会有如下目录:
etc
opt
usr
会把对应的目解压到/下对应目录中,这样就省去复制文件步骤。
创建配置文件
containerd config default >/etc/containerd/config.toml
sed -i 's#SystemdCgroup = false#SystemdCgroup = true#g' /etc/containerd/config.toml
sed -i 's@registry.k8s.io/pause:3.6@registry.aliyuncs.com/google_containers/pause:3.9@' /etc/containerd/config.toml
运行containerd
systemctl enable containerd --now
安装runc
wget https://github.com/opencontainers/runc/releases/download/v1.1.9/runc.amd64
[root@node6 ~]# chmod +x runc.amd64
[root@node6 ~]# mv runc.amd64 /usr/local/sbin/runc
部署kubelet
在node3操作生成配置文件,然后复制到工作节点上
创建kubelet-bootstrap.kubeconfig
BOOTSTRAP_TOKEN=$(awk -F "," '{print $1}' /etc/kubernetes/token.csv)
kubectl config set-cluster kubernetes --certificate-authority=ca.pem --embed-certs=true --server=https://192.168.202.140:6443 --kubeconfig=kubelet-bootstrap.kubeconfig
kubectl config set-credentials kubelet-bootstrap --token=${BOOTSTRAP_TOKEN} --kubeconfig=kubelet-bootstrap.kubeconfig
kubectl config set-context default --cluster=kubernetes --user=kubelet-bootstrap --kubeconfig=kubelet-bootstrap.kubeconfig
kubectl config use-context default --kubeconfig=kubelet-bootstrap.kubeconfig
kubectl create clusterrolebinding cluster-system-anonymous --clusterrole=cluster-admin --user=kubelet-bootstrap
kubectl create clusterrolebinding kubelet-bootstrap --clusterrole=system:node-bootstrapper --user=kubelet-bootstrap --kubeconfig=kubelet-bootstrap.kubeconfig
kubectl describe clusterrolebinding cluster-system-anonymous
kubectl describe clusterrolebinding kubelet-bootstrap
在node6和node7两台工作节点上创建目录
mkdir -p /etc/kubernetes/ssl
mkdir -p /var/lib/kubelet
mkdir -p /var/log/kubernetes
将文件拷贝到工作节点
scp kubelet-bootstrap.kubeconfig node6:/etc/kubernetes/
scp kubelet-bootstrap.kubeconfig node7:/etc/kubernetes/
scp ca.pem node6:/etc/kubernetes/ssl/
scp ca.pem node7:/etc/kubernetes/ssl/
在node6上创建kubelet配置文件
cat > /etc/kubernetes/kubelet.json << "EOF"
{
"kind": "KubeletConfiguration",
"apiVersion": "kubelet.config.k8s.io/v1beta1",
"authentication": {
"x509": {
"clientCAFile": "/etc/kubernetes/ssl/ca.pem"
},
"webhook": {
"enabled": true,
"cacheTTL": "2m0s"
},
"anonymous": {
"enabled": false
}
},
"authorization": {
"mode": "Webhook",
"webhook": {
"cacheAuthorizedTTL": "5m0s",
"cacheUnauthorizedTTL": "30s"
}
},
"address": "192.168.202.136",
"port": 10250,
"readOnlyPort": 10255,
"cgroupDriver": "systemd",
"hairpinMode": "promiscuous-bridge",
"serializeImagePulls": false,
"clusterDomain": "cluster.local.",
"clusterDNS": ["10.96.0.2"]
}
EOF
node7创建的配置文件和node6除了address要配置为node6的以外,其他相同。
创建kubelet服务启动管理文件
cat > /usr/lib/systemd/system/kubelet.service << "EOF"
[Unit]
Description=Kubernetes Kubelet
Documentation=https://github.com/kubernetes/kubernetes
After=containerd.service
Requires=containerd.service
[Service]
WorkingDirectory=/var/lib/kubelet
ExecStart=/usr/local/bin/kubelet \
--bootstrap-kubeconfig=/etc/kubernetes/kubelet-bootstrap.kubeconfig \
--cert-dir=/etc/kubernetes/ssl \
--kubeconfig=/etc/kubernetes/kubelet.kubeconfig \
--config=/etc/kubernetes/kubelet.json \
--container-runtime-endpoint=unix:///run/containerd/containerd.sock \
--rotate-certificates \
--pod-infra-container-image=registry.aliyuncs.com/google_containers/pause:3.9 \
--v=2
Restart=on-failure
RestartSec=5
[Install]
WantedBy=multi-user.target
EOF
查看节点
[root@node5 ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
node6 Ready <none> 38m v1.28.2
node7 Ready <none> 3m59s v1.28.2
部署kube-proxy
创建kube-proxy证书请求文件
cat > kube-proxy-csr.json << "EOF"
{
"CN": "system:kube-proxy",
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "Beijing",
"L": "Beijing",
"O": "k8s",
"OU": "CN"
}
]
}
EOF
生成证书
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kube-proxy-csr.json | cfssljson -bare kube-proxy
创建kubeconfig文件
kubectl config set-cluster kubernetes --certificate-authority=ca.pem --embed-certs=true --server=https://192.168.202.140:6443 --kubeconfig=kube-proxy.kubeconfig
kubectl config set-credentials kube-proxy --client-certificate=kube-proxy.pem --client-key=kube-proxy-key.pem --embed-certs=true --kubeconfig=kube-proxy.kubeconfig
kubectl config set-context default --cluster=kubernetes --user=kube-proxy --kubeconfig=kube-proxy.kubeconfig
kubectl config use-context default --kubeconfig=kube-proxy.kubeconfig
将文件复制到工作节点
scp kube-proxy*.pem node6:/etc/kubernetes/ssl/
scp kube-proxy*.pem node7:/etc/kubernetes/ssl/
scp kube-proxy.kubeconfig node6:/etc/kubernetes/
scp kube-proxy.kubeconfig node7:/etc/kubernetes/
在node6创建服务配置文件
cat > /etc/kubernetes/kube-proxy.yaml << "EOF"
apiVersion: kubeproxy.config.k8s.io/v1alpha1
bindAddress: 192.168.202.136
clientConnection:
kubeconfig: /etc/kubernetes/kube-proxy.kubeconfig
clusterCIDR: 10.244.0.0/16
healthzBindAddress: 192.168.202.136:10256
kind: KubeProxyConfiguration
metricsBindAddress: 192.168.202.136:10249
mode: "ipvs"
EOF
在node7创建服务配置文件
cat > /etc/kubernetes/kube-proxy.yaml << "EOF"
apiVersion: kubeproxy.config.k8s.io/v1alpha1
bindAddress: 192.168.202.137
clientConnection:
kubeconfig: /etc/kubernetes/kube-proxy.kubeconfig
clusterCIDR: 10.244.0.0/16
healthzBindAddress: 192.168.202.137:10256
kind: KubeProxyConfiguration
metricsBindAddress: 192.168.202.137:10249
mode: "ipvs"
EOF
创建服务启动管理文件
mkdir -p /var/lib/kube-proxy
cat > /usr/lib/systemd/system/kube-proxy.service << "EOF"
[Unit]
Description=Kubernetes Kube-Proxy Server
Documentation=https://github.com/kubernetes/kubernetes
After=network.target
[Service]
WorkingDirectory=/var/lib/kube-proxy
ExecStart=/usr/local/bin/kube-proxy \
--config=/etc/kubernetes/kube-proxy.yaml \
--v=2
Restart=on-failure
RestartSec=5
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
EOF
启动服务
systemctl daemon-reload
systemctl enable --now kube-proxy
部署网络组件Flannel
下载所需的yaml文件
wget https://github.com/flannel-io/flannel/releases/latest/download/kube-flannel.yml
应用yaml文件
[root@node3 ~]# kubectl apply -f kube-flannel.yaml
namespace/kube-flannel created
serviceaccount/flannel created
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
configmap/kube-flannel-cfg created
daemonset.apps/kube-flannel-ds created
查看flannel运行情况
[root@node3 ~]# kubectl get pods -A -owide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-flannel kube-flannel-ds-k2cr8 1/1 Running 0 5m37s 192.168.202.137 node7 <none> <none>
kube-flannel kube-flannel-ds-nbqr8 1/1 Running 0 5m37s 192.168.202.136 node6 <none> <none>
部署CoreDNS
cat > coredns.yaml << "EOF"
apiVersion: v1
kind: ServiceAccount
metadata:
name: coredns
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
kubernetes.io/bootstrapping: rbac-defaults
name: system:coredns
rules:
- apiGroups:
- ""
resources:
- endpoints
- services
- pods
- namespaces
verbs:
- list
- watch
- apiGroups:
- discovery.k8s.io
resources:
- endpointslices
verbs:
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
annotations:
rbac.authorization.kubernetes.io/autoupdate: "true"
labels:
kubernetes.io/bootstrapping: rbac-defaults
name: system:coredns
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:coredns
subjects:
- kind: ServiceAccount
name: coredns
namespace: kube-system
---
apiVersion: v1
kind: ConfigMap
metadata:
name: coredns
namespace: kube-system
data:
Corefile: |
.:53 {
errors
health {
lameduck 5s
}
ready
kubernetes cluster.local in-addr.arpa ip6.arpa {
fallthrough in-addr.arpa ip6.arpa
}
prometheus :9153
forward . /etc/resolv.conf {
max_concurrent 1000
}
cache 30
loop
reload
loadbalance
}
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: coredns
namespace: kube-system
labels:
k8s-app: kube-dns
kubernetes.io/name: "CoreDNS"
spec:
# replicas: not specified here:
# 1. Default is 1.
# 2. Will be tuned in real time if DNS horizontal auto-scaling is turned on.
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
selector:
matchLabels:
k8s-app: kube-dns
template:
metadata:
labels:
k8s-app: kube-dns
spec:
priorityClassName: system-cluster-critical
serviceAccountName: coredns
tolerations:
- key: "CriticalAddonsOnly"
operator: "Exists"
nodeSelector:
kubernetes.io/os: linux
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: k8s-app
operator: In
values: ["kube-dns"]
topologyKey: kubernetes.io/hostname
containers:
- name: coredns
image: coredns/coredns:1.8.4
imagePullPolicy: IfNotPresent
resources:
limits:
memory: 170Mi
requests:
cpu: 100m
memory: 70Mi
args: [ "-conf", "/etc/coredns/Corefile" ]
volumeMounts:
- name: config-volume
mountPath: /etc/coredns
readOnly: true
ports:
- containerPort: 53
name: dns
protocol: UDP
- containerPort: 53
name: dns-tcp
protocol: TCP
- containerPort: 9153
name: metrics
protocol: TCP
securityContext:
allowPrivilegeEscalation: false
capabilities:
add:
- NET_BIND_SERVICE
drop:
- all
readOnlyRootFilesystem: true
livenessProbe:
httpGet:
path: /health
port: 8080
scheme: HTTP
initialDelaySeconds: 60
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 5
readinessProbe:
httpGet:
path: /ready
port: 8181
scheme: HTTP
dnsPolicy: Default
volumes:
- name: config-volume
configMap:
name: coredns
items:
- key: Corefile
path: Corefile
---
apiVersion: v1
kind: Service
metadata:
name: kube-dns
namespace: kube-system
annotations:
prometheus.io/port: "9153"
prometheus.io/scrape: "true"
labels:
k8s-app: kube-dns
kubernetes.io/cluster-service: "true"
kubernetes.io/name: "CoreDNS"
spec:
selector:
k8s-app: kube-dns
clusterIP: 10.96.0.2
ports:
- name: dns
port: 53
protocol: UDP
- name: dns-tcp
port: 53
protocol: TCP
- name: metrics
port: 9153
protocol: TCP
EOF
应用yaml文件
[root@node3 ~]# kubectl apply -f coredns.yaml
serviceaccount/coredns created
clusterrole.rbac.authorization.k8s.io/system:coredns created
clusterrolebinding.rbac.authorization.k8s.io/system:coredns created
configmap/coredns created
deployment.apps/coredns created
service/kube-dns created
查看结果
[root@node3 ~]# kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-flannel kube-flannel-ds-k2cr8 1/1 Running 0 7m42s
kube-flannel kube-flannel-ds-nbqr8 1/1 Running 0 7m42s
kube-system coredns-6dd8bc9d-5c6bj 1/1 Running 0 50s
部署服务
[root@node3 ~]# cat k8sutil.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: k8sutil-deployment
spec:
selector:
matchLabels:
app: k8sutil
template:
metadata:
labels:
app: k8sutil
spec:
containers:
- name: k8sutil
image: registry.cn-beijing.aliyuncs.com/postkarte/k8sutils:v1
查看Pod
[root@node3 ~]# kubectl get pods -A -owide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
default k8sutil-deployment-7pffl 1/1 Running 0 11h 10.244.1.2 node7 <none> <none>
default k8sutil-deployment-nnmxc 1/1 Running 0 3m24s 10.244.0.12 node6 <none> <none>
kube-flannel kube-flannel-ds-nbqr8 1/1 Running 3 (62s ago) 11h 192.168.202.136 node6 <none> <none>
kube-flannel kube-flannel-ds-zqstr 1/1 Running 1 (15m ago) 11h 192.168.202.137 node7 <none> <none>
kube-system coredns-6dd8bc9d-5mkhq 1/1 Running 0 35s 10.244.1.3 node7 <none> <none>
遇到的错误
错误一
[root@node3 k8s-work]# kubectl create clusterrolebinding kube-apiserver:kubelet-apis --clusterrole=system:kubelet-api-admin --user kubernetes --kubeconfig=/root/.kube/config
error: failed to create clusterrolebinding: Post "https://192.168.202.100:6443/apis/rbac.authorization.k8s.io/v1/clusterrolebindings?fieldManager=kubectl-create&fieldValidation=Strict": tls: failed to verify certificate: x509: certificate is valid for 127.0.0.1, 192.168.202.129, 192.168.202.130, 192.168.202.131, 192.168.202.132, 192.168.202.134, 192.168.202.136, 192.168.202.137, 192.168.202.138, 192.168.202.139, 192.168.202.140, 10.96.0.1, not 192.168.202.100
原因是 kube-apiserver-csr.json证书文件中没有加192.168.202.100这个IP
解决办法:设置添加到列表的ip