二进制方式部署 v1.28.2 K8s集群(Containerd)

608 阅读11分钟

主机准备

本次部署使用的系统为Rocky9.2,7个节点

[root@node1 ~]# uname -a
Linux node1 5.14.0-284.11.1.el9_2.x86_64 #1 SMP PREEMPT_DYNAMIC Tue May 9 17:09:15 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

节点分配如下

序号主机名IP地址说明
1node1192.168.202.129部署haproxy使用的节点1
2node2192.168.202.130部署haproxy使用的节点2
3node3192.168.202.131K8s master1
4node4192.168.202.132K8s master2
5node5192.168.202.134K8s master3
6node6192.168.202.136Worker node1
7node7192.168.202.137Worker node2
8 (只是IP)lb192.168.202.140作为VIP(浮动IP)

主机名与IP地址解析配置

所有服务器都需要配置,编辑/etc/hosts文件,添加下面的主机解析配置

[root@node1 ~]# vim /etc/hosts
[root@node1 ~]# cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.202.129  node1
192.168.202.130  node2
192.168.202.131  node3
192.168.202.132  node4
192.168.202.134  node5
192.168.202.136  node6
192.168.202.137  node7

关闭主机的防火墙

所有主机都要配置

systemctl disable firewalld --now

关闭SELinux

所有主机都要配置

可以使用sestat查看SELinux是否是启用的状态

# 如果是关闭的话会显示disabled
[root@node7 ~]# sestatus 
SELinux status:                 disabled

# 如果是开启的话会显示enabled
[root@node7 ~]# sestatus 
SELinux status:                 enabled
SELinuxfs mount:                /sys/fs/selinux
SELinux root directory:         /etc/selinux
Loaded policy name:             targeted
Current mode:                   permissive
Mode from config file:          disabled
Policy MLS status:              enabled
Policy deny_unknown status:     allowed
Memory protection checking:     actual (secure)
Max kernel policy version:      33

关闭SELinux

setenforce 0 
sed -i 's#SELINUX=enforcing#SELINUX=disabled#g' /etc/selinux/config

配置时间同步

所有节点都要做,保证节点的时间一致

# 查看当前的时区,将时区设置为Asia/Shanghai
[root@node1 ~]# timedatectl
               Local time: Mon 2023-09-25 23:22:02 CST
           Universal time: Mon 2023-09-25 15:22:02 UTC
                 RTC time: Mon 2023-09-25 15:22:02
                Time zone: Asia/Shanghai (CST, +0800)
System clock synchronized: yes
              NTP service: active
          RTC in local TZ: no

# 如果时区不是Asia/Shanghai需要将节点都设置为这个时区
[root@node1 ~]# timedatectl set-timezone Asia/Shanghai

# 启动chronyd,作为时间同步的服务
[root@node1 ~]# systemctl enable chronyd --now

ipvs管理工具安装以及模块加载

在部署K8s master和worker节点上安装即可

[root@node1 ~]# yum install ipvsadm ipset sysstat conntrack libseccomp -y

配置ipvs相关模块

# 添加模块配置文件, /etc/modules-load.d中的文件能够在启动时加载内核模块
cat >> /etc/modules-load.d/ipvs.conf <<EOF 
ip_vs
ip_vs_rr
ip_vs_wrr
ip_vs_sh
nf_conntrack
EOF

# 重启启动加载内核模块的服务
[root@node1 ~]# systemctl restart systemd-modules-load.service

# 查看内核模块是否加载
[root@node1 ~]# lsmod | grep -e ip_vs -e nf_conntrack
ip_vs_sh               16384  0
ip_vs_wrr              16384  0
ip_vs_rr               16384  0
ip_vs                 204800  6 ip_vs_rr,ip_vs_sh,ip_vs_wrr
nf_conntrack_netlink    57344  0
nf_conntrack          188416  5 xt_conntrack,nf_nat,nf_conntrack_netlink,xt_MASQUERADE,ip_vs
nf_defrag_ipv6         24576  2 nf_conntrack,ip_vs
nf_defrag_ipv4         16384  1 nf_conntrack
nfnetlink              20480  4 nft_compat,nf_conntrack_netlink,nf_tables
libcrc32c              16384  5 nf_conntrack,nf_nat,nf_tables,xfs,ip_vs

开启主机内核路由转发及网桥过滤

所有主机均需要操作

配置内核加载br_netfilter和iptables放行ipv4和ipv6流量,确保集群内的容器能够正常通信。

添加网桥过滤及内核转发配置文件

# vm.swappiness = 0 为对swap分区进行关闭

cat > /etc/sysctl.d/k8s.conf <<EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
vm.swappiness = 0 
EOF

将br_netfilter模块添加到开机加载

[root@node1 ~]# cat  > /etc/modules-load.d/containerd.conf <<EOF
br_netfilter 
EOF

# 重启启动加载内核模块的服务
[root@node1 ~]# systemctl restart systemd-modules-load.service

关闭主机swap分区

在部署K8s master和worker节点上安装即可

sed -i 's/^\(.*swap.*\)$/#\1/g' /etc/fstab 
swapoff -a

HAProxy和Keeplived的部署

在node1 和 node2 上安装HAProxy和Keepalived

yum install haproxy keepalived -y

HAProxy 配置文件准备

HAProxy两个节点的配置文件相同,配置文件如下

# 用于设置全局配置参数,属于进程级的配置,通常与操作系统配置相关
global
  # 设置每个HAProxy进程可以接收的最大并发连接数
  maxconn 2000 
  # 进程所能打开的文件描述符的个数,默认情况下其会自动进行计算,因此不建议修改此选项
  ulimit-n 16384
  # 全局的日志配置,local0是日志设备,info表示日志级别。其中日志级别有err, warning, info, debug 4种。这个配置表示使用127.0.0.1上的rsyslog服务中的local0日志设备,记录日志等级为info
  log 127.0.0.1 local0 err
  # stats表示启用HAProxy的统计功能,timeout设置统计模块的超时时间为30s,也就是说如果统计模块的操作在30s内没有完成,就会终止该操作,以防止统计功能占用过多资源。
  stats  timeout  30s

# 默认配置项,针对以下的frontend、backend和listen生效
defaults 
  # 使用全局配置中定义的日志配置
  log global
  # 默认的模式,mode { tcp|http|health },tcp是4层,http是7层,health只会返回OK
  mode  http
  # 记录HTTP 请求,session 状态和计时器 
  option httplog
  # 连接超时
  timeout connect 5000
  # 客户端超时
  timeout client 50000
  # 服务端超时
  timeout server 50000
  # HAProxy中处理HTTP请求的超时时间为15秒,也就是说,如果HAProxy收到一个HTTP请求,但是在15秒内没有返回响应,那么就会终止这个请求
  timeout http-request 15s
  # 对HTTP保持活动连接的超时时间
  timeout http-keep-alive 15s

# 前端配置,monitor-in为名称,可以自己定义。
# 如下的配置表示可以通过http://ip:33305/monitor来查看HAProxy的统计数据和状态信息。
frontend monitor-in
  # 指定frontend监听的地址和端口
  bind *:33305  
  # 设置HAProxy实例默认的运行模式
  mode http
  option httplog
  # 用来定义HAProxy内置监控页面的访问路径,方便运维人员通过web界面实时查看HAProxy运行状态。
  monitor-uri /monitor

#
frontend k8s-master
 bind 0.0.0.0:6443
 bind 127.0.0.1:6443
 mode tcp
 option tcplog
 tcp-request inspect-delay  5s
 default_backend  k8s-master

backend k8s-master
 mode tcp
 option tcplog
 option tcp-check
 # rr轮询负载均衡
 balance roundrobin
 #default-server: 指定默认后端服务器组
 #inter 10s: 后端服务器检测间隔为10秒
 #downinter 5s: 服务器入机检测间隔为5秒
 #rise 2: 服务器响应2次健康才设为如常
 #fall 2: 服务器响应2次异常才设为异常
 #slowstart 60s: 慢启动模式时间为60秒
 #maxconn 250: 每个服务器最大连接数为250
 #maxqueue 256: 请求队列最大长度为256
 #weight 100: 服务器默认权重为100
 #为默认后端服务器组配置健康检查参数,每10秒检查一次服务器状态,宕机后5秒再检测,服务器需要连续正常2次才标记健康,连续异常2次标记不健康。新加入的服务器将在60秒内逐步增加流量(慢启动)。每个服务器允许连接数250,队列长度256。权重默认100,表示流量分配权重。
 default-server inter 10s downinter 5s rise 2 fall 2 slowstart 60s maxconn 250 maxqueue 256 weight 100
 server node3 192.168.202.131:6443 check
 server node4 192.168.202.132:6443 check
 server node5 192.168.202.134:6443 check

配置Keepalived

node1和node2两台节点上部署,不过两台节点上的配置不一样

将node1配置为Keepalived master节点,配置文件如下

[root@node1 ~]# cat /etc/keepalived/keepalived.conf
# 指定全局参数定义块
global_defs {
  # 为VRRP组定义唯一路由ID,用于区分不同VRRP组
  router_id LVS_DEVEL
  # 指定LVS脚本或挂钩文件执行用户为root用户
  script_user root
  # 启用脚本安全访问控制,保证只有root用户有执行权限
  enable_script_security
}

# 定义健康检查脚本
vrrp_script chk_apiserver {
  # 指定健康检查所使用的脚本的路径
  script "/etc/keepalived/check_apiserver.sh"
  # 脚本检查间隔为5秒
  interval  5
  # 检查失败后VRRP权重为5
  weight   -5
  # 连续两次检查失败认为挂掉
  fall 2
  # 需要检查成功1次才恢复
  rise 1
}
# 定义了VRRP实例
vrrp_instance VI_1 {
  # 指定该节点为MASTER节点
  state MASTER
  # 指定对外提供服务的网卡接口
  interface ens160
  # 指定本节点组播源IP
  mcast_src_ip 192.168.202.129
  # 定义VRRP组ID为51
  virtual_router_id 51
  # 定义节点优先级为100
  priority  100
 
  abvert_int 2
  # 定义不同主机交换组播信息的认证信息
  authentication {
    auth_type PASS
    auth_pass K8SHA_KA_AUTH
  }
  
  # 定义浮动IP
  virtual_ipaddress {
    192.168.202.100
  }
  
  # 关联定义的chk_apiserver检查脚本
  track_script {
    chk_apiserver
  }

}

将node2配置为Keepalived backup节点,配置文件如下

[root@node2 ~]# cat /etc/keepalived/keepalived.conf 
global_defs {
  router_id LVS_DEVEL
  script_user root
  enable_script_security
}

vrrp_script chk_apiserver {
  script "/etc/keepalived/check_apiserver.sh"
  interval  5
  weight   -5
  fall 2
  rise 1
}

vrrp_instance VI_1 {
  state BACKUP  # 这里需要设置为备份节点
  interface ens160
  mcast_src_ip 192.168.202.130
  virtual_router_id 51
  priority  99  # 优先级没有master节点高
  abvert_int 2
  authentication {
    auth_type PASS
    auth_pass K8SHA_KA_AUTH
  }
  
  virtual_ipaddress {
    192.168.202.140
  }
  
  track_script {
    chk_apiserver
  }

}

健康检查脚本如下

[root@node1 ~]# cat /etc/keepalived/check_apiserver.sh 
err=0
for k in $(seq 1 3)
do
   check_code=$(pgrep haproxy)
   if [[ $check_code == "" ]];then
      err=$(expr $err + 1)
      sleep 1
      continue
   else
      err=0
      break
   fi
done


if [[ $err != "0" ]];then
   echo "systemctl stop keepalived"
   /usr/bin/systemctl stop keepalived
   exit 1
else
   exit 0
fi

需要给脚本增加可执行的权限

[root@node1 ~]# chmod +x /etc/keepalived/check_apiserver.sh

启动HAProxy和Keepalived

# 先启动HAProxy
[root@node1 ~]# systemctl enable haproxy --now
Created symlink /etc/systemd/system/multi-user.target.wants/haproxy.service → /usr/lib/systemd/system/haproxy.service.

# 后启动Keepalived,以为Keepalived有健康状态检查。
[root@node1 ~]# systemctl enable keepalived --now
Created symlink /etc/systemd/system/multi-user.target.wants/keepalived.service → /usr/lib/systemd/system/keepalived.service.

创建etcd集群

在node3, node4, nod5三个节点上部署etcd集群。

首先生成etcd集群所使用的证书

在node3上生成证书,然后将证书拷贝到node4和node5上

创建工作目录

mkdir -p /data/k8s-work

获取cfssl工具

cd /data/k8s-work
wget https://pkg.cfssl.org/R1.2/cfssl_linux-amd64
wget https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64
wget https://pkg.cfssl.org/R1.2/cfssl-certinfo_linux-amd64

说明:
cfssl是使用go编写,由CloudFlare开源的一款PKI/TLS工具。主要程序有:

- cfssl,是CFSSL的命令行工具
- cfssljson用来从cfssl程序获取JSON输出,并将证书,密钥,CSR和bundle写入文件中。

给证书生成工具添加可执行权限并将文件移动到/usr/local/bin目录

# 添加可执行权限
[root@node3 k8s-work]# chmod +x cfssl*

# 移动到指定的文件夹并重新命名
[root@node3 k8s-work]# mv cfssl_linux-amd64 /usr/local/bin/cfssl
[root@node3 k8s-work]# mv cfssljson_linux-amd64 /usr/local/bin/cfssljson
[root@node3 k8s-work]# mv cfssl-certinfo_linux-amd64 /usr/local/bin/cfssl-certinfo

配置ca证书的请求文件

cat > ca-csr.json <<"EOF"
{
  "CN": "kubernetes",
  "key": {
      "algo": "rsa",
      "size": 2048
  },
  "names": [
    {
      "C": "CN",
      "ST": "Beijing",
      "L": "Beijing",
      "O": "k8s",
      "OU": "CN"
    }
  ],
  "ca": {
          "expiry": "87600h"
  }
}
EOF

创建ca证书

[root@node3 k8s-work]# cfssl gencert -initca ca-csr.json | cfssljson -bare ca

配置ca证书策略

cat > ca-config.json <<"EOF"
{
  "signing": {
      "default": {
          "expiry": "87600h"
        },
      "profiles": {
          "kubernetes": {
              "usages": [
                  "signing",
                  "key encipherment",
                  "server auth",
                  "client auth"
              ],
              "expiry": "87600h"
          }
      }
  }
}
EOF

# 说明
# server auth 表示client可以对使用该ca对server提供的证书进行验证

# client auth 表示server可以使用该ca对client提供的证书进行验证

创建etcd证书请求文件

cat > etcd-csr.json <<"EOF"
{
  "CN": "etcd",
  "hosts": [
    "127.0.0.1",
    "192.168.202.131",
    "192.168.202.132",
    "192.168.202.134"
  ],
  "key": {
    "algo": "rsa",
    "size": 2048
  },
  "names": [{
    "C": "CN",
    "ST": "Beijing",
    "L": "Beijing",
    "O": "k8s",
    "OU": "CN"
  }]
}
EOF

生成etcd证书

cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes etcd-csr.json | cfssljson  -bare etcd

# 最后该工作目录下的文件如下
[root@node3 k8s-work]# ls
ca-config.json  ca.csr  ca-csr.json  ca-key.pem  ca.pem  etcd.csr  etcd-csr.json  etcd-key.pem  etcd.pem

下载etcd

wget https://github.com/etcd-io/etcd/releases/download/v3.5.9/etcd-v3.5.9-linux-amd64.tar.gz

解压etcd压缩包并将解压的二进制文件拷贝到/usr/local/bin

# 解压压缩包
[root@node3 ~]# tar -xf etcd-v3.5.9-linux-amd64.tar.gz 
[root@node3 ~]# ls etcd-v3.5.9-linux-amd64
Documentation  etcd  etcdctl  etcdutl  README-etcdctl.md  README-etcdutl.md  README.md  READMEv2-etcdctl.md

# 拷贝etcd二进制文件到/usr/local/bin
# -p表示保留文件的权限,所有者和时间戳
[root@node3 ~]# cp -p etcd-v3.5.9-linux-amd64/etcd* /usr/local/bin

创建etcd的配置文件

node3节点etcd配置文件

# 创建保存etcd配置文件的目录
[root@node3 ~]# mkdir /etc/etcd/

#  创建配置文件
[root@node3 ~]# cat >  /etc/etcd/etcd.conf <<"EOF"
#[Member]
ETCD_NAME="etcd1"
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="https://192.168.202.131:2380"
ETCD_LISTEN_CLIENT_URLS="https://192.168.202.131:2379,http://127.0.0.1:2379"

#[Clustering]
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://192.168.202.131:2380"
ETCD_ADVERTISE_CLIENT_URLS="https://192.168.202.131:2379"
ETCD_INITIAL_CLUSTER="etcd1=https://192.168.202.131:2380,etcd2=https://192.168.202.132:2380,etcd3=https://192.168.202.134:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="new"
EOF

--------------------------------------------------------------------------------
说明:
ETCD_NAME:节点名称,集群中唯一
ETCD_DATA_DIR:数据目录
ETCD_LISTEN_PEER_URLS:集群通信监听地址
ETCD_LISTEN_CLIENT_URLS:客户端访问监听地址
ETCD_INITIAL_ADVERTISE_PEER_URLS:集群通告地址
ETCD_ADVERTISE_CLIENT_URLS:客户端通告地址
ETCD_INITIAL_CLUSTER:集群节点地址
ETCD_INITIAL_CLUSTER_TOKEN:集群Token
ETCD_INITIAL_CLUSTER_STATE:加入集群的当前状态,new是新集群,existing表示加入已有集群
创建保存证书的目录和保存数据的目录
mkdir -p /etc/etcd/ssl
mkdir -p /var/lib/etcd/default.etcd
拷贝生成的证书文件到/etc/etcd/ssl目录
[root@node3 ~]# cp /data/k8s-work/ca*.pem /etc/etcd/ssl
[root@node3 ~]# cp /data/k8s-work/etcd*.pem /etc/etcd/ssl/
创建etcd的systemd管理文件
cat > /etc/systemd/system/etcd.service <<"EOF"
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target

[Service]
Type=notify
EnvironmentFile=-/etc/etcd/etcd.conf
WorkingDirectory=/var/lib/etcd/
ExecStart=/usr/local/bin/etcd \
  --cert-file=/etc/etcd/ssl/etcd.pem \
  --key-file=/etc/etcd/ssl/etcd-key.pem \
  --trusted-ca-file=/etc/etcd/ssl/ca.pem \
  --peer-cert-file=/etc/etcd/ssl/etcd.pem \
  --peer-key-file=/etc/etcd/ssl/etcd-key.pem \
  --peer-trusted-ca-file=/etc/etcd/ssl/ca.pem \
  --peer-client-cert-auth \
  --client-cert-auth
Restart=on-failure
RestartSec=5
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
EOF

node4 etcd配置文件

# 创建保存etcd配置文件的目录
[root@node3 ~]# mkdir /etc/etcd/

#  创建配置文件
[root@node3 ~]# cat >  /etc/etcd/etcd.conf <<"EOF"
#[Member]
ETCD_NAME="etcd2"
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="https://192.168.202.132:2380"
ETCD_LISTEN_CLIENT_URLS="https://192.168.202.132:2379,http://127.0.0.1:2379"

#[Clustering]
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://192.168.202.132:2380"
ETCD_ADVERTISE_CLIENT_URLS="https://192.168.202.132:2379"
ETCD_INITIAL_CLUSTER="etcd1=https://192.168.202.131:2380,etcd2=https://192.168.202.132:2380,etcd3=https://192.168.202.134:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="new"
EOF
创建保存证书的目录和保存数据的目录
mkdir -p /etc/etcd/ssl
mkdir -p /var/lib/etcd/default.etcd
将在node3上生成的证书拷贝到node4的/etc/etcd/ssl目录
[root@node3 ssl]# pwd
/etc/etcd/ssl
[root@node3 ssl]# ls
ca-key.pem  ca.pem  etcd-key.pem  etcd.pem
[root@node3 ssl]# scp ./* node4:/etc/etcd/ssl
root@node4's password: 
ca-key.pem                                                                                                                 100% 1679     2.5MB/s   00:00    
ca.pem                                                                                                                     100% 1346     2.3MB/s   00:00    
etcd-key.pem                                                                                                               100% 1679     3.2MB/s   00:00    
etcd.pem 
创建etcd的systemd管理文件
cat > /etc/systemd/system/etcd.service <<"EOF"
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target

[Service]
Type=notify
EnvironmentFile=-/etc/etcd/etcd.conf
WorkingDirectory=/var/lib/etcd/
ExecStart=/usr/local/bin/etcd \
  --cert-file=/etc/etcd/ssl/etcd.pem \
  --key-file=/etc/etcd/ssl/etcd-key.pem \
  --trusted-ca-file=/etc/etcd/ssl/ca.pem \
  --peer-cert-file=/etc/etcd/ssl/etcd.pem \
  --peer-key-file=/etc/etcd/ssl/etcd-key.pem \
  --peer-trusted-ca-file=/etc/etcd/ssl/ca.pem \
  --peer-client-cert-auth \
  --client-cert-auth
Restart=on-failure
RestartSec=5
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
EOF

node5 etcd配置文件

# 创建保存etcd配置文件的目录
[root@node3 ~]# mkdir /etc/etcd/

#  创建配置文件
[root@node3 ~]# cat >  /etc/etcd/etcd.conf <<"EOF"
#[Member]
ETCD_NAME="etcd3"
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="https://192.168.202.134:2380"
ETCD_LISTEN_CLIENT_URLS="https://192.168.202.134:2379,http://127.0.0.1:2379"

#[Clustering]
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://192.168.202.134:2380"
ETCD_ADVERTISE_CLIENT_URLS="https://192.168.202.134:2379"
ETCD_INITIAL_CLUSTER="etcd1=https://192.168.202.131:2380,etcd2=https://192.168.202.132:2380,etcd3=https://192.168.202.134:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="new"
EOF
创建保存证书的目录和保存数据的目录
mkdir -p /etc/etcd/ssl
mkdir -p /var/lib/etcd/default.etcd
将在node3上生成的证书拷贝到node5的/etc/etcd/ssl目录
[root@node3 ssl]# pwd
/etc/etcd/ssl
[root@node3 ssl]# ls
ca-key.pem  ca.pem  etcd-key.pem  etcd.pem
[root@node3 ssl]# scp ./* node5:/etc/etcd/ssl
root@node5's password: 
ca-key.pem                                                                                                                 100% 1679     2.5MB/s   00:00    
ca.pem                                                                                                                     100% 1346     2.3MB/s   00:00    
etcd-key.pem                                                                                                               100% 1679     3.2MB/s   00:00    
etcd.pem 
创建etcd的systemd管理文件
cat > /etc/systemd/system/etcd.service <<"EOF"
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target

[Service]
Type=notify
EnvironmentFile=-/etc/etcd/etcd.conf
WorkingDirectory=/var/lib/etcd/
ExecStart=/usr/local/bin/etcd \
  --cert-file=/etc/etcd/ssl/etcd.pem \
  --key-file=/etc/etcd/ssl/etcd-key.pem \
  --trusted-ca-file=/etc/etcd/ssl/ca.pem \
  --peer-cert-file=/etc/etcd/ssl/etcd.pem \
  --peer-key-file=/etc/etcd/ssl/etcd-key.pem \
  --peer-trusted-ca-file=/etc/etcd/ssl/ca.pem \
  --peer-client-cert-auth \
  --client-cert-auth
Restart=on-failure
RestartSec=5
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
EOF

启动etcd集群

node3,node4,node5上都执行

systemctl daemon-reload
systemctl enable --now etcd.service

验证etcd集群状态

[root@node3 ssl]# ETCDCTL_API=3 /usr/local/bin/etcdctl --write-out=table --cacert=/etc/etcd/ssl/ca.pem --cert=/etc/etcd/ssl/etcd.pem --key=/etc/etcd/ssl/etcd-key.pem --endpoints=https://192.168.202.131:2379,https://192.168.202.132:2379,https://192.168.202.134:2379 endpoint health
+------------------------------+--------+------------+-------+
|           ENDPOINT           | HEALTH |    TOOK    | ERROR |
+------------------------------+--------+------------+-------+
| https://192.168.202.131:2379 |   true | 8.678307ms |       |
| https://192.168.202.132:2379 |   true | 9.428444ms |       |
| https://192.168.202.134:2379 |   true | 9.347641ms |       |
+------------------------------+--------+------------+-------+

检查数据库的性能

[root@node3 ssl]# ETCDCTL_API=3 /usr/local/bin/etcdctl --write-out=table --cacert=/etc/etcd/ssl/ca.pem --cert=/etc/etcd/ssl/etcd.pem --key=/etc/etcd/ssl/etcd-key.pem --endpoints=https://192.168.202.131:2379,https://192.168.202.132:2379,https://192.168.202.134:2379 check perf
 59 / 60 Booooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooom  !  98.33%PASS: Throughput is 151 writes/s
PASS: Slowest request took 0.023224s
PASS: Stddev is 0.000461s
PASS

检查成员列表

[root@node3 ssl]# ETCDCTL_API=3 /usr/local/bin/etcdctl --write-out=table --cacert=/etc/etcd/ssl/ca.pem --cert=/etc/etcd/ssl/etcd.pem --key=/etc/etcd/ssl/etcd-key.pem --endpoints=https://192.168.202.131:2379,https://192.168.202.132:2379,https://192.168.202.134:2379 member list
+------------------+---------+-------+------------------------------+------------------------------+------------+
|        ID        | STATUS  | NAME  |          PEER ADDRS          |         CLIENT ADDRS         | IS LEARNER |
+------------------+---------+-------+------------------------------+------------------------------+------------+
|  80348da092b8e96 | started | etcd1 | https://192.168.202.131:2380 | https://192.168.202.131:2379 |      false |
| 70222b4638121a8b | started | etcd2 | https://192.168.202.132:2380 | https://192.168.202.132:2379 |      false |
| a9089efe301a5c67 | started | etcd3 | https://192.168.202.134:2380 | https://192.168.202.134:2379 |      false |
+------------------+---------+-------+------------------------------+------------------------------+------------+

查看etcd集群节点状态

[root@node3 ssl]# ETCDCTL_API=3 /usr/local/bin/etcdctl --write-out=table --cacert=/etc/etcd/ssl/ca.pem --cert=/etc/etcd/ssl/etcd.pem --key=/etc/etcd/ssl/etcd-key.pem --endpoints=https://192.168.202.131:2379,https://192.168.202.132:2379,https://192.168.202.134:2379 endpoint status
+------------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
|           ENDPOINT           |        ID        | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+------------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| https://192.168.202.131:2379 |  80348da092b8e96 |   3.5.9 |   22 MB |      true |      false |         2 |       8990 |               8990 |        |
| https://192.168.202.132:2379 | 70222b4638121a8b |   3.5.9 |   22 MB |     false |      false |         2 |       8990 |               8990 |        |
| https://192.168.202.134:2379 | a9089efe301a5c67 |   3.5.9 |   22 MB |     false |      false |         2 |       8990 |               8990 |        |
+------------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+

k8s master节点所需组件的部署

master(node3, node4, node5)节点所需的组件有以下几个部分

  • kube apiserver
  • kubectl
  • controller manager
  • kube scheduler

下载二进制软件

# 在node3下载的软件包,后面再拷贝到其他节点上
[root@node3 ~]#  wget https://dl.k8s.io/v1.28.2/kubernetes-server-linux-amd64.tar.gz

解压下载好的软件包

# 解压压缩包
[root@node3 ~]# tar -xf kubernetes-server-linux-amd64.tar.gz 
[root@node3 ~]# cd kubernetes
[root@node3 kubernetes]# ls
addons  kubernetes-src.tar.gz  LICENSES  server
[root@node3 kubernetes]# cd server/

# 进入bin目录,bin目录里面包含相关的二进制文件
[root@node3 server]# ls bin/
apiextensions-apiserver  kube-apiserver.docker_tag           kube-controller-manager.tar  kubectl.tar      kube-proxy.docker_tag      kube-scheduler.tar
kubeadm                  kube-apiserver.tar                  kubectl                      kubelet          kube-proxy.tar             mounter
kube-aggregator          kube-controller-manager             kubectl-convert              kube-log-runner  kube-scheduler
kube-apiserver           kube-controller-manager.docker_tag  kubectl.docker_tag           kube-proxy       kube-scheduler.docker_tag

node3节点拷贝需要的文件到指定的目录

[root@node3 bin]# cp kube-apiserver kube-controller-manager kube-scheduler kubectl /usr/local/bin/

从node3上将需要的二进制文件拷贝到node4

[root@node3 bin]# scp kube-apiserver kube-controller-manager kube-scheduler kubectl node4:/usr/local/bin/
root@node4's password: 
kube-apiserver                                                                                                             100%  116MB 197.7MB/s   00:00    
kube-controller-manager                                                                                                    100%  112MB 195.2MB/s   00:00    
kube-scheduler                                                                                                             100%   53MB 198.9MB/s   00:00    
kubectl                                                                                                                    100%   48MB 202.5MB/s   00:00    
[root@node3 bin]# 

从node3上将需要的二进制文件拷贝到node5

[root@node3 bin]# scp kube-apiserver kube-controller-manager kube-scheduler kubectl node5:/usr/local/bin/
root@node5's password: 
kube-apiserver                                                                                                             100%  116MB 171.9MB/s   00:00    
kube-controller-manager                                                                                                    100%  112MB 180.5MB/s   00:00    
kube-scheduler                                                                                                             100%   53MB 200.3MB/s   00:00    
kubectl                                                                                                                    100%   48MB 221.1MB/s   00:00    
[root@node3 bin]# 

部署apiserver组件

创建目录,在所有的k8s master节点上执行

[root@node3 bin]# mkdir -p /etc/kubernetes
[root@node3 bin]# mkdir -p /etc/kubernetes/ssl
[root@node3 bin]# mkdir -p /var/log/kubernetes

创建apiserver证书请求文件

cat > kube-apiserver-csr.json << "EOF"
{
"CN": "kubernetes",
  "hosts": [
    "127.0.0.1",
    "192.168.202.100",
    "192.168.202.129",
    "192.168.202.130",
    "192.168.202.131",
    "192.168.202.132",
    "192.168.202.134",
    "192.168.202.136",
    "192.168.202.137",
    "192.168.202.138",
    "192.168.202.139",
    "192.168.202.140",
    "10.96.0.1",
    "kubernetes",
    "kubernetes.default",
    "kubernetes.default.svc",
    "kubernetes.default.svc.cluster",
    "kubernetes.default.svc.cluster.local"
  ],
  "key": {
    "algo": "rsa",
    "size": 2048
  },
  "names": [
    {
      "C": "CN",
      "ST": "Beijing",
      "L": "Beijing",
      "O": "k8s",
      "OU": "CN"
    }
  ]
}
EOF
说明:
如果 hosts 字段不为空则需要指定授权使用该证书的 IP(含VIP) 或域名列表。由于该证书被 集群使用,需要将节点的IP都填上,为了方便后期扩容可以多写几个预留的IP。
同时还需要填写 service 网络的首个IP(一般是 kube-apiserver 指定的 service-cluster-ip-range 网段的第一个IP,如 10.96.0.1)。

生成apiserver证书及token文件

[root@node3 k8s-work]# cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kube-apiserver-csr.json | cfssljson -bare kube-apiserver

将生成的证书拷贝到指定的目录

[root@node3 k8s-work]# cp kube-apiserver.pem kube-apiserver-key.pem ca.pem ca-key.pem /etc/kubernetes/ssl/

创建TLS机制所需的token

创建TLS机制所需TOKEN TLS Bootstraping:Master apiserver启用TLS认证后,Node节点kubelet和kube-proxy与kube-apiserver进行通信,必须使用CA签发的有效证书才可以,当Node节点很多时,这种客户端证书颁发需要大量工作,同样也会增加集群扩展复杂度。为了简化流程,Kubernetes引入了TLS bootstraping机制来自动颁发客户端证书,kubelet会以一个低权限用户自动向apiserver申请证书,kubelet的证书由apiserver动态签署。所以强烈建议在Node上使用这种方式,目前主要用于kubelet,kube-proxy还是由我们统一颁发一个证书。

cat > /etc/kubernetes/token.csv << EOF
$(head -c 16 /dev/urandom | od -An -t x | tr -d ' '),kubelet-bootstrap,10001,"system:kubelet-bootstrap"
EOF

node3上创建apiserver服务配置文件

cat > /etc/kubernetes/kube-apiserver.conf << "EOF"
KUBE_APISERVER_OPTS="--enable-admission-plugins=NamespaceLifecycle,NodeRestriction,LimitRanger,ServiceAccount,DefaultStorageClass,ResourceQuota \
  --anonymous-auth=false \
  --bind-address=192.168.202.131 \
  --secure-port=6443 \
  --advertise-address=192.168.202.131 \
  --authorization-mode=Node,RBAC \
  --runtime-config=api/all=true \
  --enable-bootstrap-token-auth \
  --service-cluster-ip-range=10.96.0.0/16 \
  --token-auth-file=/etc/kubernetes/token.csv \
  --service-node-port-range=30000-32767 \
  --tls-cert-file=/etc/kubernetes/ssl/kube-apiserver.pem  \
  --tls-private-key-file=/etc/kubernetes/ssl/kube-apiserver-key.pem \
  --client-ca-file=/etc/kubernetes/ssl/ca.pem \
  --kubelet-client-certificate=/etc/kubernetes/ssl/kube-apiserver.pem \
  --kubelet-client-key=/etc/kubernetes/ssl/kube-apiserver-key.pem \
  --service-account-key-file=/etc/kubernetes/ssl/ca-key.pem \
  --service-account-signing-key-file=/etc/kubernetes/ssl/ca-key.pem  \
  --service-account-issuer=api \
  --etcd-cafile=/etc/etcd/ssl/ca.pem \
  --etcd-certfile=/etc/etcd/ssl/etcd.pem \
  --etcd-keyfile=/etc/etcd/ssl/etcd-key.pem \
  --etcd-servers=https://192.168.202.131:2379,https://192.168.202.132:2379,https://192.168.202.134:2379 \
  --allow-privileged=true \
  --apiserver-count=3 \
  --audit-log-maxage=30 \
  --audit-log-maxbackup=3 \
  --audit-log-maxsize=100 \
  --audit-log-path=/var/log/kube-apiserver-audit.log \
  --event-ttl=1h \
  --v=4"
EOF

创建apiserver服务管理配置文件

cat > /etc/systemd/system/kube-apiserver.service << "EOF"
[Unit]
Description=Kubernetes API Server
Documentation=https://github.com/kubernetes/kubernetes
After=etcd.service
Wants=etcd.service

[Service]
EnvironmentFile=-/etc/kubernetes/kube-apiserver.conf
ExecStart=/usr/local/bin/kube-apiserver $KUBE_APISERVER_OPTS
Restart=on-failure
RestartSec=5
Type=notify
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
EOF

对node4节点进行配置

从node3 将证书文件拷贝到node4
[root@node3 k8s-work]# scp ca*.pem root@node4:/etc/kubernetes/ssl
root@node4's password: 
ca-key.pem                                                                                                                 100% 1679     3.1MB/s   00:00    
ca.pem                                                                                                                     100% 1346     2.1MB/s   00:00    
[root@node3 k8s-work]# scp kube-apiserver*.pem root@node4:/etc/kubernetes/ssl/
root@node4's password: 
kube-apiserver-key.pem                                                                                                     100% 1679     3.5MB/s   00:00    
kube-apiserver.pem 
将token.csv也拷贝到node4
scp /etc/kubernetes/token.csv node4:/etc/kubernetes/
node4上创建apiserver服务配置文件
cat > /etc/kubernetes/kube-apiserver.conf << "EOF"
KUBE_APISERVER_OPTS="--enable-admission-plugins=NamespaceLifecycle,NodeRestriction,LimitRanger,ServiceAccount,DefaultStorageClass,ResourceQuota \
  --anonymous-auth=false \
  --bind-address=192.168.202.132 \
  --secure-port=6443 \
  --advertise-address=192.168.202.132 \
  --authorization-mode=Node,RBAC \
  --runtime-config=api/all=true \
  --enable-bootstrap-token-auth \
  --service-cluster-ip-range=10.96.0.0/16 \
  --token-auth-file=/etc/kubernetes/token.csv \
  --service-node-port-range=30000-32767 \
  --tls-cert-file=/etc/kubernetes/ssl/kube-apiserver.pem  \
  --tls-private-key-file=/etc/kubernetes/ssl/kube-apiserver-key.pem \
  --client-ca-file=/etc/kubernetes/ssl/ca.pem \
  --kubelet-client-certificate=/etc/kubernetes/ssl/kube-apiserver.pem \
  --kubelet-client-key=/etc/kubernetes/ssl/kube-apiserver-key.pem \
  --service-account-key-file=/etc/kubernetes/ssl/ca-key.pem \
  --service-account-signing-key-file=/etc/kubernetes/ssl/ca-key.pem  \
  --service-account-issuer=api \
  --etcd-cafile=/etc/etcd/ssl/ca.pem \
  --etcd-certfile=/etc/etcd/ssl/etcd.pem \
  --etcd-keyfile=/etc/etcd/ssl/etcd-key.pem \
  --etcd-servers=https://192.168.202.131:2379,https://192.168.202.132:2379,https://192.168.202.134:2379 \
  --allow-privileged=true \
  --apiserver-count=3 \
  --audit-log-maxage=30 \
  --audit-log-maxbackup=3 \
  --audit-log-maxsize=100 \
  --audit-log-path=/var/log/kube-apiserver-audit.log \
  --event-ttl=1h \
  --v=4"
EOF
在node4创建apiserver服务管理配置文件
cat > /etc/systemd/system/kube-apiserver.service << "EOF"
[Unit]
Description=Kubernetes API Server
Documentation=https://github.com/kubernetes/kubernetes
After=etcd.service
Wants=etcd.service

[Service]
EnvironmentFile=-/etc/kubernetes/kube-apiserver.conf
ExecStart=/usr/local/bin/kube-apiserver $KUBE_APISERVER_OPTS
Restart=on-failure
RestartSec=5
Type=notify
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
EOF

对node5节点进行配置

从node3 将证书文件拷贝到node5
[root@node3 k8s-work]# scp ca*.pem root@node5:/etc/kubernetes/ssl
root@node5's password: 
ca-key.pem                                                                                                                 100% 1679     3.4MB/s   00:00    
ca.pem                                                                                                                     100% 1346     4.5MB/s   00:00    
[root@node3 k8s-work]# scp kube-apiserver*.pem root@node5:/etc/kubernetes/ssl/
root@node5's password: 
kube-apiserver-key.pem                                                                                                     100% 1679     3.6MB/s   00:00    
kube-apiserver.pem 
将token.csv也拷贝到node5
scp /etc/kubernetes/token.csv node5:/etc/kubernetes/
node5上创建apiserver服务配置文件
cat > /etc/kubernetes/kube-apiserver.conf << "EOF"
KUBE_APISERVER_OPTS="--enable-admission-plugins=NamespaceLifecycle,NodeRestriction,LimitRanger,ServiceAccount,DefaultStorageClass,ResourceQuota \
  --anonymous-auth=false \
  --bind-address=192.168.202.134 \
  --secure-port=6443 \
  --advertise-address=192.168.202.134 \
  --authorization-mode=Node,RBAC \
  --runtime-config=api/all=true \
  --enable-bootstrap-token-auth \
  --service-cluster-ip-range=10.96.0.0/16 \
  --token-auth-file=/etc/kubernetes/token.csv \
  --service-node-port-range=30000-32767 \
  --tls-cert-file=/etc/kubernetes/ssl/kube-apiserver.pem  \
  --tls-private-key-file=/etc/kubernetes/ssl/kube-apiserver-key.pem \
  --client-ca-file=/etc/kubernetes/ssl/ca.pem \
  --kubelet-client-certificate=/etc/kubernetes/ssl/kube-apiserver.pem \
  --kubelet-client-key=/etc/kubernetes/ssl/kube-apiserver-key.pem \
  --service-account-key-file=/etc/kubernetes/ssl/ca-key.pem \
  --service-account-signing-key-file=/etc/kubernetes/ssl/ca-key.pem  \
  --service-account-issuer=api \
  --etcd-cafile=/etc/etcd/ssl/ca.pem \
  --etcd-certfile=/etc/etcd/ssl/etcd.pem \
  --etcd-keyfile=/etc/etcd/ssl/etcd-key.pem \
  --etcd-servers=https://192.168.202.131:2379,https://192.168.202.132:2379,https://192.168.202.134:2379 \
  --allow-privileged=true \
  --apiserver-count=3 \
  --audit-log-maxage=30 \
  --audit-log-maxbackup=3 \
  --audit-log-maxsize=100 \
  --audit-log-path=/var/log/kube-apiserver-audit.log \
  --event-ttl=1h \
  --v=4"
EOF
在node4创建apiserver服务管理配置文件
cat > /etc/systemd/system/kube-apiserver.service << "EOF"
[Unit]
Description=Kubernetes API Server
Documentation=https://github.com/kubernetes/kubernetes
After=etcd.service
Wants=etcd.service

[Service]
EnvironmentFile=-/etc/kubernetes/kube-apiserver.conf
ExecStart=/usr/local/bin/kube-apiserver $KUBE_APISERVER_OPTS
Restart=on-failure
RestartSec=5
Type=notify
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
EOF

启动服务

在node3, node4, node5三个节点上启动apiserver服务

systemctl daemon-reload
systemctl enable kube-apiserver --now

部署kubectl

创建kubectl证书请求文件

cat > admin-csr.json << "EOF"
{
  "CN": "admin",
  "hosts": [],
  "key": {
    "algo": "rsa",
    "size": 2048
  },
  "names": [
    {
      "C": "CN",
      "ST": "Beijing",
      "L": "Beijing",
      "O": "system:masters",             
      "OU": "system"
    }
  ]
}
EOF
说明:

后续 kube-apiserver 使用 RBAC 对客户端(如 kubelet、kube-proxy、Pod)请求进行授权;
kube-apiserver 预定义了一些 RBAC 使用的 RoleBindings,如 cluster-admin 将 Group system:masters 与 Role cluster-admin 绑定,该 Role 授予了调用kube-apiserver 的所有 API的权限;
O指定该证书的 Group 为 system:masters,kubelet 使用该证书访问 kube-apiserver 时 ,由于证书被 CA 签名,所以认证通过,同时由于证书用户组为经过预授权的 system:masters,所以被授予访问所有 API 的权限;
注:
这个admin 证书,是将来生成管理员用的kubeconfig 配置文件用的,现在我们一般建议使用RBAC 来对kubernetes 进行角色权限控制, kubernetes 将证书中的CN 字段 作为User, O 字段作为 Group;
"O": "system:masters", 必须是system:masters,否则后面kubectl create clusterrolebinding报错。

生成证书文件

cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes admin-csr.json | cfssljson -bare admin

复制证书文件到指定目录

cp admin*.pem /etc/kubernetes/ssl/

生成kubeconfig配置文件

kube.config 为 kubectl 的配置文件,包含访问 apiserver 的所有信息,如 apiserver 地址、CA 证书和自身使用的证书

kubectl config set-cluster kubernetes --certificate-authority=ca.pem --embed-certs=true --server=https://192.168.202.140:6443 --kubeconfig=kube.config

kubectl config set-credentials admin --client-certificate=admin.pem --client-key=admin-key.pem --embed-certs=true --kubeconfig=kube.config

kubectl config set-context kubernetes --cluster=kubernetes --user=admin --kubeconfig=kube.config

kubectl config use-context kubernetes --kubeconfig=kube.config

准备kubectl配置文件并进行角色绑定

mkdir ~/.kube
cp kube.config ~/.kube/config
kubectl create clusterrolebinding kube-apiserver:kubelet-apis --clusterrole=system:kubelet-api-admin --user kubernetes --kubeconfig=/root/.kube/config

查看状态

查看集群的状态
[root@node3 k8s-work]# kubectl cluster-info
Kubernetes control plane is running at https://192.168.202.140:6443

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

查看组件的状态
[root@node3 k8s-work]# kubectl get componentstatuses
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME                 STATUS      MESSAGE                                                                                        ERROR
scheduler            Unhealthy   Get "https://127.0.0.1:10259/healthz": dial tcp 127.0.0.1:10259: connect: connection refused   
controller-manager   Unhealthy   Get "https://127.0.0.1:10257/healthz": dial tcp 127.0.0.1:10257: connect: connection refused   
etcd-0               Healthy     ok
查看命名空间中的资源对象
[root@node3 k8s-work]# kubectl get all --all-namespaces
NAMESPACE   NAME                 TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
default     service/kubernetes   ClusterIP   10.96.0.1    <none>        443/TCP   37m

同步kubectl配置文件到集群其它master节点

# node4
[root@node4 ~]# mkdir /root/.kube

# node5
[root@node5 ~]# mkdir /root/.kube

[root@node3 k8s-work]# scp /root/.kube/config node4:/root/.kube/config 
root@node4's password: 
config                                                                                                                     100% 6239    10.5MB/s   00:00    
[root@node3 k8s-work]# scp /root/.kube/config node5:/root/.kube/config 
root@node5's password: 
config

部署kube-controller-manager

创建kube-controller-manager证书请求文件

cat > kube-controller-manager-csr.json << "EOF"
{
    "CN": "system:kube-controller-manager",
    "key": {
        "algo": "rsa",
        "size": 2048
    },
    "hosts": [
      "127.0.0.1",
      "192.168.202.131",
      "192.168.202.132",
      "192.168.202.134"
    ],
    "names": [
      {
        "C": "CN",
        "ST": "Beijing",
        "L": "Beijing",
        "O": "system:kube-controller-manager",
        "OU": "system"
      }
    ]
}
EOF
说明:

hosts 列表包含所有 kube-controller-manager 节点 IP;
CN 为 system:kube-controller-manager;
O 为 system:kube-controller-manager,kubernetes 内置的 ClusterRoleBindings system:kube-controller-manager 赋予 kube-controller-manager 工作所需的权限

创建kube-controller-manager证书文件

cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kube-controller-manager-csr.json | cfssljson -bare kube-controller-manager

创建kube-controller-manager的kube-controller-manager.kubeconfig

kubectl config set-cluster kubernetes --certificate-authority=ca.pem --embed-certs=true --server=https://192.168.202.140:6443 --kubeconfig=kube-controller-manager.kubeconfig

kubectl config set-credentials system:kube-controller-manager --client-certificate=kube-controller-manager.pem --client-key=kube-controller-manager-key.pem --embed-certs=true --kubeconfig=kube-controller-manager.kubeconfig

kubectl config set-context system:kube-controller-manager --cluster=kubernetes --user=system:kube-controller-manager --kubeconfig=kube-controller-manager.kubeconfig

kubectl config use-context system:kube-controller-manager --kubeconfig=kube-controller-manager.kubeconfig

创建kube-controller-manager配置文件

cat > kube-controller-manager.conf << "EOF"
KUBE_CONTROLLER_MANAGER_OPTS="
  --secure-port=10257 \
  --bind-address=127.0.0.1 \
  --kubeconfig=/etc/kubernetes/kube-controller-manager.kubeconfig \
  --service-cluster-ip-range=10.96.0.0/16 \
  --cluster-name=kubernetes \
  --cluster-signing-cert-file=/etc/kubernetes/ssl/ca.pem \
  --cluster-signing-key-file=/etc/kubernetes/ssl/ca-key.pem \
  --allocate-node-cidrs=true \
  --cluster-cidr=10.244.0.0/16 \
  --root-ca-file=/etc/kubernetes/ssl/ca.pem \
  --service-account-private-key-file=/etc/kubernetes/ssl/ca-key.pem \
  --leader-elect=true \
  --feature-gates=RotateKubeletServerCertificate=true \
  --controllers=*,bootstrapsigner,tokencleaner \
  --horizontal-pod-autoscaler-sync-period=10s \
  --tls-cert-file=/etc/kubernetes/ssl/kube-controller-manager.pem \
  --tls-private-key-file=/etc/kubernetes/ssl/kube-controller-manager-key.pem \
  --use-service-account-credentials=true \
  --v=2"
EOF

创建服务启动文件

cat > kube-controller-manager.service << "EOF"
[Unit]
Description=Kubernetes Controller Manager
Documentation=https://github.com/kubernetes/kubernetes

[Service]
EnvironmentFile=-/etc/kubernetes/kube-controller-manager.conf
ExecStart=/usr/local/bin/kube-controller-manager $KUBE_CONTROLLER_MANAGER_OPTS
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF

将下列文件拷贝到master文件指定的文件夹

cp kube-controller-manager*.pem /etc/kubernetes/ssl/
cp kube-controller-manager.kubeconfig /etc/kubernetes/
cp kube-controller-manager.conf /etc/kubernetes/
cp kube-controller-manager.service /usr/lib/systemd/system/
scp  kube-controller-manager*.pem node4:/etc/kubernetes/ssl/
scp  kube-controller-manager*.pem node5:/etc/kubernetes/ssl/

scp  kube-controller-manager.kubeconfig kube-controller-manager.conf node4:/etc/kubernetes/
scp  kube-controller-manager.kubeconfig kube-controller-manager.conf node5:/etc/kubernetes/

scp  kube-controller-manager.service node4:/usr/lib/systemd/system/
scp  kube-controller-manager.service node5:/usr/lib/systemd/system/

启动服务

systemctl daemon-reload 
systemctl enable --now kube-controller-manager

查看组件的运行状态

[root@node5 ~]# kubectl get componentstatuses
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME                 STATUS      MESSAGE                                                                                        ERROR
scheduler            Unhealthy   Get "https://127.0.0.1:10259/healthz": dial tcp 127.0.0.1:10259: connect: connection refused   
controller-manager   Healthy     ok                                                                                             
etcd-0               Healthy     ok   

部署kube-scheduler

创建kube-scheduler证书请求文件

cat > kube-scheduler-csr.json << "EOF"
{
    "CN": "system:kube-scheduler",
    "hosts": [
      "127.0.0.1",
      "192.168.202.131",
      "192.168.202.132",
      "192.168.202.134"
    ],
    "key": {
        "algo": "rsa",
        "size": 2048
    },
    "names": [
      {
        "C": "CN",
        "ST": "Beijing",
        "L": "Beijing",
        "O": "system:kube-scheduler",
        "OU": "system"
      }
    ]
}
EOF

生成kube-scheduler证书

cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kube-scheduler-csr.json | cfssljson -bare kube-scheduler

创建kube-scheduler的kubeconfig

kubectl config set-cluster kubernetes --certificate-authority=ca.pem --embed-certs=true --server=https://192.168.202.140:6443 --kubeconfig=kube-scheduler.kubeconfig

kubectl config set-credentials system:kube-scheduler --client-certificate=kube-scheduler.pem --client-key=kube-scheduler-key.pem --embed-certs=true --kubeconfig=kube-scheduler.kubeconfig

kubectl config set-context system:kube-scheduler --cluster=kubernetes --user=system:kube-scheduler --kubeconfig=kube-scheduler.kubeconfig

kubectl config use-context system:kube-scheduler --kubeconfig=kube-scheduler.kubeconfig

创建服务配置文件

cat > kube-scheduler.conf << "EOF"
KUBE_SCHEDULER_OPTS="
--bind-address=127.0.0.1 \
--kubeconfig=/etc/kubernetes/kube-scheduler.kubeconfig \
--leader-elect=true \
--v=2"
EOF

创建服务启动配置文件

cat > kube-scheduler.service << "EOF"
[Unit]
Description=Kubernetes Scheduler
Documentation=https://github.com/kubernetes/kubernetes

[Service]
EnvironmentFile=-/etc/kubernetes/kube-scheduler.conf
ExecStart=/usr/local/bin/kube-scheduler $KUBE_SCHEDULER_OPTS
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF

同步集群文件到master节点

cp kube-scheduler*.pem /etc/kubernetes/ssl/
cp kube-scheduler.kubeconfig /etc/kubernetes/
cp kube-scheduler.conf /etc/kubernetes/
cp kube-scheduler.service /usr/lib/systemd/system/
scp  kube-scheduler*.pem node4:/etc/kubernetes/ssl/
scp  kube-scheduler*.pem node5:/etc/kubernetes/ssl/

scp  kube-scheduler.kubeconfig kube-scheduler.conf node4:/etc/kubernetes/
scp  kube-scheduler.kubeconfig kube-scheduler.conf node5:/etc/kubernetes/

scp  kube-scheduler.service node4:/usr/lib/systemd/system/
scp  kube-scheduler.service node5:/usr/lib/systemd/system/

启动服务

systemctl daemon-reload
systemctl enable --now kube-scheduler
systemctl status kube-scheduler

查看服务

[root@node5 ~]# kubectl get componentstatuses
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME                 STATUS    MESSAGE   ERROR
controller-manager   Healthy   ok        
scheduler            Healthy   ok        
etcd-0               Healthy   ok 

k8s 工作节点所需组件的部署

worker(node6, node7)节点所需的组件有以下几个部分

  • kubelet
  • kube-proxy
  • 容器运行时工具 containerd

从node3中将需要的二进制文件拷贝到node6和node7

[root@node3 bin]# scp kubelet kube-proxy node6:/usr/local/bin/
[root@node3 bin]# scp kubelet kube-proxy node7:/usr/local/bin/

安装containerd

wget https://github.com/containerd/containerd/releases/download/v1.6.24/cri-containerd-cni-1.6.24-linux-amd64.tar.gz
tar -xf cri-containerd-cni-1.6.24-linux-amd64.tar.gz -C /

默认解压后会有如下目录:
etc
opt
usr
会把对应的目解压到/下对应目录中,这样就省去复制文件步骤。

创建配置文件

containerd config default >/etc/containerd/config.toml

sed -i 's#SystemdCgroup = false#SystemdCgroup = true#g' /etc/containerd/config.toml
sed -i 's@registry.k8s.io/pause:3.6@registry.aliyuncs.com/google_containers/pause:3.9@' /etc/containerd/config.toml

运行containerd

systemctl enable containerd --now

安装runc

wget https://github.com/opencontainers/runc/releases/download/v1.1.9/runc.amd64

[root@node6 ~]# chmod +x runc.amd64
[root@node6 ~]# mv runc.amd64 /usr/local/sbin/runc

部署kubelet

在node3操作生成配置文件,然后复制到工作节点上

创建kubelet-bootstrap.kubeconfig
BOOTSTRAP_TOKEN=$(awk -F "," '{print $1}' /etc/kubernetes/token.csv)

kubectl config set-cluster kubernetes --certificate-authority=ca.pem --embed-certs=true --server=https://192.168.202.140:6443 --kubeconfig=kubelet-bootstrap.kubeconfig

kubectl config set-credentials kubelet-bootstrap --token=${BOOTSTRAP_TOKEN} --kubeconfig=kubelet-bootstrap.kubeconfig

kubectl config set-context default --cluster=kubernetes --user=kubelet-bootstrap --kubeconfig=kubelet-bootstrap.kubeconfig

kubectl config use-context default --kubeconfig=kubelet-bootstrap.kubeconfig
kubectl create clusterrolebinding cluster-system-anonymous --clusterrole=cluster-admin --user=kubelet-bootstrap

kubectl create clusterrolebinding kubelet-bootstrap --clusterrole=system:node-bootstrapper --user=kubelet-bootstrap --kubeconfig=kubelet-bootstrap.kubeconfig
kubectl describe clusterrolebinding cluster-system-anonymous

kubectl describe clusterrolebinding kubelet-bootstrap

在node6和node7两台工作节点上创建目录

mkdir -p /etc/kubernetes/ssl
mkdir -p /var/lib/kubelet
mkdir -p /var/log/kubernetes

将文件拷贝到工作节点

scp kubelet-bootstrap.kubeconfig node6:/etc/kubernetes/
scp kubelet-bootstrap.kubeconfig node7:/etc/kubernetes/

scp ca.pem node6:/etc/kubernetes/ssl/
scp ca.pem node7:/etc/kubernetes/ssl/

在node6上创建kubelet配置文件

cat > /etc/kubernetes/kubelet.json << "EOF"
{
  "kind": "KubeletConfiguration",
  "apiVersion": "kubelet.config.k8s.io/v1beta1",
  "authentication": {
    "x509": {
      "clientCAFile": "/etc/kubernetes/ssl/ca.pem"
    },
    "webhook": {
      "enabled": true,
      "cacheTTL": "2m0s"
    },
    "anonymous": {
      "enabled": false
    }
  },
  "authorization": {
    "mode": "Webhook",
    "webhook": {
      "cacheAuthorizedTTL": "5m0s",
      "cacheUnauthorizedTTL": "30s"
    }
  },
  "address": "192.168.202.136",
  "port": 10250,
  "readOnlyPort": 10255,
  "cgroupDriver": "systemd",                    
  "hairpinMode": "promiscuous-bridge",
  "serializeImagePulls": false,
  "clusterDomain": "cluster.local.",
  "clusterDNS": ["10.96.0.2"]
}
EOF

node7创建的配置文件和node6除了address要配置为node6的以外,其他相同。

创建kubelet服务启动管理文件
cat > /usr/lib/systemd/system/kubelet.service << "EOF"
[Unit]
Description=Kubernetes Kubelet
Documentation=https://github.com/kubernetes/kubernetes
After=containerd.service
Requires=containerd.service

[Service]
WorkingDirectory=/var/lib/kubelet
ExecStart=/usr/local/bin/kubelet \
  --bootstrap-kubeconfig=/etc/kubernetes/kubelet-bootstrap.kubeconfig \
  --cert-dir=/etc/kubernetes/ssl \
  --kubeconfig=/etc/kubernetes/kubelet.kubeconfig \
  --config=/etc/kubernetes/kubelet.json \
  --container-runtime-endpoint=unix:///run/containerd/containerd.sock \
  --rotate-certificates \
  --pod-infra-container-image=registry.aliyuncs.com/google_containers/pause:3.9 \
  --v=2
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF

查看节点

[root@node5 ~]# kubectl get nodes
NAME    STATUS   ROLES    AGE     VERSION
node6   Ready    <none>   38m     v1.28.2
node7   Ready    <none>   3m59s   v1.28.2

部署kube-proxy

创建kube-proxy证书请求文件

cat > kube-proxy-csr.json << "EOF"
{
  "CN": "system:kube-proxy",
  "key": {
    "algo": "rsa",
    "size": 2048
  },
  "names": [
    {
      "C": "CN",
      "ST": "Beijing",
      "L": "Beijing",
      "O": "k8s",
      "OU": "CN"
    }
  ]
}
EOF

生成证书

cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kube-proxy-csr.json | cfssljson -bare kube-proxy

创建kubeconfig文件

kubectl config set-cluster kubernetes --certificate-authority=ca.pem --embed-certs=true --server=https://192.168.202.140:6443 --kubeconfig=kube-proxy.kubeconfig

kubectl config set-credentials kube-proxy --client-certificate=kube-proxy.pem --client-key=kube-proxy-key.pem --embed-certs=true --kubeconfig=kube-proxy.kubeconfig

kubectl config set-context default --cluster=kubernetes --user=kube-proxy --kubeconfig=kube-proxy.kubeconfig

kubectl config use-context default --kubeconfig=kube-proxy.kubeconfig

将文件复制到工作节点

scp kube-proxy*.pem node6:/etc/kubernetes/ssl/
scp kube-proxy*.pem node7:/etc/kubernetes/ssl/

scp kube-proxy.kubeconfig node6:/etc/kubernetes/
scp kube-proxy.kubeconfig node7:/etc/kubernetes/

在node6创建服务配置文件

cat > /etc/kubernetes/kube-proxy.yaml << "EOF"
apiVersion: kubeproxy.config.k8s.io/v1alpha1
bindAddress: 192.168.202.136
clientConnection:
  kubeconfig: /etc/kubernetes/kube-proxy.kubeconfig
clusterCIDR: 10.244.0.0/16
healthzBindAddress: 192.168.202.136:10256
kind: KubeProxyConfiguration
metricsBindAddress: 192.168.202.136:10249
mode: "ipvs"
EOF

在node7创建服务配置文件

cat > /etc/kubernetes/kube-proxy.yaml << "EOF"
apiVersion: kubeproxy.config.k8s.io/v1alpha1
bindAddress: 192.168.202.137
clientConnection:
  kubeconfig: /etc/kubernetes/kube-proxy.kubeconfig
clusterCIDR: 10.244.0.0/16
healthzBindAddress: 192.168.202.137:10256
kind: KubeProxyConfiguration
metricsBindAddress: 192.168.202.137:10249
mode: "ipvs"
EOF

创建服务启动管理文件

mkdir -p /var/lib/kube-proxy
cat >  /usr/lib/systemd/system/kube-proxy.service << "EOF"
[Unit]
Description=Kubernetes Kube-Proxy Server
Documentation=https://github.com/kubernetes/kubernetes
After=network.target

[Service]
WorkingDirectory=/var/lib/kube-proxy
ExecStart=/usr/local/bin/kube-proxy \
  --config=/etc/kubernetes/kube-proxy.yaml \
  --v=2
Restart=on-failure
RestartSec=5
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
EOF

启动服务

systemctl daemon-reload
systemctl enable --now kube-proxy

部署网络组件Flannel

下载所需的yaml文件

wget https://github.com/flannel-io/flannel/releases/latest/download/kube-flannel.yml

应用yaml文件

[root@node3 ~]# kubectl apply -f kube-flannel.yaml 
namespace/kube-flannel created
serviceaccount/flannel created
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
configmap/kube-flannel-cfg created
daemonset.apps/kube-flannel-ds created

查看flannel运行情况

[root@node3 ~]# kubectl get pods -A -owide
NAMESPACE      NAME                                       READY   STATUS        RESTARTS   AGE     IP                NODE    NOMINATED NODE   READINESS GATES
kube-flannel   kube-flannel-ds-k2cr8                      1/1     Running       0          5m37s   192.168.202.137   node7   <none>           <none>
kube-flannel   kube-flannel-ds-nbqr8                      1/1     Running       0          5m37s   192.168.202.136   node6   <none>           <none>

部署CoreDNS

cat >  coredns.yaml << "EOF"
apiVersion: v1
kind: ServiceAccount
metadata:
  name: coredns
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    kubernetes.io/bootstrapping: rbac-defaults
  name: system:coredns
rules:
  - apiGroups:
    - ""
    resources:
    - endpoints
    - services
    - pods
    - namespaces
    verbs:
    - list
    - watch
  - apiGroups:
    - discovery.k8s.io
    resources:
    - endpointslices
    verbs:
    - list
    - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  annotations:
    rbac.authorization.kubernetes.io/autoupdate: "true"
  labels:
    kubernetes.io/bootstrapping: rbac-defaults
  name: system:coredns
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:coredns
subjects:
- kind: ServiceAccount
  name: coredns
  namespace: kube-system
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: coredns
  namespace: kube-system
data:
  Corefile: |
    .:53 {
        errors
        health {
          lameduck 5s
        }
        ready
        kubernetes cluster.local  in-addr.arpa ip6.arpa {
          fallthrough in-addr.arpa ip6.arpa
        }
        prometheus :9153
        forward . /etc/resolv.conf {
          max_concurrent 1000
        }
        cache 30
        loop
        reload
        loadbalance
    }
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: coredns
  namespace: kube-system
  labels:
    k8s-app: kube-dns
    kubernetes.io/name: "CoreDNS"
spec:
  # replicas: not specified here:
  # 1. Default is 1.
  # 2. Will be tuned in real time if DNS horizontal auto-scaling is turned on.
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
  selector:
    matchLabels:
      k8s-app: kube-dns
  template:
    metadata:
      labels:
        k8s-app: kube-dns
    spec:
      priorityClassName: system-cluster-critical
      serviceAccountName: coredns
      tolerations:
        - key: "CriticalAddonsOnly"
          operator: "Exists"
      nodeSelector:
        kubernetes.io/os: linux
      affinity:
         podAntiAffinity:
           preferredDuringSchedulingIgnoredDuringExecution:
           - weight: 100
             podAffinityTerm:
               labelSelector:
                 matchExpressions:
                   - key: k8s-app
                     operator: In
                     values: ["kube-dns"]
               topologyKey: kubernetes.io/hostname
      containers:
      - name: coredns
        image: coredns/coredns:1.8.4
        imagePullPolicy: IfNotPresent
        resources:
          limits:
            memory: 170Mi
          requests:
            cpu: 100m
            memory: 70Mi
        args: [ "-conf", "/etc/coredns/Corefile" ]
        volumeMounts:
        - name: config-volume
          mountPath: /etc/coredns
          readOnly: true
        ports:
        - containerPort: 53
          name: dns
          protocol: UDP
        - containerPort: 53
          name: dns-tcp
          protocol: TCP
        - containerPort: 9153
          name: metrics
          protocol: TCP
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            add:
            - NET_BIND_SERVICE
            drop:
            - all
          readOnlyRootFilesystem: true
        livenessProbe:
          httpGet:
            path: /health
            port: 8080
            scheme: HTTP
          initialDelaySeconds: 60
          timeoutSeconds: 5
          successThreshold: 1
          failureThreshold: 5
        readinessProbe:
          httpGet:
            path: /ready
            port: 8181
            scheme: HTTP
      dnsPolicy: Default
      volumes:
        - name: config-volume
          configMap:
            name: coredns
            items:
            - key: Corefile
              path: Corefile
---
apiVersion: v1
kind: Service
metadata:
  name: kube-dns
  namespace: kube-system
  annotations:
    prometheus.io/port: "9153"
    prometheus.io/scrape: "true"
  labels:
    k8s-app: kube-dns
    kubernetes.io/cluster-service: "true"
    kubernetes.io/name: "CoreDNS"
spec:
  selector:
    k8s-app: kube-dns
  clusterIP: 10.96.0.2
  ports:
  - name: dns
    port: 53
    protocol: UDP
  - name: dns-tcp
    port: 53
    protocol: TCP
  - name: metrics
    port: 9153
    protocol: TCP
 
EOF

应用yaml文件

[root@node3 ~]# kubectl apply -f coredns.yaml 
serviceaccount/coredns created
clusterrole.rbac.authorization.k8s.io/system:coredns created
clusterrolebinding.rbac.authorization.k8s.io/system:coredns created
configmap/coredns created
deployment.apps/coredns created
service/kube-dns created

查看结果

[root@node3 ~]# kubectl get pods -A
NAMESPACE      NAME                                       READY   STATUS        RESTARTS   AGE
kube-flannel   kube-flannel-ds-k2cr8                      1/1     Running       0          7m42s
kube-flannel   kube-flannel-ds-nbqr8                      1/1     Running       0          7m42s
kube-system    coredns-6dd8bc9d-5c6bj                     1/1     Running       0          50s

部署服务

[root@node3 ~]# cat k8sutil.yaml 
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: k8sutil-deployment
spec:
  selector:
    matchLabels:
      app: k8sutil
  template:
    metadata:
      labels:
        app: k8sutil
    spec:
      containers:
      - name: k8sutil
        image: registry.cn-beijing.aliyuncs.com/postkarte/k8sutils:v1

查看Pod

[root@node3 ~]# kubectl get pods -A  -owide
NAMESPACE      NAME                       READY   STATUS    RESTARTS      AGE     IP                NODE    NOMINATED NODE   READINESS GATES
default        k8sutil-deployment-7pffl   1/1     Running   0             11h     10.244.1.2        node7   <none>           <none>
default        k8sutil-deployment-nnmxc   1/1     Running   0             3m24s   10.244.0.12       node6   <none>           <none>
kube-flannel   kube-flannel-ds-nbqr8      1/1     Running   3 (62s ago)   11h     192.168.202.136   node6   <none>           <none>
kube-flannel   kube-flannel-ds-zqstr      1/1     Running   1 (15m ago)   11h     192.168.202.137   node7   <none>           <none>
kube-system    coredns-6dd8bc9d-5mkhq     1/1     Running   0             35s     10.244.1.3        node7   <none>           <none>

遇到的错误

错误一

[root@node3 k8s-work]# kubectl create clusterrolebinding kube-apiserver:kubelet-apis --clusterrole=system:kubelet-api-admin --user kubernetes --kubeconfig=/root/.kube/config
error: failed to create clusterrolebinding: Post "https://192.168.202.100:6443/apis/rbac.authorization.k8s.io/v1/clusterrolebindings?fieldManager=kubectl-create&fieldValidation=Strict": tls: failed to verify certificate: x509: certificate is valid for 127.0.0.1, 192.168.202.129, 192.168.202.130, 192.168.202.131, 192.168.202.132, 192.168.202.134, 192.168.202.136, 192.168.202.137, 192.168.202.138, 192.168.202.139, 192.168.202.140, 10.96.0.1, not 192.168.202.100

原因是 kube-apiserver-csr.json证书文件中没有加192.168.202.100这个IP

解决办法:设置添加到列表的ip

参考链接