05使用RKE构建企业生产级k8s集群

0 阅读9分钟

使用RKE构建企业生产级k8s集群

RKE工具介绍

  • RKE是一款经过CNCF认证的开源Kubernetes发行版,可以在Docker容器内运行。
  • 它通过删除大部分主机依赖项,并为部署、升级和回滚提供一个稳定的路径,从而解决了Kubernetes最常见的安装复杂性问题。
  • 借助RKE,Kubernetes可以完全独立于正在运行的操作系统和平台,轻松实现Kubernetes的自动化运维。
  • 只要运行受支持的Docker版本,就可以通过RKE部署和运行Kubernetes。仅需几分钟,RKE便可通过单条命令构建一个集群,其声明式配置使Kubernetes升级操作具备原子性且安全。

部署环境准备

使用CentOS7u9操作系统,然后准备如下配置的五个节点

ipCPU内存硬盘角色主机名
192.168.91.1902C2G40GBcontrolplane、rancher、rkemaster01
192.168.91.1912C2G40GBcontrolpanemaster02
192.168.91.1922C2G40GBworkerworker01
192.168.91.1932C2G40GBworkerworker02
192.168.91.1942C2G40GBetcdetcd01

在所有k8s节点中进行如下操作

# 基础配置
cat >> /etc/hosts << EOF
192.168.91.190  master01
192.168.91.191  master02
192.168.91.192  worker01
192.168.91.193  worker02
192.168.91.194  etcd01
EOF
yum -y install ntpdate
echo "0 */1 * * * ntpdate time1.aliyun.com" >> /var/spool/cron/root
systemctl disable firewalld && systemctl stop firewalld
sed -ri 's/SELINUX=enforcing/SELINUX=disabled/' /etc/selinux/config

# 配置内核转发及网桥过滤
cat > /etc/sysctl.d/k8s.conf << EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
vm.swappiness = 0
EOF
modprobe br_netfilter
sysctl -p /etc/sysctl.d/k8s.conf

# 关闭SWAP分区
sed -i 's&/dev/mapper/centos-swap&#/dev/mapper/centos-swap&' /etc/fstab
swapoff -a

# Docker安装
wget -O /etc/yum.repos.d/docker-ce.repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
yum -y install --setopt=obsoletes=0 docker-ce-20.10.9-3.el7
mkdir /etc/docker
cat << EOF > /etc/docker/daemon.json
{
  "registry-mirrors": ["https://zwyx2n3v.mirror.aliyuncs.com"]
}
EOF
systemctl enable docker && systemctl start docker
# 重启
reboot

# 使用CentOS时,不能使用 root 账号,因此要添加专用的账号进行docker相关 操作
useradd rancher
usermod -aG docker rancher
echo 123 | passwd --stdin rancher

# 生成ssh证书用于部署集群,本次在master01上部署rke二进制文件,只需要配置master01到其它节点的免密登录

# master01
ssh-keygen

# 按照提示输入多次yes和密码即可
for i in 0 1 2 3 4; do ssh-copy-id rancher@192.168.91.19$i; done

rke部署集群

# master01

# rke工具下载
wget https://github.com/rancher/rke/releases/download/v1.3.7/rke_linux-amd64
mv rke_linux-amd64 /usr/local/bin/rke
chmod +x /usr/local/bin/rke
rke --version
rke version v1.3.7

# 初始化rke配置文件
mkdir -p /app/rancher
cd /app/rancher
# 按照提示输入信息
rke config --name cluster.yml
[+] Cluster Level SSH Private Key Path [~/.ssh/id_rsa]: ~/.ssh/id_rsa
[+] Number of Hosts [1]: 3
[+] SSH Address of host (1) [none]: 192.168.91.190
[+] SSH Port of host (1) [22]: 22
[+] SSH Private Key Path of host (192.168.91.190) [none]: ~/.ssh/id_rsa
[+] SSH User of host (192.168.91.190) [ubuntu]: rancher
[+] Is host (192.168.91.190) a Control Plane host (y/n)? [y]: y
[+] Is host (192.168.91.190) a Worker host (y/n)? [n]: n
[+] Is host (192.168.91.190) an etcd host (y/n)? [n]: n
[+] Override Hostname of host (192.168.91.190) [none]:
[+] Internal IP of host (192.168.91.190) [none]:
[+] Docker socket path on host (192.168.91.190) [/var/run/docker.sock]:
[+] SSH Address of host (2) [none]: 192.168.91.192
[+] SSH Port of host (2) [22]: 22
[+] SSH Private Key Path of host (192.168.91.192) [none]: ~/.ssh/id_rsa
[+] SSH User of host (192.168.91.192) [ubuntu]: rancher
[+] Is host (192.168.91.192) a Control Plane host (y/n)? [y]: n
[+] Is host (192.168.91.192) a Worker host (y/n)? [n]: y
[+] Is host (192.168.91.192) an etcd host (y/n)? [n]: n
[+] Override Hostname of host (192.168.91.192) [none]:
[+] Internal IP of host (192.168.91.192) [none]:
[+] Docker socket path on host (192.168.91.192) [/var/run/docker.sock]:
[+] SSH Address of host (3) [none]: 192.168.91.194
[+] SSH Port of host (3) [22]: 22
[+] SSH Private Key Path of host (192.168.91.194) [none]: ~/.ssh/id_rsa
[+] SSH User of host (192.168.91.194) [ubuntu]: rancher
[+] Is host (192.168.91.194) a Control Plane host (y/n)? [y]: n
[+] Is host (192.168.91.194) a Worker host (y/n)? [n]: n
[+] Is host (192.168.91.194) an etcd host (y/n)? [n]: y
[+] Override Hostname of host (192.168.91.194) [none]:
[+] Internal IP of host (192.168.91.194) [none]:
[+] Docker socket path on host (192.168.91.194) [/var/run/docker.sock]:
[+] Network Plugin Type (flannel, calico, weave, canal, aci) [canal]:
[+] Authentication Strategy [x509]:
[+] Authorization Mode (rbac, none) [rbac]:
[+] Kubernetes Docker image [rancher/hyperkube:v1.22.6-rancher1]: rancher/hyperkube:v1.21.9-rancher1
[+] Cluster domain [cluster.local]:
[+] Service Cluster IP Range [10.43.0.0/16]:
[+] Enable PodSecurityPolicy [n]:
[+] Cluster Network CIDR [10.42.0.0/16]:
[+] Cluster DNS Service IP [10.43.0.10]:
[+] Add addon manifest URLs or YAML files [no]:

如果后面需要部署kubeflow或istio则一定要配置以下参数

# cluster.yml
kube-controller:
	...
    # 如果后面需要部署kubeflow或istio则一定要配置以下参数
    cluster-signing-cert-file: "/etc/kubernetes/ssl/kube-ca.pem"
    cluster-signing-key-file: "/etc/kubernetes/ssl/kube-ca-key.pem"
# master01
cd /app/rancher
# 集群部署,日志很多,只要网络正常一般都会成功
rke up
...

INFO[0458] [addons] Executing deploy job rke-ingress-controller
INFO[0463] [ingress] removing default backend service and deployment if they exist
INFO[0463] [ingress] ingress controller nginx deployed successfully
INFO[0463] [addons] Setting up user addons
INFO[0463] [addons] no user addons defined
INFO[0463] Finished building Kubernetes cluster successfully

安装kubectl客户端

# master01

# kubectl客户端安装
wget https://storage.googleapis.com/kubernetes-release/release/v1.21.9/bin/linux/amd64/kubectl
chmod +x kubectl 
mv kubectl /usr/local/bin/kubectl
kubectl version --client
Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.9", GitCommit:"b631974d68ac5045e076c86a5c66fba6f128dc72", GitTreeState:"clean", BuildDate:"2022-01-19T17:51:12Z", GoVersion:"go1.16.12", Compiler:"gc", Platform:"linux/amd64"}

# kubectl客户端配置集群管理文件及应用验证
mkdir ~/.kube
cp /app/rancher/kube_config_cluster.yml ~/.kube/config
kubectl get nodes
NAME             STATUS   ROLES          AGE     VERSION
192.168.91.190   Ready    controlplane   2m49s   v1.21.9
192.168.91.192   Ready    worker         2m48s   v1.21.9
192.168.91.194   Ready    etcd           2m49s   v1.21.9

kubectl get pods -n kube-system
NAME                                       READY   STATUS      RESTARTS   AGE
calico-kube-controllers-5685fbd9f7-g467d   1/1     Running     0          5m4s
canal-dk62m                                2/2     Running     0          5m4s
canal-pk2b2                                2/2     Running     0          5m4s
canal-zcbfz                                2/2     Running     0          5m4s
coredns-8578b6dbdd-5qltl                   1/1     Running     0          4m58s
coredns-autoscaler-f7b68ccb7-m8gxq         1/1     Running     0          4m58s
metrics-server-6bc7854fb5-b4tzq            1/1     Running     0          4m53s
rke-coredns-addon-deploy-job-dmfmm         0/1     Completed   0          4m59s
rke-ingress-controller-deploy-job-q8djc    0/1     Completed   0          4m49s
rke-metrics-addon-deploy-job-xbmln         0/1     Completed   0          4m54s
rke-network-plugin-deploy-job-tfc7c        0/1     Completed   0          5m14s

集群web管理 rancher

rancher控制面板主要方便用于控制k8s集群,查看集群状态,编辑集群等

www.suse.com/suse-ranche… 在这里可以查看rancher与k8s的版本兼容性

# master01

# 使用docker run启动一个rancher
docker run -d --restart=unless-stopped --privileged --name rancher -p 80:80 -p 443:443 rancher/rancher:v2.5.9

ss -anput | grep ":80"
tcp    LISTEN     0      128       *:80                    *:*                   users:(("docker-proxy",pid=46168,fd=4))
tcp    LISTEN     0      128    [::]:80                 [::]:*                   users:(("docker-proxy",pid=46174,fd=4))

image.png

image.png

image.png

image.png

image.png

image.png

# 使用第一条报错
kubectl apply -f https://192.168.91.190/v3/import/x8dzj9kp6qgvd6p9dl79bxrws7zkkwxvpc44xvfk2rkwfdps46bzjc_c-lvnx7.yaml
Unable to connect to the server: x509: certificate is valid for 127.0.0.1, 172.17.0.2, not 192.168.91.190

# 使用第二条
# 第一次报错
curl --insecure -sfL https://192.168.91.190/v3/import/x8dzj9kp6qgvd6p9dl79bxrws7zkkwxvpc44xvfk2rkwfdps46bzjc_c-lvnx7.yaml | kubectl apply -f -
error: no objects passed to apply

# 第二次成功
curl --insecure -sfL https://192.168.91.190/v3/import/x8dzj9kp6qgvd6p9dl79bxrws7zkkwxvpc44xvfk2rkwfdps46bzjc_c-lvnx7.yaml | kubectl apply -f -
Warning: resource clusterroles/proxy-clusterrole-kubeapiserver is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by kubectl apply. kubectl apply should only be used on resources created declaratively by either kubectl create --save-config or kubectl apply. The missing annotation will be patched automatically.
clusterrole.rbac.authorization.k8s.io/proxy-clusterrole-kubeapiserver configured
Warning: resource clusterrolebindings/proxy-role-binding-kubernetes-master is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by kubectl apply. kubectl apply should only be used on resources created declaratively by either kubectl create --save-config or kubectl apply. The missing annotation will be patched automatically.
clusterrolebinding.rbac.authorization.k8s.io/proxy-role-binding-kubernetes-master configured
namespace/cattle-system created
serviceaccount/cattle created
clusterrolebinding.rbac.authorization.k8s.io/cattle-admin-binding created
secret/cattle-credentials-8a71274 created
clusterrole.rbac.authorization.k8s.io/cattle-admin created
deployment.apps/cattle-cluster-agent created

# 观察所有pod的状态变为Running或Completed
watch kubectl get pods -A

image.png

image.png

集群节点更新

# master01
cd /app/rancher
# 增加worker节点
# worker节点上的pod是没有结束运行的。如果节点被重复使用,那么在创建新的 Kubernetes 集群时,将自动删除 pod
# 修改cluster.yml,在nodes下面添加worker02节点的配置
- address: 192.168.91.193
  port: "22"
  internal_address: ""
  role:
  - worker
  hostname_override: ""
  user: rancher
  docker_socket: /var/run/docker.sock
  ssh_key: ""
  ssh_key_path: ~/.ssh/id_rsa
  ssh_cert: ""
  ssh_cert_path: ""
  labels: {}
  taints: []

rke up --update-only
...
INFO[0246] [addons] Setting up user addons
INFO[0246] [addons] no user addons defined
INFO[0246] Finished building Kubernetes cluster successfully

kubectl get nodes
NAME             STATUS   ROLES          AGE   VERSION
192.168.91.190   Ready    controlplane   73m   v1.21.9
192.168.91.192   Ready    worker         73m   v1.21.9
192.168.91.193   Ready    worker         71s   v1.21.9
192.168.91.194   Ready    etcd           73m   v1.21.9

# 移除worker节点
# 修改cluster.yml,在nodes下面移除worker02节点的配置
rke up --update-only
...
INFO[0010] [addons] Setting up user addons
INFO[0010] [addons] no user addons defined
INFO[0010] Finished building Kubernetes cluster successfully

kubectl get nodes
NAME             STATUS   ROLES          AGE   VERSION
192.168.91.190   Ready    controlplane   80m   v1.21.9
192.168.91.192   Ready    worker         80m   v1.21.9
192.168.91.194   Ready    etcd           80m   v1.21.9


# 增加etcd节点
# 修改cluster.yml,在nodes下面添加etcd节点的配置
- address: 192.168.91.193
  port: "22"
  internal_address: ""
  role:
  - etcd
  hostname_override: ""
  user: rancher
  docker_socket: /var/run/docker.sock
  ssh_key: ""
  ssh_key_path: ~/.ssh/id_rsa
  ssh_cert: ""
  ssh_cert_path: ""
  labels: {}
  taints: []

rke up --update-only
...
INFO[0129] [addons] Setting up user addons
INFO[0129] [addons] no user addons defined
INFO[0129] Finished building Kubernetes cluster successfully

kubectl get nodes
NAME             STATUS   ROLES          AGE     VERSION
192.168.91.190   Ready    controlplane   91m     v1.21.9
192.168.91.192   Ready    worker         91m     v1.21.9
192.168.91.193   Ready    etcd           2m51s   v1.21.9
192.168.91.194   Ready    etcd           91m     v1.21.9

# 移除etcd节点
# 修改cluster.yml,在nodes下面移除上面添加的etcd节点的配置
rke up --update-only
...
INFO[0022] [addons] Setting up user addons
INFO[0022] [addons] no user addons defined
INFO[0022] Finished building Kubernetes cluster successfully

kubectl get nodes
NAME             STATUS   ROLES          AGE   VERSION
192.168.91.190   Ready    controlplane   94m   v1.21.9
192.168.91.192   Ready    worker         94m   v1.21.9
192.168.91.194   Ready    etcd           94m   v1.21.9

# 增加master节点,不知道什么原因不能直接添加master节点,可以先添加worker节点成功之后再修改成master节点
# 修改cluster.yml,在nodes下面添加master节点的配置
- address: 192.168.91.191
  port: "22"
  internal_address: ""
  role:
  - worker
  hostname_override: ""
  user: rancher
  docker_socket: /var/run/docker.sock
  ssh_key: ""
  ssh_key_path: ~/.ssh/id_rsa
  ssh_cert: ""
  ssh_cert_path: ""
  labels: {}
  taints: []

rke up --update-only

kubectl get nodes
NAME             STATUS   ROLES          AGE     VERSION
192.168.91.190   Ready    controlplane   5h13m   v1.21.9
192.168.91.191   Ready    worker         7m18s   v1.21.9
192.168.91.192   Ready    worker         5h13m   v1.21.9
192.168.91.194   Ready    etcd           5h13m   v1.21.9

# 修改cluster.yml,nodes下面192.168.91.191的角色修改为controlplane
- address: 192.168.91.191
  port: "22"
  internal_address: ""
  role:
  - controlplane
  
rke up --update-only

kubectl get nodes
NAME             STATUS   ROLES          AGE     VERSION
192.168.91.190   Ready    controlplane   5h15m   v1.21.9
192.168.91.191   Ready    controlplane   8m57s   v1.21.9
192.168.91.192   Ready    worker         5h15m   v1.21.9
192.168.91.194   Ready    etcd           5h15m   v1.21.9

# 移除master节点
# 修改cluster.yml,在nodes下面移除上面添加的master节点的配置
rke up --update-only

kubectl get nodes
NAME             STATUS   ROLES          AGE     VERSION
192.168.91.190   Ready    controlplane   5h17m   v1.21.9
192.168.91.192   Ready    worker         5h17m   v1.21.9
192.168.91.194   Ready    etcd           5h17m   v1.21.9

部署应用

# master01

cat > nginx.yaml << "EOF"
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-test
spec:
  selector:
    matchLabels:
      app: nginx
      env: test
      owner: rancher
  replicas: 2 # tells deployment to run 2 pods matching the template
  template:
    metadata:
      labels:
        app: nginx
        env: test
        owner: rancher
    spec:
      containers:
        - name: nginx-test
          image: nginx:1.19.9
          ports:
            - containerPort: 80
EOF

kubectl apply -f nginx.yaml
deployment.apps/nginx-test created

cat > nginx-service.yaml << "EOF"
---
apiVersion: v1
kind: Service
metadata:
  name: nginx-test
  labels:
    run: nginx
spec:
  type: NodePort
  ports:
  - port: 80
    protocol: TCP
  selector:
    owner: rancher
EOF

kubectl apply -f nginx-service.yaml
service/nginx-test created

# 验证
kubectl get pods -o wide
NAME                          READY   STATUS    RESTARTS   AGE    IP           NODE             NOMINATED NODE   READINESS GATES
nginx-test-7d95fb4447-2t7nv   1/1     Running   0          104s   10.42.2.19   192.168.91.192   <none>           <none>
nginx-test-7d95fb4447-sxsfg   1/1     Running   0          104s   10.42.2.18   192.168.91.192   <none>           <none>

kubectl get svc -o wide
NAME         TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)        AGE     SELECTOR
kubernetes   ClusterIP   10.43.0.1       <none>        443/TCP        5h26m   <none>
nginx-test   NodePort    10.43.109.255   <none>        80:31213/TCP   2m4s    owner=rancher

curl 192.168.91.192:31213
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
...