使用RKE构建企业生产级k8s集群
RKE工具介绍
- RKE是一款经过CNCF认证的开源Kubernetes发行版,可以在Docker容器内运行。
- 它通过删除大部分主机依赖项,并为部署、升级和回滚提供一个稳定的路径,从而解决了Kubernetes最常见的安装复杂性问题。
- 借助RKE,Kubernetes可以完全独立于正在运行的操作系统和平台,轻松实现Kubernetes的自动化运维。
- 只要运行受支持的Docker版本,就可以通过RKE部署和运行Kubernetes。仅需几分钟,RKE便可通过单条命令构建一个集群,其声明式配置使Kubernetes升级操作具备原子性且安全。
部署环境准备
使用CentOS7u9操作系统,然后准备如下配置的五个节点
ip | CPU | 内存 | 硬盘 | 角色 | 主机名 |
---|---|---|---|---|---|
192.168.91.190 | 2C | 2G | 40GB | controlplane、rancher、rke | master01 |
192.168.91.191 | 2C | 2G | 40GB | controlpane | master02 |
192.168.91.192 | 2C | 2G | 40GB | worker | worker01 |
192.168.91.193 | 2C | 2G | 40GB | worker | worker02 |
192.168.91.194 | 2C | 2G | 40GB | etcd | etcd01 |
在所有k8s节点中进行如下操作
# 基础配置
cat >> /etc/hosts << EOF
192.168.91.190 master01
192.168.91.191 master02
192.168.91.192 worker01
192.168.91.193 worker02
192.168.91.194 etcd01
EOF
yum -y install ntpdate
echo "0 */1 * * * ntpdate time1.aliyun.com" >> /var/spool/cron/root
systemctl disable firewalld && systemctl stop firewalld
sed -ri 's/SELINUX=enforcing/SELINUX=disabled/' /etc/selinux/config
# 配置内核转发及网桥过滤
cat > /etc/sysctl.d/k8s.conf << EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
vm.swappiness = 0
EOF
modprobe br_netfilter
sysctl -p /etc/sysctl.d/k8s.conf
# 关闭SWAP分区
sed -i 's&/dev/mapper/centos-swap&#/dev/mapper/centos-swap&' /etc/fstab
swapoff -a
# Docker安装
wget -O /etc/yum.repos.d/docker-ce.repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
yum -y install --setopt=obsoletes=0 docker-ce-20.10.9-3.el7
mkdir /etc/docker
cat << EOF > /etc/docker/daemon.json
{
"registry-mirrors": ["https://zwyx2n3v.mirror.aliyuncs.com"]
}
EOF
systemctl enable docker && systemctl start docker
# 重启
reboot
# 使用CentOS时,不能使用 root 账号,因此要添加专用的账号进行docker相关 操作
useradd rancher
usermod -aG docker rancher
echo 123 | passwd --stdin rancher
# 生成ssh证书用于部署集群,本次在master01上部署rke二进制文件,只需要配置master01到其它节点的免密登录
# master01
ssh-keygen
# 按照提示输入多次yes和密码即可
for i in 0 1 2 3 4; do ssh-copy-id rancher@192.168.91.19$i; done
rke部署集群
# master01
# rke工具下载
wget https://github.com/rancher/rke/releases/download/v1.3.7/rke_linux-amd64
mv rke_linux-amd64 /usr/local/bin/rke
chmod +x /usr/local/bin/rke
rke --version
rke version v1.3.7
# 初始化rke配置文件
mkdir -p /app/rancher
cd /app/rancher
# 按照提示输入信息
rke config --name cluster.yml
[+] Cluster Level SSH Private Key Path [~/.ssh/id_rsa]: ~/.ssh/id_rsa
[+] Number of Hosts [1]: 3
[+] SSH Address of host (1) [none]: 192.168.91.190
[+] SSH Port of host (1) [22]: 22
[+] SSH Private Key Path of host (192.168.91.190) [none]: ~/.ssh/id_rsa
[+] SSH User of host (192.168.91.190) [ubuntu]: rancher
[+] Is host (192.168.91.190) a Control Plane host (y/n)? [y]: y
[+] Is host (192.168.91.190) a Worker host (y/n)? [n]: n
[+] Is host (192.168.91.190) an etcd host (y/n)? [n]: n
[+] Override Hostname of host (192.168.91.190) [none]:
[+] Internal IP of host (192.168.91.190) [none]:
[+] Docker socket path on host (192.168.91.190) [/var/run/docker.sock]:
[+] SSH Address of host (2) [none]: 192.168.91.192
[+] SSH Port of host (2) [22]: 22
[+] SSH Private Key Path of host (192.168.91.192) [none]: ~/.ssh/id_rsa
[+] SSH User of host (192.168.91.192) [ubuntu]: rancher
[+] Is host (192.168.91.192) a Control Plane host (y/n)? [y]: n
[+] Is host (192.168.91.192) a Worker host (y/n)? [n]: y
[+] Is host (192.168.91.192) an etcd host (y/n)? [n]: n
[+] Override Hostname of host (192.168.91.192) [none]:
[+] Internal IP of host (192.168.91.192) [none]:
[+] Docker socket path on host (192.168.91.192) [/var/run/docker.sock]:
[+] SSH Address of host (3) [none]: 192.168.91.194
[+] SSH Port of host (3) [22]: 22
[+] SSH Private Key Path of host (192.168.91.194) [none]: ~/.ssh/id_rsa
[+] SSH User of host (192.168.91.194) [ubuntu]: rancher
[+] Is host (192.168.91.194) a Control Plane host (y/n)? [y]: n
[+] Is host (192.168.91.194) a Worker host (y/n)? [n]: n
[+] Is host (192.168.91.194) an etcd host (y/n)? [n]: y
[+] Override Hostname of host (192.168.91.194) [none]:
[+] Internal IP of host (192.168.91.194) [none]:
[+] Docker socket path on host (192.168.91.194) [/var/run/docker.sock]:
[+] Network Plugin Type (flannel, calico, weave, canal, aci) [canal]:
[+] Authentication Strategy [x509]:
[+] Authorization Mode (rbac, none) [rbac]:
[+] Kubernetes Docker image [rancher/hyperkube:v1.22.6-rancher1]: rancher/hyperkube:v1.21.9-rancher1
[+] Cluster domain [cluster.local]:
[+] Service Cluster IP Range [10.43.0.0/16]:
[+] Enable PodSecurityPolicy [n]:
[+] Cluster Network CIDR [10.42.0.0/16]:
[+] Cluster DNS Service IP [10.43.0.10]:
[+] Add addon manifest URLs or YAML files [no]:
如果后面需要部署kubeflow或istio则一定要配置以下参数
# cluster.yml
kube-controller:
...
# 如果后面需要部署kubeflow或istio则一定要配置以下参数
cluster-signing-cert-file: "/etc/kubernetes/ssl/kube-ca.pem"
cluster-signing-key-file: "/etc/kubernetes/ssl/kube-ca-key.pem"
# master01
cd /app/rancher
# 集群部署,日志很多,只要网络正常一般都会成功
rke up
...
INFO[0458] [addons] Executing deploy job rke-ingress-controller
INFO[0463] [ingress] removing default backend service and deployment if they exist
INFO[0463] [ingress] ingress controller nginx deployed successfully
INFO[0463] [addons] Setting up user addons
INFO[0463] [addons] no user addons defined
INFO[0463] Finished building Kubernetes cluster successfully
安装kubectl客户端
# master01
# kubectl客户端安装
wget https://storage.googleapis.com/kubernetes-release/release/v1.21.9/bin/linux/amd64/kubectl
chmod +x kubectl
mv kubectl /usr/local/bin/kubectl
kubectl version --client
Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.9", GitCommit:"b631974d68ac5045e076c86a5c66fba6f128dc72", GitTreeState:"clean", BuildDate:"2022-01-19T17:51:12Z", GoVersion:"go1.16.12", Compiler:"gc", Platform:"linux/amd64"}
# kubectl客户端配置集群管理文件及应用验证
mkdir ~/.kube
cp /app/rancher/kube_config_cluster.yml ~/.kube/config
kubectl get nodes
NAME STATUS ROLES AGE VERSION
192.168.91.190 Ready controlplane 2m49s v1.21.9
192.168.91.192 Ready worker 2m48s v1.21.9
192.168.91.194 Ready etcd 2m49s v1.21.9
kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-5685fbd9f7-g467d 1/1 Running 0 5m4s
canal-dk62m 2/2 Running 0 5m4s
canal-pk2b2 2/2 Running 0 5m4s
canal-zcbfz 2/2 Running 0 5m4s
coredns-8578b6dbdd-5qltl 1/1 Running 0 4m58s
coredns-autoscaler-f7b68ccb7-m8gxq 1/1 Running 0 4m58s
metrics-server-6bc7854fb5-b4tzq 1/1 Running 0 4m53s
rke-coredns-addon-deploy-job-dmfmm 0/1 Completed 0 4m59s
rke-ingress-controller-deploy-job-q8djc 0/1 Completed 0 4m49s
rke-metrics-addon-deploy-job-xbmln 0/1 Completed 0 4m54s
rke-network-plugin-deploy-job-tfc7c 0/1 Completed 0 5m14s
集群web管理 rancher
rancher控制面板主要方便用于控制k8s集群,查看集群状态,编辑集群等
www.suse.com/suse-ranche… 在这里可以查看rancher与k8s的版本兼容性
# master01
# 使用docker run启动一个rancher
docker run -d --restart=unless-stopped --privileged --name rancher -p 80:80 -p 443:443 rancher/rancher:v2.5.9
ss -anput | grep ":80"
tcp LISTEN 0 128 *:80 *:* users:(("docker-proxy",pid=46168,fd=4))
tcp LISTEN 0 128 [::]:80 [::]:* users:(("docker-proxy",pid=46174,fd=4))
# 使用第一条报错
kubectl apply -f https://192.168.91.190/v3/import/x8dzj9kp6qgvd6p9dl79bxrws7zkkwxvpc44xvfk2rkwfdps46bzjc_c-lvnx7.yaml
Unable to connect to the server: x509: certificate is valid for 127.0.0.1, 172.17.0.2, not 192.168.91.190
# 使用第二条
# 第一次报错
curl --insecure -sfL https://192.168.91.190/v3/import/x8dzj9kp6qgvd6p9dl79bxrws7zkkwxvpc44xvfk2rkwfdps46bzjc_c-lvnx7.yaml | kubectl apply -f -
error: no objects passed to apply
# 第二次成功
curl --insecure -sfL https://192.168.91.190/v3/import/x8dzj9kp6qgvd6p9dl79bxrws7zkkwxvpc44xvfk2rkwfdps46bzjc_c-lvnx7.yaml | kubectl apply -f -
Warning: resource clusterroles/proxy-clusterrole-kubeapiserver is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by kubectl apply. kubectl apply should only be used on resources created declaratively by either kubectl create --save-config or kubectl apply. The missing annotation will be patched automatically.
clusterrole.rbac.authorization.k8s.io/proxy-clusterrole-kubeapiserver configured
Warning: resource clusterrolebindings/proxy-role-binding-kubernetes-master is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by kubectl apply. kubectl apply should only be used on resources created declaratively by either kubectl create --save-config or kubectl apply. The missing annotation will be patched automatically.
clusterrolebinding.rbac.authorization.k8s.io/proxy-role-binding-kubernetes-master configured
namespace/cattle-system created
serviceaccount/cattle created
clusterrolebinding.rbac.authorization.k8s.io/cattle-admin-binding created
secret/cattle-credentials-8a71274 created
clusterrole.rbac.authorization.k8s.io/cattle-admin created
deployment.apps/cattle-cluster-agent created
# 观察所有pod的状态变为Running或Completed
watch kubectl get pods -A
集群节点更新
# master01
cd /app/rancher
# 增加worker节点
# worker节点上的pod是没有结束运行的。如果节点被重复使用,那么在创建新的 Kubernetes 集群时,将自动删除 pod
# 修改cluster.yml,在nodes下面添加worker02节点的配置
- address: 192.168.91.193
port: "22"
internal_address: ""
role:
- worker
hostname_override: ""
user: rancher
docker_socket: /var/run/docker.sock
ssh_key: ""
ssh_key_path: ~/.ssh/id_rsa
ssh_cert: ""
ssh_cert_path: ""
labels: {}
taints: []
rke up --update-only
...
INFO[0246] [addons] Setting up user addons
INFO[0246] [addons] no user addons defined
INFO[0246] Finished building Kubernetes cluster successfully
kubectl get nodes
NAME STATUS ROLES AGE VERSION
192.168.91.190 Ready controlplane 73m v1.21.9
192.168.91.192 Ready worker 73m v1.21.9
192.168.91.193 Ready worker 71s v1.21.9
192.168.91.194 Ready etcd 73m v1.21.9
# 移除worker节点
# 修改cluster.yml,在nodes下面移除worker02节点的配置
rke up --update-only
...
INFO[0010] [addons] Setting up user addons
INFO[0010] [addons] no user addons defined
INFO[0010] Finished building Kubernetes cluster successfully
kubectl get nodes
NAME STATUS ROLES AGE VERSION
192.168.91.190 Ready controlplane 80m v1.21.9
192.168.91.192 Ready worker 80m v1.21.9
192.168.91.194 Ready etcd 80m v1.21.9
# 增加etcd节点
# 修改cluster.yml,在nodes下面添加etcd节点的配置
- address: 192.168.91.193
port: "22"
internal_address: ""
role:
- etcd
hostname_override: ""
user: rancher
docker_socket: /var/run/docker.sock
ssh_key: ""
ssh_key_path: ~/.ssh/id_rsa
ssh_cert: ""
ssh_cert_path: ""
labels: {}
taints: []
rke up --update-only
...
INFO[0129] [addons] Setting up user addons
INFO[0129] [addons] no user addons defined
INFO[0129] Finished building Kubernetes cluster successfully
kubectl get nodes
NAME STATUS ROLES AGE VERSION
192.168.91.190 Ready controlplane 91m v1.21.9
192.168.91.192 Ready worker 91m v1.21.9
192.168.91.193 Ready etcd 2m51s v1.21.9
192.168.91.194 Ready etcd 91m v1.21.9
# 移除etcd节点
# 修改cluster.yml,在nodes下面移除上面添加的etcd节点的配置
rke up --update-only
...
INFO[0022] [addons] Setting up user addons
INFO[0022] [addons] no user addons defined
INFO[0022] Finished building Kubernetes cluster successfully
kubectl get nodes
NAME STATUS ROLES AGE VERSION
192.168.91.190 Ready controlplane 94m v1.21.9
192.168.91.192 Ready worker 94m v1.21.9
192.168.91.194 Ready etcd 94m v1.21.9
# 增加master节点,不知道什么原因不能直接添加master节点,可以先添加worker节点成功之后再修改成master节点
# 修改cluster.yml,在nodes下面添加master节点的配置
- address: 192.168.91.191
port: "22"
internal_address: ""
role:
- worker
hostname_override: ""
user: rancher
docker_socket: /var/run/docker.sock
ssh_key: ""
ssh_key_path: ~/.ssh/id_rsa
ssh_cert: ""
ssh_cert_path: ""
labels: {}
taints: []
rke up --update-only
kubectl get nodes
NAME STATUS ROLES AGE VERSION
192.168.91.190 Ready controlplane 5h13m v1.21.9
192.168.91.191 Ready worker 7m18s v1.21.9
192.168.91.192 Ready worker 5h13m v1.21.9
192.168.91.194 Ready etcd 5h13m v1.21.9
# 修改cluster.yml,nodes下面192.168.91.191的角色修改为controlplane
- address: 192.168.91.191
port: "22"
internal_address: ""
role:
- controlplane
rke up --update-only
kubectl get nodes
NAME STATUS ROLES AGE VERSION
192.168.91.190 Ready controlplane 5h15m v1.21.9
192.168.91.191 Ready controlplane 8m57s v1.21.9
192.168.91.192 Ready worker 5h15m v1.21.9
192.168.91.194 Ready etcd 5h15m v1.21.9
# 移除master节点
# 修改cluster.yml,在nodes下面移除上面添加的master节点的配置
rke up --update-only
kubectl get nodes
NAME STATUS ROLES AGE VERSION
192.168.91.190 Ready controlplane 5h17m v1.21.9
192.168.91.192 Ready worker 5h17m v1.21.9
192.168.91.194 Ready etcd 5h17m v1.21.9
部署应用
# master01
cat > nginx.yaml << "EOF"
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-test
spec:
selector:
matchLabels:
app: nginx
env: test
owner: rancher
replicas: 2 # tells deployment to run 2 pods matching the template
template:
metadata:
labels:
app: nginx
env: test
owner: rancher
spec:
containers:
- name: nginx-test
image: nginx:1.19.9
ports:
- containerPort: 80
EOF
kubectl apply -f nginx.yaml
deployment.apps/nginx-test created
cat > nginx-service.yaml << "EOF"
---
apiVersion: v1
kind: Service
metadata:
name: nginx-test
labels:
run: nginx
spec:
type: NodePort
ports:
- port: 80
protocol: TCP
selector:
owner: rancher
EOF
kubectl apply -f nginx-service.yaml
service/nginx-test created
# 验证
kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-test-7d95fb4447-2t7nv 1/1 Running 0 104s 10.42.2.19 192.168.91.192 <none> <none>
nginx-test-7d95fb4447-sxsfg 1/1 Running 0 104s 10.42.2.18 192.168.91.192 <none> <none>
kubectl get svc -o wide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
kubernetes ClusterIP 10.43.0.1 <none> 443/TCP 5h26m <none>
nginx-test NodePort 10.43.109.255 <none> 80:31213/TCP 2m4s owner=rancher
curl 192.168.91.192:31213
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
...