此篇文档为生产系统K8S环境迁移服务,在全部文档中编号为4,内容为etcd,api-server,controller-manager,scheduler四个组件
部署架构
| host | ip | 组件 | 备注 |
|---|---|---|---|
| ed1.bj.ylls.com | 172.27.0.20 | etcd | |
| ed2.bj.ylls.com | 172.27.0.21 | etcd | |
| ed3.bj.ylls.com | 172.27.0.22 | etcd | |
| km1.bj.ylls.com | 172.27.0.10 | api-server controller-manager scheduler | |
| km2.bj.ylls.com | 172.27.0.11 | api-server controller-manager scheduler | |
| k8s-master.ylls.com | 172.27.0.19 | k8s-master服务的vip |
组件信息
| 组件 | 版本 | 下载地址 |
|---|---|---|
| etcd | v3.1.20 | github.com/etcd-io/etc… |
| api-server | v1.15.5 | dl.k8s.io/v1.15.5/kub… |
| controller-manager | v1.15.5 | dl.k8s.io/v1.15.5/kub… |
| scheduler | v1.15.5 | dl.k8s.io/v1.15.5/kub… |
etcd部署
证书准备
证书工具
工具操作
wget https://github.com/cloudflare/cfssl/releases/download/1.2.0/cfssl_linux-amd64 -o /usr/bin/cfssl
wget https://github.com/cloudflare/cfssl/releases/download/1.2.0/cfssljson_linux-amd64 -o /usr/bin/cfssjson
wget https://github.com/cloudflare/cfssl/releases/download/1.2.0/cfssl-certinfo_linux-amd64 -o /usr/bin/cfssl-certinfo
chmod +x /usr/bin/cfssl* ~
生成证书
生成配置文件模板
cfssl print-defaults csr > ca-csr.json
编辑配置文件ca-csr.json
{
"CN": "ylls",
"hosts": [
"ylls.com"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "Beijing",
"L": "Beijing",
"O": "ylls",
"OU": "ops"
}
],
"ca": {
"expiry": "87600h"
}
}
expiry字段代表证书过期时间,这里是87600小时,即10年
生成操作
cfssl gencert -initca ca-csr.json | cfssl-json -bare ca
执行后会产生ca.pem,ca-key.pem,ca.csr三个文件
编辑配置文件ca-config.json
{
"signing": {
"default": {
"expiry": "87600h"
},
"profiles": {
"server": {
"expiry": "87600h",
"usages": [
"signing",
"key encipherment",
"server auth"
]
},
"client": {
"expiry": "87600h",
"usages": [
"signing",
"key encipherment",
"client auth"
]
},
"peer": {
"expiry": "87600h",
"usages": [
"signing",
"key encipherment",
"server auth",
"client auth"
]
}
}
}
}
这里配置了三个证书类型,server,client,peer,分另对应server、client和双向
编辑配置文件etcd-peer-csr.json
{
"CN": "k8s-etcd",
"hosts": [
"172.27.0.20",
"172.27.0.21",
"172.27.0.22",
"172.27.0.23",
"172.27.0.24",
"172.27.0.25",
"172.27.0.26",
"172.27.0.27",
"172.27.0.28",
"172.27.0.29"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "Beijing",
"L": "Beijing",
"O": "ylls",
"OU": "ops"
}
]
}
在本实例中只部署三台etcd,ip为172.27.0.20-22。但是为了后期加机器或换机器,这里把规划中的20-29所有ip加在配置文件中。注意必需写单个ip,不能写ip段。
生成etcd-peer证书
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=peer etcd-peer-csr.json|cfssl-json -bare etcd-peer
完成后会产生etcd-peer.pem,etcd-peer-key.pem,etcd-peer.csr三个文件
安装etcd
- 安装准备
useradd -s /sbin/nologin -M etcd
wget https://github.com/etcd-io/etcd/releases/download/v3.4.16/etcd-v3.1.20-linux-amd64.tar.gz
tar zxvf etcd-v3.1.20-linux-amd64.tar.gz
mv etcd-v3.1.20-linux-amd64 /server/src/etcd-3.1.20
ln -s /server/src/etcd-3.1.20 /server/etcd
mkdir -p /server/etcd/certs /server/etcd/data /server/logs/etcd
chown -R etcd:etcd /server/etcd
chown -R etcd:etcd /server/logs/etcd
cp etcd-peer.pem etcd-peer-key.pem ca.pem /server/etcd/certs/
chown etcd:etcd /server/etce/certs/etcd-peer-key.pem
- 启动脚本etcd-startup.sh
#!/bin/sh
/server/etcd --name etcd-server-0.20 \
--data-dir /server/etcd/data \
--listen-peer-urls https://172.27.0.20:2380 \
--listen-client-urls https://172.27.0.20:2379,http://127.0.0.1:2379 \
--quota-backent-bytes 8000000000 \
--initial-advertise-peer-urls https://172.27.0.20:2380\
--advertise-client-urls https://172.27.0.20:2379,http://127.0.0.1:2379 \
--initial-cluster etcd-server-20=https://127.27.0.20:2380,etcd-server-21=https://172.27.0.21:2380,etcd-server-22=https://172.27.0.22:2380 \
--ca-file /server/etcd/certs/ca.pem \
--cert-file /server/etcd/certs/etcd-peer.pem \
--key-file /server/etcd/certs/etcd-peer-key.pem \
--client-cert-auth \
--trusted-ca-file /server/etcd/certs/ca.pem \
--peer-ca-file /server/etcd/certs/ca.pem \
--peer-cert-file /server/etcd/certs/etcd-peer.pem \
--peer-key-file /server/etcd/certs/etcd-peer-key.pem \
--peer-client-cert-auth \
--peer-trusted-ca-file /server/etcd/certs/ca.pem
- 准备进程管理工具supervisor
yum -y install supervisor
systemctl start supervisord
systemctl enable supervisord
- 准备etcd在supervisor中的ini文件/etc/supervisord.d/etcd-server.ini
在使用supervisor过程中必需提前创建相关目录才能正常启动
[program:etcd-server-0.20] ; 程序名称,在 supervisorctl 中通过这个值来对程序进行一系列的操作
autorestart=True ; 程序异常退出后自动重启
autostart=True ; 在 supervisord 启动的时候也自动启动
redirect_stderr=True ; 把 stderr 重定向到 stdout,默认 false
#environment=PATH="/home/app_env/bin" ; 可以通过 environment 来添加需要的环境变量
command=/server/etcd/start.sh ; 启动命令,与手动在命令行启动的命令是一样的
user=etcd ; 用哪个用户启动
#directory=/server/etcd ; 程序的启动目录
stout_logfile=/server/logs/etcd.log
stdout_logfile_maxbytes = 200MB ; stdout 日志文件大小,默认 50MB
stdout_logfile_backups = 20 ; stdout 日志文件备份数
; stdout 日志文件,需要注意当指定目录不存在时无法正常启动,所以需要手动创建目录(supervisord 会自动创建日志文件)
numprocs=1 ;启动一个进程
startsecs=30 ;启动时间30秒
startrestries=3 ;重试启动3次
exitcodes=0,2 ;退出码
stopsignal=QUIT ;退出信号
stopwaitsecs=10 ;停止等待时间
- 使用supervisor启动etcd
supervisorctl update
supervisorctl status
- 部署完成其它两台后,可以测试集群状态
/server/etcd/bin/etcdctl cluster-health
/server/etcd/bin/etcdctl member list
部署api-server
准备工作
wget https://dl.k8s.io/v1.15.5/kubernetes-server-linux-amd64.tar.gz
tar zxvf kubernetes-server-linux-amd64.tar.gz
mv kubernetes-server-linux-amd64 /server/src/k8s-server-1.15.4
ln -s /server/src/k8s-server-1.15.4 /server/k8s-server
rm -rf /server/k8s-server/bin/kubernetes-src.tar.gz #删go源码
rm -rf /server/k8s-server/server/bin/*.tar #删镜像
rm -rf /server/k8s-server/server/bin/*_tag #删tag文本
签发证书
- client证书
签发client证书, 用于api-server与etcd通讯,也是以后所有client端通用证书
- 编辑配置文件client-scr.json
{
"CN": "k8s-node",
"hosts": [
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "Beijing",
"L": "Beijing",
"O": "ylls",
"OU": "ops"
}
]
}
- 生成操作
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=client client-csr.json|cfssl-json -bare client
操作后会产生client.pem,client-key.pem,client.csr三个文件
-
签发api-server的server证书
-
编辑配置文件apiservier-csr.json
{
"CN": "k8s-apiserver",
"hosts": [
"172.27.0.10",
"192.168.254.1",
"127.0.0.1",
"kubernetes.default",
"kubernetes.default.svc",
"kubernetes.default.svc.cluster",
"kubernetes.default.svc.cluster.local",
"172.27.0.11",
"172.27.0.12",
"172.27.0.13",
"172.27.0.14",
"172.27.0.15",
"172.27.0.16",
"172.27.0.17",
"172.27.0.18",
"172.27.0.19"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "Beijing",
"L": "Beijing",
"O": "ylls",
"OU": "ops"
}
]
}
host中的192.168.254.1是k8s集群中apiserver的clusterIP地址,一定要加上
- 生成操作
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=server apiserver-csr.json|cfssl-json -bare apiserver
完成后产生apiserver.pem,apiserver-key.pem,apiserver.csr三个文件
apiserver的准备工作
mkdir /server/logs/k8s -p
mkdir /server/k8s-server/certs
cp ca.pem ca-key.pem client-key.pem client.pem apiserver.pem apiserver-key.pem /server/k8s-server/certs/
mkdir /server/k8s-server/conf
编辑审计策略文件audit.yaml
apiVersion: audit.k8s.io/v1beta1 # This is required.
kind: Policy
# Don't generate audit events for all requests in RequestReceived stage.
omitStages:
- "RequestReceived"
rules:
# Log pod changes at RequestResponse level
- level: RequestResponse
resources:
- group: ""
# Resource "pods" doesn't match requests to any subresource of pods,
# which is consistent with the RBAC policy.
resources: ["pods"]
# Log "pods/log", "pods/status" at Metadata level
- level: Metadata
resources:
- group: ""
resources: ["pods/log", "pods/status"]
# Don't log requests to a configmap called "controller-leader"
- level: None
resources:
- group: ""
resources: ["configmaps"]
resourceNames: ["controller-leader"]
# Don't log watch requests by the "system:kube-proxy" on endpoints or services
- level: None
users: ["system:kube-proxy"]
verbs: ["watch"]
resources:
- group: "" # core API group
resources: ["endpoints", "services"]
# Don't log authenticated requests to certain non-resource URL paths.
- level: None
userGroups: ["system:authenticated"]
nonResourceURLs:
- "/api*" # Wildcard matching.
- "/version"
# Log the request body of configmap changes in kube-system.
- level: Request
resources:
- group: "" # core API group
resources: ["configmaps"]
# This rule only applies to resources in the "kube-system" namespace.
# The empty string "" can be used to select non-namespaced resources.
namespaces: ["kube-system"]
# Log configmap and secret changes in all other namespaces at the Metadata level.
- level: Metadata
resources:
- group: "" # core API group
resources: ["secrets", "configmaps"]
# Log all other resources in core and extensions at the Request level.
- level: Request
resources:
- group: "" # core API group
- group: "extensions" # Version of group should NOT be included.
# A catch-all rule to log all other requests at the Metadata level.
- level: Metadata
# Long-running requests like watches that fall under this rule will not
# generate an audit event in RequestReceived.
omitStages:
- "RequestReceived"
部署api-server
- 启动脚本/server/k8s-server/bin/startup-apiserver.sh
#!/bin/sh
./kube-apiserver \
--apiserver-cont 2 \
--audit-log-path /server/logs/k8s/apiserver/audit-log \
--audit-policy-file /server/k8s-server/conf/audit.yaml \
--authorization-mode RBAC \
--client-ca-file /server/k8s-server/certs/ca.pem \
--requestheader-client-ca-file /server/k8s-server/certs/ca.pem \
--enable-admission-plugins NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,DefaultTolerationSeconds,MutatingAdmissionWebhook,ValidatingAdmissionWebhook,ResourceQuota \
--etcd-cafile /server/k8s-server/certs/ca.pem \
--etcd-certfile /server/k8s-server/certs/client.pem \
--etcd-keyfile /server/k8s-server/certs/client-key.pem \
--etcd-servers https://172.27.0.20:2379,https://172.27.0.21:2379,https://172.27.0.22:2379 \
--service-account-key-file /server/k8s-server/certs/ca-key.pem \
--service-cluster-ip-range 192.168.254.0/24 \
--service-node-port-range 30000-39999 \
--target-ram-mb=1024 \
--kubelet-client-certificate /server/k8s-server/certs/client.pem \
--kubelet-client-key /server/k8s-server/certs/client-key.pem \
--log-dir /server/logs/k8s/apiserver/apiserver.log \
--tls-cert-file /server/k8s-server/certs/apiserver.pem \
--tls-private-key-file /server/k8s-server/certs/apiserver-key.pem \
--v 2
- 进程管理ini文件/etc/supervisord.d/apiserver.ini
program:api-server-0.10] ; 程序名称,在 supervisorctl 中通过这个值来对程序进行一系列的操作
autorestart=True ; 程序异常退出后自动重启
autostart=True ; 在 supervisord 启动的时候也自动启动
redirect_stderr=True ; 把 stderr 重定向到 stdout,默认 false
#environment=PATH="/home/app_env/bin" ; 可以通过 environment 来添加需要的环境变量
command=/server/k8s/bin/startup-apiserver.sh ; 启动命令,与手动在命令行启动的命令是一样的
user=root ; 用哪个用户启动
#directory=/server/k8s ; 程序的启动目录
stout_logfile=/server/logs/apiserver/apiserver-run.log
stdout_logfile_maxbytes = 200MB ; stdout 日志文件大小,默认 50MB
stdout_logfile_backups = 3 ; stdout 日志文件备份数
; stdout 日志文件,需要注意当指定目录不存在时无法正常启动,所以需要手动创建目录(supervisord 会自动创建日志文件)
numprocs=1 ;启动一个进程
startsecs=30 ;启动时间30秒
startrestries=3 ;重试启动3次
exitcodes=0,2 ;退出码
stopsignal=QUIT ;退出信号
stopwaitsecs=10 ;停止等待时间
- 通过supervisor启动
supervisorctl update
supervisorctl status
- 部署其它服务器
这部分文档略过
高可用配置
- 四层代理配置
4层代理配置,这里用nginx,版本要1.9以上并安装stream模块,也可以用haproxy之类的专业4层代理,但是没有必要专门加一套,环境中本身有nginx,且性能不输haproxy。
注意stream配置不能写在nginx.conf的http模块中,因为不是http协议
stream {
upstream apiserver {
server 172.27.0.10:6443 max_fails=3 fail_timeout=30s;
server 172.27.0.11:6443 max_fails=3 fail_timeout=30s;
}
server {
listen 7443;
proxy_connect_timeout 2s;
proxy_timeout 900s;
proxy_pass apiserver;
}
}
- keepalive检查脚本/etc/keepalived/check_port.sh
#!/bin/sh
CHK_PORT=$1
if [ -n "$CHK_PORT" ]; then
PORT_PROCESS=`ss -lnt|grep $CHK_PORT|wc -l`
if [ $PORT_PROCESS -eq 0]; then
echo "Port $CHK_PORT is not used"
exit 1
fi
else
echo "Check port cant be empty!"
fi
- keepalive主配置
! Configuration File for keepalived
global_defs {
router_id 172.27.0.10
}
vrrp_script check_nginx {
script "/etc/keepalived/check_port.sh 6443"
interval 2
weight -20
}
vrrp_instance VI_1{
state MASTER
interface eth0
virtual_router_id 251
priority 100
advert_int 1
mcast_src_ip 172.27.0.10
nopreempt
authentication {
auth_type PASS
auth_pass 123456789
}
trach_script {
chk_nginx
}
virtual_ipaddress {
172.27.0.19
}
}
- keepalive从配置
! Configuration File for keepalived
global_defs {
router_id 172.27.0.11
}
vrrp_script check_nginx {
script "/etc/keepalived/check_port.sh 7443"
interval 2
weight -20
}
vrrp_instance VI_1{
state BACKUP
interface eth0
virtual_router_id 251
priority 90
advert_int 1
mcast_src_ip 172.27.0.11
authentication {
auth_type PASS
auth_pass 123456789
}
trach_script {
chk_nginx
}
virtual_ipaddress {
172.27.0.19
}
}
controller-manager 部署
注意:以下两个服务没有配置证书是因为apiserver、controller-manager和scheduler三个组件部署在一台服务器上,所以两个启动脚本中--master一项可以找127.0.0.1的http服务。如果三个组件不在一台服务器,那么一定要配置证书
安装准备工作
mkdir /server/logs/controller-manager
启动脚本/server/k8s-server/bin/startup-controller-manager.sh
#!/bin/sh
./kube-controller-manager \
--cluster-cidr 10.8.0.0/16 \
--leader-elect true \
--log-dir /server/logs/controller-manager \
--master http://127.0.0.1:8080 \
--service-account-private-key-file /server/k8s-server/certs/ca-key.pem \
--service-cluster-ip-range 192.168.254.0/24 \
--root-ca-file server/k8s-server/certs/ca.pem \
--v 2
进程管理文件/etc/supervisord.d/controller-manager.ini
[program:controller-manager-0.10]
command=/server/k8s-server/bin/startup-controller-manager.sh
numprocs=1
dirctory=/server/k8s-server/bin
autostart=true
autorestart=true
startsecs=30
startresties=3
exitcode=0,2
stopsignal=QUIT
stopwaitsecs=10
user=root
redirect_stderr=true
stdout_logfile=/server/logs/controller-manager/cotroller-manager-run.log
stdout_file_maxbytes=200MB
stdout_file_backups=3
stdout_capture_maxbytes=1MB
stdout_events_enabled=false
scheduler部署
scheduler启动脚本/server/k8s-server/bin/startup-scheduler.sh
#!/bin/bash
./kube-scheduler \
--leader-elect \
--log-dir /server/logs/scheduler \
--master http://127.0.0.1:8080 \
--v 2
准备工作
mkdir /server/logs/scheduler
进程管理文件/etc/supervisord.d/scheduler.ini
[program:scheduler-0.10]
command=/server/k8s-server/bin/startup-scheduler.sh
numprocs=1
dirctory=/server/k8s-server/bin
autostart=true
autorestart=true
startsecs=30
startresties=3
exitcode=0,2
stopsignal=QUIT
stopwaitsecs=10
user=root
redirect_stderr=true
stdout_logfile=/server/logs/scheduler/scheduler-run.log
stdout_file_maxbytes=200MB
stdout_file_backups=3
stdout_capture_maxbytes=1MB
stdout_events_enabled=false
部署其它服务器并启动
文档略过,以上k8s-master组件部署完成