搭建k8s
前言
我有一个项目需要部署,正好618 京东云大降价,我就买了几台服务器,加上以前买的几台云服务器,打算组一个小k8s集群,因为分属各个云服务器提供商,内网不互通,折腾了好久,终于成功了,因此记录一下。
| 云服务器 | 厂商 | linux版本 |
|---|---|---|
| master | 京东云 | ubuntu22.04 |
| jd-2h4g | 京东云 | ubuntu22.04 |
| jd-4h16g | 京东云 | ubuntu22.04 |
| baidu-4h8g | 百度云 | ubuntu22.04 |
| aliyun-2h4g | 阿里云 | ubuntu22.04 |
| txyun-2h4g | 腾讯云 | ubuntu22.04 |
目标是在k8s集群上搭建 longhorn,pulsar,mongodb,mysql,redis,etcd ,traefik等服务
| k8s组件 | 值 |
|---|---|
| k8s版本 | 1.30.2 |
| 初始化方式 | kubeadm 1.30.2 |
| cri 容器运行时 | contaierd 1.7.12 |
| cni 网络插件 | cilium 1.16.0 完全替代kube-proxy + vxlan方案 |
| 包管理器 | helm 3.15.2 |
| 存储方案 | 分布式块存储 longhorn |
目前集群已完成 longhorn + traefik 的安装,其他组件持续更新。。。 (有时间再把cilium的方案改成本地路由试下)
配置免密登录
方便从本地传文件到服务器上,mac电脑,ssh工具是 termius
ssh-copy-id -i .ssh/id_rsa.pub root@xxx.xxx.xxx.xxx
ssh-copy-id -i .ssh/id_rsa.pub root@xxx.xxx.xxx.xxx
ssh-copy-id -i .ssh/id_rsa.pub root@xxx.xxx.xxx.xxx
设置k8s设置
仅需要设置这个即可
# 设置所需的 sysctl 参数,参数在重新启动后保持不变
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.ipv4.ip_forward = 1
EOF
# 应用 sysctl 参数而不重新启动
sudo sysctl --system
安装containerd
比docker更轻量,k8s推荐
sudo apt-get update && sudo apt-get install -y containerd
# 修改contained 配置
sudo mkdir -p /etc/containerd
sudo containerd config default | sudo tee /etc/containerd/config.toml
cd /etc/containerd
# 修改配置 详见 k8s
# CgroupSystemd = true
# 修改沙箱 镜像为 registry.aliyuncs.com/google_containers/pause:3.9
# 重启containerd
sudo systemctl restart containerd


配置crictl
cat > /etc/crictl.yaml << EOF
runtime-endpoint: unix:///run/containerd/containerd.sock
image-endpoint: unix:///run/containerd/containerd.sock
EOF
下载k8s组件
sudo apt-get update
# apt-transport-https 可能是一个虚拟包(dummy package);如果是的话,你可以跳过安装这个包
apt-get update && apt-get install -y apt-transport-https
curl -fsSL https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.30/deb/Release.key |
gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
echo "deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.30/deb/ /" |
tee /etc/apt/sources.list.d/kubernetes.list
sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl
设置cgroup
ubuntu 需要设置这个,具体原因不是很明白,应该跟cgroup有关
vim /etc/default/grub
# 找到 GRUB_CMDLINE_LINUX 这一行
添加 cgroup_enable=cpu
# 再重启
cat /etc/default/grub
sudo update-grub2
sudo reboot
设置子网卡
这一步是因为k8s初始化时 会检查公网ip对应的 network interface,所以创建一个子网卡,并修改主机名,方便查找容器对应的主机名。
sudo hostnamectl set-hostname master
sudo hostnamectl set-hostname jd-4h16g
sudo hostnamectl set-hostname baidu-4h8g
sudo ip addr add xxx.xxx.xxx.xxx/24 dev eth0 label eth0:0
sudo ip link set dev eth0:0 up
sudo ip addr add xxx.xxx.xxx.xxx/24 dev eth0 label eth0:0
sudo ip link set dev eth0:0 up
sudo ip addr add xxx.xxx.xxx.xxx0/24 dev eth0 label eth0:0
sudo ip link set dev eth0:0 up
安装k8s
- 生成init 配置文件(目的是把公网ip传给kubelet,以及做一些设置修改)
sudo kubeadm config print init-defaults > kubeadm-init.yml
- 修改init配置文件
apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: xxx.xxx.xxx.xxx # 设置master的公网ip
bindPort: 6443
nodeRegistration:
kubeletExtraArgs:
node-ip: xxx.xxx.xxx.xxx # 设置master的公网ip
criSocket: unix:///var/run/containerd/containerd.sock
imagePullPolicy: IfNotPresent
name: master # 设置master的name
taints: null
---
apiServer:
extraArgs:
service-node-port-range: "1-65535"
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns: {}
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: registry.aliyuncs.com/google_containers # 设置国内源
kind: ClusterConfiguration
kubernetesVersion: 1.30.0
networking:
dnsDomain: cluster.local
serviceSubnet: 10.96.0.0/12
podSubnet: 10.244.0.0/16 # 设置子网
scheduler: {}
- 集群初始化
kubeadm init --config kubeadm-init.yaml --skip-phases=addon/kube-proxy
- 设置join配置文件 (把公网ip传给node节点的kubelet)
sudo kubeadm config print join-defaults > kubeadm-join.yml
- 设置join配置文件
apiVersion: kubeadm.k8s.io/v1beta3
caCertPath: /etc/kubernetes/pki/ca.crt
discovery:
bootstrapToken:
apiServerEndpoint: x.x.x.x:6443
token: abcdef.0123456789abcdef
unsafeSkipCAVerification: true
timeout: 5m0s
tlsBootstrapToken: abcdef.0123456789abcdef
kind: JoinConfiguration
nodeRegistration:
kubeletExtraArgs:
node-ip: x.x.x.x
criSocket: unix:///var/run/containerd/containerd.sock
imagePullPolicy: IfNotPresent
name: aliyun
taints: null
- join 集群
kubeadm join --config kubeadm-join.yml
- 配置crictl (containerd的更方便的命令工具)
cat > /etc/crictl.yaml <<EOF
runtime-endpoint: unix:///run/containerd/containerd.sock
image-endpoint: unix:///run/containerd/containerd.sock
timeout: 10
debug: false
EOF
安装helm
curl https://baltocdn.com/helm/signing.asc | gpg --dearmor | sudo tee /usr/share/keyrings/helm.gpg > /dev/null
sudo apt-get install apt-transport-https --yes
echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/helm.gpg] https://baltocdn.com/helm/stable/debian/ all main" | sudo tee /etc/apt/sources.list.d/helm-stable-debian.list
sudo apt-get update
sudo apt-get install helm
安装 cilium(cni插件)
cni插件一直都是很重要的,然后如果安装失败了,要移除失败的影响又很麻烦,往往最简单的做法是重装系统,我之前折腾了很久的calico跟flannel,一直都有各种各样的问题,我只是一个后端开发,不需要在这上面折腾太多时间,然后cilium又很省心,也不需要设置iptable 或者ipvs,cilium的ip分配以及监控非常强大。因此,我要强烈安利cilium,真的很省心!
- 安装helm repo 源
helm repo add cilium https://helm.cilium.io/
- 配置cilium的自定义配置
cluster:
name: kubernetes
enableIPv4Masquerade: true
enableIPv6Masquerade: true
hubble:
relay:
enabled: true
ui:
enabled: true
ipam:
mode: kubernetes
operator:
clusterPoolIPv4PodCIDRList: 10.244.0.0/16 #k8s pod 网段
ipv4NativeRoutingCIDR: 10.244.0.0/16
k8sServiceHost: xxx.xxx.xxx.xxx #master的公网ip
k8sServicePort: 6443
kubeProxyReplacement: strict # 完全替代kube-proxy
operator:
replicas: 1
serviceAccounts:
cilium:
name: cilium
operator:
name: cilium-operator
tunnel: vxlan
- 安装 cilium
helm install cilium cilium/cilium --namespace kube-system --values cilium-values.yml
如果碰到安装cilium成功,但是还是无法再集群内访问clusterIP,很有可能是服务器端口的问题,建议也放开udp端口,cilium端口要求,详见 前置要求