公网云服务器安装k8s

468 阅读5分钟

搭建k8s

前言

我有一个项目需要部署,正好618 京东云大降价,我就买了几台服务器,加上以前买的几台云服务器,打算组一个小k8s集群,因为分属各个云服务器提供商,内网不互通,折腾了好久,终于成功了,因此记录一下。

云服务器厂商linux版本
master京东云ubuntu22.04
jd-2h4g京东云ubuntu22.04
jd-4h16g京东云ubuntu22.04
baidu-4h8g百度云ubuntu22.04
aliyun-2h4g阿里云ubuntu22.04
txyun-2h4g腾讯云ubuntu22.04

目标是在k8s集群上搭建 longhorn,pulsar,mongodb,mysql,redis,etcd ,traefik等服务

k8s组件
k8s版本1.30.2
初始化方式kubeadm 1.30.2
cri 容器运行时contaierd 1.7.12
cni 网络插件cilium 1.16.0 完全替代kube-proxy + vxlan方案
包管理器helm 3.15.2
存储方案分布式块存储 longhorn

目前集群已完成 longhorn + traefik 的安装,其他组件持续更新。。。 (有时间再把cilium的方案改成本地路由试下)

配置免密登录

方便从本地传文件到服务器上,mac电脑,ssh工具是 termius

ssh-copy-id -i .ssh/id_rsa.pub root@xxx.xxx.xxx.xxx
ssh-copy-id -i .ssh/id_rsa.pub root@xxx.xxx.xxx.xxx
ssh-copy-id -i .ssh/id_rsa.pub root@xxx.xxx.xxx.xxx

设置k8s设置

仅需要设置这个即可

# 设置所需的 sysctl 参数,参数在重新启动后保持不变
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.ipv4.ip_forward = 1
EOF

# 应用 sysctl 参数而不重新启动
sudo sysctl --system

安装containerd

比docker更轻量,k8s推荐

sudo apt-get update && sudo apt-get install -y containerd


# 修改contained 配置
sudo mkdir -p /etc/containerd
sudo containerd config default | sudo tee /etc/containerd/config.toml
cd /etc/containerd

# 修改配置 详见 k8s

 # CgroupSystemd = true
 # 修改沙箱 镜像为 registry.aliyuncs.com/google_containers/pause:3.9
 
# 重启containerd 
sudo systemctl restart containerd

image.png

image.png

配置crictl

cat > /etc/crictl.yaml << EOF
runtime-endpoint: unix:///run/containerd/containerd.sock 
image-endpoint: unix:///run/containerd/containerd.sock
EOF

下载k8s组件

sudo apt-get update
# apt-transport-https 可能是一个虚拟包(dummy package);如果是的话,你可以跳过安装这个包
apt-get update && apt-get install -y apt-transport-https
curl -fsSL https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.30/deb/Release.key |
    gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
echo "deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.30/deb/ /" |
    tee /etc/apt/sources.list.d/kubernetes.list

sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl

设置cgroup

ubuntu 需要设置这个,具体原因不是很明白,应该跟cgroup有关

vim /etc/default/grub

# 找到 GRUB_CMDLINE_LINUX 这一行

添加 cgroup_enable=cpu

# 再重启
cat /etc/default/grub
sudo update-grub2
sudo reboot

设置子网卡

这一步是因为k8s初始化时 会检查公网ip对应的 network interface,所以创建一个子网卡,并修改主机名,方便查找容器对应的主机名。

sudo hostnamectl set-hostname master
sudo hostnamectl set-hostname jd-4h16g
sudo hostnamectl set-hostname baidu-4h8g

sudo ip addr add xxx.xxx.xxx.xxx/24 dev eth0 label eth0:0
sudo ip link set dev eth0:0 up

sudo ip addr add xxx.xxx.xxx.xxx/24 dev eth0 label eth0:0
sudo ip link set dev eth0:0 up

sudo ip addr add xxx.xxx.xxx.xxx0/24 dev eth0 label eth0:0
sudo ip link set dev eth0:0 up

安装k8s

  1. 生成init 配置文件(目的是把公网ip传给kubelet,以及做一些设置修改)
sudo kubeadm config print init-defaults > kubeadm-init.yml
  1. 修改init配置文件
apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:
  - system:bootstrappers:kubeadm:default-node-token
  token: abcdef.0123456789abcdef
  ttl: 24h0m0s
  usages:
  - signing
  - authentication
kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: xxx.xxx.xxx.xxx # 设置master的公网ip
  bindPort: 6443
nodeRegistration:
  kubeletExtraArgs:
    node-ip: xxx.xxx.xxx.xxx  # 设置master的公网ip
  criSocket: unix:///var/run/containerd/containerd.sock
  imagePullPolicy: IfNotPresent
  name: master # 设置master的name
  taints: null
---
apiServer:
  extraArgs: 
    service-node-port-range: "1-65535"
  timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns: {}
etcd:
  local:
    dataDir: /var/lib/etcd
imageRepository: registry.aliyuncs.com/google_containers # 设置国内源
kind: ClusterConfiguration
kubernetesVersion: 1.30.0
networking:
  dnsDomain: cluster.local
  serviceSubnet: 10.96.0.0/12
  podSubnet: 10.244.0.0/16 # 设置子网
scheduler: {}
  1. 集群初始化
kubeadm init  --config kubeadm-init.yaml --skip-phases=addon/kube-proxy 
  1. 设置join配置文件 (把公网ip传给node节点的kubelet)
sudo kubeadm config print join-defaults > kubeadm-join.yml
  1. 设置join配置文件
apiVersion: kubeadm.k8s.io/v1beta3
caCertPath: /etc/kubernetes/pki/ca.crt
discovery:
  bootstrapToken:
    apiServerEndpoint: x.x.x.x:6443
    token: abcdef.0123456789abcdef
    unsafeSkipCAVerification: true
  timeout: 5m0s
  tlsBootstrapToken: abcdef.0123456789abcdef
kind: JoinConfiguration
nodeRegistration:
  kubeletExtraArgs:
    node-ip: x.x.x.x
  criSocket: unix:///var/run/containerd/containerd.sock
  imagePullPolicy: IfNotPresent
  name: aliyun
  taints: null
  1. join 集群
kubeadm join --config kubeadm-join.yml
  1. 配置crictl (containerd的更方便的命令工具)
cat > /etc/crictl.yaml <<EOF
runtime-endpoint: unix:///run/containerd/containerd.sock
image-endpoint: unix:///run/containerd/containerd.sock
timeout: 10
debug: false
EOF

安装helm

curl https://baltocdn.com/helm/signing.asc | gpg --dearmor | sudo tee /usr/share/keyrings/helm.gpg > /dev/null
sudo apt-get install apt-transport-https --yes
echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/helm.gpg] https://baltocdn.com/helm/stable/debian/ all main" | sudo tee /etc/apt/sources.list.d/helm-stable-debian.list
sudo apt-get update
sudo apt-get install helm

安装 cilium(cni插件)

cni插件一直都是很重要的,然后如果安装失败了,要移除失败的影响又很麻烦,往往最简单的做法是重装系统,我之前折腾了很久的calico跟flannel,一直都有各种各样的问题,我只是一个后端开发,不需要在这上面折腾太多时间,然后cilium又很省心,也不需要设置iptable 或者ipvs,cilium的ip分配以及监控非常强大。因此,我要强烈安利cilium,真的很省心!

  1. 安装helm repo 源
helm repo add cilium https://helm.cilium.io/
  1. 配置cilium的自定义配置
cluster:
  name: kubernetes
enableIPv4Masquerade: true
enableIPv6Masquerade: true
hubble:
  relay:
    enabled: true
  ui:
    enabled: true
ipam:
  mode: kubernetes
  operator:
    clusterPoolIPv4PodCIDRList: 10.244.0.0/16 #k8s pod 网段
ipv4NativeRoutingCIDR: 10.244.0.0/16
k8sServiceHost: xxx.xxx.xxx.xxx #master的公网ip
k8sServicePort: 6443
kubeProxyReplacement: strict # 完全替代kube-proxy
operator:
  replicas: 1
serviceAccounts:
  cilium:
    name: cilium
  operator:
    name: cilium-operator
tunnel: vxlan
  1. 安装 cilium
helm install cilium cilium/cilium --namespace kube-system --values cilium-values.yml

如果碰到安装cilium成功,但是还是无法再集群内访问clusterIP,很有可能是服务器端口的问题,建议也放开udp端口,cilium端口要求,详见 前置要求