Ubuntu Server24.04 实战记录 All-in-one 安装 Kubernetes + KubeSphere
环境概览
- 主机:本地 DIY 电脑主机
- 系统:Ubuntu-Server 24.04 (无 GUI 图形界面)
- 硬件:RTX 3060 12GB + DDR4 3200 16GB
- 网络:无 VPN (无法访问外网)
📝 准备工作
- 源码包:从 Github 下载最新的 KubeKey Source Code 包 (
kubekey-4.0.4.tar.gz) - 参考文档:KubeSphere 官网安装指引
1️⃣ 制作 KubeKey (kk) 可执行文件
官方文档的下载命令需要外网,而我的机器是纯无界面的 Linux 且无法连接外网。因此,我选择在另一台能访问 Github 的机器上下载源码包,然后进行编译制作。
操作步骤:
-
上传源码包
# 上传源码包到指定的机器 scp kubekey-4.0.4.tar.gz liusonglin@192.168.31.31:/home/liusonglin -
解压并编译
注:期间需要安装 Go 编码翻译器,请自行安装。如果make kk半天没动静,建议增加 Go 的镜像源后重试。tar -zxvf kubekey-4.0.4.tar.gz cd kubekey-4.0.4 # 视网络情况,决定是否增加 Go 的镜像源 # export GOPROXY=https://mirrors.aliyun.com/goproxy/,direct make kk -
安装制品
# 查看制品,复制制品文件到系统路径,赋予执行权限 ls -lh _output/bin/kk sudo cp _output/bin/kk /usr/local/bin/ sudo chmod +x /usr/local/bin/kk kk version -
查看帮助
# 通过 help 命令查看 kk 如何使用 kk --help kk create --help kk create config --help kk create cluster --help
2️⃣ 创建安装配置文件
安装依赖:
# 安装外部依赖项目
sudo apt install socat conntrack ebtables ipset -y
生成配置:
# 在当前目录下,创建安装所需的配置文件
kk create config -c config-sample.yaml --with-kubernetes v1.34.3
3️⃣ 安装集群 (Kubernetes v1.34.3 + KubeSphere v4.2.0)
目标:All-In-One 安装,etcd 使用本地存储。
1. 修改配置文件 (config-sample.yaml)
请只修改所示部分,其它内容保持原样。
apiVersion: kubekey.kubesphere.io/v1
kind: Cluster
metadata: # 新增 metadata 元节点
name: myCluster
spec:
# 新增主机节点配置,如果是 arm 机器,参看官方文档,还需要增加一个 arm 参数,All-In-One 只配置本机自身即可
hosts:
- {name: ubuntu-server, address: 192.168.31.31, internalAddress: "192.168.31.31", port: 22, user: yyyyyyy, password: "xxxxxxx"}
# 新增角色组,指定相关组件使用的节点
roleGroups:
etcd:
- ubuntu-server
control-plane:
- ubuntu-server
worker:
- ubuntu-server
# If set to "cn", online downloads will prioritize domestic sources when available.
zone: "cn"
2. 执行安装
# 执行安装
sudo kk create cluster -c config-sample.yaml
⚠️ 遇到的问题与解决方案
问题 1:etcd 启动失败
-
现象:日志显示
Job for etcd.service failed because the control process exited with error code。04:19:02 UTC [roles/etcd/install] Install | Start and enable etcd systemd service ⠸ [localhost] failed [0s] 04:19:03 UTC [Playbook default/create-cluster-mp22n] finish. total: 141,success: 136,ignored: 4,failed: 1 Error: task [Install | Start and enable etcd systemd service](default/create-cluster-mp22n-vhw8c) run failed: [localhost][executor]: module run failed [localhost][item=<nil>][0]: stdout: stderr: Job for etcd.service failed because the control process exited with error code. See "systemctl status etcd.service" and "journalctl -xeu etcd.service" for details. error: exit status 1 task [Install | Start and enable etcd systemd service](default/create-cluster-mp22n-vhw8c) run failed: [localhost][executor]: module run failed [localhost][item=<nil>][0]: stdout: stderr: Job for etcd.service failed because the control process exited with error code. See "systemctl status etcd.service" and "journalctl -xeu etcd.service" for details. ``` -
原因:查看
/etc/etcd.env配置,发现包含了 IPv6 (fe80::...) 的配置内容,导致单机 All-in-One 环境启动崩溃。 -
解决:删除所有 IPv6 配置内容,重新启动 etcd 服务。
sudo systemctl start etcd liusonglin@ubuntu-server:~$ sudo systemctl status etcd ● etcd.service - etcd Loaded: loaded (/etc/systemd/system/etcd.service; enabled; preset: enabled) Active: active (running) since Mon 2026-04-06 13:26:42 CST; 1h 47min ago Main PID: 595857 (etcd) Tasks: 26 (limit: 18901) Memory: 117.2M (peak: 137.4M) CPU: 2min 17.545s CGroup: /system.slice/etcd.service └─595857 /usr/local/bin/etcd
问题 2:Kubeadm 初始化报错 (IPv6 地址格式错误)
-
现象:报错
Get "https://https/%5Bfe80::...%5D:2379/version": dial tcp: lookup https...13:28:11 CST [Playbook default/create-cluster-fjzn5] finish. total: 214,success: 207,ignored: 6,failed: 1 Error: task [Init | Run kubeadm init](default/create-cluster-fjzn5-cfhrd) run failed: [localhost][executor]: module run failed [localhost][item=<nil>][0]: stdout: [init] Using Kubernetes version: v1.34.3 [preflight] Running pre-flight checks stderr: W0406 13:27:55.909841 596990 utils.go:69] The recommended value for "clusterDNS" in "KubeletConfiguration" is: [10.233.0.10]; the provided value is: [169.254.25.10] [preflight] Some fatal errors occurred: [ERROR ExternalEtcdVersion]: Get "https://https/%5Bfe80::8e32:23ff:fe6c:8518%5D:2379/version": dial tcp: lookup https on 127.0.0.53:53: server misbehaving [preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...` error: error execution phase preflight: preflight checks failed To see the stack trace of this error execute with --v=5 or higher error: exit status 1 task [Init | Run kubeadm init](default/create-cluster-fjzn5-cfhrd) run failed: [localhost][executor]: module run failed [localhost][item=<nil>][0]: stdout: [init] Using Kubernetes version: v1.34.3 [preflight] Running pre-flight checks stderr: W0406 13:27:55.909841 596990 utils.go:69] The recommended value for "clusterDNS" in "KubeletConfiguration" is: [10.233.0.10]; the provided value is: [169.254.25.10] [preflight] Some fatal errors occurred: [ERROR ExternalEtcdVersion]: Get "https://https/%5Bfe80::8e32:23ff:fe6c:8518%5D:2379/version": dial tcp: lookup https on 127.0.0.53:53: server misbehaving [preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...` error: error execution phase preflight: preflight checks failed To see the stack trace of this error execute with --v=5 or higher error: exit status 1 -
原因:分析发现
/etc/hosts中配置了 IPv6 地址映射,且 KubeKey 在生成配置时优先使用了 IPv6。# 验证发现 getent hosts ubuntu-server # 返回了 IPv6 地址 cat /etc/hosts # 确认包含 IPv6 映射 -
无效尝试:手动删除
/etc/hosts中的行(kk以 sudo 执行,会重新写入)。 -
有效解决:彻底关闭 IPv6。
彻底关闭 IPv6 操作:
-
编辑
/etc/sysctl.conf。 -
将所有涉及 IPv6 的配置值修改为
1(禁用)。 -
强制保存退出 (
:wq!)。 -
执行网络配置:
sudo netplan try sudo netplan apply
3. 重新安装
# 删除历史安装,重新安装
sudo kk delete cluster -c config-sample.yaml
sudo kk create cluster -c config-sample.yaml
4. 验证安装结果
安装结果显示 failed: 0,查看节点状态:
sudo chmod 644 /etc/kubernetes/admin.conf
kubectl get nodes
输出结果:
NAME STATUS ROLES AGE VERSION
ubuntu-server Ready control-plane,worker 93m v1.34.3
说明 Kubernetes 集群安装成功。
4️⃣ 安装 KubeSphere v4.2.0
执行命令:
请直接复制三行命令一块执行,不要分步执行。
# 官方推荐:访问 Docker Hub 受限,请在命令后添加如下配置,修改扩展组件镜像的拉取地址
chart=oci://hub.kubesphere.com.cn/kse/ks-core
version=1.2.3-20251118
helm upgrade --install -n kubesphere-system --create-namespace ks-core $chart --debug --wait --version $version --reset-values --set extension.imageRegistry=swr.cn-north-9.myhuaweicloud.com/ks
- 耗时:视机器配置和网络情况,通常在 10-50 分钟之间。
- 提示:控制台没有输出不要紧,耐心等待直到出现成功提示。
成功提示信息:
NOTES:
Thank you for choosing KubeSphere Helm Chart. Please be patient and wait for several seconds for the KubeSphere deployment to complete.
1. Wait for Deployment Completion
Confirm that all KubeSphere components are running by executing the following command:
kubectl get pods -n kubesphere-system
2. Access the KubeSphere Console
Once the deployment is complete, you can access the KubeSphere console using the following URL:
http://192.168.31.31:30880
3. Login to KubeSphere Console
Use the following credentials to log in:
Account: admin
Password: P@88w0rd
NOTE: It is highly recommended to change the default password immediately after the first login.
For additional information and details, please visit https://kubesphere.io.
✅ 最终成果
局域网另一台机器访问 http://192.168.31.31:30880,成功显示 KubeSphere 登录界面。
至此,在本地机器完成了 All-in-one 的 KubeSphere v4.0.2 + Kubernetes v1.34.3 版本安装。