安装部署
步骤参见:blog.csdn.net/networken/a…
Rook官网:rook.io
Github地址:github.com/rook/rook
部署环境准备 1、前提条件:
- 至少准备3个节点、并且全部可以调度pod,满足ceph副本高可用要求
- 节点存在可用的裸盘或裸分区,无格式化文件系统
- 如果需要LVM,节点必须安装lvm2软件包 Ceph需要使用RBD模块构建的Linux内核,可以通过运行modprobe rbd来测试Kubernetes节点。
- 如果要从Ceph共享文件系统(CephFS)创建卷,建议的最低内核版本为4.17。
开始安装过程:
- 下载源码安装包:
[root@node1 ~]# git clone --single-branch --branch v1.10.5 https://github.com/rook/rook.git
正克隆到 'rook'...
remote: Enumerating objects: 83792, done.
remote: Counting objects: 100% (188/188), done.
remote: Compressing objects: 100% (104/104), done.
接收对象中: 23% (19273/83792), 11.71 MiB | 1.15 MiB/s
[root@node3 examples]# cd /root/rook-1.10.5/deploy/examples
- 部署Rook Operator
[root@node3 examples]# kubectl create -f crds.yaml -f common.yaml -f operator.yaml
customresourcedefinition.apiextensions.k8s.io/cephblockpoolradosnamespaces.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephblockpools.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephbucketnotifications.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephbuckettopics.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephclients.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephclusters.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephfilesystemmirrors.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephfilesystems.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephfilesystemsubvolumegroups.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephnfses.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephobjectrealms.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephobjectstores.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephobjectstoreusers.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephobjectzonegroups.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephobjectzones.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephrbdmirrors.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/objectbucketclaims.objectbucket.io created
customresourcedefinition.apiextensions.k8s.io/objectbuckets.objectbucket.io created
namespace/rook-ceph created
clusterrole.rbac.authorization.k8s.io/cephfs-csi-nodeplugin created
clusterrole.rbac.authorization.k8s.io/cephfs-external-provisioner-runner created
clusterrole.rbac.authorization.k8s.io/rbd-csi-nodeplugin created
clusterrole.rbac.authorization.k8s.io/rbd-external-provisioner-runner created
clusterrole.rbac.authorization.k8s.io/rook-ceph-cluster-mgmt created
clusterrole.rbac.authorization.k8s.io/rook-ceph-global created
clusterrole.rbac.authorization.k8s.io/rook-ceph-mgr-cluster created
clusterrole.rbac.authorization.k8s.io/rook-ceph-mgr-system created
clusterrole.rbac.authorization.k8s.io/rook-ceph-object-bucket created
clusterrole.rbac.authorization.k8s.io/rook-ceph-osd created
clusterrole.rbac.authorization.k8s.io/rook-ceph-system created
clusterrolebinding.rbac.authorization.k8s.io/cephfs-csi-provisioner-role created
clusterrolebinding.rbac.authorization.k8s.io/rbd-csi-nodeplugin created
clusterrolebinding.rbac.authorization.k8s.io/rbd-csi-provisioner-role created
clusterrolebinding.rbac.authorization.k8s.io/rook-ceph-global created
clusterrolebinding.rbac.authorization.k8s.io/rook-ceph-mgr-cluster created
clusterrolebinding.rbac.authorization.k8s.io/rook-ceph-object-bucket created
clusterrolebinding.rbac.authorization.k8s.io/rook-ceph-osd created
clusterrolebinding.rbac.authorization.k8s.io/rook-ceph-system created
role.rbac.authorization.k8s.io/cephfs-external-provisioner-cfg created
role.rbac.authorization.k8s.io/rbd-csi-nodeplugin created
role.rbac.authorization.k8s.io/rbd-external-provisioner-cfg created
role.rbac.authorization.k8s.io/rook-ceph-cmd-reporter created
role.rbac.authorization.k8s.io/rook-ceph-mgr created
role.rbac.authorization.k8s.io/rook-ceph-osd created
role.rbac.authorization.k8s.io/rook-ceph-purge-osd created
role.rbac.authorization.k8s.io/rook-ceph-rgw created
role.rbac.authorization.k8s.io/rook-ceph-system created
rolebinding.rbac.authorization.k8s.io/cephfs-csi-provisioner-role-cfg created
rolebinding.rbac.authorization.k8s.io/rbd-csi-nodeplugin-role-cfg created
rolebinding.rbac.authorization.k8s.io/rbd-csi-provisioner-role-cfg created
rolebinding.rbac.authorization.k8s.io/rook-ceph-cluster-mgmt created
rolebinding.rbac.authorization.k8s.io/rook-ceph-cmd-reporter created
rolebinding.rbac.authorization.k8s.io/rook-ceph-mgr created
rolebinding.rbac.authorization.k8s.io/rook-ceph-mgr-system created
rolebinding.rbac.authorization.k8s.io/rook-ceph-osd created
rolebinding.rbac.authorization.k8s.io/rook-ceph-purge-osd created
rolebinding.rbac.authorization.k8s.io/rook-ceph-rgw created
rolebinding.rbac.authorization.k8s.io/rook-ceph-system created
serviceaccount/rook-ceph-cmd-reporter created
serviceaccount/rook-ceph-mgr created
serviceaccount/rook-ceph-osd created
serviceaccount/rook-ceph-purge-osd created
serviceaccount/rook-ceph-rgw created
serviceaccount/rook-ceph-system created
serviceaccount/rook-csi-cephfs-plugin-sa created
serviceaccount/rook-csi-cephfs-provisioner-sa created
serviceaccount/rook-csi-rbd-plugin-sa created
serviceaccount/rook-csi-rbd-provisioner-sa created
configmap/rook-ceph-operator-config created
deployment.apps/rook-ceph-operator created
##查看创建的operator pod
[root@node3 examples]# kubectl -n rook-ceph get pods
NAME READY STATUS RESTARTS AGE
rook-ceph-operator-7b4f6fd594-jdglb 1/1 Running 0 165m
- 接下来,开始创建ceph集群
#添加如下磁盘
[root@node3 examples]# vim cluster.yaml
238 nodes:
239 - name: "node4"
240 devices: # specific devices to use for storage can be specified for each node
241 - name: "/dev/vdb"
242 - name: "/dev/vdc"
243 - name: "node5"
244 devices: # specific devices to use for storage can be specified for each node
245 - name: "/dev/vdb"
246 - name: "/dev/vdc"
247 - name: "node6"
248 devices: # specific devices to use for storage can be specified for each node
249 - name: "/dev/vdb"
250 - name: "/dev/vdc"
[root@node3 examples]# kubectl create -f cluster.yaml
cephcluster.ceph.rook.io/rook-ceph created
[root@node3 examples]# kubectl get pods -n rook-ceph
NAME READY STATUS RESTARTS AGE
csi-cephfsplugin-2xvrs 2/2 Running 0 3h2m
csi-cephfsplugin-5bx4l 2/2 Running 0 3h2m
csi-cephfsplugin-hwrfn 2/2 Running 0 3h2m
csi-cephfsplugin-provisioner-75875b5887-cch5l 5/5 Running 0 3h2m
csi-cephfsplugin-provisioner-75875b5887-zbq2j 5/5 Running 0 3h2m
csi-rbdplugin-dbskr 2/2 Running 0 3h2m
csi-rbdplugin-fbkv7 2/2 Running 0 3h2m
csi-rbdplugin-provisioner-56d69f5d8-2vngj 5/5 Running 0 3h2m
csi-rbdplugin-provisioner-56d69f5d8-b7hd8 5/5 Running 0 3h2m
csi-rbdplugin-wwzhn 2/2 Running 0 3h2m
rook-ceph-crashcollector-node4-788d65dcc4-87z5d 1/1 Running 0 3h1m
rook-ceph-crashcollector-node5-5b87ff6fc5-bbwsm 1/1 Running 0 3h
rook-ceph-crashcollector-node6-76679cb79b-mlwnm 1/1 Running 0 3h
rook-ceph-mgr-a-59bcc59d7c-8jrcp 3/3 Running 0 3h1m
rook-ceph-mgr-b-5dbd588748-mxg65 3/3 Running 0 3h1m
rook-ceph-mon-a-544b58cd97-8td94 2/2 Running 0 3h2m
rook-ceph-mon-b-587674cd95-8gh89 2/2 Running 0 3h1m
rook-ceph-mon-c-5b5c696bfd-ghm6k 2/2 Running 0 3h1m
rook-ceph-operator-64fb475fcb-zvpfp 1/1 Running 0 3h3m
rook-ceph-osd-0-596ffcbb78-66dbm 2/2 Running 0 3h
rook-ceph-osd-1-7f5fdcff4b-vqch8 2/2 Running 0 3h
rook-ceph-osd-2-5896657d68-b7r64 2/2 Running 0 3h
rook-ceph-osd-3-784f998f77-4pfl5 2/2 Running 0 3h
rook-ceph-osd-4-d75df46df-9fxlt 2/2 Running 0 3h
rook-ceph-osd-5-6c84469594-pkfhb 2/2 Running 0 3h
rook-ceph-osd-prepare-node4-779cg 0/1 Completed 0 3h
rook-ceph-osd-prepare-node5-mh8ds 0/1 Completed 0 3h
rook-ceph-osd-prepare-node6-mcfww 0/1 Completed 0 3h
- 配置ceph dashboard
[root@node3 examples]# kubectl create -f dashboard-external-http.yaml
service/rook-ceph-mgr-dashboard-external-http created
[root@node3 examples]# kubectl -n rook-ceph get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
rook-ceph-mgr ClusterIP 10.98.189.149 <none> 9283/TCP 33m
rook-ceph-mgr-dashboard ClusterIP 10.107.23.234 <none> 7000/TCP 33m
rook-ceph-mgr-dashboard-external-http NodePort 10.103.119.23 <none> 7000:32331/TCP 6s
rook-ceph-mon-a ClusterIP 10.101.25.118 <none> 6789/TCP,3300/TCP 34m
rook-ceph-mon-b ClusterIP 10.101.22.52 <none> 6789/TCP,3300/TCP 34m
rook-ceph-mon-c ClusterIP 10.110.72.46 <none> 6789/TCP,3300/TCP 33m
然后最悲剧的事情发生了,发现访问的时候,出现了303转发,转发的地址正好是service后端对应的pod。
使用curl:
root@yong:/home/cyxinda/download/rook-1.10.5/deploy/examples# curl -vvv http://172.70.10.185:32331/
* Trying 172.70.10.185:32331...
* TCP_NODELAY set
* Connected to 172.70.10.185 (172.70.10.185) port 32331 (#0)
> GET / HTTP/1.1
> Host: 172.70.10.185:32331
> User-Agent: curl/7.68.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 303 See Other
< Content-Type: text/html;charset=utf-8
< Server: Ceph-Dashboard
< Date: Wed, 16 Nov 2022 07:22:47 GMT
< Content-Security-Policy: frame-ancestors 'self';
< X-Content-Type-Options: nosniff
< Strict-Transport-Security: max-age=63072000; includeSubDomains; preload
< Location: http://10.244.139.6:7000/
< Vary: Accept-Encoding
< Content-Length: 96
<
* Connection #0 to host 172.70.10.185 left intact
This resource can be found at < a href="http://10.244.139.6:7000/">http://10.244.139.6:7000/</ a>.
按说k8s服务都是反向代理的,怎么会出现303重定向呢?
然后Google了好一会,才找到ceph官网上面的几句话:
翻译一下:
意思是,当请求到状态处于standby状态的mgr服务时,会自动转发请求到activate状态的mgr服务,自然会出现页面上面的303转发了。
然后使用工具集,可以看到mgr.b的状态:
当请求通过service无头服务到mgr.b上面的时候,因为mgr.b是standby状态,然后会将请求转发给mgr.a,也就出现在了浏览器上面,浏览器自然不能识别k8s中pod的地址了,所以,就请求不到任何数据了。
然后在dashboard-external-http.yaml内,加入mgr.a特有的label,可以解决该问题
然后在浏览器上面,便可以访问到ceph的dashboard了
当然还有可以禁用303重定向,因为已经知道了303出现的原因,也就没有太多必要实验了
密码可以通过下面的命令获得
[root@node3 examples]# kubectl -n rook-ceph get secret rook-ceph-dashboard-password -o jsonpath="{['data']['password']}" | base64 --decode && echo
(KW=6{zQg%{n1.dB)U:2
- 创建工具集
[root@node3 examples]# kubectl create -f toolbox.yaml
deployment.apps/rook-ceph-tools created
[root@node3 examples]# kubectl get pods -n rook-ceph
NAME READY STATUS RESTARTS AGE
rook-ceph-tools-5679b7d8f-jzbkr 1/1 Running 0 179m
[root@node1 ~]# kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- bash
bash-4.4$
bash-4.4$ ceph osd status
ID HOST USED AVAIL WR OPS WR DATA RD OPS RD DATA STATE
0 node4 19.5M 499G 0 0 0 0 exists,up
1 node5 20.8M 499G 0 0 0 0 exists,up
2 node6 19.5M 499G 0 0 0 0 exists,up
3 node4 21.1M 499G 0 0 0 0 exists,up
4 node5 19.5M 499G 0 0 0 0 exists,up
5 node6 20.8M 499G 0 0 0 0 exists,up
bash-4.4$ ceph df
--- RAW STORAGE ---
CLASS SIZE AVAIL USED RAW USED %RAW USED
hdd 2.9 TiB 2.9 TiB 121 MiB 121 MiB 0
TOTAL 2.9 TiB 2.9 TiB 121 MiB 121 MiB 0
--- POOLS ---
POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL
.mgr 1 1 449 KiB 2 449 KiB 0 950 GiB
bash-4.4$ ceph -s
cluster:
id: 0055fdfc-9741-40e5-b108-87e02574e98b
health: HEALTH_WARN
clock skew detected on mon.b
services:
mon: 3 daemons, quorum a,b,c (age 2h)
mgr: a(active, since 2h), standbys: b
osd: 6 osds: 6 up (since 2h), 6 in (since 2h)
data:
pools: 1 pools, 1 pgs
objects: 2 objects, 449 KiB
usage: 121 MiB used, 2.9 TiB / 2.9 TiB avail
pgs: 1 active+clean
bash-4.4$ ceph osd status
ID HOST USED AVAIL WR OPS WR DATA RD OPS RD DATA STATE
0 node4 19.5M 499G 0 0 0 0 exists,up
1 node5 20.8M 499G 0 0 0 0 exists,up
2 node6 19.5M 499G 0 0 0 0 exists,up
3 node4 21.1M 499G 0 0 0 0 exists,up
4 node5 19.5M 499G 0 0 0 0 exists,up
5 node6 20.8M 499G 0 0 0 0 exists,up
bash-4.4$ ceph health detail
HEALTH_WARN clock skew detected on mon.b
[WRN] MON_CLOCK_SKEW: clock skew detected on mon.b
mon.b clock skew 0.158903s > max 0.05s (latency 0.0186243s)
发现mon.b时钟不太正常,有些延迟,导致ceph集群不健康
然后重新找到mon.b所在的node6节点,发现:
chrony服务的客户端根本没有生效,通过文档查看
然后修改node1上面的chrony server,添加allow的ip段
然后重启chronyd服务:
[root@node1 rook]# vim /etc/chrony.conf
26 #allow 192.168.0.0/16
27 allow 172.0.0.0/8
[root@node1 rook]# systemctl restart chronyd
[root@node1 rook]# systemctl status chronyd
● chronyd.service - NTP client/server
Loaded: loaded (/usr/lib/systemd/system/chronyd.service; enabled; vendor preset: enabled)
Active: active (running) since 三 2022-11-16 18:17:39 CST; 5s ago
Docs: man:chronyd(8)
man:chrony.conf(5)
Process: 22003 ExecStartPost=/usr/libexec/chrony-helper update-daemon (code=exited, status=0/SUCCESS)
Process: 21999 ExecStart=/usr/sbin/chronyd $OPTIONS (code=exited, status=0/SUCCESS)
Main PID: 22001 (chronyd)
Tasks: 1
Memory: 484.0K
CGroup: /system.slice/chronyd.service
└─22001 /usr/sbin/chronyd
11月 16 18:17:39 node1 systemd[1]: Starting NTP client/server...
11月 16 18:17:39 node1 chronyd[22001]: chronyd version 3.4 starting (+CMDMON +NTP +REFCLOCK +RTC +PRIVDROP +SCFILTER +SIGND +ASYNCDNS +SECHASH +IPV6 +DEBUG)
11月 16 18:17:39 node1 chronyd[22001]: Frequency -13.820 +/- 0.026 ppm read from /var/lib/chrony/drift
11月 16 18:17:39 node1 systemd[1]: Started NTP client/server.
11月 16 18:17:44 node1 chronyd[22001]: Selected source 203.107.6.88
然后等待几分钟后,客户端恢复正常
然后dashboard页面上面,可以看到ceph集群恢复正常了:
在tools中,可以看到:
[root@node1 ~]# kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- bash
bash-4.4$ ceph health detail
HEALTH_OK
bash-4.4$
关于crash事件
bash-4.4$ ceph health detail
HEALTH_WARN 3 mgr modules have recently crashed
[WRN] RECENT_MGR_MODULE_CRASH: 3 mgr modules have recently crashed
mgr module nfs crashed in daemon mgr.a on host rook-ceph-mgr-a-744cc86b75-n8x6w at 2022-11-17T05:34:36.364991Z
mgr module nfs crashed in daemon mgr.a on host rook-ceph-mgr-a-744cc86b75-n8x6w at 2022-11-17T05:34:39.298479Z
mgr module nfs crashed in daemon mgr.a on host rook-ceph-mgr-a-744cc86b75-wwbk5 at 2022-11-17T06:17:45.972042Z
bash-4.4$ ceph crash ls
ID ENTITY NEW
2022-11-17T05:34:36.364991Z_1257681d-877b-41e1-96fa-6519032b7f2d mgr.a *
2022-11-17T05:34:39.298479Z_bd2244fe-352e-4a12-b7a0-19f31cb94f58 mgr.a *
2022-11-17T06:17:45.972042Z_ac3ac293-5c65-496d-b7b1-43be3f0c836e mgr.a *
bash-4.4$ ceph crash info 2022-11-17T06:17:45.972042Z_ac3ac293-5c65-496d-b7b1-43be3f0c836e
{
"backtrace": [
" File \"/usr/share/ceph/mgr/nfs/module.py\", line 154, in cluster_ls\n return available_clusters(self)",
" File \"/usr/share/ceph/mgr/nfs/utils.py\", line 38, in available_clusters\n completion = mgr.describe_service(service_type='nfs')",
" File \"/usr/share/ceph/mgr/orchestrator/_interface.py\", line 1479, in inner\n completion = self._oremote(method_name, args, kwargs)",
" File \"/usr/share/ceph/mgr/orchestrator/_interface.py\", line 1546, in _oremote\n raise NoOrchestrator()",
"orchestrator._interface.NoOrchestrator: No orchestrator configured (try `ceph orch set backend`)"
],
"ceph_version": "17.2.5",
"crash_id": "2022-11-17T06:17:45.972042Z_ac3ac293-5c65-496d-b7b1-43be3f0c836e",
"entity_name": "mgr.a",
"mgr_module": "nfs",
"mgr_module_caller": "ActivePyModule::dispatch_remote cluster_ls",
"mgr_python_exception": "NoOrchestrator",
"os_id": "centos",
"os_name": "CentOS Stream",
"os_version": "8",
"os_version_id": "8",
"process_name": "ceph-mgr",
"stack_sig": "dad5ab00e8109633a6f99a44e3e3c67aa44aad3613d396e491b3ebd3ae1e9dad",
"timestamp": "2022-11-17T06:17:45.972042Z",
"utsname_hostname": "rook-ceph-mgr-a-744cc86b75-wwbk5",
"utsname_machine": "x86_64",
"utsname_release": "5.17.6-1.el7.elrepo.x86_64",
"utsname_sysname": "Linux",
"utsname_version": "#1 SMP PREEMPT Fri May 6 09:08:57 EDT 2022"
}
bash-4.4$ ceph crash archive-all
##或者
bash-4.4$ ceph crash archive <id>
##可以消除ceph health detail里面的提示ceph集群不健康的信息
bash-4.4$ ceph crash ls
ID ENTITY NEW
2022-11-17T05:34:36.364991Z_1257681d-877b-41e1-96fa-6519032b7f2d mgr.a
2022-11-17T05:34:39.298479Z_bd2244fe-352e-4a12-b7a0-19f31cb94f58 mgr.a
2022-11-17T06:17:45.972042Z_ac3ac293-5c65-496d-b7b1-43be3f0c836e mgr.a
bash-4.4$ ceph health detail
HEALTH_OK
关于设置orchestrator
bash-4.4$ ceph orch status
Error ENOENT: No orchestrator configured (try `ceph orch set backend`)
bash-4.4$ ceph mgr module enable rook
module 'rook' is already enabled
bash-4.4$ ceph orch set backend rook
bash-4.4$ ceph orch status
Backend: rook
Available: Yes
bash-4.4$ ceph orch host ls
HOST ADDR LABELS STATUS
node1 172.70.10.181/node1
node2 172.70.10.182/node2
node3 172.70.10.183/node3
node4 172.70.10.184/node4
node5 172.70.10.185/node5
使用master节点上面的磁盘
master节点上面,包含有Taints: node-role.kubernetes.io/control-plane:NoSchedule污点
[root@node3 examples]# kubectl describe node/node1 -n rook-ceph
Name: node1
Roles: control-plane
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/arch=amd64
kubernetes.io/hostname=node1
kubernetes.io/os=linux
node-role.kubernetes.io/control-plane=
node.kubernetes.io/exclude-from-external-load-balancers=
Annotations: kubeadm.alpha.kubernetes.io/cri-socket: unix:///var/run/containerd/containerd.sock
node.alpha.kubernetes.io/ttl: 0
projectcalico.org/IPv4Address: 172.70.10.181/24
projectcalico.org/IPv4VXLANTunnelAddr: 10.244.166.128
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Wed, 16 Nov 2022 13:59:19 +0800
Taints: node-role.kubernetes.io/control-plane:NoSchedule
需要在cluster.yaml中添加tolerations
placement:
all:
tolerations:
- key: node-role.kubernetes.io/control-plane
operator: Exists
effect: "NoSchedule"
之后,添加dev设备:
nodes:
- name: "node1"
devices: # specific devices to use for storage can be specified for each node
- name: "/dev/vdb"
- name: "/dev/vdc"
- name: "node2"
devices: # specific devices to use for storage can be specified for each node
- name: "/dev/vdb"
- name: "/dev/vdc"
- name: "node3"
devices: # specific devices to use for storage can be specified for each node
- name: "/dev/vdb"
- name: "/dev/vdc"
然后就看到在master节点上面启动的osd等几个pod
[root@node3 examples]# kubectl get pods -n rook-ceph -o wide |grep node1
rook-ceph-crashcollector-node1-8866f454f-m5kht 1/1 Running 0 3h38m 10.244.166.161 node1 <none> <none>
rook-ceph-osd-6-66fccf8f69-lq7zw 2/2 Running 0 3h38m 10.244.166.160 node1 <none> <none>
rook-ceph-osd-8-66c98ddc5c-gnb54 2/2 Running 0 3h38m 10.244.166.158 node1 <none> <none>
rook-ceph-osd-prepare-node1-5sjvd 0/1 Completed 0 6m27s 10.244.166.185 node1 <none> <none>
[root@node3 examples]# kubectl get pods -n rook-ceph -o wide |grep node2
rook-ceph-crashcollector-node2-78c64b56bb-m44x4 1/1 Running 0 3h36m 10.244.104.10 node2 <none> <none>
rook-ceph-mgr-a-55ff688b45-pljzc 3/3 Running 0 3h42m 10.244.104.5 node2 <none> <none>
rook-ceph-osd-10-5d96f775f9-7lwpg 2/2 Running 0 3h28m 10.244.104.15 node2 <none> <none>
rook-ceph-osd-9-777c866497-74rll 2/2 Running 0 3h36m 10.244.104.9 node2 <none> <none>
rook-ceph-osd-prepare-node2-c886s 0/1 Completed 0 6m35s 10.244.104.25 node2 <none> <none>
[root@node3 examples]# kubectl get pods -n rook-ceph -o wide |grep node3
rook-ceph-crashcollector-node3-65569859c9-9rdlx 1/1 Running 0 3h38m 10.244.135.42 node3 <none> <none>
rook-ceph-mgr-b-795cbf984d-rt2k6 3/3 Running 0 3h42m 10.244.135.37 node3 <none> <none>
rook-ceph-osd-11-747d4cb964-7jsxt 2/2 Running 0 162m 10.244.135.51 node3 <none> <none>
rook-ceph-osd-7-58d996b97-kgtk8 2/2 Running 0 88m 10.244.135.56 node3 <none> <none>
rook-ceph-osd-prepare-node3-7mf55 0/1 Completed 0 6m31s 10.244.135.57 node3 <none> <none>
关于osd不能正常加入到rook-ceph
如下错误
Running command: /usr/bin/ceph-osd --cluster ceph --osd-objectstore bluestore --mkfs -i 9 --monmap /var/lib/ceph/osd/ceph-9/activate.monmap --keyfile - --osd-data /var/lib/ceph/osd/ceph-9/ --osd-uuid 672afdb9-ffa6-4b42-85d1-7ae476bc6d6c --setuser ceph --setgroup ceph
stderr: 2022-11-18T10:56:45.103+0000 7fb67e97d3c0 -1 bluestore(/var/lib/ceph/osd/ceph-9/) _read_fsid unparsable uuid
stderr: 2022-11-18T10:56:45.214+0000 7fb67e97d3c0 -1 bluefs _replay 0x0: stop: uuid 00000000-0000-0000-0000-000000000000 != super.uuid 592cee20-ba67-4e4b-bb10-884b0678491f, block dump:
stderr: 00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
stderr: *
stderr: 00000ff0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
stderr: 00001000
stderr: 2022-11-18T10:56:49.986+0000 7fb67e97d3c0 -1 rocksdb: verify_sharding unable to list column families: NotFound:
stderr: 2022-11-18T10:56:49.986+0000 7fb67e97d3c0 -1 bluestore(/var/lib/ceph/osd/ceph-9/) _open_db erroring opening db:
stderr: 2022-11-18T10:56:50.407+0000 7fb67e97d3c0 -1 OSD::mkfs: ObjectStore::mkfs failed with error (5) Input/output error
stderr: 2022-11-18T10:56:50.407+0000 7fb67e97d3c0 -1 ** ERROR: error creating empty object store in /var/lib/ceph/osd/ceph-9/: (5) Input/output error
--> Was unable to complete a new OSD, will rollback changes
Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd purge-new osd.9 --yes-i-really-mean-it
stderr: purged osd.9
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/ceph_volume/devices/raw/prepare.py", line 91, in safe_prepare
self.prepare()
File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 16, in is_root
return func(*a, **kw)
File "/usr/lib/python3.6/site-packages/ceph_volume/devices/raw/prepare.py", line 134, in prepare
tmpfs,
File "/usr/lib/python3.6/site-packages/ceph_volume/devices/raw/prepare.py", line 68, in prepare_bluestore
db=db
File "/usr/lib/python3.6/site-packages/ceph_volume/util/prepare.py", line 484, in osd_mkfs_bluestore
raise RuntimeError('Command failed with exit code %!s(MISSING): %!s(MISSING)' %!((MISSING)returncode, ' '.join(command)))
RuntimeError: Command failed with exit code 250: /usr/bin/ceph-osd --cluster ceph --osd-objectstore bluestore --mkfs -i 9 --monmap /var/lib/ceph/osd/ceph-9/activate.monmap --keyfile - --osd-data /var/lib/ceph/osd/ceph-9/ --osd-uuid 672afdb9-ffa6-4b42-85d1-7ae476bc6d6c --setuser ceph --setgroup ceph
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/sbin/ceph-volume", line 11, in <module>
load_entry_point('ceph-volume==1.0.0', 'console_scripts', 'ceph-volume')()
File "/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 41, in __init__
self.main(self.argv)
File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 59, in newfunc
return f(*a, **kw)
File "/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 153, in main
terminal.dispatch(self.mapper, subcommand_args)
File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 194, in dispatch
instance.main()
File "/usr/lib/python3.6/site-packages/ceph_volume/devices/raw/main.py", line 32, in main
terminal.dispatch(self.mapper, self.argv)
File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 194, in dispatch
instance.main()
File "/usr/lib/python3.6/site-packages/ceph_volume/devices/raw/prepare.py", line 169, in main
self.safe_prepare(self.args)
File "/usr/lib/python3.6/site-packages/ceph_volume/devices/raw/prepare.py", line 95, in safe_prepare
rollback_osd(self.args, self.osd_id)
File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/common.py", line 35, in rollback_osd
Zap(['--destroy', '--osd-id', osd_id]).main()
File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/zap.py", line 404, in main
self.zap_osd()
File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 16, in is_root
return func(*a, **kw)
File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/zap.py", line 301, in zap_osd
devices = find_associated_devices(self.args.osd_id, self.args.osd_fsid)
File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/zap.py", line 88, in find_associated_devices
'%!s(MISSING)' %!o(MISSING)sd_id or osd_fsid)
RuntimeError: Unable to find any LV for zapping OSD: 9: exit status 1}
2022-11-18 10:59:12.259144 I | ceph-cluster-controller: reconciling ceph cluster in namespace "rook-ceph"
该错误一般是需要清理一下osd对应的dev设备的
先在cluster.yaml中,将该osd设备摘除
然后
需要执行如下清理过程:
参考1:stackoverflow.com/questions/5…
参考2:zhuanlan.zhihu.com/p/140486398
#标记出osd
bash-4.4$ ceph osd out osd.7
marked out osd.7.
#从crush map中删除osd
bash-4.4$ ceph osd crush remove osd.7
device 'osd.7' does not appear in the crush map
#删除caps
bash-4.4$ ceph auth del osd.7
updated
#移除osd
bash-4.4$ ceph osd rm osd.7
removed osd.7
然后
## 格式化块设备
[root@node3 mapper]# mkfs.ext4 /dev/vdc
mke2fs 1.42.9 (28-Dec-2013)
文件系统标签=
OS type: Linux
块大小=4096 (log=2)
分块大小=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
32768000 inodes, 131072000 blocks
6553600 blocks (5.00%) reserved for the super user
第一个数据块=0
Maximum filesystem blocks=2279604224
4000 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968,
102400000
Allocating group tables: 完成
正在写入inode表: 完成
Creating journal (32768 blocks): 完成
Writing superblocks and filesystem accounting information:
完成
##磁盘清理过程
[root@node3 ~]# DISK="/dev/vdc"
[root@node3 ~]# if [ ! -z $1 ];then
> DISK=$1
> fi
[root@node3 ~]# # 磁盘去格式化
[root@node3 ~]# # DISK="/dev/vdc"
[root@node3 ~]# # Zap the disk to a fresh, usable state (zap-all is important, b/c MBR has to be clean)
[root@node3 ~]# # You will have to run this step for all disks.
[root@node3 ~]# sgdisk --zap-all $DISK
Creating new GPT entries.
GPT data structures destroyed! You may now partition the disk using fdisk or
other utilities.
[root@node3 ~]# dd if=/dev/zero of="$DISK" bs=1M count=100 oflag=direct,dsync
记录了100+0 的读入
记录了100+0 的写出
104857600字节(105 MB)已复制,16.5448 秒,6.3 MB/秒
[root@node3 ~]# partprobe $DISK
[root@node3 ~]# kubectl get pods -n rook-ceph -o wide|grep node3
rook-ceph-crashcollector-node3-65569859c9-9rdlx 1/1 Running 0 128m 10.244.135.42 node3 <none> <none>
rook-ceph-mgr-b-795cbf984d-rt2k6 3/3 Running 0 132m 10.244.135.37 node3 <none> <none>
rook-ceph-osd-11-747d4cb964-7jsxt 2/2 Running 0 72m 10.244.135.51 node3 <none> <none>
rook-ceph-osd-7-58d996b97-vshvz 1/2 CrashLoopBackOff 28 (99s ago) 128m 10.244.135.41 node3 <none> <none>
rook-ceph-osd-prepare-node3-zzhwp 0/1 Completed 0 49m 10.244.135.55 node3 <none> <none>
[root@node3 ~]# kubectl -n rook-ceph logs pod/rook-ceph-osd-7-58d996b97-vshvz
Defaulted container "osd" out of: osd, log-collector, activate (init), chown-container-data-dir (init)
debug 2022-11-21T04:27:48.818+0000 7fc114dcd700 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2]
debug 2022-11-21T04:27:48.819+0000 7fc1145cc700 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2]
debug 2022-11-21T04:27:48.819+0000 7fc1155ce700 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2]
failed to fetch mon config (--no-mon-config to skip)
[root@node3 ~]# kubectl -n rook-ceph delete pod/rook-ceph-osd-7-58d996b97-vshvz
pod "rook-ceph-osd-7-58d996b97-vshvz" deleted
[root@node3 ~]# kubectl get pods -n rook-ceph -o wide|grep node3
rook-ceph-crashcollector-node3-65569859c9-9rdlx 1/1 Running 0 130m 10.244.135.42 node3 <none> <none>
rook-ceph-mgr-b-795cbf984d-rt2k6 3/3 Running 0 134m 10.244.135.37 node3 <none> <none>
rook-ceph-osd-11-747d4cb964-7jsxt 2/2 Running 0 74m 10.244.135.51 node3 <none> <none>
rook-ceph-osd-7-58d996b97-kgtk8 1/2 Running 0 11s 10.244.135.56 node3 <none> <none>
rook-ceph-osd-prepare-node3-zzhwp 0/1 Completed 0 52m 10.244.135.55 node3 <none> <none>
[root@node3 examples]# kubectl get pods -n rook-ceph -o wide|grep node3
rook-ceph-crashcollector-node3-65569859c9-9rdlx 1/1 Running 0 3h25m 10.244.135.42 node3 <none> <none>
rook-ceph-mgr-b-795cbf984d-rt2k6 3/3 Running 0 3h29m 10.244.135.37 node3 <none> <none>
rook-ceph-osd-11-747d4cb964-7jsxt 2/2 Running 0 149m 10.244.135.51 node3 <none> <none>
rook-ceph-osd-7-58d996b97-kgtk8 2/2 Running 0 75m 10.244.135.56 node3 <none> <none>
rook-ceph-osd-prepare-node3-zzhwp 0/1 Completed 0 127m 10.244.135.55 node3 <none> <none>
清理详细过程:
参考:github.com/rook/rook/i…
#!/usr/bin/env bash
DISK="/dev/sda"
# Zap the disk to a fresh, usable state (zap-all is important, b/c MBR has to be clean)
# You will have to run this step for all disks.
sgdisk --zap-all $DISK
# Clean hdds with dd
dd if=/dev/zero of="$DISK" bs=1M count=100 oflag=direct,dsync
# Clean disks such as ssd with blkdiscard instead of dd
blkdiscard $DISK
# These steps only have to be run once on each node
# If rook sets up osds using ceph-volume, teardown leaves some devices mapped that lock the disks.
ls /dev/mapper/ceph-* | xargs -I% -- dmsetup remove %
# ceph-volume setup can leave ceph-<UUID> directories in /dev and /dev/mapper (unnecessary clutter)
rm -rf /dev/ceph-*
rm -rf /dev/mapper/ceph--*
# Inform the OS of partition table changes
partprobe $DISK
集群出现MGR_MODULE_ERROR错误
dashboard页面出现 MGR_MODULE_ERROR Module 'rook' has failed
bash-4.4$ ceph health detail
HEALTH_ERR Module 'rook' has failed: ({'type': 'ERROR', 'object': {'api_version': 'v1',
'kind': 'Status',
'metadata': {'annotations': None,
'cluster_name': None,
'creation_timestamp': None,
'deletion_grace_period_seconds': None,
'deletion_timestamp': None,
'finalizers': None,
'generate_name': None,
'generation': None,
'initializers': None,
'labels': None,
'managed_fields': None,
'name': None,
'namespace': None,
'owner_references': None,
'resource_version': None,
'self_link': None,
'uid': None},
'spec': None,
'status': {'addresses': None,
'allocatable': None,
'capacity': None,
'conditions': None,
'config': None,
'daemon_endpoints': None,
'images': None,
'node_info': None,
'phase': None,
'volumes_attached': None,
'volumes_in_use': None}}, 'raw_object': {'kind': 'Status', 'apiVersion': 'v1', 'metadata': {}, 'status': 'Failure', 'message': 'too old resource version: 1909844 (1909877)', 'reason': 'Expired', 'code': 410}})
Reason: None
可以使用重启mgr服务和禁用mgr服务,如果MGR服务没有问题,错误信息也将被清理掉:
kubectl -n rook-ceph exec -it rook-ceph-tools -- ceph status # reports HEALTH_ERR
kubectl -n rook-ceph exec -it rook-ceph-tools -- ceph mgr module disable prometheus
kubectl -n rook-ceph exec -it rook-ceph-tools -- ceph mgr module disable dashboard
kubectl -n rook-ceph exec -it rook-ceph-tools -- ceph status # reports HEALTH_OK
关于镜像:
[root@node3 examples]# cat images.txt
quay.io/ceph/ceph:v17.2.5
quay.io/cephcsi/cephcsi:v3.7.2
quay.io/csiaddons/k8s-sidecar:v0.5.0
registry.k8s.io/sig-storage/csi-attacher:v4.0.0
registry.k8s.io/sig-storage/csi-node-driver-registrar:v2.5.1
registry.k8s.io/sig-storage/csi-provisioner:v3.3.0
registry.k8s.io/sig-storage/csi-resizer:v1.6.0
registry.k8s.io/sig-storage/csi-snapshotter:v6.1.0
rook/ceph:v1.10.5
[root@node3 examples]# pwd
/root/rook-1.10.5/deploy/examples
[root@node3 examples]#
可以将镜像使用代理,下载到本地,打包后,在集群节点中导入即可
常用命令
kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l \
"app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') -- bin/bash
ceph osd set noout
ceph osd set nobackfill
ceph osd set norebalance
ceph osd set norecover
exit
ceph osd unset
ceph osd unset noout
ceph osd unset nobackfill
ceph osd unset norebalance
ceph osd unset norecover
关于rook清理
参见: 官网清理文档
[root@node1 rbd]# kubectl delete -n rook-ceph cephblockpool rbdpool
cephblockpool.ceph.rook.io "rbdpool" deleted
[root@node1 rbd]# kubectl get sc
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
rook-ceph-block (default) rook-ceph.rbd.csi.ceph.com Delete Immediate true 27h
[root@node1 rbd]# kubectl delete storageclass rook-ceph-block
storageclass.storage.k8s.io "rook-ceph-block" deleted
[root@node1 examples]# kubectl -n rook-ceph patch cephcluster rook-ceph --type merge -p '{"spec":{"cleanupPolicy":{"confirmation":"yes-really-destroy-data"}}}'
cephcluster.ceph.rook.io/rook-ceph patched
[root@node1 examples]#
kubectl -n rook-ceph get secret rook-ceph-dashboard-password -o jsonpath="{['data']['password']}" | base64 --decode && echo
$ kubectl -n rook-ceph get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
rook-ceph-mgr ClusterIP 172.30.11.40 <none> 9283/TCP 4h
rook-ceph-mgr-dashboard ClusterIP 172.30.203.185 <none> 8443/TCP 4h
rook-ceph-mgr-dashboard-loadbalancer LoadBalancer 172.30.27.242 a7f23e8e2839511e9b7a5122b08f2038-1251669398.us-east-1.elb.amazonaws.com 8443:32747/TCP 4h
https://rook.io/docs/rook/v1.10/Getting-Started/ceph-teardown/#delete-the-block-and-file-artifacts
[root@node1 examples]# kubectl delete -f operator.yaml
configmap "rook-ceph-operator-config" deleted
deployment.apps "rook-ceph-operator" deleted
[root@node1 examples]# kubectl delete -f common.yaml
namespace "rook-ceph" deleted
clusterrole.rbac.authorization.k8s.io "cephfs-csi-nodeplugin" deleted
clusterrole.rbac.authorization.k8s.io "cephfs-external-provisioner-runner" deleted
clusterrole.rbac.authorization.k8s.io "rbd-csi-nodeplugin" deleted
clusterrole.rbac.authorization.k8s.io "rbd-external-provisioner-runner" deleted
clusterrole.rbac.authorization.k8s.io "rook-ceph-cluster-mgmt" deleted
clusterrole.rbac.authorization.k8s.io "rook-ceph-global" deleted
clusterrole.rbac.authorization.k8s.io "rook-ceph-mgr-cluster" deleted
clusterrole.rbac.authorization.k8s.io "rook-ceph-mgr-system" deleted
clusterrole.rbac.authorization.k8s.io "rook-ceph-object-bucket" deleted
clusterrole.rbac.authorization.k8s.io "rook-ceph-osd" deleted
clusterrole.rbac.authorization.k8s.io "rook-ceph-system" deleted
clusterrolebinding.rbac.authorization.k8s.io "cephfs-csi-provisioner-role" deleted
clusterrolebinding.rbac.authorization.k8s.io "rbd-csi-nodeplugin" deleted
clusterrolebinding.rbac.authorization.k8s.io "rbd-csi-provisioner-role" deleted
clusterrolebinding.rbac.authorization.k8s.io "rook-ceph-global" deleted
clusterrolebinding.rbac.authorization.k8s.io "rook-ceph-mgr-cluster" deleted
clusterrolebinding.rbac.authorization.k8s.io "rook-ceph-object-bucket" deleted
clusterrolebinding.rbac.authorization.k8s.io "rook-ceph-osd" deleted
clusterrolebinding.rbac.authorization.k8s.io "rook-ceph-system" deleted
role.rbac.authorization.k8s.io "cephfs-external-provisioner-cfg" deleted
role.rbac.authorization.k8s.io "rbd-csi-nodeplugin" deleted
role.rbac.authorization.k8s.io "rbd-external-provisioner-cfg" deleted
role.rbac.authorization.k8s.io "rook-ceph-cmd-reporter" deleted
role.rbac.authorization.k8s.io "rook-ceph-mgr" deleted
role.rbac.authorization.k8s.io "rook-ceph-osd" deleted
role.rbac.authorization.k8s.io "rook-ceph-purge-osd" deleted
role.rbac.authorization.k8s.io "rook-ceph-rgw" deleted
role.rbac.authorization.k8s.io "rook-ceph-system" deleted
rolebinding.rbac.authorization.k8s.io "cephfs-csi-provisioner-role-cfg" deleted
rolebinding.rbac.authorization.k8s.io "rbd-csi-nodeplugin-role-cfg" deleted
rolebinding.rbac.authorization.k8s.io "rbd-csi-provisioner-role-cfg" deleted
rolebinding.rbac.authorization.k8s.io "rook-ceph-cluster-mgmt" deleted
rolebinding.rbac.authorization.k8s.io "rook-ceph-cmd-reporter" deleted
rolebinding.rbac.authorization.k8s.io "rook-ceph-mgr" deleted
rolebinding.rbac.authorization.k8s.io "rook-ceph-mgr-system" deleted
rolebinding.rbac.authorization.k8s.io "rook-ceph-osd" deleted
rolebinding.rbac.authorization.k8s.io "rook-ceph-purge-osd" deleted
rolebinding.rbac.authorization.k8s.io "rook-ceph-rgw" deleted
rolebinding.rbac.authorization.k8s.io "rook-ceph-system" deleted
serviceaccount "rook-ceph-cmd-reporter" deleted
serviceaccount "rook-ceph-mgr" deleted
serviceaccount "rook-ceph-osd" deleted
serviceaccount "rook-ceph-purge-osd" deleted
serviceaccount "rook-ceph-rgw" deleted
serviceaccount "rook-ceph-system" deleted
serviceaccount "rook-csi-cephfs-plugin-sa" deleted
serviceaccount "rook-csi-cephfs-provisioner-sa" deleted
serviceaccount "rook-csi-rbd-plugin-sa" deleted
serviceaccount "rook-csi-rbd-provisioner-sa" deleted
[root@node1 examples]# kubectl delete -f psp.yaml
Error from server (NotFound): error when deleting "psp.yaml": clusterroles.rbac.authorization.k8s.io "psp:rook" not found
Error from server (NotFound): error when deleting "psp.yaml": clusterrolebindings.rbac.authorization.k8s.io "rook-ceph-system-psp" not found
Error from server (NotFound): error when deleting "psp.yaml": clusterrolebindings.rbac.authorization.k8s.io "rook-csi-cephfs-plugin-sa-psp" not found
Error from server (NotFound): error when deleting "psp.yaml": clusterrolebindings.rbac.authorization.k8s.io "rook-csi-cephfs-provisioner-sa-psp" not found
Error from server (NotFound): error when deleting "psp.yaml": clusterrolebindings.rbac.authorization.k8s.io "rook-csi-rbd-plugin-sa-psp" not found
Error from server (NotFound): error when deleting "psp.yaml": clusterrolebindings.rbac.authorization.k8s.io "rook-csi-rbd-provisioner-sa-psp" not found
Error from server (NotFound): error when deleting "psp.yaml": rolebindings.rbac.authorization.k8s.io "rook-ceph-cmd-reporter-psp" not found
Error from server (NotFound): error when deleting "psp.yaml": rolebindings.rbac.authorization.k8s.io "rook-ceph-default-psp" not found
Error from server (NotFound): error when deleting "psp.yaml": rolebindings.rbac.authorization.k8s.io "rook-ceph-mgr-psp" not found
Error from server (NotFound): error when deleting "psp.yaml": rolebindings.rbac.authorization.k8s.io "rook-ceph-osd-psp" not found
Error from server (NotFound): error when deleting "psp.yaml": rolebindings.rbac.authorization.k8s.io "rook-ceph-purge-osd-psp" not found
Error from server (NotFound): error when deleting "psp.yaml": rolebindings.rbac.authorization.k8s.io "rook-ceph-rgw-psp" not found
resource mapping not found for name: "00-rook-privileged" namespace: "" from "psp.yaml": no matches for kind "PodSecurityPolicy" in version "policy/v1beta1"
ensure CRDs are installed first
[root@node1 examples]# kubectl delete -f crds.yaml
customresourcedefinition.apiextensions.k8s.io "cephblockpoolradosnamespaces.ceph.rook.io" deleted
customresourcedefinition.apiextensions.k8s.io "cephblockpools.ceph.rook.io" deleted
customresourcedefinition.apiextensions.k8s.io "cephbucketnotifications.ceph.rook.io" deleted
customresourcedefinition.apiextensions.k8s.io "cephbuckettopics.ceph.rook.io" deleted
customresourcedefinition.apiextensions.k8s.io "cephclients.ceph.rook.io" deleted
customresourcedefinition.apiextensions.k8s.io "cephclusters.ceph.rook.io" deleted
customresourcedefinition.apiextensions.k8s.io "cephfilesystemmirrors.ceph.rook.io" deleted
customresourcedefinition.apiextensions.k8s.io "cephfilesystems.ceph.rook.io" deleted
customresourcedefinition.apiextensions.k8s.io "cephfilesystemsubvolumegroups.ceph.rook.io" deleted
customresourcedefinition.apiextensions.k8s.io "cephnfses.ceph.rook.io" deleted
customresourcedefinition.apiextensions.k8s.io "cephobjectrealms.ceph.rook.io" deleted
customresourcedefinition.apiextensions.k8s.io "cephobjectstores.ceph.rook.io" deleted
customresourcedefinition.apiextensions.k8s.io "cephobjectstoreusers.ceph.rook.io" deleted
customresourcedefinition.apiextensions.k8s.io "cephobjectzonegroups.ceph.rook.io" deleted
customresourcedefinition.apiextensions.k8s.io "cephobjectzones.ceph.rook.io" deleted
customresourcedefinition.apiextensions.k8s.io "cephrbdmirrors.ceph.rook.io" deleted
customresourcedefinition.apiextensions.k8s.io "objectbucketclaims.objectbucket.io" deleted
customresourcedefinition.apiextensions.k8s.io "objectbuckets.objectbucket.io" deleted
[root@node1 examples]#
[root@node1 examples]# kubectl get ns
添加osd过程中,总有添加失败的问题,最好将node的所有taints都去掉的好,防止pod调度问题。。。。。
设置其他限制
docs.mirantis.com/container-c…
ceph 磁盘和osd对应关系:ceph device ls|sort -k2