Rocky9部署Ceph集群

1,283 阅读5分钟

三台服务器

192.168.0.184 node1  /dev/sda为多加的一块盘
192.168.0.100 node3  /dev/sda为多加的一块盘
192.168.0.164 node4  /dev/sda为多加的一块盘

集群一

三台服务器作为ceph集群OSD存储服务器

192.168.0.184 node1  /dev/sda为多加的一块盘
192.168.0.100 node3  /dev/sda为多加的一块盘
192.168.0.164 node4  /dev/sda为多加的一块盘

集群二

三台服务器作为ceph集群Mon监视服务器

192.168.0.184 node1
192.168.0.100 node3
192.168.0.164 node4

集群三

三个ceph-mgr管理服务器

192.168.0.184 node1
192.168.0.100 node3
192.168.0.164 node4

时间同步

# 设置时区为上海
 timedatectl set-timezone Asia/Shanghai 
 
# 安装chrony,用于同步时间
yum install chrony -y

# 启动chrony
systemctl enable chrony --now

配置hosts解析

修改/etc/hosts。添加如下内容

192.168.0.184 node1  
192.168.0.100 node3  
192.168.0.164 node4 

配置ceph国内源

添加ceph仓库的配置文件

三台服务器都要配置上下面的源

tee /etc/yum.repos.d/ceph.repo  << EOF
[ceph]
name=Ceph packages
baseurl=https://mirrors.tuna.tsinghua.edu.cn/ceph/rpm-reef/el9/x86_64/
enabled=1
priority=2
gpgcheck=0

[ceph-noarch]
name=Ceph noarch packages
baseurl=https://mirrors.tuna.tsinghua.edu.cn/ceph/rpm-reef/el9/noarch/
enabled=1
priority=2
gpgcheck=0

[ceph-source]
name=Ceph source packages
baseurl=https://mirrors.tuna.tsinghua.edu.cn/ceph/rpm-reef/el9/SRPMS/
enabled=0
priority=2
gpgcheck=0
EOF

更新仓库

[root@node1 ~]# dnf update

安装ceph-deploy

创建工作目录

因为ceph-deploy会有输出,后面执行ceph-deploy时候在该目录下进行

[root@node1 ~]# mkdir cephcluster
[root@node1 ~]# cd cephcluster/
[root@node1 cephcluster]# 

安装ceph-deploy

# 安装python3和pip
dnf install python3-pip -y

# 配置pip国内源
pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple

# 克隆ceph-deploy
git clone https://github.com/ceph/ceph-deploy.git


cd ceph-deploy 

#在执行下面命令前需要修改ceph_deploy/hosts/__init__.py 添加对rocky系统的支持,具体参照最后的问题一解决办法
#如果要使用ceph-deploy直接部署ceph那么需要修改以下这个文件 
#vim /root/cephcluster/ceph-deploy/ceph_deploy/install.py 
#将args.release = 'nautilus' 修改成 args.release = 'reef' 
pip3 install setuptools 
python3 setup.py install

在cephcluster目录下执行

new 命令作用: Start deploying a new cluster, and write a CLUSTER.conf and keyring for it

[root@node1 cephcluster]# ceph-deploy new node1 node3 node4

修改ceph配置文件

root@node1:~/cephcluster/ceph-deploy# cat ceph.conf 
[global]
fsid = 9e7b59a6-c3ee-43d4-9baf-60d5bb05484a
mon_initial_members = node1,node3,node4
mon_host = 192.168.0.184,192.168.0.100,192.168.0.164
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
public_network=192.168.0.0/24
#设置副本数
osd_pool_default_size = 3
#设置最小副本数
osd_pool_default_min_size = 2
#设置时钟偏移0.5s
mon_clock_drift_allowed = .50

每个节点安装需要的软件包

# 安装epel源
dnf install epel-release

# 安装ceph相关软件包
ceph-deploy install --no-adjust-repos --nogpgcheck node1  node3 node4 
# 参数说明
# install 表示在指定的节点上安装ceph相关包
# --no-adjust-repos 不替换节点上的repo文件,因为我们已经配置了清华源了
# --nogpgcheck 不进行gpgcheck校验

初始化mon并生成秘钥

ceph-deploy mon create-initial

会生成如下文件

[root@node1 cephcluster]# ls
ceph.bootstrap-mds.keyring  ceph.bootstrap-osd.keyring  ceph.client.admin.keyring  ceph-deploy           ceph.mon.keyring
ceph.bootstrap-mgr.keyring  ceph.bootstrap-rgw.keyring  ceph.conf                  ceph-deploy-ceph.log

使用Ceph -deploy将配置文件和管理密钥复制到管理节点和Ceph节点

ceph-deploy admin node1  node3 node4

部署mgr服务

ceph-deploy mgr create node1  node3 node4

查看当前集群状态

[root@node1 cephcluster]# ceph status
  cluster:
    id:     9e7b59a6-c3ee-43d4-9baf-60d5bb05484a
    health: HEALTH_WARN
            mons are allowing insecure global_id reclaim
 
  services:
    mon: 3 daemons, quorum node3,node4,node1 (age 81s)
    mgr: node1(active, since 3s), standbys: node3
    osd: 0 osds: 0 up, 0 in
 
  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 B
    usage:   0 B used, 0 B / 0 B avail
    pgs:

解决mons are allowing insecure global_id reclaim错误

ceph config set mon auth_allow_insecure_global_id_reclaim false

添加osd

查看要部署osd的节点的磁盘信息

ceph-deploy disk list node1  node3 node4 

擦除磁盘

ceph-deploy disk zap node1 /dev/sda
ceph-deploy disk zap node3 /dev/sda
ceph-deploy disk zap node4 /dev/sda

添加磁盘

# 添加node1的osd
ceph-deploy osd create node1 --data /dev/sda

# 添加node3的osd
ceph-deploy osd create node3 --data /dev/sda

# 添加node4的osd
ceph-deploy osd create node4 --data /dev/sda

查看ceph集群状态

[root@node1 cephcluster]# ceph status
  cluster:
    id:     9e7b59a6-c3ee-43d4-9baf-60d5bb05484a
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum node3,node4,node1 (age 7m)
    mgr: node1(active, since 6m), standbys: node4, node3
    osd: 3 osds: 3 up (since 19s), 3 in (since 34s)
 
  data:
    pools:   1 pools, 1 pgs
    objects: 2 objects, 769 KiB
    usage:   82 MiB used, 477 GiB / 477 GiB avail
    pgs:     1 active+clean
 
  io:
    client:   4.5 KiB/s rd, 255 KiB/s wr, 4 op/s rd, 10 op/s wr

遇到的问题

问题一:

出现了不支持的平台错误

[root@node1 ceph-deploy]# ceph-deploy new node1 node3 node4
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (2.1.0): /usr/local/bin/ceph-deploy new node1 node3 node4
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  mon                           : ['node1', 'node3', 'node4']
[ceph_deploy.cli][INFO  ]  ssh_copykey                   : True
[ceph_deploy.cli][INFO  ]  fsid                          : None
[ceph_deploy.cli][INFO  ]  cluster_network               : None
[ceph_deploy.cli][INFO  ]  public_network                : None
[ceph_deploy.cli][INFO  ]  cd_conf                       : <ceph_deploy.conf.cephdeploy.Conf object at 0x7ff2057cb730>
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.cli][INFO  ]  func                          : <function new at 0x7ff2055201f0>
[ceph_deploy.new][DEBUG ] Creating new cluster named ceph
[ceph_deploy.new][INFO  ] making sure passwordless SSH succeeds
[node1][DEBUG ] connected to host: node1 
[ceph_deploy][ERROR ] UnsupportedPlatform: Platform is not supported: rocky blue onyx 9.2

解决办法

修改 ceph_deploy/hosts/init.py文件,添加如下内容

image.png

错误二

添加ceph-mon时候显示不是inital-members

解决办法

image.png

错误三

ceph_deploy.cli][INFO  ]  mon                           : ['node2']
[ceph_deploy.mon][INFO  ] ensuring configuration of new mon host: node2
[ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to node2
[node2][DEBUG ] connected to host: node2 
[ceph_deploy.mon][DEBUG ] Adding mon to cluster ceph, host node2
[ceph_deploy.mon][DEBUG ] using mon address by resolving host: 192.168.0.101
[ceph_deploy.mon][DEBUG ] detecting platform for host node2 ...
[node2][DEBUG ] connected to host: node2 
[ceph_deploy.mon][INFO  ] distro info: rocky 9.2 blue onyx
[node2][DEBUG ] determining if provided host has same hostname in remote
[node2][DEBUG ] adding mon to node2
[node2][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-node2/done
[node2][INFO  ] Running command: systemctl enable ceph.target
[node2][INFO  ] Running command: systemctl enable ceph-mon@node2
[node2][INFO  ] Running command: systemctl start ceph-mon@node2
[node2][INFO  ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.node2.asok mon_status
[node2][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
[node2][WARNIN] monitor node2 does not exist in monmap
[node2][WARNIN] neither `public_addr` nor `public_network` keys are defined for monitors
[node2][WARNIN] monitors may not be able to form quorum
[node2][INFO  ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.node2.asok mon_status
[node2][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
[node2][WARNIN] monitor: mon.node2, might not be running yet
[root@node1 cephcluster]# ls /var/run/ceph/ceph-

image.png

解决办法