1. MDSs report oversized cache / clients failing to respond to cache pressure
# 1. 检查 ceph mds 中有 400多万的 inodes
ceph daemon mds.node-mds1 perf dump mds
# 2. ceph mds 实时性能监控中显示有超过500万的 inodes
ceph daemonperf mds.node-mds01
# 3. 检查 mds_cache_memory_limit 值为 1GB,报警阈值为 1.5(即达到 1.5 倍时产生告警)
ceph daemon mds.node-mds01 config show | grep mds_cache
# 4. 修改 ceph.conf 配置文件,重启systemctl restart ceph-mds@`hostname`
[mds]
mds cache memory limit = 10737418240
2. ceph osd full/nearfull
扩容
# 修改osd最大可写容量比例## 修改前为80%,修改后为85%
$ ceph tell osd.* injectargs '--mon-osd-full-ratio 85'
$ ceph tell osd.* injectargs '--mon-osd-full-ratio 0.85'
$ ceph pg set_full_ratio 0.85
调整osd权重
# 查看集群osd的比重
$ ceph osd tree
ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 18.43690 root default
-2 2.63699 host controller-3
0 0.87900 osd.0 up 1.00000 1.00000
2 0.87900 osd.2 up 1.00000 1.00000
3 0.87900 osd.3 up 1.00000 1.00000
-3 2.63699 host controller-1
1 0.87900 osd.1 up 1.00000 1.00000
4 0.87900 osd.4 up 1.00000 1.00000
15 0.87900 osd.15 up 1.00000 1.00000
# 例:调整osd15的比重为0.7
$ ceph osd crush reweight osd.15 0.7
# 查看集群osd的比重
$ ceph osd tree
ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 18.43690 root default
-2 2.63699 host controller-3
0 0.87900 osd.0 up 1.00000 1.00000
2 0.87900 osd.2 up 1.00000 1.00000
3 0.87900 osd.3 up 1.00000 1.00000
-3 2.63699 host controller-1
1 0.87900 osd.1 up 1.00000 1.00000
4 0.87900 osd.4 up 1.00000 1.00000
15 0.87900 osd.15 up 1.00000 1.00000
# 例:调整osd15的比重为0.7
$ ceph osd crush reweight osd.15 0.7
ceph osd crush reweight <osd> <weight> 和 ceph osd reweight <osd> <weight>的区别:
"ceph osd crush reweight" sets the CRUSH weight of the OSD. This
weight is an arbitrary value (generally the size of the disk in TB or
something) and controls how much data the system tries to allocate to
the OSD.
"ceph osd reweight" sets an override weight on the OSD. This value is
in the range 0 to 1, and forces CRUSH to re-place (1-weight) of the
data that would otherwise live on this drive. It does *not* change the
weights assigned to the buckets above the OSD, and is a corrective
measure in case the normal CRUSH distribution isn't working out quite
right. (For instance, if one of your OSDs is at 90% and the others are
at 50%, you could reduce this weight to try and compensate for it.)