通过logstash实现ES的数据同步

220 阅读1分钟

一、ES数据同步工具

常见的工具:

  1. logstash
  2. elasticdump
  3. 快照



二、logstash的部署与目录结构

  1. input文件
  2. output文件



三、脚本

Full project name: ElasticSearch/ES2ES.data.migration

使用logstash实现的跨ES集群的迁移

2个ES集群不带xpack认证

一次最多可以执行三个index,但是job中只是做了一个。

脚本在ansible的/data/logstash/syncindex.sh

同步策略(用以下举例)
【KFC的index】-》只需要从备集群同步到主集群,遇到重复ID的忽略
order.kfc_pre.20240229.index
order.kfc.20240226.index

【PH的index】-》先从备集群同步到主集群,遇到重复ID的忽略;再从主集群同步到备集群,遇到重复ID的忽略;
order.phhs.202402.index
order.phdi_co.202402.index

执行方式:先执行脚本syncindex.sh,然后运行logstash,运行logstash如下:
nohup /usr/share/logstash/bin/logstash "--path.settings" "/etc/logstash" &


syncindex.sh 的内容如下:

#!/bin/bash


# 从命令行参数中读取Inurl、Outurl和索引

Inurl=$1
Outurl=$2
index1=$3
index2=$4
index3=$5


# 检查Inurl, Outurl和index1是否为空
if [ -z "$Inurl" ] || [ -z "$Outurl" ] || [ -z "$index1" ]; then
echo "Inurl, Outurl and index1 must not be empty."
exit 1
fi


# 初始化计数器
counter=1
echo "输入源ES集群:$Inurl 输出ES集群:$Outurl 同步索引:$index1 $index2 $index3"
  

# 创建文件
for index in $index1 $index2 $index3
do
if [ -n "$index" ]; then
# 剔除最后'.'和后面的字符串
filename=$(echo $index | rev | cut -d'.' -f2- | rev)
filepath="/etc/logstash/conf.d/reindex$counter/$filename.conf"
# 转移路径下的各个文件
mv /etc/logstash/conf.d/reindex$counter/* /etc/logstash/conf.d/synchistory
# 创建文件并写入内容
echo "input {
elasticsearch {
hosts => \"$Inurl\"
index => \"$index\"
scroll => \"5m\"
docinfo => true
}
}

output {
elasticsearch {
hosts => \"$Outurl\"
index => \"$index\"
document_id => \"%{[@metadata][_id]}\"
action => \"create\"
}
}" > $filepath
# 计数器自增
((counter++))
fi
done



图片.png

说明: 未完待续




四、制作脚手架工具

shell内容如下,


#!/bin/bash -ex

cd $WORKSPACE

echo ${src_host}
echo ${dest_host}
echo ${index_name}


ssh -o "StrictHostKeyChecking=no" 172.20.214.8 "jps"
# ssh -o "StrictHostKeyChecking=no" 172.20.214.8 "kill -9 `ps -ef  |grep logstash |grep -v grep |awk '{print $2}'`"
isExist=`ssh -o "StrictHostKeyChecking=no" 172.20.214.8 "ps -ef  |grep logstash |grep -v grep |awk '{print $2}'"`
if [ ! -n "isExist" ];then
        ssh -o "StrictHostKeyChecking=no" 172.20.214.8 "kill -9 `ps -ef  |grep logstash |grep -v grep |awk '{print $2}'`"
else
        echo "skip"
fi

ssh -o "StrictHostKeyChecking=no" 172.20.214.8 "sh /data/logstash/syncindex.sh ${src_host} ${dest_host} ${index_name}"
ssh -o "StrictHostKeyChecking=no" 172.20.214.8 "nohup /usr/share/logstash/bin/logstash \"--path.settings\" \"/etc/logstash\" &"