# 数据写入路由机制
单条数据写入
(1)op_type (操作类型)
作用:指定对文档的操作类型。主要有两种:
index
(默认):如果文档不存在则创建,如果存在则覆盖(类似于upsert)。create
:只在文档不存在时创建,如果存在则返回失败(避免覆盖)。
使用场景:当你想要确保不覆盖已存在的文档时,使用create
。
(2)refresh (数据刷新策略)
作用:控制何时将写入的数据变为可搜索。选项:
true
:立即刷新相关分片(影响性能,仅用于测试)。false
(默认):不刷新,依赖默认刷新间隔(通常1秒)。wait_for
:请求等待刷新完成后才返回(确保写入后立即可查,但增加延迟)。
使用场景:在需要写入后立即查询的场景,可以使用wait_for
。
(3)routing (数据路由策略)
作用:指定文档写入到哪个分片。默认使用文档ID的哈希值,通过自定义路由可以控制文档存储位置。
使用场景:当你希望将同一类文档(如相同用户的数据)存储到同一个分片时使用。
(4)wait_for_active_shards (等待写入分片数量)
作用:设置必须有多少个分片副本处于活动状态,写入操作才能执行。默认是1(只需要主分片)。可以设置为整数(如2)或百分比(如80%
)。
使用场景:提高数据可靠性,确保写入多个副本后才返回成功。
(5)version (数据版本号)
作用:指定文档的版本号。用于乐观并发控制。只有当前版本号与指定版本号一致时,操作才会执行。
使用场景:防止并发更新导致的数据不一致。
(6)version_type (数据版本机制)
作用:指定版本号的解释方式。常见的两种:
internal
(默认):版本号由Elasticsearch内部管理,比较时会检查提供的版本号是否大于当前版本号。external
:使用外部版本号(如数据库中的版本),比较时要求提供的版本号大于或等于当前版本号。
使用场景:当使用外部系统(如数据库)的版本号时,使用external
。
数据ID问题
1、默认自动生成数据ID,基于ObjectId生成规则
2、写入过程指定数据ID,则会逻辑上检索一次,同比自动ID性能差一倍
数据刷新立即生效问题
1、默认数据写入,数据刷新机制是默认的1s。
2、可以修改方式,主动强制刷新。
# 刷新单个索引
POST your_index/_refresh
# 刷新多个索引(逗号分隔)
POST index1,index2/_refresh
# 刷新所有索引
POST _refresh
数据条数限制
数据删除
总结
(1)优先选择批量写入方式,次之选择单条写入
(2)批量写入数据避免数据包过大,限制总条数或者数据量,建议10mb以内且10000条以下,具体的以自测为主
(3)数据刷新机制refresh优先默认机制1s,慎用人工干预主动刷新,避免性能问题
(4)数据ID优先选择自动生成方式,大幅度提升写入效率
(5)索引创建必须创建或者绑定别名,一个索引必须至少有一个索引别名指向。
(6)优先选择条件删除方式,次之考虑单条删除或者批量删除
(7)删除数据慎用刷新机制refresh优先默认刷新1s
practice
1、每个匹配该模板的新索引都会自动拥有这三个别名。
priority
_meta:元数据
index_patterns
DELETE _index_template/citibike-tripdata-2022
PUT _index_template/citibike-tripdata-2022
{
"index_patterns": [
"citibike-20*",
"citibike-tripdata-20*"
],
"_meta": {},
"priority": 0,
"template": {
"settings": {
"number_of_replicas": 1,
"number_of_shards": 1,
"refresh_interval": "1s"
},
"mappings": {
"_source": {
"enabled": true
},
"dynamic": true,
"properties": {
"@timestamp": {
"type": "date"
},
"end_lat": {
"type": "float"
},
"end_lng": {
"type": "float"
},
"end_station_id": {
"type": "keyword"
},
"end_station_name": {
"type": "text"
},
"ended_at": {
"type": "date",
"format": [
"yyyy-MM-dd HH:mm:ss"
]
},
"member_casual": {
"type": "keyword"
},
"ride_id": {
"type": "keyword"
},
"rideable_type": {
"type": "keyword"
},
"start_lat": {
"type": "float"
},
"start_lng": {
"type": "float"
},
"start_station_id": {
"type": "keyword"
},
"start_station_name": {
"type": "text"
},
"started_at": {
"type": "date",
"format": [
"yyyy-MM-dd HH:mm:ss"
]
},
"tags": {
"type": "keyword"
}
}
},
"aliases": {
"citibike": {},
"citibike-tripdata": {},
"citibike-tripdata-2022": {}
}
}
}
keyword和text区别
集群禁止自动创建索引
写入单条 - 新增或者更新数据
PUT /citibike-tripdata-2022/_doc/ride_id_D1FCEF55EB4A807F
{
"end_station_id": "6756.05",
"member_casual": "member",
"@timestamp": "2022-12-11T05:36:16.645Z",
"end_lat": "40.762009",
"start_station_name": "W 21 St & 6 Ave",
"start_lat": "40.74173969",
"ended_at": "2022-01-22 14:53:18",
"start_station_id": "6140.05",
"start_lng": "-73.99415556",
"rideable_type": "classic_bike",
"started_at": "2022-01-22 14:28:32",
"tags": [
"citibike",
"tripdata"
],
"ride_id": "D1FCEF55EB4A807F",
"end_station_name": "W 44 St & 11 Ave",
"end_lng": "-73.996975"
}
写入单条 - _create:显示新增数据
PUT /citibike-tripdata-2022/_create/ride_id_D1FCEF55EB4A807F
{
"end_station_id": "6756.05",
"member_casual": "member",
"@timestamp": "2022-12-11T05:36:16.645Z",
"end_lat": "40.762009",
"start_station_name": "W 21 St & 6 Ave",
"start_lat": "40.74173969",
"ended_at": "2022-01-22 14:53:18",
"start_station_id": "6140.05",
"start_lng": "-73.99415556",
"rideable_type": "classic_bike",
"started_at": "2022-01-22 14:28:32",
"tags": [
"citibike",
"tripdata"
],
"ride_id": "D1FCEF55EB4A807F",
"end_station_name": "W 44 St & 11 Ave",
"end_lng": "-73.996975"
}
POST _doc
POST方式填充,自动生成id
DELETE /citibike-tripdata-2022
POST /citibike-tripdata-2022/_doc
{
"end_station_id": "6756.051",
"member_casual": "member",
"@timestamp": "2022-12-11T05:36:16.645Z",
"end_lat": "40.762009",
"start_station_name": "W 21 St & 6 Ave",
"start_lat": "40.74173969",
"ended_at": "2022-01-22 14:53:18",
"start_station_id": "6140.05",
"start_lng": "-73.99415556",
"rideable_type": "classic_bike",
"started_at": "2022-01-22 14:28:32",
"tags": [
"citibike",
"tripdata"
],
"ride_id": "D1FCEF55EB4A807F",
"end_station_name": "W 44 St & 11 Ave",
"end_lng": "-73.996975"
}
数据写入,带数据ID
POST /citibike-tripdata-2022/_doc/ride_id_D1FCEF55EB4A807F
{
"end_station_id": "6756.051",
"member_casual": "member",
"@timestamp": "2022-12-11T05:36:16.645Z",
"end_lat": "40.762009",
"start_station_name": "W 21 St & 6 Ave",
"start_lat": "40.74173969",
"ended_at": "2022-01-22 14:53:18",
"start_station_id": "6140.05",
"start_lng": "-73.99415556",
"rideable_type": "classic_bike",
"started_at": "2022-01-22 14:28:32",
"tags": [
"citibike",
"tripdata"
],
"ride_id": "D1FCEF55EB4A807F",
"end_station_name": "W 44 St & 11 Ave",
"end_lng": "-73.996975"
}
POST create
DELETE /citibike-tripdata-2022
POST /citibike-tripdata-2022/_create/ride_id_D1FCEF55EB4A807F
{
"end_station_id": "6756.051",
"member_casual": "member",
"@timestamp": "2022-12-11T05:36:16.645Z",
"end_lat": "40.762009",
"start_station_name": "W 21 St & 6 Ave",
"start_lat": "40.74173969",
"ended_at": "2022-01-22 14:53:18",
"start_station_id": "6140.05",
"start_lng": "-73.99415556",
"rideable_type": "classic_bike",
"started_at": "2022-01-22 14:28:32",
"tags": [
"citibike",
"tripdata"
],
"ride_id": "D1FCEF55EB4A807F",
"end_station_name": "W 44 St & 11 Ave",
"end_lng": "-73.996975"
}
# 删除旧索引(谨慎操作)
DELETE citibike-202202
# 创建新索引并配置参数
PUT citibike-202202
{
"settings": {
"number_of_replicas": 1,
"number_of_shards": 6,
"refresh_interval": "15s"
}
}
# 写入文档(合并两张图的字段)
POST citibike-202202/_doc/ride_id_9F1E1AB4E5C8E11D?refresh=true&wait_for_active_shards=1&op_type=index&routing=abc&timeout=1m&version=1&version_type=external
{
"end_station_id": "6876.04",
"member_casual": "member",
"@timestamp": "2022-12-11T05:36:16.650Z",
"end_lat": "40.76590936",
"start_station_name": "E 66 St & Madison Ave",
"start_lat": "40.76800889305947",
"ended_at": "2022-01-30 15:39:08",
"start_station_id": "6969.08",
"start_lng": "-73.96845281124115",
"rideable_type": "electric_bike",
"started_at": "2022-01-30 15:33:25",
"tags": ["citibike", "tripdata"],
"ride_id": "9F1E1AB4E5C8E11D",
"end_station_name": "Central Park S & 6 Ave",
"end_lng": "-73.97634151"
}
# 查询索引数据(测试刷新是否生效)
GET citibike-202202/_search
Mapping自动更新
# 删除索引(谨慎操作)
DELETE citibike-202202
# 创建索引(启用动态映射)
PUT citibike-202202
{
"mappings": {
"dynamic": true
}
}
# 查询初始映射(第一次)
GET citibike-202202/_mapping
# ========================
# 数据填充1(第一条记录)
# ========================
PUT citibike-202202/_create/ride_id_571ECC43B92B62CF
{
"end_station_id": "6920.03",
"member_casual": "member",
"@timestamp": "2022-12-11T05:36:16.655Z"
}
# ========================
# 数据填充2(第二条记录)
# ========================
PUT citibike-202202/_create/ride_id_1729DAF28B7D3CCC
{
"end_station_id": "5515.08",
"member_casual": "member",
"@timestamp": "2022-12-11T05:36:16.656Z",
"end_lat": 40.723346832180155,
"start_station_name": "Division St & Bowery",
"start_lat": 40.714193,
"ended_at": "2022-01-20 17:14:52",
"start_station_id": "5270.08",
"start_lng": -73.996732,
"rideable_type": "electric_bike",
"started_at": "2022-01-20 17:03:17",
"tags": [
"citibike",
"tripdata"
],
"ride_id": "1729DAF28B7D3CCC",
"end_station_name": "E 4 St & Ave B",
"end_lng": -73.98265913128853
}
# 查询映射变化(第二次)
GET citibike-202202/_mapping
批量插入 _bulk
# 删除旧索引
DELETE citibike-202202
# 批量导入数据(修正格式错误)
POST _bulk
{"create":{"_index":"citibike-202202","_id":"ride_id_571ECC43B92B62CF","require_alias":false}}
{"end_station_id":"6920.03","member_casual":"member","@timestamp":"2022-12-11T05:36:16.655Z","end_lat":"40.76584941","start_station_name":"W 49 St & 8 Ave","start_lat":"40.76227205","ended_at":"2022-01-02 12:58:05","start_station_id":"6747.06","start_lng":"-73.98788205","rideable_type":"classic_bike","started_at":"2022-01-02 12:55:34","tags":["citibike","tripdata"],"ride_id":"571ECC43B92B62CF","end_station_name":"W 54 St & 9 Ave","end_lng":"-73.98690506"}
{"create":{"_index":"citibike-202202","_id":"ride_id_D1FCEF55EB4A807F","require_alias":false}}
{"end_station_id":"6756.05","member_casual":"member","@timestamp":"2022-12-11T05:36:16.645Z","end_lat":"40.762009","start_station_name":"W 21 St & 6 Ave","start_lat":"40.74173969","ended_at":"2022-01-22 14:53:18","start_station_id":"6140.05","start_lng":"-73.99415556","rideable_type":"classic_bike","started_at":"2022-01-22 14:28:32","tags":["citibike","tripdata"],"ride_id":"D1FCEF55EB4A807F","end_station_name":"W 44 St & 11 Ave","end_lng":"-73.996975"}
# 查询数据验证
GET citibike-202202/_search
# 删除索引
DELETE citibike-202202
# 批量导入数据(已修正所有格式错误)
POST _bulk?refresh=true&timeout=1m&wait_for_active_shards=1&routing=abc
{"create":{"_index":"citibike-202202","_id":"ride_id_571ECC43B92B62CF","require_alias":false}}
{"end_station_id":"6920.03","member_casual":"member","@timestamp":"2022-12-11T05:36:16.655Z","end_lat":"40.76584941","start_station_name":"W 49 St & 8 Ave","start_lat":"40.76227205","ended_at":"2022-01-02 12:58:05","start_station_id":"6747.06","start_lng":"-73.98788205","rideable_type":"classic_bike","started_at":"2022-01-02 12:55:34","tags":["citibike","tripdata"],"ride_id":"571ECC43B92B62CF","end_station_name":"W 54 St & 9 Ave","end_lng":"-73.98690506"}
{"create":{"_index":"citibike-202202","_id":"ride_id_D1FCEF55EB4A807F","require_alias":false}}
{"end_station_id":"6756.05","member_casual":"member","@timestamp":"2022-12-11T05:36:16.645Z","end_lat":"40.762009","start_station_name":"W 21 St & 6 Ave","start_lat":"40.74173969","ended_at":"2022-01-22 14:53:18","start_station_id":"6140.05","start_lng":"-73.99415556","rideable_type":"classic_bike","started_at":"2022-01-22 14:28:32","tags":["citibike","tripdata"],"ride_id":"D1FCEF55EB4A807F","end_station_name":"W 44 St & 11 Ave","end_lng":"-73.996975"}
# 查询数据验证
GET citibike-202202/_search
"require_alias":false
索引删除数据
单条删除数据
基于数据ID删除
DELETE citibike-202202/_doc/ride_idD1FCEF55EB4A807F
Request 请求参数概要说明(文档删除操作)
参数名 | 作用说明 |
---|---|
refresh | 刷新策略(控制索引刷新时机) |
routing | 路由策略(指定文档所在分片) |
timeout | 超时时间(等待主分片响应时限) |
wait_for_active_shards | 主副本分片删除是否同步·并行控制 |
version | 删除指定版本(乐观锁控制) |
version_type | 版本机制·引入外部版本机制 |
if_seq_no | 操作顺序号(并发控制) |
if_primary_term | 操作主分片版本号(配合 if_seq_no 实现并发控制) |
# 删除索引
DELETE citibike-202202
# 批量导入数据
POST _bulk?refresh=true&timeout=1m&wait_for_active_shards=1&routing=abc
{"create":{"_index":"citibike-202202","_id":"ride_id_571ECC43B92B62CF","require_alias":false}}
{"end_station_id":"6920.03","member_casual":"member","@timestamp":"2022-12-11T05:36:16.655Z","end_lat":"40.76584941","start_station_name":"W 49 St & 8 Ave","start_lat":"40.76227205","ended_at":"2022-01-02 12:58:05","start_station_id":"6747.06","start_lng":"-73.98788205","rideable_type":"classic_bike","started_at":"2022-01-02 12:55:34","tags":["citibike","tripdata"],"ride_id":"571ECC43B92B62CF","end_station_name":"W 54 St & 9 Ave","end_lng":"-73.98690506"}
{"create":{"_index":"citibike-202202","_id":"ride_id_D1FCEF55EB4A807F","require_alias":false}}
{"end_station_id":"6756.05","member_casual":"member","@timestamp":"2022-12-11T05:36:16.645Z","end_lat":"40.762009","start_station_name":"W 21 St & 6 Ave","start_lat":"40.74173969","ended_at":"2022-01-22 14:53:18","start_station_id":"6140.05","start_lng":"-73.99415556","rideable_type":"classic_bike","started_at":"2022-01-22 14:28:32","tags":["citibike","tripdata"],"ride_id":"D1FCEF55EB4A807F","end_station_name":"W 44 St & 11 Ave","end_lng":"-73.996975"}
# 删除操作1:指定版本号删除
DELETE citibike-202202/_doc/ride_id_D1FCEF55EB4A807F?refresh=true&routing=abc&timeout=1m&wait_for_active_shards=1&version=2&version_type=external
# 删除操作2:基于seqno删除
DELETE citibike-202202/_doc/ride_id_D1FCEF55EB4A807F?refresh=true&routing=abc&timeout=1m&wait_for_active_shards=1&if_seq_no=1&if_primary_term=1
_bulk规范
批量删除数据
# 删除数据
POST _bulk
{"delete":{"_index":"citibike-202202","_id":"ride_id_571ECC43B92B62CF","require_alias":false}}
{"delete":{"_index":"citibike-202202","_id":"ride_id_D1FCEF55EB4A807F","require_alias":false}}
# 查询数据数量
GET citibike-202202/_count
POST _bulk?refresh=true&timeout=1m&wait_for_active_shards=1&routing=abc
{"delete" : {"_index" :"citibike-202202","_id" :"ride_id_571ECC43B92B62CF","require_alias" : false}}
{"delete" : {"_index" :"citibike-202202","_id" :"ride_idD1FCEF55EB4A807F","require_alias" : false}}
条件删除 _delete_by_query
# ---------- 第1张图:数据删除与填充操作 ----------
# 删除索引
DELETE citibike-202202
# 批量导入数据(已修正格式错误)
POST _bulk?refresh=true&timeout=1m&wait_for_active_shards=1&routing=abc
{"create":{"_index":"citibike-202202","_id":"ride_id_571ECC43B92B62CF","require_alias":false}}
{"end_station_id":"6920.03","member_casual":"member1","@timestamp":"2022-12-11T05:36:16.655Z","end_lat":"40.76584941","start_station_name":"W 49 St & 8 Ave","start_lat":"40.76227205","ended_at":"2022-01-02 12:58:05","start_station_id":"6747.06","start_lng":"-73.98788205","rideable_type":"classic_bike","started_at":"2022-01-02 12:55:34","tags":["citibike","tripdata"],"ride_id":"571ECC43B92B62CF","end_station_name":"W 54 St & 9 Ave","end_lng":"-73.98690506"}
{"index":{"_index":"citibike-202202","_id":"ride_id_D1FCEF55EB4A807F","require_alias":false}}
{"end_station_id":"6756.05","member_casual":"member2","@timestamp":"2022-12-11T05:36:16.645Z","end_lat":"40.762009","start_station_name":"W 21 St & 6 Ave","start_lat":"40.74173969","ended_at":"2022-01-22 14:53:18","start_station_id":"6140.05","start_lng":"-73.99415556","rideable_type":"classic_bike","started_at":"2022-01-22 14:28:32","tags":["citibike","tripdata"],"ride_id":"D1FCEF55EB4A807F","end_station_name":"W 44 St & 11 Ave","end_lng":"-73.996975"}
# ---------- 第2张图:条件查询与删除操作 ----------
# 条件查询:member_casual 字段值为 "member1"
GET citibike-202202/_search
{
"query": {
"match": {
"member_casual": "member1"
}
}
}
# 条件删除:member_casual 字段值为 "member1" 的文档
POST citibike-202202/_delete_by_query
{
"query": {
"match": {
"member_casual": "member1"
}
}
}
_rethrottle-调整条件处理数据的速度
requests_per_second=-1
:设置为无限制(全速执行)
单文档写入(如
PUT /index/_doc/1
)和批量写入(_bulk
)操作本身是同步的,不支持wait_for_completion=false
参数。
这个是执行异步的操作
POST _delete_by_query/5WXzYG6XSnifGUHCFzBCBQ:3328493/_rethrottle?requests_per_second=-1
_task
# ======================
# 查看所有运行的删除数据任务
# ======================
# 方式1:简洁任务列表(带表头)
GET _cat/tasks?v=true
# 方式2:详细任务信息(过滤删除操作)
GET _tasks?detailed=true&actions=*/delete/byquery
# ======================
# 查看单个异步删除数据任务
# ======================
# 查看指定任务ID的详细信息
GET _tasks/5WXzYG6XSnifGUHCFzBCBQ:3328493
条件式删除
# 条件式删除1(基本)
POST citibike-202202/_delete_by_query?conflicts=proceed&refresh=true&max_docs=1000&requests_per_second=10&scroll_size=1000&slices=1&wait_for_completion=false
{
"query": {
"match": {
"member_casual": "member1"
}
}
}
# 条件式删除1(切片并行)
POST citibike-202202/_delete_by_query?conflicts=proceed&refresh=true&max_docs=1000&requests_per_second=10&scroll_size=1000&slices=1&wait_for_completion=false
{
"slice": {
"id": 0,
"max": 2
},
"query": {
"match": {
"member_casual": "member1"
}
}
}
也可以自动分片