ES数据写入与数据删除

7 阅读14分钟
# 数据写入路由机制

image.png

image.png

image.png

单条数据写入

image.png

image.png

image.png

image.png

(1)op_type (操作类型)

作用​:指定对文档的操作类型。主要有两种:

  • index(默认):如果文档不存在则创建,如果存在则覆盖(类似于upsert)。
  • create:只在文档不存在时创建,如果存在则返回失败(避免覆盖)。
    使用场景​:当你想要确保不覆盖已存在的文档时,使用create

(2)refresh (数据刷新策略)

作用​:控制何时将写入的数据变为可搜索。选项:

  • true:立即刷新相关分片(影响性能,仅用于测试)。
  • false(默认):不刷新,依赖默认刷新间隔(通常1秒)。
  • wait_for:请求等待刷新完成后才返回(确保写入后立即可查,但增加延迟)。
    使用场景​:在需要写入后立即查询的场景,可以使用wait_for

(3)routing (数据路由策略)

作用​:指定文档写入到哪个分片。默认使用文档ID的哈希值,通过自定义路由可以控制文档存储位置。
使用场景​:当你希望将同一类文档(如相同用户的数据)存储到同一个分片时使用。

(4)wait_for_active_shards (等待写入分片数量)

作用​:设置必须有多少个分片副本处于活动状态,写入操作才能执行。默认是1(只需要主分片)。可以设置为整数(如2)或百分比(如80%)。
使用场景​:提高数据可靠性,确保写入多个副本后才返回成功。

(5)version (数据版本号)

作用​:指定文档的版本号。用于乐观并发控制。只有当前版本号与指定版本号一致时,操作才会执行。
使用场景​:防止并发更新导致的数据不一致。

(6)version_type (数据版本机制)

作用​:指定版本号的解释方式。常见的两种:

  • internal(默认):版本号由Elasticsearch内部管理,比较时会检查提供的版本号是否大于当前版本号。
  • external:使用外部版本号(如数据库中的版本),比较时要求提供的版本号大于或等于当前版本号。
    使用场景​:当使用外部系统(如数据库)的版本号时,使用external

image.png

image.png

image.png

数据ID问题

1、默认自动生成数据ID,基于ObjectId生成规则
2、写入过程指定数据ID,则会逻辑上检索一次,同比自动ID性能差一倍

数据刷新立即生效问题

1、默认数据写入,数据刷新机制是默认的1s。
2、可以修改方式,主动强制刷新。

# 刷新单个索引
POST your_index/_refresh

# 刷新多个索引(逗号分隔)
POST index1,index2/_refresh

# 刷新所有索引
POST _refresh

数据条数限制

image.png

image.png

数据删除

image.png

image.png

image.png

image.png

image.png

image.png

image.png

image.png

image.png

image.png

image.png

image.png

image.png

image.png

image.png

image.png

image.png

image.png

image.png

总结

(1)优先选择批量写入方式,次之选择单条写入
(2)批量写入数据避免数据包过大,限制总条数或者数据量,建议10mb以内且10000条以下,具体的以自测为主
(3)数据刷新机制refresh优先默认机制1s,慎用人工干预主动刷新,避免性能问题
(4)数据ID优先选择自动生成方式,大幅度提升写入效率
(5)索引创建必须创建或者绑定别名,一个索引必须至少有一个索引别名指向。
(6)优先选择条件删除方式,次之考虑单条删除或者批量删除
(7)删除数据慎用刷新机制refresh优先默认刷新1s

practice

1、每个匹配该模板的新索引都会自动拥有这三个别名。

priority

image.png

image.png

_meta:元数据

index_patterns

image.png

DELETE _index_template/citibike-tripdata-2022
PUT _index_template/citibike-tripdata-2022
{
  "index_patterns": [
    "citibike-20*",
    "citibike-tripdata-20*"
  ],
  "_meta": {},
  "priority": 0,
  "template": {
    "settings": {
      "number_of_replicas": 1,
      "number_of_shards": 1,
      "refresh_interval": "1s"
    },
    "mappings": {
      "_source": {
        "enabled": true
      },
      "dynamic": true,
      "properties": {
        "@timestamp": {
          "type": "date"
        },
        "end_lat": {
          "type": "float"
        },
        "end_lng": {
          "type": "float"
        },
        "end_station_id": {
          "type": "keyword"
        },
        "end_station_name": {
          "type": "text"
        },
        "ended_at": {
          "type": "date",
          "format": [
            "yyyy-MM-dd HH:mm:ss"
          ]
        },
        "member_casual": {
          "type": "keyword"
        },
        "ride_id": {
          "type": "keyword"
        },
        "rideable_type": {
          "type": "keyword"
        },
        "start_lat": {
          "type": "float"
        },
        "start_lng": {
          "type": "float"
        },
        "start_station_id": {
          "type": "keyword"
        },
        "start_station_name": {
          "type": "text"
        },
        "started_at": {
          "type": "date",
          "format": [
            "yyyy-MM-dd HH:mm:ss"
          ]
        },
        "tags": {
          "type": "keyword"
        }
      }
    },
    "aliases": {
      "citibike": {},
      "citibike-tripdata": {},
      "citibike-tripdata-2022": {}
    }
  }
}

keyword和text区别

image.png

集群禁止自动创建索引

image.png

写入单条 - 新增或者更新数据

PUT /citibike-tripdata-2022/_doc/ride_id_D1FCEF55EB4A807F
{
    "end_station_id": "6756.05",
    "member_casual": "member",
    "@timestamp": "2022-12-11T05:36:16.645Z",
    "end_lat": "40.762009",
    "start_station_name": "W 21 St & 6 Ave",
    "start_lat": "40.74173969",
    "ended_at": "2022-01-22 14:53:18",
    "start_station_id": "6140.05",
    "start_lng": "-73.99415556",
    "rideable_type": "classic_bike",
    "started_at": "2022-01-22 14:28:32",
    "tags": [
        "citibike",
        "tripdata"
    ],
    "ride_id": "D1FCEF55EB4A807F",
    "end_station_name": "W 44 St & 11 Ave",
    "end_lng": "-73.996975"
}

写入单条 - _create:显示新增数据

PUT /citibike-tripdata-2022/_create/ride_id_D1FCEF55EB4A807F
{
    "end_station_id": "6756.05",
    "member_casual": "member",
    "@timestamp": "2022-12-11T05:36:16.645Z",
    "end_lat": "40.762009",
    "start_station_name": "W 21 St & 6 Ave",
    "start_lat": "40.74173969",
    "ended_at": "2022-01-22 14:53:18",
    "start_station_id": "6140.05",
    "start_lng": "-73.99415556",
    "rideable_type": "classic_bike",
    "started_at": "2022-01-22 14:28:32",
    "tags": [
        "citibike",
        "tripdata"
    ],
    "ride_id": "D1FCEF55EB4A807F",
    "end_station_name": "W 44 St & 11 Ave",
    "end_lng": "-73.996975"
}

POST _doc

POST方式填充,自动生成id

DELETE /citibike-tripdata-2022

POST /citibike-tripdata-2022/_doc
{
    "end_station_id": "6756.051",
    "member_casual": "member",
    "@timestamp": "2022-12-11T05:36:16.645Z",
    "end_lat": "40.762009",
    "start_station_name": "W 21 St & 6 Ave",
    "start_lat": "40.74173969",
    "ended_at": "2022-01-22 14:53:18",
    "start_station_id": "6140.05",
    "start_lng": "-73.99415556",
    "rideable_type": "classic_bike",
    "started_at": "2022-01-22 14:28:32",
    "tags": [
        "citibike",
        "tripdata"
    ],
    "ride_id": "D1FCEF55EB4A807F",
    "end_station_name": "W 44 St & 11 Ave",
    "end_lng": "-73.996975"
}

数据写入,带数据ID

POST /citibike-tripdata-2022/_doc/ride_id_D1FCEF55EB4A807F
{
    "end_station_id": "6756.051",
    "member_casual": "member",
    "@timestamp": "2022-12-11T05:36:16.645Z",
    "end_lat": "40.762009",
    "start_station_name": "W 21 St & 6 Ave",
    "start_lat": "40.74173969",
    "ended_at": "2022-01-22 14:53:18",
    "start_station_id": "6140.05",
    "start_lng": "-73.99415556",
    "rideable_type": "classic_bike",
    "started_at": "2022-01-22 14:28:32",
    "tags": [
        "citibike",
        "tripdata"
    ],
    "ride_id": "D1FCEF55EB4A807F",
    "end_station_name": "W 44 St & 11 Ave",
    "end_lng": "-73.996975"
}

image.png

POST create

DELETE /citibike-tripdata-2022

POST /citibike-tripdata-2022/_create/ride_id_D1FCEF55EB4A807F
{
    "end_station_id": "6756.051",
    "member_casual": "member",
    "@timestamp": "2022-12-11T05:36:16.645Z",
    "end_lat": "40.762009",
    "start_station_name": "W 21 St & 6 Ave",
    "start_lat": "40.74173969",
    "ended_at": "2022-01-22 14:53:18",
    "start_station_id": "6140.05",
    "start_lng": "-73.99415556",
    "rideable_type": "classic_bike",
    "started_at": "2022-01-22 14:28:32",
    "tags": [
        "citibike",
        "tripdata"
    ],
    "ride_id": "D1FCEF55EB4A807F",
    "end_station_name": "W 44 St & 11 Ave",
    "end_lng": "-73.996975"
}

image.png

image.png

# 删除旧索引(谨慎操作)
DELETE citibike-202202

# 创建新索引并配置参数
PUT citibike-202202
{
  "settings": {
    "number_of_replicas": 1,
    "number_of_shards": 6,
    "refresh_interval": "15s"
  }
}

# 写入文档(合并两张图的字段)
POST citibike-202202/_doc/ride_id_9F1E1AB4E5C8E11D?refresh=true&wait_for_active_shards=1&op_type=index&routing=abc&timeout=1m&version=1&version_type=external
{
  "end_station_id": "6876.04",
  "member_casual": "member",
  "@timestamp": "2022-12-11T05:36:16.650Z",
  "end_lat": "40.76590936",
  "start_station_name": "E 66 St & Madison Ave",
  "start_lat": "40.76800889305947",
  "ended_at": "2022-01-30 15:39:08",
  "start_station_id": "6969.08",
  "start_lng": "-73.96845281124115",
  "rideable_type": "electric_bike",
  "started_at": "2022-01-30 15:33:25",
  "tags": ["citibike", "tripdata"],
  "ride_id": "9F1E1AB4E5C8E11D",
  "end_station_name": "Central Park S & 6 Ave",
  "end_lng": "-73.97634151"
}

# 查询索引数据(测试刷新是否生效)
GET citibike-202202/_search

Mapping自动更新

# 删除索引(谨慎操作)
DELETE citibike-202202

# 创建索引(启用动态映射)
PUT citibike-202202
{
  "mappings": {
    "dynamic": true
  }
}

# 查询初始映射(第一次)
GET citibike-202202/_mapping

# ========================
# 数据填充1(第一条记录)
# ========================
PUT citibike-202202/_create/ride_id_571ECC43B92B62CF
{
  "end_station_id": "6920.03",
  "member_casual": "member",
  "@timestamp": "2022-12-11T05:36:16.655Z"
}

# ========================
# 数据填充2(第二条记录)
# ========================
PUT citibike-202202/_create/ride_id_1729DAF28B7D3CCC
{
  "end_station_id": "5515.08",
  "member_casual": "member",
  "@timestamp": "2022-12-11T05:36:16.656Z",
  "end_lat": 40.723346832180155,
  "start_station_name": "Division St & Bowery",
  "start_lat": 40.714193,
  "ended_at": "2022-01-20 17:14:52",
  "start_station_id": "5270.08",
  "start_lng": -73.996732,
  "rideable_type": "electric_bike",
  "started_at": "2022-01-20 17:03:17",
  "tags": [
    "citibike",
    "tripdata"
  ],
  "ride_id": "1729DAF28B7D3CCC",
  "end_station_name": "E 4 St & Ave B",
  "end_lng": -73.98265913128853
}

# 查询映射变化(第二次)
GET citibike-202202/_mapping

批量插入 _bulk

# 删除旧索引
DELETE citibike-202202

# 批量导入数据(修正格式错误)
POST _bulk
{"create":{"_index":"citibike-202202","_id":"ride_id_571ECC43B92B62CF","require_alias":false}}
{"end_station_id":"6920.03","member_casual":"member","@timestamp":"2022-12-11T05:36:16.655Z","end_lat":"40.76584941","start_station_name":"W 49 St & 8 Ave","start_lat":"40.76227205","ended_at":"2022-01-02 12:58:05","start_station_id":"6747.06","start_lng":"-73.98788205","rideable_type":"classic_bike","started_at":"2022-01-02 12:55:34","tags":["citibike","tripdata"],"ride_id":"571ECC43B92B62CF","end_station_name":"W 54 St & 9 Ave","end_lng":"-73.98690506"}
{"create":{"_index":"citibike-202202","_id":"ride_id_D1FCEF55EB4A807F","require_alias":false}}
{"end_station_id":"6756.05","member_casual":"member","@timestamp":"2022-12-11T05:36:16.645Z","end_lat":"40.762009","start_station_name":"W 21 St & 6 Ave","start_lat":"40.74173969","ended_at":"2022-01-22 14:53:18","start_station_id":"6140.05","start_lng":"-73.99415556","rideable_type":"classic_bike","started_at":"2022-01-22 14:28:32","tags":["citibike","tripdata"],"ride_id":"D1FCEF55EB4A807F","end_station_name":"W 44 St & 11 Ave","end_lng":"-73.996975"}

# 查询数据验证
GET citibike-202202/_search
# 删除索引
DELETE citibike-202202

# 批量导入数据(已修正所有格式错误)
POST _bulk?refresh=true&timeout=1m&wait_for_active_shards=1&routing=abc
{"create":{"_index":"citibike-202202","_id":"ride_id_571ECC43B92B62CF","require_alias":false}}
{"end_station_id":"6920.03","member_casual":"member","@timestamp":"2022-12-11T05:36:16.655Z","end_lat":"40.76584941","start_station_name":"W 49 St & 8 Ave","start_lat":"40.76227205","ended_at":"2022-01-02 12:58:05","start_station_id":"6747.06","start_lng":"-73.98788205","rideable_type":"classic_bike","started_at":"2022-01-02 12:55:34","tags":["citibike","tripdata"],"ride_id":"571ECC43B92B62CF","end_station_name":"W 54 St & 9 Ave","end_lng":"-73.98690506"}
{"create":{"_index":"citibike-202202","_id":"ride_id_D1FCEF55EB4A807F","require_alias":false}}
{"end_station_id":"6756.05","member_casual":"member","@timestamp":"2022-12-11T05:36:16.645Z","end_lat":"40.762009","start_station_name":"W 21 St & 6 Ave","start_lat":"40.74173969","ended_at":"2022-01-22 14:53:18","start_station_id":"6140.05","start_lng":"-73.99415556","rideable_type":"classic_bike","started_at":"2022-01-22 14:28:32","tags":["citibike","tripdata"],"ride_id":"D1FCEF55EB4A807F","end_station_name":"W 44 St & 11 Ave","end_lng":"-73.996975"}

# 查询数据验证
GET citibike-202202/_search

"require_alias":false

image.png

image.png

索引删除数据

单条删除数据

基于数据ID删除
DELETE citibike-202202/_doc/ride_idD1FCEF55EB4A807F

Request 请求参数概要说明(文档删除操作)

参数名作用说明
refresh刷新策略(控制索引刷新时机)
routing路由策略(指定文档所在分片)
timeout超时时间(等待主分片响应时限)
wait_for_active_shards主副本分片删除是否同步·并行控制
version删除指定版本(乐观锁控制)
version_type版本机制·引入外部版本机制
if_seq_no操作顺序号(并发控制)
if_primary_term操作主分片版本号(配合 if_seq_no 实现并发控制)
# 删除索引
DELETE citibike-202202

# 批量导入数据
POST _bulk?refresh=true&timeout=1m&wait_for_active_shards=1&routing=abc
{"create":{"_index":"citibike-202202","_id":"ride_id_571ECC43B92B62CF","require_alias":false}}
{"end_station_id":"6920.03","member_casual":"member","@timestamp":"2022-12-11T05:36:16.655Z","end_lat":"40.76584941","start_station_name":"W 49 St & 8 Ave","start_lat":"40.76227205","ended_at":"2022-01-02 12:58:05","start_station_id":"6747.06","start_lng":"-73.98788205","rideable_type":"classic_bike","started_at":"2022-01-02 12:55:34","tags":["citibike","tripdata"],"ride_id":"571ECC43B92B62CF","end_station_name":"W 54 St & 9 Ave","end_lng":"-73.98690506"}
{"create":{"_index":"citibike-202202","_id":"ride_id_D1FCEF55EB4A807F","require_alias":false}}
{"end_station_id":"6756.05","member_casual":"member","@timestamp":"2022-12-11T05:36:16.645Z","end_lat":"40.762009","start_station_name":"W 21 St & 6 Ave","start_lat":"40.74173969","ended_at":"2022-01-22 14:53:18","start_station_id":"6140.05","start_lng":"-73.99415556","rideable_type":"classic_bike","started_at":"2022-01-22 14:28:32","tags":["citibike","tripdata"],"ride_id":"D1FCEF55EB4A807F","end_station_name":"W 44 St & 11 Ave","end_lng":"-73.996975"}

# 删除操作1:指定版本号删除
DELETE citibike-202202/_doc/ride_id_D1FCEF55EB4A807F?refresh=true&routing=abc&timeout=1m&wait_for_active_shards=1&version=2&version_type=external

# 删除操作2:基于seqno删除
DELETE citibike-202202/_doc/ride_id_D1FCEF55EB4A807F?refresh=true&routing=abc&timeout=1m&wait_for_active_shards=1&if_seq_no=1&if_primary_term=1

_bulk规范

image.png

image.png

批量删除数据

# 删除数据
POST _bulk
{"delete":{"_index":"citibike-202202","_id":"ride_id_571ECC43B92B62CF","require_alias":false}}
{"delete":{"_index":"citibike-202202","_id":"ride_id_D1FCEF55EB4A807F","require_alias":false}}

# 查询数据数量
GET citibike-202202/_count

image.png

POST _bulk?refresh=true&timeout=1m&wait_for_active_shards=1&routing=abc
{"delete" : {"_index" :"citibike-202202","_id" :"ride_id_571ECC43B92B62CF","require_alias" : false}}
{"delete" : {"_index" :"citibike-202202","_id" :"ride_idD1FCEF55EB4A807F","require_alias" : false}}

条件删除 _delete_by_query

# ---------- 第1张图:数据删除与填充操作 ----------
# 删除索引
DELETE citibike-202202

# 批量导入数据(已修正格式错误)
POST _bulk?refresh=true&timeout=1m&wait_for_active_shards=1&routing=abc
{"create":{"_index":"citibike-202202","_id":"ride_id_571ECC43B92B62CF","require_alias":false}}
{"end_station_id":"6920.03","member_casual":"member1","@timestamp":"2022-12-11T05:36:16.655Z","end_lat":"40.76584941","start_station_name":"W 49 St & 8 Ave","start_lat":"40.76227205","ended_at":"2022-01-02 12:58:05","start_station_id":"6747.06","start_lng":"-73.98788205","rideable_type":"classic_bike","started_at":"2022-01-02 12:55:34","tags":["citibike","tripdata"],"ride_id":"571ECC43B92B62CF","end_station_name":"W 54 St & 9 Ave","end_lng":"-73.98690506"}
{"index":{"_index":"citibike-202202","_id":"ride_id_D1FCEF55EB4A807F","require_alias":false}}
{"end_station_id":"6756.05","member_casual":"member2","@timestamp":"2022-12-11T05:36:16.645Z","end_lat":"40.762009","start_station_name":"W 21 St & 6 Ave","start_lat":"40.74173969","ended_at":"2022-01-22 14:53:18","start_station_id":"6140.05","start_lng":"-73.99415556","rideable_type":"classic_bike","started_at":"2022-01-22 14:28:32","tags":["citibike","tripdata"],"ride_id":"D1FCEF55EB4A807F","end_station_name":"W 44 St & 11 Ave","end_lng":"-73.996975"}

# ---------- 第2张图:条件查询与删除操作 ----------
# 条件查询:member_casual 字段值为 "member1"
GET citibike-202202/_search
{
  "query": {
    "match": {
      "member_casual": "member1"
    }
  }
}

# 条件删除:member_casual 字段值为 "member1" 的文档
POST citibike-202202/_delete_by_query
{
  "query": {
    "match": {
      "member_casual": "member1"
    }
  }
}


_rethrottle-调整条件处理数据的速度

  • requests_per_second=-1:设置为无限制(全速执行)

image.png

image.png 单文档写入(如PUT /index/_doc/1)和批量写入(_bulk)操作本身是同步的,不支持wait_for_completion=false参数。
这个是执行异步的操作

POST _delete_by_query/5WXzYG6XSnifGUHCFzBCBQ:3328493/_rethrottle?requests_per_second=-1

_task

# ======================
# 查看所有运行的删除数据任务
# ======================

# 方式1:简洁任务列表(带表头)
GET _cat/tasks?v=true

# 方式2:详细任务信息(过滤删除操作)
GET _tasks?detailed=true&actions=*/delete/byquery

# ======================
# 查看单个异步删除数据任务
# ======================

# 查看指定任务ID的详细信息
GET _tasks/5WXzYG6XSnifGUHCFzBCBQ:3328493

条件式删除

image.png

# 条件式删除1(基本)
POST citibike-202202/_delete_by_query?conflicts=proceed&refresh=true&max_docs=1000&requests_per_second=10&scroll_size=1000&slices=1&wait_for_completion=false
{
  "query": {
    "match": {
      "member_casual": "member1"
    }
  }
}

# 条件式删除1(切片并行)
POST citibike-202202/_delete_by_query?conflicts=proceed&refresh=true&max_docs=1000&requests_per_second=10&scroll_size=1000&slices=1&wait_for_completion=false
{
  "slice": {
    "id": 0,
    "max": 2
  },
  "query": {
    "match": {
      "member_casual": "member1"
    }
  }
}

image.png

image.png

也可以自动分片

image.png