二、elasticsearch 的基本操作

99 阅读12分钟

一、基本概念

  elasticsearch 是面向文档型数据库。文档会被序列化成 json 格式保存在 elasticsearch 中,文档是所有可搜索数据的最小单元。我们将 elasticsearch 中涉及的相关概念名词和 MySQL 做一个类比,方便理解。

掘金-es.drawio.png

  • elasticsearch 中的 Index 可以看做一个库,而 Type 相当于表,Documents 相当于表中的一行记录
  • 在 7.0 版本之前,一个 Index 可以设置多个 Type;但在 7.0 之后的版本中,Type 已经被弃用,一个 Index 只能创建一个 Type

二、 索引的基本操作

1. 创建索引

put http://localhost:9200/shopping
  • 响应结果
{
    "acknowledged": true,
    "shards_acknowledged": true,   # 分片操作成功
    "index": "shopping"            # 索引名称
}
# 创建索引库分片数默认是 1 片,在 7.0 版本之前中,默认是 5 片

2. 查看指定索引

get http://localhost:9200/shopping
  • 响应结果
{
    "shopping": {
        "aliases": {},    # 别名
        "mappings": {},   # 映射
        "settings": {
            "index": {
                "creation_date": "1664875716347",   # 索引创建时间
                "number_of_shards": "1",            # 索引主分片数
                "number_of_replicas": "1",          # 索引副本数
                "uuid": "U4Cn4kugQ46Kvq5LLpHsqQ",   # 索引唯一标识
                "version": {
                    "created": "7080099"            # 索引版本号
                },
                "provided_name": "shopping"         # 索引名称
            }
        }
    }
}

3. 查看所有索引

get http://localhost:9200/_cat/indices?v
  • 响应结果
health status index    uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   shopping U4Cn4kugQ46Kvq5LLpHsqQ   1   1          0            0       208b           208b

表头含义
health当前服务健康状态:green(集群完整)、yellow(单点正常、集群不完成)、red(单点不正常)
status索引状态:打开 或 关闭
index索引名称
uuid索引统一编号
pri主分片数量
rep副本数量
docs.count可用文档数量
docs.deleted文档逻辑删除状态
store.size主分片和副分片整体占用空间大小
pri.store.size主分片占用空间大小

4. 删除指定索引

delete http://localhost:9200/shopping
  • 响应结果
{
    "acknowledged": true
}

三、 文档的增删改查

1. 创建文档

(1) 自动生成主键ID的方式创建

post http://localhost:9200/shopping/_doc

image.png

  • 响应结果
{
    "_index": "shopping",     # 索引
    "_type": "_doc",          # 类型是文档
    "_id": "2p48o4MB5fq_0p-xahUa",   # 随机主键ID
    "_version": 1, 
    "result": "created",     # created 表示创建成功
    "_shards": {
        "total": 2,
        "successful": 1,
        "failed": 0
    },
    "_seq_no": 0,
    "_primary_term": 1
}

(2) 指定主键ID的方式创建

post http://localhost:9200/shopping/_doc/1

image.png

  • 响应结果
{
    "_index": "shopping",
    "_type": "_doc",
    "_id": "1001",
    "_version": 1,
    "result": "created",
    "_shards": {
        "total": 2,
        "successful": 1,
        "failed": 0
    },
    "_seq_no": 1,
    "_primary_term": 1
}

注意:/shopping/_doc方式创建,如果文档ID不存在,就直接创建新的文档。如果文档ID存在,先删除现有的文档,再创建新的文档,版本号会增加

(3) 使用 op_type 参数来创建

post http://localhost:9200/shopping/_doc/1002?op_type=create

image.png

这种方式的简写post http://localhost:9200/shopping/_create/1002, 它的特点是如果文档ID已存在了,那么再次创建相同ID的文档时会失败

2. 查看文档

get http://localhost:9200/shopping/_doc/1002
  • 响应结果
{
    "_index": "shopping",
    "_type": "_doc",
    "_id": "1002",
    "_version": 1,
    "_seq_no": 2,
    "_primary_term": 1,
    "found": true,
    "_source": {
        "title": "苹果12",
        "price": 4999
    }
}

3. 更新文档

(1) 全量文档内容覆盖

post http://localhost:9200/shopping/_doc/1001

image.png

  • 响应结果
{
    "_index": "shopping",
    "_type": "_doc",
    "_id": "1001",
    "_version": 2,         #修改成功后文档版本号新增
    "result": "updated",
    "_shards": {
        "total": 2,
        "successful": 1,
        "failed": 0
    },
    "_seq_no": 3,
    "_primary_term": 1
}

(2) 修改文档指定指定值

post http://localhost:9200/shopping/_update/1001
  • 修改指定字段请求体:
 {
    "doc":{
        "price": 6999
     }
}
  • 示例:

image.png

4. 删除文档

delete http://localhost:9200/shopping/_doc/1002

四、 文档的条件查询

  • 官方API文档参考:www.elastic.co/guide/en/el… ;这里主要说明一下 Request Body Search 方式查询搜索,因为这种方式容易理解、使用和记忆!

  • 什么是 Request Body Search
    它可以简单理解为就是在http搜索请求的request body中使用 [Query DSL] (www.elastic.co/guide/en/el…) 来完成一次搜索请求。

1. 查询全部文档

get http://localhost:9200/shopping/_search
  • 请求Body
{
    "query":{
        "match_all":{}
    }
}
# query:表示一个查询对象,里面可以有不同的查询属性
# match_all:表示查询索引的所有文档

image.png

  • 响应结果
{
    "took": 1,                # 查询所消耗的时间,单位毫秒
    "timed_out": false,       # 查询是否超时
    "_shards": {              # 分片信息
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {                # 搜索命中的结果信息 
        "total": {           # 搜索条件匹配到的文档总数信息
            "value": 3,
            "relation": "eq"  # eq 表示计数准确,gte 表示计算不准确
        },
        "max_score": 1.0,     # 匹配度分数值
        "hits": [             # 搜索到的结果数据集合
            {
                "_index": "shopping",
                "_type": "_doc",
                "_id": "2p48o4MB5fq_0p-xahUa",
                "_score": 1.0,                  # 相关度评分
                "_source": {                    # 文档原始信息
                    "title": "小米手机","price": 3999.00
                }
            },
            {
                "_index": "shopping", "_type": "_doc", "_id": "1001", "_score": 1.0,
                "_source": {
                    "title": "苹果13", "price": 6999
                }
            },
            {
                "_index": "shopping", "_type": "_doc", "_id": "1003",  "_score": 1.0,
                "_source": {
                    "title": "华为P50", "price": 4999.00
                }
            }
        ]
    }
}

2. 分页查询文档

get http://localhost:9200/shopping/_search
  • 请求Body
{
    "query":{
        "match_all":{}
    },
    "from":0,
    "size":1
}
# from:当前页的起始索引,默认是从 0 开始的。from的计算公式:from=(pageNum-1)*size
# size:每页显示多少条数据

image.png

3. 文档排序查询

  • 现有如下示例文档,针对这个文档进行排序分页
{ "title": "苹果13","price": 6999,"type": 1 }
{ "title": "华为P50","price": 4999, "type": 3 }
{ "title": "小米手机","price": 3999,"type": 2 }
{ "title": "华为Mate50","price": 7999,"type": 3 }
  • 排序主要使用 sort 关键字,它可以让我们按照不同的字段进行排序,并且通过 order 关键字来指定排序方式;desc 降序, asc 升序。一般最好是在 数值型日期型 字段上排序!

3.1 单个字段排序后分页

get http://localhost:9200/shopping/_search
  • 请求Body
{
    "query":{
        "match_all":{}
    },
    "sort":[{
        "price": { "order":"desc" }
    }],
    "from":0,
    "size":2    
}

image.png

  • 响应结果
{
    "took": 1,
    "timed_out": false,
    "_shards": {
        "total": 1,"successful": 1, "skipped": 0,"failed": 0
    },
    "hits": {
        "total": {
            "value": 4, "relation": "eq"
        },
        "max_score": null,
        "hits": [
            {
                "_index": "shopping", "_type": "_doc", "_id": "1004", "_score": null,
                "_source": {
                    "title": "华为Mate50", "price": 7999.00, "type": 3
                },
                "sort": [
                    7999.0
                ]
            },
            {
                "_index": "shopping", "_type": "_doc", "_id": "1001", "_score": null,
                "_source": {
                    "title": "苹果13","price": 6999,"type": 1
                },
                "sort": [
                    6999.0
                ]
            }
        ]
    }
}

3.1 多字段排序后分页

get http://localhost:9200/shopping/_search
  • 请求Body
{
    "query":{
        "match_all":{}
    },
    "sort":[
        {
            "price": { "order":"asc" }
        },
        {
            "type": { "order":"desc" }
        }
    ],
    "from":0,
    "size":2    
}
  • 响应结果
{"took":5,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":4,"relation":"eq"},"max_score":null,"hits":[{"_index":"shopping","_type":"_doc","_id":"2p48o4MB5fq_0p-xahUa","_score":null,"_source":{"title":"小米手机","price":3999.0,"type":2},"sort":[3999.0,2]},{"_index":"shopping","_type":"_doc","_id":"1003","_score":null,"_source":{"title":"华为P50","price":4999.0,"type":3},"sort":[4999.0,3]}]}}

4. 查询结果返回指定字段

  • 在请求Body中通过 _source关键字来指定需要返回的那些字段,_source 同时支持通配符的,比如:"_source":["name*","mobile*"]
  • 我们还可以用 includes(来指定想要显示的字段)、excludes(来指定不想要显示的字段)
get http://localhost:9200/shopping/_search
  • 请求Body
{
    "query":{
        "match_all":{}
    },
    "from":0,
    "size":1,
    "_source":["title","price"]
}
# 或者
{
    "query":{
        "match_all":{}
    },
    "from":0,
    "size":1,
    "_source":{
        "includes":["title","price"]
    }
}

5. termterms关键字精确查询

{ "title": "苹果13","price": 6999,"type": 1,"comment": "很"}
{ "title": "华为P50","price": 4999, "type": 3 }
{ "title": "小米手机","price": 3999,"type": 2,"comment": "很好" }
{ "title": "华为Mate50","price": 7999.0,"type": 3,"alias": "xiaomi", 
  "comment": "非常", "place": "made in chain" }
  • term查询,它是精确的关键字匹配查询,不对查询条件进行分词
  • term查询官方API:www.elastic.co/guide/en/el…
  • 注意:文本字段避免使用 term 查询,如果你使用了它也是查询不到结果的(这里的文本主要指的是一段文字或者一段英文,比如:{"comment": "非常", "place": "made in chain"}),如果是文本字段查询,使用 match 查询

5.1 单个关键字精确查询

get http://localhost:9200/shopping/_search
  • 请求Body
{
    "query":{
        "term":{           
            "alias": {
                "value": "xiaomi"    
            }
        }
    }
}

# 或者

{
    "query":{
        "term":{           
            "price": {
                "value": "7999"    
            }
        }
    }
}
  • 响应结果
{"took":3,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":1,"relation":"eq"},"max_score":1.0,
"hits":[{"_index":"shopping","_type":"_doc","_id":"1004","_score":1.0,
"_source":{"title":"华为Mate50","price":7999.0,"type":3,"alias":"xiaomi","comment":"非常","place":"made in chain"}}]}}

5.2 多个关键字精确查询

terms 查询和 term查询是一样的,但它允许你指定多个值进行匹配查询,如果这个字段包含了指定值中的任何一个,那么就返回该文档数据。类似于 MySQL 的 in 查询。

get http://localhost:9200/shopping/_search
  • 请求Body
{
    "query":{
        "terms":{           
            "price":  ["7999","4999"]  
        }
    }
}

6. match匹配查询

{ "title": "苹果13","price": 6999,"type": 1,"comment": "很"}
{ "title": "华为P50","price": 4999, "type": 3 }
{ "title": "小米手机","price": 3999,"type": 2,"comment": "很好" }
{ "title": "华为Mate50","price": 7999,"type": 3,"comment": "特别好" }
  • match匹配查询,它会把查询条件进行分词后查询,多个词条之间是 or的关系
  • match官方API:www.elastic.co/guide/en/el…

6.1 match查询默认or操作符

get http://localhost:9200/shopping/_search
  • 请求Body
{
    "query":{
        "match":{
            "comment":"很好"
        }
    }
}

(1)表示查询所有文档的 comment 字段中包含了 好、很、很好 三个词条的文档都会被搜索到
(2)注意match匹配查询,它的查询条件分词后,各词条之间是 or的关系,当然可以改变词条默认 or的关系,可改成 and 的关系

{
    "query":{
        "match":{
            "comment": {
                "query":"很很好",
                "operator":"and"
            }  
        }
    }
}
# 可以理解成 "query":"很 AND 很 AND 好"
  • 响应结果
{
    "took": 9,
    "timed_out": false,
    "_shards": {
        "total": 1, "successful": 1, "skipped": 0, "failed": 0
    },
    "hits": {
        "total": {
            "value": 3, "relation": "eq"
        },
        "max_score": 0.9400072,
        "hits": [
            {
                "_index": "shopping", "_type": "_doc", "_id": "2p48o4MB5fq_0p-xahUa", "_score": 0.9400072,
                "_source": {
                    "title": "小米手机", "price": 3999.0,"type": 2,"comment": "很好"
                }
            },
            {
                "_index": "shopping", "_type": "_doc",  "_id": "1001",  "_score": 0.5908618,
                "_source": {
                    "title": "苹果13","price": 6999,"type": 1,"comment": "很"
                }
            },
            {
                "_index": "shopping", "_type": "_doc", "_id": "1003",  "_score": 0.39019167,
                "_source": {
                    "title": "华为P50","price": 4999.0,"type": 3,"comment": "特别好"
                }
            }
        ]
    }
}

6.2 match_phrase短语查询

get http://localhost:9200/shopping/_search
  • 请求Body
{
    "query":{
        "match_phrase":{
            "comment": {
                "query": "很好"
            }  
        }
    }
}
  • 响应结果
    可以看到此时的查询结果中 comment 字段值中包含了 很好 这个短语的文档。
{
    "took": 3,
    "timed_out": false,
    "_shards": {
        "total": 1,"successful": 1,"skipped": 0, "failed": 0
    },
    "hits": {
        "total": {
            "value": 2, "relation": "eq"
        },
        "max_score": 1.2320213,
        "hits": [
            {
                "_index": "shopping",  "_type": "_doc",  "_id": "1001",  "_score": 1.2320213,
                "_source": {
                    "title": "苹果13","price": 6999,"type": 1,"comment": "很很好"
                }
            },
            {
                "_index": "shopping", "_type": "_doc", "_id": "2p48o4MB5fq_0p-xahUa", "_score": 1.1433705,
                "_source": {
                    "title": "小米手机","price": 3999.0,"type": 2, "comment": "很好"
                }
            }
        ]
    }
}

7. range范围查询

操作符说明
gt大于
gte大于等于
lt小于
lte小于等于
get http://localhost:9200/shopping/_search
  • 请求Body
{
    "query":{
        "range":{           
            "price": {
                "gt": 5999,
                "lte": 8999 
            }
        }
    }
}

8. boolean组合查询

操作符说明
must满足该查询条件的文档必须出现在查询结果中
filter满足该过滤条件的文档必须出现在查询结果中
should满足该查询条件的文档应该出现在查询结果中
must_not满足该查询条件的文档不能出现在查询结果中
get http://localhost:9200/shopping/_search
  • 请求Body
{
  "query": {
    "bool" : {
        
      "must" : {
        "term" : { "price" : 4999 }
      },
      
      "filter": {
        "term" : { "type" : 3 }
      },

      "must_not" : {
        "range" : {
          "price" : { "gte" : 10, "lte" : 20 }
        }
      },

      "should" : [
        { "term" : { "price" : 2000 } },
        { "term" : { "price" : 1000} }
      ]

    }
  }
}
  • 响应结果
{
    "took": 7,
    "timed_out": false,
    "_shards": {
        "total": 1, "successful": 1,  "skipped": 0,  "failed": 0
    },
    "hits": {
        "total": {
            "value": 1, "relation": "eq"
        },
        "max_score": 1.0,
        "hits": [
            {
                "_index": "shopping",  "_type": "_doc",  "_id": "1003",  "_score": 1.0,
                "_source": {
                    "title": "华为P50",  "price": 4999.0, "type": 3, "comment": "特别好"
                }
            }
        ]
    }
}

9. 聚合查询

聚合查询允许我们对文档进行统计分析,比如求最大值、最小值、平均值等等。

9.1 根据某个字段分组统计

get http://localhost:9200/shopping/_search
  • 请求Body
{
    "aggs":{
        "group_price":{          // 名称,自己定义的
            "terms":{            // 分组
                "field": "type"  // 分组字段
            }
        }
    },
    "size":0                    // 查询结果不展示原始数据
}
  • 响应结果
{
    "took": 6,
    "timed_out": false,
    "_shards": { "total": 1, "successful": 1, "skipped": 0, "failed": 0 },
    "hits": {
        "total": { "value": 4, "relation": "eq" },  "max_score": null,
        "hits": []
    },
    "aggregations": {
        "group_price": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
                { "key": 3,  "doc_count": 2 },
                { "key": 1,  "doc_count": 1 },
                { "key": 2,  "doc_count": 1 }
            ]
        }
    }
}

9.2 分组后再聚合

get http://localhost:9200/shopping/_search
  • 请求Body
{
    "aggs":{
        "group_type":{          
            "terms":{           
                "field": "type"  
            },
            // 再次聚合查询
            "aggs":{
                "sum_price":{
                    "sum" :{"field":"price"}
                }
            }
        }
    },
    "size":0                   
}
  • 响应结果
{
    "took": 2,
    "timed_out": false,
    "_shards": { "total": 1, "successful": 1, "skipped": 0, "failed": 0     },
    "hits": {
        "total": { "value": 4, "relation": "eq" },  "max_score": null,
        "hits": []
    },
    "aggregations": {
        "group_type": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
                {
                    "key": 3,   "doc_count": 2,  "sum_price": { "value": 12998.0 }
                },
                {
                    "key": 1,  "doc_count": 1,  "sum_price": { "value": 6999.0 }
                },
                {
                    "key": 2,  "doc_count": 1,  "sum_price": { "value": 3999.0 }
                }
            ]
        }
    }
}

9.3 统计某个字段的最大值

get http://localhost:9200/shopping/_search
  • 请求Body
{
    "aggs":{
        "max_price":{          // 名称,自己定义的
            "max":{          
                "field": "price"
            }
        }
    },
    "size":0                   // 查询结果不展示原始数据
}
  • 响应结果
{
    "took": 2,
    "timed_out": false,
    "_shards": {  "total": 1, "successful": 1, skipped": 0,  "failed": 0 },
    "hits": {
        "total": { "value": 4, relation": "eq" },
        "max_score": null,
        "hits": []
    },
    "aggregations": {
        "max_price": { "value": 7999.0 }
    }
}

9.3 去重后统计总数

get http://localhost:9200/shopping/_search
  • 请求Body
{
    "aggs":{
        "distinct_type":{          
            "cardinality":{ "field": "type"}
        }
    },
    "size":0                   
}
  • 响应结果 去重后的文档总条数是 3 条
{
    "took": 1,
    "timed_out": false,
    "_shards": {
        "total": 1, "successful": 1, "skipped": 0, "failed": 0
    },
    "hits": {
        "total": {  "value": 4, "relation": "eq" },
        "max_score": null,
        "hits": []
    },
    "aggregations": {
        "distinct_type": { "value": 3 }
    }
}

9.4 其他聚合查询

10. 高亮查询

elasticsearch 可以对查询内容中额关键字部分进行标签和样式(高亮)设置。在使用 match 查询的同时,加上一个 highlight 属性,具体属性字段说明如下:

属性字段作用
pre_tags前置标签
post_tags后置标签
fields需要高亮的字段
get http://localhost:9200/shopping/_search
  • 请求Body
{
    "query": {
        "match": {
            "comment": {
                "query": "很好",
                "operator": "and"
            }
        }
    },
    "highlight": {
        "pre_tags": "<font color='red'>",
        "post_tags": "</font>",
        "fields": {
            "comment": {}
        }
    }
}
  • 响应结果

image.png

11. Query StringSimple Query String 查询

11.1 Query String

  • 它使用语法严格的解析器,解析字符串查询条件,并返回查询结果
  • 如果字符串查询条件语法有有误,无法查询到数据,直接返回错误信息
  • 一般不建议使用 Query String,因为它有严格的语法校验;如果要用,推荐使用 Simple Query String
get http://localhost:9200/shopping/_search
  • 请求Body
{
    "query": {
        "query_string": {
            "query": "小 and 很好",
            "fields": ["comment","title"]
        }
  }
}

# fields 指定要匹配那些字段(就是那查询条件 "很 and 很好" 分别去匹配 "comment""title",匹配到的文档都返回  )
  • 响应结果
{
    "took": 5,
    "timed_out": false,
    "_shards": {
        "total": 1, "successful": 1, "skipped": 0, "failed": 0
    },
    "hits": {
        "total": { "value": 3, "relation": "eq" },
        "max_score": 1.2320213,
        "hits": [
            {
                "_index": "shopping", "_type": "_doc",  "_id": "1001", "_score": 1.2320213,
                "_source": {  "title": "苹果13",  "price": 6999,  "type": 1, "comment": "很很好" }
            },
            {
                "_index": "shopping",  "_type": "_doc",  "_id": "2p48o4MB5fq_0p-xahUa", "_score": 1.1433705,
                "_source": { "title": "小米手机",  "price": 3999.0, "type": 2, "comment": "很好" }
            },
            {
                "_index": "shopping",  "_type": "_doc", "_id": "1003",
                "_score": 0.32969955,
                "_source": {  "title": "华为P50", "price": 4999.0, "type": 3, "comment": "特别好" }
            }
        ]
    }
}

11.2 Simple Query String

  • 类似 Query String ,但是会忽略掉错误语法,同时支持部分查询语法
  • 不支持 AND OR NOT ,会当做字符串处理
  • 查询默认关系是 OR,可以指定 Operator 来改变
  • 支持用 + 代替 AND | 代替 OR - 代替 NOT
get http://localhost:9200/shopping/_search
  • 请求Body
{
    "query": {
        "simple_query_string": {
            "query": "特别好",
            "fields": ["comment","title"],
            "default_operator": "AND"
        }
  }
}
  • 响应结果
{"took":2,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":1,"relation":"eq"},"max_score":2.5555317,
"hits":[
      {"_index":"shopping",
      "_type":"_doc","_id":"1003",
      "_score":2.5555317,
      "_source":{"title":"华为P50","price":4999.0,"type":3,"comment":"特别好"}}]}}