一、基本概念
elasticsearch 是面向文档型数据库。文档会被序列化成 json 格式保存在 elasticsearch 中,文档是所有可搜索数据的最小单元。我们将 elasticsearch 中涉及的相关概念名词和 MySQL 做一个类比,方便理解。
- elasticsearch 中的 Index 可以看做一个库,而 Type 相当于表,Documents 相当于表中的一行记录
- 在 7.0 版本之前,一个 Index 可以设置多个 Type;但在 7.0 之后的版本中,Type 已经被弃用,一个 Index 只能创建一个 Type
二、 索引的基本操作
1. 创建索引
put http://localhost:9200/shopping
- 响应结果
{
"acknowledged": true,
"shards_acknowledged": true, # 分片操作成功
"index": "shopping" # 索引名称
}
# 创建索引库分片数默认是 1 片,在 7.0 版本之前中,默认是 5 片
2. 查看指定索引
get http://localhost:9200/shopping
- 响应结果
{
"shopping": {
"aliases": {}, # 别名
"mappings": {}, # 映射
"settings": {
"index": {
"creation_date": "1664875716347", # 索引创建时间
"number_of_shards": "1", # 索引主分片数
"number_of_replicas": "1", # 索引副本数
"uuid": "U4Cn4kugQ46Kvq5LLpHsqQ", # 索引唯一标识
"version": {
"created": "7080099" # 索引版本号
},
"provided_name": "shopping" # 索引名称
}
}
}
}
3. 查看所有索引
get http://localhost:9200/_cat/indices?v
- 响应结果
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
yellow open shopping U4Cn4kugQ46Kvq5LLpHsqQ 1 1 0 0 208b 208b
| 表头 | 含义 |
|---|---|
| health | 当前服务健康状态:green(集群完整)、yellow(单点正常、集群不完成)、red(单点不正常) |
| status | 索引状态:打开 或 关闭 |
| index | 索引名称 |
| uuid | 索引统一编号 |
| pri | 主分片数量 |
| rep | 副本数量 |
| docs.count | 可用文档数量 |
| docs.deleted | 文档逻辑删除状态 |
| store.size | 主分片和副分片整体占用空间大小 |
| pri.store.size | 主分片占用空间大小 |
4. 删除指定索引
delete http://localhost:9200/shopping
- 响应结果
{
"acknowledged": true
}
三、 文档的增删改查
- 官方API参考:www.elastic.co/guide/en/el…
1. 创建文档
(1) 自动生成主键ID的方式创建
post http://localhost:9200/shopping/_doc
- 响应结果
{
"_index": "shopping", # 索引
"_type": "_doc", # 类型是文档
"_id": "2p48o4MB5fq_0p-xahUa", # 随机主键ID
"_version": 1,
"result": "created", # created 表示创建成功
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 0,
"_primary_term": 1
}
(2) 指定主键ID的方式创建
post http://localhost:9200/shopping/_doc/1
- 响应结果
{
"_index": "shopping",
"_type": "_doc",
"_id": "1001",
"_version": 1,
"result": "created",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 1,
"_primary_term": 1
}
注意:/shopping/_doc方式创建,如果文档ID不存在,就直接创建新的文档。如果文档ID存在,先删除现有的文档,再创建新的文档,版本号会增加。
(3) 使用 op_type 参数来创建
post http://localhost:9200/shopping/_doc/1002?op_type=create
这种方式的简写post http://localhost:9200/shopping/_create/1002,
它的特点是如果文档ID已存在了,那么再次创建相同ID的文档时会失败。
2. 查看文档
get http://localhost:9200/shopping/_doc/1002
- 响应结果
{
"_index": "shopping",
"_type": "_doc",
"_id": "1002",
"_version": 1,
"_seq_no": 2,
"_primary_term": 1,
"found": true,
"_source": {
"title": "苹果12",
"price": 4999
}
}
3. 更新文档
(1) 全量文档内容覆盖
post http://localhost:9200/shopping/_doc/1001
- 响应结果
{
"_index": "shopping",
"_type": "_doc",
"_id": "1001",
"_version": 2, #修改成功后文档版本号新增
"result": "updated",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 3,
"_primary_term": 1
}
(2) 修改文档指定指定值
post http://localhost:9200/shopping/_update/1001
- 修改指定字段请求体:
{
"doc":{
"price": 6999
}
}
- 示例:
4. 删除文档
delete http://localhost:9200/shopping/_doc/1002
四、 文档的条件查询
-
官方API文档参考:www.elastic.co/guide/en/el… ;这里主要说明一下
Request Body Search方式查询搜索,因为这种方式容易理解、使用和记忆! -
什么是
Request Body Search?
它可以简单理解为就是在http搜索请求的request body中使用 [Query DSL] (www.elastic.co/guide/en/el…) 来完成一次搜索请求。
1. 查询全部文档
get http://localhost:9200/shopping/_search
- 请求Body
{
"query":{
"match_all":{}
}
}
# query:表示一个查询对象,里面可以有不同的查询属性
# match_all:表示查询索引的所有文档
- 响应结果
{
"took": 1, # 查询所消耗的时间,单位毫秒
"timed_out": false, # 查询是否超时
"_shards": { # 分片信息
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": { # 搜索命中的结果信息
"total": { # 搜索条件匹配到的文档总数信息
"value": 3,
"relation": "eq" # eq 表示计数准确,gte 表示计算不准确
},
"max_score": 1.0, # 匹配度分数值
"hits": [ # 搜索到的结果数据集合
{
"_index": "shopping",
"_type": "_doc",
"_id": "2p48o4MB5fq_0p-xahUa",
"_score": 1.0, # 相关度评分
"_source": { # 文档原始信息
"title": "小米手机","price": 3999.00
}
},
{
"_index": "shopping", "_type": "_doc", "_id": "1001", "_score": 1.0,
"_source": {
"title": "苹果13", "price": 6999
}
},
{
"_index": "shopping", "_type": "_doc", "_id": "1003", "_score": 1.0,
"_source": {
"title": "华为P50", "price": 4999.00
}
}
]
}
}
2. 分页查询文档
get http://localhost:9200/shopping/_search
- 请求Body
{
"query":{
"match_all":{}
},
"from":0,
"size":1
}
# from:当前页的起始索引,默认是从 0 开始的。from的计算公式:from=(pageNum-1)*size
# size:每页显示多少条数据
3. 文档排序查询
- 现有如下示例文档,针对这个文档进行排序分页
{ "title": "苹果13","price": 6999,"type": 1 }
{ "title": "华为P50","price": 4999, "type": 3 }
{ "title": "小米手机","price": 3999,"type": 2 }
{ "title": "华为Mate50","price": 7999,"type": 3 }
- 排序主要使用
sort关键字,它可以让我们按照不同的字段进行排序,并且通过order关键字来指定排序方式;desc 降序, asc 升序。一般最好是在 数值型 和 日期型 字段上排序!
3.1 单个字段排序后分页
get http://localhost:9200/shopping/_search
- 请求Body
{
"query":{
"match_all":{}
},
"sort":[{
"price": { "order":"desc" }
}],
"from":0,
"size":2
}
- 响应结果
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,"successful": 1, "skipped": 0,"failed": 0
},
"hits": {
"total": {
"value": 4, "relation": "eq"
},
"max_score": null,
"hits": [
{
"_index": "shopping", "_type": "_doc", "_id": "1004", "_score": null,
"_source": {
"title": "华为Mate50", "price": 7999.00, "type": 3
},
"sort": [
7999.0
]
},
{
"_index": "shopping", "_type": "_doc", "_id": "1001", "_score": null,
"_source": {
"title": "苹果13","price": 6999,"type": 1
},
"sort": [
6999.0
]
}
]
}
}
3.1 多字段排序后分页
get http://localhost:9200/shopping/_search
- 请求Body
{
"query":{
"match_all":{}
},
"sort":[
{
"price": { "order":"asc" }
},
{
"type": { "order":"desc" }
}
],
"from":0,
"size":2
}
- 响应结果
{"took":5,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":4,"relation":"eq"},"max_score":null,"hits":[{"_index":"shopping","_type":"_doc","_id":"2p48o4MB5fq_0p-xahUa","_score":null,"_source":{"title":"小米手机","price":3999.0,"type":2},"sort":[3999.0,2]},{"_index":"shopping","_type":"_doc","_id":"1003","_score":null,"_source":{"title":"华为P50","price":4999.0,"type":3},"sort":[4999.0,3]}]}}
4. 查询结果返回指定字段
- 在请求Body中通过
_source关键字来指定需要返回的那些字段,_source同时支持通配符的,比如:"_source":["name*","mobile*"] - 我们还可以用
includes(来指定想要显示的字段)、excludes(来指定不想要显示的字段)
get http://localhost:9200/shopping/_search
- 请求Body
{
"query":{
"match_all":{}
},
"from":0,
"size":1,
"_source":["title","price"]
}
# 或者
{
"query":{
"match_all":{}
},
"from":0,
"size":1,
"_source":{
"includes":["title","price"]
}
}
5. term 和 terms关键字精确查询
{ "title": "苹果13","price": 6999,"type": 1,"comment": "很"}
{ "title": "华为P50","price": 4999, "type": 3 }
{ "title": "小米手机","price": 3999,"type": 2,"comment": "很好" }
{ "title": "华为Mate50","price": 7999.0,"type": 3,"alias": "xiaomi",
"comment": "非常", "place": "made in chain" }
term查询,它是精确的关键字匹配查询,不对查询条件进行分词term查询官方API:www.elastic.co/guide/en/el…- 注意:文本字段避免使用 term 查询,如果你使用了它也是查询不到结果的(这里的文本主要指的是一段文字或者一段英文,比如:
{"comment": "非常", "place": "made in chain"}),如果是文本字段查询,使用 match 查询
5.1 单个关键字精确查询
get http://localhost:9200/shopping/_search
- 请求Body
{
"query":{
"term":{
"alias": {
"value": "xiaomi"
}
}
}
}
# 或者
{
"query":{
"term":{
"price": {
"value": "7999"
}
}
}
}
- 响应结果
{"took":3,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":1,"relation":"eq"},"max_score":1.0,
"hits":[{"_index":"shopping","_type":"_doc","_id":"1004","_score":1.0,
"_source":{"title":"华为Mate50","price":7999.0,"type":3,"alias":"xiaomi","comment":"非常","place":"made in chain"}}]}}
5.2 多个关键字精确查询
terms 查询和 term查询是一样的,但它允许你指定多个值进行匹配查询,如果这个字段包含了指定值中的任何一个,那么就返回该文档数据。类似于 MySQL 的 in 查询。
get http://localhost:9200/shopping/_search
- 请求Body
{
"query":{
"terms":{
"price": ["7999","4999"]
}
}
}
6. match匹配查询
{ "title": "苹果13","price": 6999,"type": 1,"comment": "很"}
{ "title": "华为P50","price": 4999, "type": 3 }
{ "title": "小米手机","price": 3999,"type": 2,"comment": "很好" }
{ "title": "华为Mate50","price": 7999,"type": 3,"comment": "特别好" }
match匹配查询,它会把查询条件进行分词后查询,多个词条之间是or的关系match官方API:www.elastic.co/guide/en/el…
6.1 match查询默认or操作符
get http://localhost:9200/shopping/_search
- 请求Body
{
"query":{
"match":{
"comment":"很好"
}
}
}
(1)表示查询所有文档的 comment 字段中包含了 好、很、很好 三个词条的文档都会被搜索到
(2)注意:match匹配查询,它的查询条件分词后,各词条之间是 or的关系,当然可以改变词条默认 or的关系,可改成 and 的关系
{
"query":{
"match":{
"comment": {
"query":"很很好",
"operator":"and"
}
}
}
}
# 可以理解成 "query":"很 AND 很 AND 好"
- 响应结果
{
"took": 9,
"timed_out": false,
"_shards": {
"total": 1, "successful": 1, "skipped": 0, "failed": 0
},
"hits": {
"total": {
"value": 3, "relation": "eq"
},
"max_score": 0.9400072,
"hits": [
{
"_index": "shopping", "_type": "_doc", "_id": "2p48o4MB5fq_0p-xahUa", "_score": 0.9400072,
"_source": {
"title": "小米手机", "price": 3999.0,"type": 2,"comment": "很好"
}
},
{
"_index": "shopping", "_type": "_doc", "_id": "1001", "_score": 0.5908618,
"_source": {
"title": "苹果13","price": 6999,"type": 1,"comment": "很"
}
},
{
"_index": "shopping", "_type": "_doc", "_id": "1003", "_score": 0.39019167,
"_source": {
"title": "华为P50","price": 4999.0,"type": 3,"comment": "特别好"
}
}
]
}
}
6.2 match_phrase短语查询
get http://localhost:9200/shopping/_search
- 请求Body
{
"query":{
"match_phrase":{
"comment": {
"query": "很好"
}
}
}
}
- 响应结果
可以看到此时的查询结果中comment字段值中包含了 很好 这个短语的文档。
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,"successful": 1,"skipped": 0, "failed": 0
},
"hits": {
"total": {
"value": 2, "relation": "eq"
},
"max_score": 1.2320213,
"hits": [
{
"_index": "shopping", "_type": "_doc", "_id": "1001", "_score": 1.2320213,
"_source": {
"title": "苹果13","price": 6999,"type": 1,"comment": "很很好"
}
},
{
"_index": "shopping", "_type": "_doc", "_id": "2p48o4MB5fq_0p-xahUa", "_score": 1.1433705,
"_source": {
"title": "小米手机","price": 3999.0,"type": 2, "comment": "很好"
}
}
]
}
}
7. range范围查询
-
range查询官方API:www.elastic.co/guide/en/el… -
range查询是找出满足条件区间的文档。range查询允许如下操作符
| 操作符 | 说明 |
|---|---|
| gt | 大于 |
| gte | 大于等于 |
| lt | 小于 |
| lte | 小于等于 |
get http://localhost:9200/shopping/_search
- 请求Body
{
"query":{
"range":{
"price": {
"gt": 5999,
"lte": 8999
}
}
}
}
8. boolean组合查询
-
bool官方查询API:
(1) www.elastic.co/guide/en/el…
(2) www.elastic.co/guide/en/el… -
bool查询允许如下操作符
| 操作符 | 说明 |
|---|---|
| must | 满足该查询条件的文档必须出现在查询结果中 |
| filter | 满足该过滤条件的文档必须出现在查询结果中 |
| should | 满足该查询条件的文档应该出现在查询结果中 |
| must_not | 满足该查询条件的文档不能出现在查询结果中 |
get http://localhost:9200/shopping/_search
- 请求Body
{
"query": {
"bool" : {
"must" : {
"term" : { "price" : 4999 }
},
"filter": {
"term" : { "type" : 3 }
},
"must_not" : {
"range" : {
"price" : { "gte" : 10, "lte" : 20 }
}
},
"should" : [
{ "term" : { "price" : 2000 } },
{ "term" : { "price" : 1000} }
]
}
}
}
- 响应结果
{
"took": 7,
"timed_out": false,
"_shards": {
"total": 1, "successful": 1, "skipped": 0, "failed": 0
},
"hits": {
"total": {
"value": 1, "relation": "eq"
},
"max_score": 1.0,
"hits": [
{
"_index": "shopping", "_type": "_doc", "_id": "1003", "_score": 1.0,
"_source": {
"title": "华为P50", "price": 4999.0, "type": 3, "comment": "特别好"
}
}
]
}
}
9. 聚合查询
聚合查询允许我们对文档进行统计分析,比如求最大值、最小值、平均值等等。
9.1 根据某个字段分组统计
get http://localhost:9200/shopping/_search
- 请求Body
{
"aggs":{
"group_price":{ // 名称,自己定义的
"terms":{ // 分组
"field": "type" // 分组字段
}
}
},
"size":0 // 查询结果不展示原始数据
}
- 响应结果
{
"took": 6,
"timed_out": false,
"_shards": { "total": 1, "successful": 1, "skipped": 0, "failed": 0 },
"hits": {
"total": { "value": 4, "relation": "eq" }, "max_score": null,
"hits": []
},
"aggregations": {
"group_price": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{ "key": 3, "doc_count": 2 },
{ "key": 1, "doc_count": 1 },
{ "key": 2, "doc_count": 1 }
]
}
}
}
9.2 分组后再聚合
get http://localhost:9200/shopping/_search
- 请求Body
{
"aggs":{
"group_type":{
"terms":{
"field": "type"
},
// 再次聚合查询
"aggs":{
"sum_price":{
"sum" :{"field":"price"}
}
}
}
},
"size":0
}
- 响应结果
{
"took": 2,
"timed_out": false,
"_shards": { "total": 1, "successful": 1, "skipped": 0, "failed": 0 },
"hits": {
"total": { "value": 4, "relation": "eq" }, "max_score": null,
"hits": []
},
"aggregations": {
"group_type": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": 3, "doc_count": 2, "sum_price": { "value": 12998.0 }
},
{
"key": 1, "doc_count": 1, "sum_price": { "value": 6999.0 }
},
{
"key": 2, "doc_count": 1, "sum_price": { "value": 3999.0 }
}
]
}
}
}
9.3 统计某个字段的最大值
get http://localhost:9200/shopping/_search
- 请求Body
{
"aggs":{
"max_price":{ // 名称,自己定义的
"max":{
"field": "price"
}
}
},
"size":0 // 查询结果不展示原始数据
}
- 响应结果
{
"took": 2,
"timed_out": false,
"_shards": { "total": 1, "successful": 1, skipped": 0, "failed": 0 },
"hits": {
"total": { "value": 4, relation": "eq" },
"max_score": null,
"hits": []
},
"aggregations": {
"max_price": { "value": 7999.0 }
}
}
9.3 去重后统计总数
get http://localhost:9200/shopping/_search
- 请求Body
{
"aggs":{
"distinct_type":{
"cardinality":{ "field": "type"}
}
},
"size":0
}
- 响应结果 去重后的文档总条数是 3 条
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1, "successful": 1, "skipped": 0, "failed": 0
},
"hits": {
"total": { "value": 4, "relation": "eq" },
"max_score": null,
"hits": []
},
"aggregations": {
"distinct_type": { "value": 3 }
}
}
9.4 其他聚合查询
10. 高亮查询
elasticsearch 可以对查询内容中额关键字部分进行标签和样式(高亮)设置。在使用 match 查询的同时,加上一个 highlight 属性,具体属性字段说明如下:
| 属性字段 | 作用 |
|---|---|
| pre_tags | 前置标签 |
| post_tags | 后置标签 |
| fields | 需要高亮的字段 |
get http://localhost:9200/shopping/_search
- 请求Body
{
"query": {
"match": {
"comment": {
"query": "很好",
"operator": "and"
}
}
},
"highlight": {
"pre_tags": "<font color='red'>",
"post_tags": "</font>",
"fields": {
"comment": {}
}
}
}
- 响应结果
11. Query String 和 Simple Query String 查询
11.1 Query String
- 它使用语法严格的解析器,解析字符串查询条件,并返回查询结果
- 如果字符串查询条件语法有有误,无法查询到数据,直接返回错误信息
- 一般不建议使用 Query String,因为它有严格的语法校验;如果要用,推荐使用 Simple Query String
get http://localhost:9200/shopping/_search
- 请求Body
{
"query": {
"query_string": {
"query": "小 and 很好",
"fields": ["comment","title"]
}
}
}
# fields 指定要匹配那些字段(就是那查询条件 "很 and 很好" 分别去匹配 "comment" 和 "title",匹配到的文档都返回 )
- 响应结果
{
"took": 5,
"timed_out": false,
"_shards": {
"total": 1, "successful": 1, "skipped": 0, "failed": 0
},
"hits": {
"total": { "value": 3, "relation": "eq" },
"max_score": 1.2320213,
"hits": [
{
"_index": "shopping", "_type": "_doc", "_id": "1001", "_score": 1.2320213,
"_source": { "title": "苹果13", "price": 6999, "type": 1, "comment": "很很好" }
},
{
"_index": "shopping", "_type": "_doc", "_id": "2p48o4MB5fq_0p-xahUa", "_score": 1.1433705,
"_source": { "title": "小米手机", "price": 3999.0, "type": 2, "comment": "很好" }
},
{
"_index": "shopping", "_type": "_doc", "_id": "1003",
"_score": 0.32969955,
"_source": { "title": "华为P50", "price": 4999.0, "type": 3, "comment": "特别好" }
}
]
}
}
11.2 Simple Query String
- 类似 Query String ,但是会忽略掉错误语法,同时支持部分查询语法
- 不支持
ANDORNOT,会当做字符串处理 - 查询默认关系是 OR,可以指定 Operator 来改变
- 支持用
+代替AND|代替OR-代替NOT
get http://localhost:9200/shopping/_search
- 请求Body
{
"query": {
"simple_query_string": {
"query": "特别好",
"fields": ["comment","title"],
"default_operator": "AND"
}
}
}
- 响应结果
{"took":2,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":1,"relation":"eq"},"max_score":2.5555317,
"hits":[
{"_index":"shopping",
"_type":"_doc","_id":"1003",
"_score":2.5555317,
"_source":{"title":"华为P50","price":4999.0,"type":3,"comment":"特别好"}}]}}