麻烦路过的小伙伴点赞、关注,共同学习成长!
本人csdn博客:blog.csdn.net/Sun_ltyy
一、添加索引
-
语法:PUT 你的索引名称 -
示例:PUT goods_for_test_use
二、设置mapping
属性映射类型可参考:
示例
PUT goods_for_test_use/goods_for_test/_mapping
{
"properties": {
"goodsId":{
"type": "keyword"
},
"goodsName":{
"type": "keyword"
},
"buIds":{
"type": "long"
},"gmv":{
"type": "double"
},
"manager":{
"type": "object",
"properties": {
"firstName":{"type":"keyword"},
"secondName":{"type":"keyword"},
"age":{"type":"long"}
}
},
"managers":{
"type": "nested",
"properties": {
"firstName":{"type":"keyword"},
"secondName":{"type":"keyword"},
"age":{"type":"long"}
}
}
}
}三、插入数据
插入数据,只要根据mapping设置的数据类型,组装相应的json串即可,见示例代码
##插入数据第一条数据,这里的1表示的是索引中文档的id,可根据业务语义设置即可,如使用商品id
PUT goods_for_test_use/_doc/1
{
"goodsId":"222",
"goodsName":"测试商品名称2",
"buIds":[1,2,3],
"gmv":333,
"manager":[
{
"firstName":"lei",
"secondName":"teng"
},
{
"fistName":"zhang",
"secondName":"san"
}
],
"managers":[
{
"firstName":"lei",
"secondName":"teng",
"age":30
},
{
"firstName":"zhang",
"secondName":"san",
"age":18
},
{
"firstName":"lei",
"secondName":"san",
"age":18
}
]
}
四、查询数据
以上mapping的设置,涵盖了四种常用的数据类型。对每种数据类型查询做下分析
1.简单数据类型:如上述示例中的goodsId,goodsName
## 查询goodsId 为goodsId222 且商品名称为测试商品名称2的数据
GET goods_for_test_use/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"goodsId": "goodsId222"
}
},
{
"match": {
"goodsName": "测试商品名称2"
}
}
]
}
}
}
2.对于type为long,数组形式存储的但属性值的查询。如上述示例中的buIds,主要为java对象中List
buIds 的用法
## 注意,单属性数组形式存储的数据,使用terms查询,代表in的意思,不能使用term(等值查询)
GET goods_for_test_use/_search
{
"query": {
"bool": {
"must": [
{
"terms": {
"buIds": [
"1",
"5"
]
}
}
]
}
}
}
3.object类型和nested类型的查询(上述示例中的manager和managers)
在示例中:
manager 属性type类型为object,插入的两个对象为
管理者一:
firstName:lei
secondName:teng
管理者二:
firstName:zhang
secondName:san
managers 属性类型为nested,插入的三个管理这对象为
管理者一:
firstName:lei
secondName:teng
管理者二:
firstName:zhang
secondName:san
管理者三:
firstName:lei
secondName:san
现在如果我们要查询管理者firstName="lei"并且secondName="san"的商品记录,如果manager属性为object类型,则查询时,即使没有名字叫lei、san的管理者,但是也会把记录匹配出来。
## object 类型的对象查询
GET goods_for_test_use/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"manager.firstName": "lei"
}
},{
"match": {
"manager.secondName": "san"
}
}
]
}
}
}
## 查询到的结果
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 1,
"max_score" : 1.3862944,
"hits" : [
{
"_index" : "goods_for_test_use",
"_type" : "goods_for_test",
"_id" : "2",
"_score" : 1.3862944,
"_source" : {
"goodsId" : "goodsId222",
"goodsName" : "测试商品名称2",
"buIds" : [
1,
2,
3
],
"manager" : [
{
"firstName" : "lei",
"secondName" : "teng"
},
{
"fistName" : "zhang",
"secondName" : "san"
}
],
"managers" : [
{
"firstName" : "lei",
"secondName" : "teng",
"age" : 30
},
{
"fistName" : "zhang",
"secondName" : "san",
"age" : 18
},
{
"fistName" : "lei",
"secondName" : "san",
"age" : 18
}
]
}
}
]
}
}
发生这种情况的原因是因为,type为object类型在elastic内部存储类似如下:
{
"goodsId" : "goodsId222",
"goodsName" : "测试商品名称2",
"manager.firstName" : [ "lei", "zhang" ]
"manager.secondName": ["teng","san]
}
-
manager 为object 类型,内部存储结构被扁平化为多值字段,类似示例中的buIds,因此可以查询到lei、san的管理者数据,这是不满足查询语义的。
-
**因此如果为了保障各对象的相对独立性,需要采用managers的方式,定义类型为nested,其内部是作为独立对象存储的,可以用于查询等操作,具体查询语句如下**
GET goods_for_test_use/_search { "query": { "nested": { "path": "managers", "query": { "bool": { "must": [ { "match": { "managers.firstName": "lei" } },{ "match": { "managers.secondName": "san" } } ] } } } } }
执行以上嵌套查询,就可以查到真实的姓名leisan的数据了,因为确实存储了。
五、更新索引
1.通过put插入索引的方式,指定id后,把要改的字段,全都写一遍,重新进行索引,如果是新增的字段,如果索引模式设置了自动适配就会新增该字段(默认是新增)。相同的字段进行更新,缺少的字段更新为空
PUT goods_for_test_use/_doc/1
{
"goodsId":"111",
"goodsName":"测试商品名称1",
"buIds":[1,2,3],
"manager":[
{
"firstName":"heh",
"secondName":"teng"
},
{
"fistName":"zhang",
"secondName":"san"
}
],
"managers":[
{
"firstName":"lei",
"secondName":"teng",
"age":30
},
{
"firstName":"zhang",
"secondName":"san",
"age":18
},
{
"firstName":"lei",
"secondName":"san",
"age":18
}
]
}
2.通过update by query语句进行更新,示例如下
POST goods_for_test_use/_update_by_query
{
"script":{
"source":"ctx._source['goodsName']='雷腾测试商品';ctx._source['buIds']=[1,2,4]"
},
"query":{
"bool":{
"must":[
{
"match":{
"goodsId":"111"
}
}
]
}
}
}
六.删除语句 delete by query 语句进行,务必要加条件,不然会删除所有数据,慎用
POST goods_for_test_use/_delete_by_query
{
"query":{
"bool":{
"must":[
{
"match":{
"goodsId":"111"
}
}
]
}
}
}
七、分页
GET goods_for_test_use/_search
{
"from": 0,
"size": 20
}
八、聚合
1.agg terms分桶,类似于sql中的group by; agg 各种指标函数,类似于sql中的sum、avg、max、min等。并且可以联合使用
示例如下
## 查询语句,统计每个商品的gmv综合
GET goods_for_test_use/_search
{
"aggs": {
"商品id": {
"terms": {
"field": "goodsId",
"size": 10
}
, "aggs": {
"总的gmv": {
"sum": {
"field": "gmv"
}
}
}
}
},
"size": 0
}
## 查询的结果
{
"took" : 7,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 6,
"max_score" : 0.0,
"hits" : [ ]
},
"aggregations" : {
"商品id" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "111",
"doc_count" : 3,
"总的gmv" : {
"value" : 765.0
}
},
{
"key" : "222",
"doc_count" : 3,
"总的gmv" : {
"value" : 999.0
}
}
]
}
}
}