一、Elasticsearch-URL
1.1 官方网址
Windows 版的 Elasticsearch 压缩包,解压即安装完毕,解压后的 Elasticsearch 的目录结构如下 :
解压后,进入 bin 文件目录,点击 elasticsearch.bat 文件启动 ES 服务 。
注意: 9300 端口为 Elasticsearch 集群间组件的通信端口, 9200 端口为浏览器访问的 http协议 RESTful 端口。
打开浏览器,输入地址: http://localhost:9200,测试返回结果,返回结果如下:
{
"name" : "192.local",
"cluster_name" : "elasticsearch",
"cluster_uuid" : "hEnwz9zORIifwL6HovF2fA",
"version" : {
"number" : "7.17.5",
"build_flavor" : "default",
"build_type" : "tar",
"build_hash" : "8d61b4f7ddf931f219e3745f295ed2bbc50c8e84",
"build_date" : "2022-06-23T21:57:28.736740635Z",
"build_snapshot" : false,
"lucene_version" : "8.11.1",
"minimum_wire_compatibility_version" : "6.8.0",
"minimum_index_compatibility_version" : "6.0.0-beta1"
},
"tagline" : "You Know, for Search"
}
1.2 RESTful & JSON
REST 指的是一组架构约束条件和原则。满足这些约束条件和原则的应用程序或设计就是 RESTful。 Web 应用程序最重要的 REST 原则是,客户端和服务器之间的交互在请求之间是无状态的。从客户端到服务器的每个请求都必须包含理解请求所必需的信息。如果服务器在请求之间的任何时间点重启,客户端不会得到通知。此外,无状态请求可以由任何可用服务器回答,这十分适合云计算之类的环境。客户端可以缓存数据以改进性能。
在服务器端,应用程序状态和功能可以分为各种资源。资源是一个有趣的概念实体,它向客户端公开。资源的例子有:应用程序对象、数据库记录、算法等等。每个资源都使用 URI(Universal Resource Identifier) 得到一个唯一的地址。所有资源都共享统一的接口,以便在客户端和服务器之间传输状态。使用的是标准的 HTTP 方法,比如 GET、 PUT、 POST 和DELETE。
在 REST 样式的 Web 服务中,每个资源都有一个地址。资源本身都是方法调用的目 标,方法列表对所有资源都是一样的。这些方法都是标准方法,包括 HTTP GET、 POST、PUT、 DELETE,还可能包括 HEAD 和 OPTIONS。简单的理解就是,如果想要访问互联网上的资源,就必须向资源所在的服务器发出请求,请求体中必须包含资源的网络路径, 以及对资源进行的操作(增删改查)。
REST 样式的 Web 服务若有返回结果,大多数以JSON字符串形式返回。
1.3 Postman客户端工具
如果直接通过浏览器向 Elasticsearch 服务器发请求,那么需要在发送的请求中包含HTTP 标准的方法,而 HTTP 的大部分特性且仅支持 GET 和 POST 方法。所以为了能方便地进行客户端的访问,可以使用 Postman 软件Postman 是一款强大的网页调试工具,提供功能强大的 Web API 和 HTTP 请求调试。
软件功能强大,界面简洁明晰、操作方便快捷,设计得很人性化。 Postman 中文版能够发送任何类型的 HTTP 请求 (GET, HEAD, POST, PUT…),不仅能够表单提交,且可以附带任意类型请求体。
1.4 倒排索引
Elasticsearch 是面向文档型数据库,一条数据在这里就是一个文档。 为了方便大家理解,我们将 Elasticsearch 里存储文档数据和关系型数据库 MySQL 存储数据的概念进行一个类比
ES 里的 Index 可以看做一个库,而 Types 相当于表, Documents 则相当于表的行。这里 Types 的概念已经被逐渐弱化, Elasticsearch 6.X 中,一个 index 下已经只能包含一个type, Elasticsearch 7.X 中, Type 的概念已经被删除了。
1.5 HTTP-索引-创建
对比关系型数据库,创建索引就等同于创建数据库。
在 Postman 中,向 ES 服务器发 PUT 请求 : http://127.0.0.1:9200/shopping
请求后,服务器返回响应:
{
"acknowledged": true,
"shards_acknowledged": true,
"index": "shopping"
}
后台日志:
[2022-08-10T15:47:07,160][INFO ][o.e.c.m.MetadataCreateIndexService] [192.local] [shopping] creating index, cause [api], templates [], shards [1]/[1]
如果重复发 PUT 请求 : http://127.0.0.1:9200/shopping 添加索引,会返回错误信息 :
{
"error": {
"root_cause": [
{
"type": "resource_already_exists_exception",
"reason": "index [shopping/J0WlEhh4R7aDrfIc3AkwWQ] already exists",
"index_uuid": "J0WlEhh4R7aDrfIc3AkwWQ",
"index": "shopping"
}
],
"type": "resource_already_exists_exception",
"reason": "index [shopping/J0WlEhh4R7aDrfIc3AkwWQ] already exists",
"index_uuid": "J0WlEhh4R7aDrfIc3AkwWQ",
"index": "shopping"
},
"status": 400
}
1.6 HTTP-索引-查询&删除
查看所有索引 在 Postman 中查询全部索引,向 ES 服务器发 GET 请求 : http://127.0.0.1:9200/_cat/indices?v
这里请求路径中的_cat 表示查看的意思, indices 表示索引,所以整体含义就是查看当前 ES服务器中的所有索引,就好像 MySQL 中的 show tables 的感觉,服务器响应结果如下 :
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
green open .geoip_databases MByvMPJcRC6q1xFiYisslQ 1 0 40 0 37.9mb 37.9mb
yellow open shopping dxKLUaNBSXeKPRROyeFlsg 1 1 0 0 226b 226b
1.7 查看单个索引
在postman中,向es服务器发送get请求:http://127.0.0.1:9200/shopping
返回结果如下:
{
"shopping": {
"aliases": {},
"mappings": {},
"settings": {
"index": {
"routing": {
"allocation": {
"include": {
"_tier_preference": "data_content"
}
}
},
"number_of_shards": "1",
"provided_name": "shopping",
"creation_date": "1660117627142",
"number_of_replicas": "1",
"uuid": "dxKLUaNBSXeKPRROyeFlsg",
"version": {
"created": "7170599"
}
}
}
}
}
1.8 删除索引
在postman中,向 ES 服务器发 DELETE 请求: http://127.0.0.1:9200/shopping
返回结果如下:
{
"acknowledged": true
}
1.8-1 查看所有索引
再次查看所有索引,GET http://127.0.0.1:9200/_cat/indices?v,返回结果如下:
GET http://127.0.0.1:9200/_cat/indices?v
返回结果如下:
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
成功删除。
1.9 HTTP-文档-创建(PUT&POST)
假设索引已经创建好了,接下来我们来创建文档,并添加数据。这里的文档可以类比为关系型数据库中的表数据,添加的数据格式为 JSON 格式
在 Postman 中,向 ES 服务器发 POST 请求 : http://127.0.0.1:9200/shopping/_doc,请求体JSON内容为:
{
"title":"小米手机",
"category":"小米",
"images":"http://www.gulixueyuan.com/xm.jpg",
"price":3999.00
}
注意,此处发送请求的方式必须为 POST,不能是 PUT,否则会发生错误 。
返回结果:
{
"_index": "shopping",
"_type": "_doc",
"_id": "_XPLhoIBg3C-pJdYUv6X",
"_version": 1,
"result": "created",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 0,
"_primary_term": 1
}
上面的数据创建后,由于没有指定数据唯一性标识(ID),默认情况下, ES 服务器会随机生成一个。
如果想要自定义唯一性标识,需要在创建时指定: http://127.0.0.1:9200/shopping/_doc/1,请求体JSON内容为:
{
"title":"小米手机",
"category":"小米",
"images":"http://www.gulixueyuan.com/xm.jpg",
"price":3999.00
}
返回结果如下:
{
"_index": "shopping",
"_type": "_doc",
"_id": "1",
"_version": 1,
"result": "created",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 1,
"_primary_term": 1
}
此处需要注意:如果增加数据时明确数据主键,那么请求方式也可以为 PUT。
1.10 HTTP-查询-主键 查询 & 全查询
查看文档时,需要指明文档的唯一性标识,类似于 MySQL 中数据的主键查询
在 Postman 中,向 ES 服务器发 GET 请求 : http://127.0.0.1:9200/shopping/_doc/1 。
返回结果如下:
{
"_index": "shopping",
"_type": "_doc",
"_id": "1",
"_version": 1,
"_seq_no": 1,
"_primary_term": 1,
"found": true,
"_source": {
"title": "小米手机",
"category": "小米",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 3999
}
}
查找不存在的内容,向 ES 服务器发 GET 请求 : http://127.0.0.1:9200/shopping/_doc/1001。
返回结果如下:
{
"_index": "shopping",
"_type": "_doc",
"_id": "1001",
"found": false
}
查看索引下所有数据,向 ES 服务器发 GET 请求 : http://127.0.0.1:9200/shopping/_search。
返回结果如下:
{
"took": 726,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": 1.0,
"hits": [
{
"_index": "shopping",
"_type": "_doc",
"_id": "_XPLhoIBg3C-pJdYUv6X",
"_score": 1.0,
"_source": {
"title": "小米手机",
"category": "小米",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 3999.00
}
},
{
"_index": "shopping",
"_type": "_doc",
"_id": "1",
"_score": 1.0,
"_source": {
"title": "小米手机",
"category": "小米",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 3999.00
}
}
]
}
}
1.11 HTTP-全量修改 & 局部修改 & 删除
1.11.1 全量修改
和新增文档一样,输入相同的 URL 地址请求,如果请求体变化,会将原有的数据内容覆盖
在 Postman 中,向 ES 服务器发 POST 请求 : http://127.0.0.1:9200/shopping/_doc/1
请求体JSON内容为:
{
"title":"华为手机",
"category":"华为",
"images":"http://www.gulixueyuan.com/hw.jpg",
"price":1999.00
}
修改成功后,服务器相应结果:
{
"_index": "shopping",
"_type": "_doc",
"_id": "1",
"_version": 3,
"result": "updated",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 3,
"_primary_term": 1
}
1.11.2 局部修改
修改数据时,也可以只修改某一给条数据的局部信息
在 Postman 中,向 ES 服务器发 POST 请求 : http://127.0.0.1:9200/shopping/_update/1。
请求体JSON内容为:
{
"doc": {
"title":"小米手机",
"category":"小米"
}
}
返回结果如下:
{
"_index": "shopping",
"_type": "_doc",
"_id": "1",
"_version": 4,
"_seq_no": 4,
"_primary_term": 1,
"found": true,
"_source": {
"title": "xaiomi手机",
"category": "小米",
"images": "http://www.gulixueyuan.com/hw.jpg",
"price": 1999.00
}
}
1.11.3 删除
删除一个文档不会立即从磁盘上移除,它只是被标记成已删除(逻辑删除)。
在 Postman 中,向 ES 服务器发 DELETE 请求 : http://127.0.0.1:9200/shopping/_doc/1
返回结果:
{
"_index": "shopping",
"_type": "_doc",
"_id": "1",
"_version": 5,
"result": "deleted",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 5,
"_primary_term": 1
}
在 Postman 中,向 ES 服务器发 GET请求 : http://127.0.0.1:9200/shopping/_doc/1,查看是否删除成功:
{
"_index": "shopping",
"_type": "_doc",
"_id": "1",
"found": false
}
1.12 HTTP-条件查询 & 分页查询 & 查询排序
1.12.1 条件查询
假设有以下文档内容,(在 Postman 中,向 ES 服务器发 GET请求 : http://127.0.0.1:9200/shopping/_search):
{
"took": 5,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 6,
"relation": "eq"
},
"max_score": 1,
"hits": [
{
"_index": "shopping",
"_type": "_doc",
"_id": "ANQqsHgBaKNfVnMbhZYU",
"_score": 1,
"_source": {
"title": "小米手机",
"category": "小米",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 3999
}
},
{
"_index": "shopping",
"_type": "_doc",
"_id": "A9R5sHgBaKNfVnMb25Ya",
"_score": 1,
"_source": {
"title": "小米手机",
"category": "小米",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 1999
}
},
{
"_index": "shopping",
"_type": "_doc",
"_id": "BNR5sHgBaKNfVnMb7pal",
"_score": 1,
"_source": {
"title": "小米手机",
"category": "小米",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 1999
}
},
{
"_index": "shopping",
"_type": "_doc",
"_id": "BtR6sHgBaKNfVnMbX5Y5",
"_score": 1,
"_source": {
"title": "华为手机",
"category": "华为",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 1999
}
},
{
"_index": "shopping",
"_type": "_doc",
"_id": "B9R6sHgBaKNfVnMbZpZ6",
"_score": 1,
"_source": {
"title": "华为手机",
"category": "华为",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 1999
}
},
{
"_index": "shopping",
"_type": "_doc",
"_id": "CdR7sHgBaKNfVnMbsJb9",
"_score": 1,
"_source": {
"title": "华为手机",
"category": "华为",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 1999
}
}
]
}
}
1.12.2 URL带参查询
查找category为小米的文档,在 Postman 中,向 ES 服务器发 GET请求 : http://127.0.0.1:9200/shopping/_search?q=category:小米,返回结果如下:
{
"took": 33,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 0.5753642,
"hits": [
{
"_index": "shopping",
"_type": "_doc",
"_id": "_XPLhoIBg3C-pJdYUv6X",
"_score": 0.5753642,
"_source": {
"title": "小米手机",
"category": "小米",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 3999.00
}
}
]
}
}
上述为URL带参数形式查询,这很容易让不善者心怀恶意,或者参数值出现中文会出现乱码情况。为了避免这些情况,我们可用使用带JSON请求体请求进行查询。
1.12.3 请求体带参查询
接下带JSON请求体,还是查找category为小米的文档,在 Postman 中,向 ES 服务器发 GET请求 : http://127.0.0.1:9200/shopping/_search,附带JSON体如下
{
"query":{
"match":{
"category":"小米"
}
}
}
返回结果如下:
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": 0.9400072,
"hits": [
{
"_index": "shopping",
"_type": "_doc",
"_id": "_XPLhoIBg3C-pJdYUv6X",
"_score": 0.9400072,
"_source": {
"title": "小米手机",
"category": "小米",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 3999.00
}
},
{
"_index": "shopping",
"_type": "_doc",
"_id": "3",
"_score": 0.9400072,
"_source": {
"title": "xiaomi手机",
"category": "小米",
"images": "http://www.xiaomi.com/xm.jpg",
"price": 6999.00
}
}
]
}
}
1.12.4 带请求体方式的查找所有内容
查找所有文档内容,也可以这样,在 Postman 中,向 ES 服务器发 GET请求 : http://127.0.0.1:9200/shopping/_search,附带JSON体如下:
{
"query":{
"match_all":{}
}
}
则返回所有文档内容:
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 3,
"relation": "eq"
},
"max_score": 1.0,
"hits": [
{
"_index": "shopping",
"_type": "_doc",
"_id": "_XPLhoIBg3C-pJdYUv6X",
"_score": 1.0,
"_source": {
"title": "小米手机",
"category": "小米",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 3999.00
}
},
{
"_index": "shopping",
"_type": "_doc",
"_id": "2",
"_score": 1.0,
"_source": {
"title": "华为手机",
"category": "华为",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 4999.00
}
},
{
"_index": "shopping",
"_type": "_doc",
"_id": "3",
"_score": 1.0,
"_source": {
"title": "xiaomi手机",
"category": "小米",
"images": "http://www.xiaomi.com/xm.jpg",
"price": 6999.00
}
}
]
}
}
1.12.5 查询指定字段
如果你想查询指定字段,在 Postman 中,向 ES 服务器发 GET请求 : http://127.0.0.1:9200/shopping/_search,附带JSON体如下:
{
"query":{
"match_all":{}
},
"_source":["title"]
}
返回结果如下:
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 3,
"relation": "eq"
},
"max_score": 1.0,
"hits": [
{
"_index": "shopping",
"_type": "_doc",
"_id": "_XPLhoIBg3C-pJdYUv6X",
"_score": 1.0,
"_source": {
"title": "小米手机"
}
},
{
"_index": "shopping",
"_type": "_doc",
"_id": "2",
"_score": 1.0,
"_source": {
"title": "华为手机"
}
},
{
"_index": "shopping",
"_type": "_doc",
"_id": "3",
"_score": 1.0,
"_source": {
"title": "xiaomi手机"
}
}
]
}
}
1.12.6 分页查询&查询排序
1.12.6.1 分页查询
在 Postman 中,向 ES 服务器发 GET请求 : http://127.0.0.1:9200/shopping/_search,附带JSON体如下:
{
"query":{
"match_all":{}
},
"from":0,
"size":2
}
返回结果如下:
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 3,
"relation": "eq"
},
"max_score": 1.0,
"hits": [
{
"_index": "shopping",
"_type": "_doc",
"_id": "_XPLhoIBg3C-pJdYUv6X",
"_score": 1.0,
"_source": {
"title": "小米手机",
"category": "小米",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 3999.00
}
},
{
"_index": "shopping",
"_type": "_doc",
"_id": "2",
"_score": 1.0,
"_source": {
"title": "华为手机",
"category": "华为",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 4999.00
}
}
]
}
}
1.12.6.2 查询排序
如果你想通过排序查出价格最高的手机,在 Postman 中,向 ES 服务器发 GET请求 : http://127.0.0.1:9200/shopping/_search,附带JSON体如下:
{
"query":{
"match_all":{}
},
"sort":{
"price":{
"order":"desc"
}
}
}
返回结果如下:
{
"took": 30,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 3,
"relation": "eq"
},
"max_score": null,
"hits": [
{
"_index": "shopping",
"_type": "_doc",
"_id": "3",
"_score": null,
"_source": {
"title": "xiaomi手机",
"category": "小米",
"images": "http://www.xiaomi.com/xm.jpg",
"price": 6999.00
},
"sort": [
6999.0
]
},
{
"_index": "shopping",
"_type": "_doc",
"_id": "2",
"_score": null,
"_source": {
"title": "华为手机",
"category": "华为",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 4999.00
},
"sort": [
4999.0
]
},
{
"_index": "shopping",
"_type": "_doc",
"_id": "_XPLhoIBg3C-pJdYUv6X",
"_score": null,
"_source": {
"title": "小米手机",
"category": "小米",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 3999.00
},
"sort": [
3999.0
]
}
]
}
}
1.13 HTTP-多条件查询 & 范围查询
1.13.1 多条件查询
假设想找出小米牌子,价格为3999元的。(must相当于数据库的&&)
在 Postman 中,向 ES 服务器发 GET请求 : http://127.0.0.1:9200/shopping/_search,附带JSON体如下:
&&:must
|| :should
{
"query":{
"bool":{
"must":[{
"match":{
"category":"小米"
}
},{
"match":{
"price":3999.00
}
}]
}
}
}
返回结果如下:
{
"took": 19,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 1.9400072,
"hits": [
{
"_index": "shopping",
"_type": "_doc",
"_id": "_XPLhoIBg3C-pJdYUv6X",
"_score": 1.9400072,
"_source": {
"title": "小米手机",
"category": "小米",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 3999.00
}
}
]
}
}
假设想找出小米和华为的牌子。(should相当于数据库的||)
在 Postman 中,向 ES 服务器发 GET请求 : http://127.0.0.1:9200/shopping/_search,附带JSON体如下:
{
"query": {
"bool": {
"should": [{
"match": {
"category": "小米"
}
}, {
"match": {
"category": "华为"
}
}],
"filter": {
"range": {
"price": {
"gt": 4000
}
}
}
}
}
}
返回结果:
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": 1.9616582,
"hits": [
{
"_index": "shopping",
"_type": "_doc",
"_id": "2",
"_score": 1.9616582,
"_source": {
"title": "华为手机",
"category": "华为",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 4999.00
}
},
{
"_index": "shopping",
"_type": "_doc",
"_id": "3",
"_score": 0.9400072,
"_source": {
"title": "xiaomi手机",
"category": "小米",
"images": "http://www.xiaomi.com/xm.jpg",
"price": 6999.00
}
}
]
}
}
1.13.2 范围查询
假设想找出小米和华为的牌子,价格大于2000元的手机。
在 Postman 中,向 ES 服务器发 GET请求 : http://127.0.0.1:9200/shopping/_search,附带JSON体如下:
{
"query":{
"bool":{
"should":[{
"match":{
"category":"小米"
}
},{
"match":{
"category":"华为"
}
}],
"filter":{
"range":{
"price":{
"gt":2000
}
}
}
}
}
}
返回结果如下:
{
"took": 8,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 3,
"relation": "eq"
},
"max_score": 1.9616582,
"hits": [
{
"_index": "shopping",
"_type": "_doc",
"_id": "2",
"_score": 1.9616582,
"_source": {
"title": "华为手机",
"category": "华为",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 4999.00
}
},
{
"_index": "shopping",
"_type": "_doc",
"_id": "_XPLhoIBg3C-pJdYUv6X",
"_score": 0.9400072,
"_source": {
"title": "小米手机",
"category": "小米",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 3999.00
}
},
{
"_index": "shopping",
"_type": "_doc",
"_id": "3",
"_score": 0.9400072,
"_source": {
"title": "xiaomi手机",
"category": "小米",
"images": "http://www.xiaomi.com/xm.jpg",
"price": 6999.00
}
}
]
}
}
1.14 HTTP-全文检索 & 完全匹配 & 高亮查询
1.14.1 全文检索
这功能像搜索引擎那样,如品牌输入“小华”,返回结果带回品牌有“小米”和华为的。
在 Postman 中,向 ES 服务器发 GET请求 : http://127.0.0.1:9200/shopping/_search,附带JSON体如下:
{
"query":{
"match":{
"category" : "小华"
}
}
}
返回结果如下:
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 3,
"relation": "eq"
},
"max_score": 0.9808291,
"hits": [
{
"_index": "shopping",
"_type": "_doc",
"_id": "2",
"_score": 0.9808291,
"_source": {
"title": "华为手机",
"category": "华为",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 4999.00
}
},
{
"_index": "shopping",
"_type": "_doc",
"_id": "_XPLhoIBg3C-pJdYUv6X",
"_score": 0.4700036,
"_source": {
"title": "小米手机",
"category": "小米",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 3999.00
}
},
{
"_index": "shopping",
"_type": "_doc",
"_id": "3",
"_score": 0.4700036,
"_source": {
"title": "xiaomi手机",
"category": "小米",
"images": "http://www.xiaomi.com/xm.jpg",
"price": 6999.00
}
}
]
}
}
1.14.2 完全匹配
在 Postman 中,向 ES 服务器发 GET请求 : http://127.0.0.1:9200/shopping/_search,附带JSON体如下:
{
"query":{
"match_phrase":{
"category" : "为"
}
}
}
返回结果如下:
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 0.9808291,
"hits": [
{
"_index": "shopping",
"_type": "_doc",
"_id": "2",
"_score": 0.9808291,
"_source": {
"title": "华为手机",
"category": "华为",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 4999.00
}
}
]
}
}
1.14.3 高亮查询
在 Postman 中,向 ES 服务器发 GET请求 : http://127.0.0.1:9200/shopping/_search,附带JSON体如下:
{
"query":{
"match_phrase":{
"category" : "为"
}
},
"highlight":{
"fields":{
"category":{}//<----高亮这字段
}
}
}
返回结果如下:
{
"took": 36,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 0.9808291,
"hits": [
{
"_index": "shopping",
"_type": "_doc",
"_id": "2",
"_score": 0.9808291,
"_source": {
"title": "华为手机",
"category": "华为",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 4999.00
},
"highlight": {
"category": [
"华<em>为</em>"
]
}
}
]
}
}
1.15 HTTP-聚合查询
聚合允许使用者对 es 文档进行统计分析,类似与关系型数据库中的 group by,当然还有很多其他的聚合,例如取最大值max、平均值avg等等。
接下来按price字段进行分组:
在 Postman 中,向 ES 服务器发 GET请求 : http://127.0.0.1:9200/shopping/_search,附带JSON体如下:
{
"aggs":{//聚合操作
"price_group":{//名称,随意起名
"terms":{//分组
"field":"price"//分组字段
}
}
}
}
返回结果如下:
{
"took": 31,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 3,
"relation": "eq"
},
"max_score": 1.0,
"hits": [
{
"_index": "shopping",
"_type": "_doc",
"_id": "_XPLhoIBg3C-pJdYUv6X",
"_score": 1.0,
"_source": {
"title": "小米手机",
"category": "小米",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 3999.00
}
},
{
"_index": "shopping",
"_type": "_doc",
"_id": "2",
"_score": 1.0,
"_source": {
"title": "华为手机",
"category": "华为",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 4999.00
}
},
{
"_index": "shopping",
"_type": "_doc",
"_id": "3",
"_score": 1.0,
"_source": {
"title": "xiaomi手机",
"category": "小米",
"images": "http://www.xiaomi.com/xm.jpg",
"price": 6999.00
}
}
]
},
"aggregations": {
"price_group": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": 3999.0,
"doc_count": 1
},
{
"key": 4999.0,
"doc_count": 1
},
{
"key": 6999.0,
"doc_count": 1
}
]
}
}
}
上面返回结果会附带原始数据的。若不想要不附带原始数据的结果,在 Postman 中,向 ES 服务器发 GET请求 : http://127.0.0.1:9200/shopping/_search,附带JSON体如下:
{
"aggs":{
"price_group":{
"terms":{
"field":"price"
}
}
},
"size":0
}
返回结果如下:
{
"took": 9,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 3,
"relation": "eq"
},
"max_score": null,
"hits": []
},
"aggregations": {
"price_group": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": 3999.0,
"doc_count": 1
},
{
"key": 4999.0,
"doc_count": 1
},
{
"key": 6999.0,
"doc_count": 1
}
]
}
}
}
若想对所有手机价格求平均值。
在 Postman 中,向 ES 服务器发 GET请求 : http://127.0.0.1:9200/shopping/_search,附带JSON体如下:
{
"aggs":{
"price_avg":{//名称,随意起名
"avg":{//求平均
"field":"price"
}
}
},
"size":0
}
返回值结果:
{
"took": 6,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 3,
"relation": "eq"
},
"max_score": null,
"hits": []
},
"aggregations": {
"price_avg": {
"value": 5332.333333333333
}
}
}
1.16 HTTP-映射关系
有了索引库,等于有了数据库中的 database。
接下来就需要建索引库(index)中的映射了,类似于数据库(database)中的表结构(table)。
创建数据库表需要设置字段名称,类型,长度,约束等;索引库也一样,需要知道这个类型下有哪些字段,每个字段有哪些约束信息,这就叫做映射(mapping)。
先创建一个索引:
# PUT http://127.0.0.1:9200/user
返回结果:
{
"acknowledged": true,
"shards_acknowledged": true,
"index": "user"
}
创建映射
# PUT http://127.0.0.1:9200/user/_mapping
{
"properties": {
"name":{
"type": "text",
"index": true
},
"sex":{
"type": "keyword",
"index": true
},
"tel":{
"type": "keyword",
"index": false
}
}
}
返回结果如下:
{
"acknowledged": true
}
查询映射:
#GET http://127.0.0.1:9200/user/_mapping
返回结果如下:
{
"user": {
"mappings": {
"properties": {
"name": {
"type": "text"
},
"sex": {
"type": "keyword"
},
"tel": {
"type": "keyword",
"index": false
}
}
}
}
}
增加数据
#PUT http://127.0.0.1:9200/user/_create/1001
{
"name":"小米",
"sex":"男的",
"tel":"1111"
}
返回结果如下:
{
"_index": "user",
"_type": "_doc",
"_id": "1001",
"_version": 1,
"result": "created",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 0,
"_primary_term": 1
}
查找 name 含有 "小" 数据:
#GET http://127.0.0.1:9200/user/_search
{
"query":{
"match":{
"name":"小"
}
}
}
返回结果如下:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 0.2876821,
"hits": [
{
"_index": "user",
"_type": "_doc",
"_id": "1001",
"_score": 0.2876821,
"_source": {
"name": "小米",
"sex": "男的",
"tel": "1111"
}
}
]
}
}
查找sex含有”男“数据:
#GET http://127.0.0.1:9200/user/_search
{
"query":{
"match":{
"sex":"男"
}
}
}
返回结果如下:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 0,
"relation": "eq"
},
"max_score": null,
"hits": []
}
}
找不想要的结果,只因创建映射时"sex"的类型为"keyword"。
"sex"只能完全为”男的“,才能得出原数据。
#GET http://127.0.0.1:9200/user/_search
{
"query":{
"match":{
"sex":"男的"
}
}
}
返回结果如下:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 0.2876821,
"hits": [
{
"_index": "user",
"_type": "_doc",
"_id": "1001",
"_score": 0.2876821,
"_source": {
"name": "小米",
"sex": "男的",
"tel": "1111"
}
}
]
}
}
查询电话
# GET http://127.0.0.1:9200/user/_search
{
"query":{
"match":{
"tel":"11"
}
}
}
返回结果如下:
{
"error": {
"root_cause": [
{
"type": "query_shard_exception",
"reason": "failed to create query: Cannot search on field [tel] since it is not indexed.",
"index_uuid": "yuEPksOOTFmrTFV5ujv2DQ",
"index": "user"
}
],
"type": "search_phase_execution_exception",
"reason": "all shards failed",
"phase": "query",
"grouped": true,
"failed_shards": [
{
"shard": 0,
"index": "user",
"node": "c5CrEH_uSNy3E9JW7SMBEw",
"reason": {
"type": "query_shard_exception",
"reason": "failed to create query: Cannot search on field [tel] since it is not indexed.",
"index_uuid": "yuEPksOOTFmrTFV5ujv2DQ",
"index": "user",
"caused_by": {
"type": "illegal_argument_exception",
"reason": "Cannot search on field [tel] since it is not indexed."
}
}
}
]
},
"status": 400
}
报错只因创建映射时"tel"的"index"为false。
二、ElasticSearch-JavaAPI
环境准备
新建Maven工程
添加依赖 创建POM文件
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<parent>
<artifactId>yooomecloud</artifactId>
<groupId>com.yooome.springcloud</groupId>
<version>1.0-SNAPSHOT</version>
</parent>
<modelVersion>4.0.0</modelVersion>
<artifactId>cloud-config-elasticsearch-9200</artifactId>
<properties>
<maven.compiler.source>8</maven.compiler.source>
<maven.compiler.target>8</maven.compiler.target>
</properties>
<dependencies>
<!-- SpringBoot整合Web组件 -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-devtools</artifactId>
<scope>runtime</scope>
<optional>true</optional>
</dependency>
<dependency>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
<optional>true</optional>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
</dependency>
<!--elasticsearch需要依赖-->
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-high-level-client</artifactId>
<version>7.7.0</version>
<exclusions>
<exclusion>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch</artifactId>
</exclusion>
<exclusion>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-client</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-client</artifactId>
<version>7.7.0</version>
</dependency>
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch</artifactId>
<version>7.7.0</version>
</dependency>
</dependencies>
</project>
2.1 JavaAPI-索引-创建
@GetMapping("/elsa/search/{id}")
public String GetElasticsearch(@PathVariable("id") Integer id) throws IOException {
RestHighLevelClient client = new RestHighLevelClient(
RestClient.builder(new HttpHost("localhost", 9200, "http")));
System.out.println(client);
// 创建索引 - 请求对象
CreateIndexRequest createIndexRequest = new CreateIndexRequest("user3");
// 发送请求,获取响应
CreateIndexResponse response = client.indices().create(createIndexRequest, RequestOptions.DEFAULT);
// 响应状态
boolean acknowledged = response.isAcknowledged();
System.out.println("操作状态:" + acknowledged);
// 关闭客户端连接
client.close();
return "get elastic search dsgs " + "server port " + serverPort + " ack " + id + " acknowledged " + acknowledged;
}
2.2 JavaAPI-索引-查询 & 删除
2.2.1 查询
@GetMapping("elas/search/user")
public String GetIndexResponse() throws IOException {
RestHighLevelClient client = new RestHighLevelClient(RestClient.builder(new HttpHost("localhost", 9200, "http")));
GetIndexRequest request = new GetIndexRequest("user");
GetIndexResponse response = client.indices().get(request, RequestOptions.DEFAULT);
return "aliases: " + response.getAliases() + " mappings: " + response.getMappings() + " settings: " + response.getSettings();
}
2.2.2 删除
@GetMapping("/elas/delete")
public String DeleteIndexResponse() throws IOException {
RestHighLevelClient client = new RestHighLevelClient(RestClient.builder(new HttpHost("localhost", 9200, "http")));
DeleteIndexRequest request = new DeleteIndexRequest("user3");
AcknowledgedResponse response = client.indices().delete(request, RequestOptions.DEFAULT);
return "response: " + response.isAcknowledged();
}
2.3 JavaAPI-文档-新增 & 修改
2.3.1 重构
上下文由于频繁使用以下连接 Elasticsearch 和 关闭它的代码,于是个人对它进行重构。
2.3.2 新增
@GetMapping("/elas/create/doc")
public String CreateElasticsearchIndex() throws IOException {
RestHighLevelClient client = new RestHighLevelClient(RestClient.builder(new HttpHost("localhost", 9200, "http")));
IndexRequest request = new IndexRequest();
request.index("user4").id("1000");
//创建数据对象
User user = new User();
user.setAge(18);
user.setName("zhangsan");
user.setSex("男");
ObjectMapper objectMapper = new ObjectMapper();
String s = objectMapper.writeValueAsString(user);
// 添加文档数据,数据格式为 JSON 格式
request.source(s, XContentType.JSON);
IndexResponse response = client.index(request, RequestOptions.DEFAULT);
System.out.println("_index:" + response.getIndex());
System.out.println("_id:" + response.getId());
System.out.println("_result:" + response.getResult());
return "_index:" + response.getIndex() + "_id:" + response.getId() + "_result:" + response.getResult();
}
2.3.3 修改
@GetMapping("/elas/update/doc")
public String UpdateElasDoc() throws IOException {
RestHighLevelClient client = new RestHighLevelClient(RestClient.builder(new HttpHost("localhost", 9200, "http")));
// 修改文档 - 请求对象
UpdateRequest request = new UpdateRequest();
// 修改配置参数
request.index("user4").id("1000");
// 设置请求体,对数据修改
request.doc(XContentType.JSON, "sex", "女");
UpdateResponse response = client.update(request, RequestOptions.DEFAULT);
return "index " + response.getIndex() + " _id " + response.getId() + " _result " + response.getResult();
}
2.4 JavaAPI-文档-查询 & 删除
2.4.1 查询
@GetMapping("/elas/search/doc")
public String SearchElasDoc() throws IOException {
RestHighLevelClient client = new RestHighLevelClient(RestClient.builder(new HttpHost("localhost", 9200, "http")));
// 创建请求对象
GetRequest request = new GetRequest();
// 客户端发送请求,获取响应对象
request.index("user4").id("1000");
GetResponse response = client.get(request, RequestOptions.DEFAULT);
client.close();
return "_index = " + response.getIndex() + " _type = " + response.getType() + " _id = " + response.getId() + " source = " + response.getSource();
}
2.4.2 删除
@GetMapping("/elas/delete/doc")
public String DeleteElasDoc() throws IOException {
RestHighLevelClient client = new RestHighLevelClient(RestClient.builder(new HttpHost("localhost", 9200, "http")));
// 创建请求对象
DeleteRequest request = new DeleteRequest();
// 客户端发送请求,获取响应对象
request.index("user4").id("1000");
DeleteResponse delete = client.delete(request, RequestOptions.DEFAULT);
client.close();
return delete.toString();
}
2.5 JavaAPI-文档-批量新增 & 批量删除
2.5.1 批量新增
@GetMapping("/elas/batch/add")
public String BatchAddElasDoc() throws IOException {
RestHighLevelClient client = new RestHighLevelClient(RestClient.builder(new HttpHost("localhost", 9200, "http")));
// 创建添加对象
BulkRequest request = new BulkRequest();
request.add(new IndexRequest().index("user5").id("1001").source(XContentType.JSON,"name","zhangsan"));
request.add(new IndexRequest().index("user5").id("1002").source(XContentType.JSON,"name","lisi"));
request.add(new IndexRequest().index("user5").id("1003").source(XContentType.JSON,"name","wangwu"));
//客户端发送请求,获取响应对象
BulkResponse responses = client.bulk(request, RequestOptions.DEFAULT);
return "took = " + responses.getTook() +" item = " + responses.getItems();
}
2.5.2 批量删除
@GetMapping("/elas/batch/delete")
public String BatchDeleteElasDoc() throws IOException {
RestHighLevelClient client = new RestHighLevelClient(RestClient.builder(new HttpHost("localhost", 9200, "http")));
// 创建添加对象
BulkRequest request = new BulkRequest();
request.add(new DeleteRequest().index("user5").id("1001"));
request.add(new DeleteRequest().index("user5").id("1002"));
request.add(new DeleteRequest().index("user5").id("1003"));
//客户端发送请求,获取响应对象
BulkResponse responses = client.bulk(request, RequestOptions.DEFAULT);
return "took = " + responses.getTook() +" item = " + responses.getItems();
}
2.6 JavaAPI-文档-高级查询-全量查询
先批量增加数据
@GetMapping("/elas/batch/add")
public String BatchAddElasDoc() throws IOException {
RestHighLevelClient client = new RestHighLevelClient(RestClient.builder(new HttpHost("localhost", 9200, "http")));
// 创建添加对象
BulkRequest request = new BulkRequest();
request.add(new IndexRequest().index("user5").id("1001").source(XContentType.JSON,"name","zhangsan"));
request.add(new IndexRequest().index("user5").id("1002").source(XContentType.JSON,"name","lisi"));
request.add(new IndexRequest().index("user5").id("1003").source(XContentType.JSON,"name","wangwu"));
request.add(new IndexRequest().index("user5").id("1004").source(XContentType.JSON,"name","zhaoliu"));
request.add(new IndexRequest().index("user5").id("1005").source(XContentType.JSON,"name","hanqi"));
//客户端发送请求,获取响应对象
BulkResponse responses = client.bulk(request, RequestOptions.DEFAULT);
return "took = " + responses.getTook() +" item = " + responses.getItems();
}
查询所有索引数据
@GetMapping("/elas/batch/search/index")
public String BatchSearchElasIndex() throws IOException {
RestHighLevelClient client = new RestHighLevelClient(RestClient.builder(new HttpHost("localhost", 9200, "http")));
// 创建添加对象
// 创建搜索请求对象
SearchRequest request = new SearchRequest();
// 指定索引
request.indices("user5");
// 构建查询的请求体
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
// 查询所有数据
sourceBuilder.query(QueryBuilders.matchAllQuery());
request.source(sourceBuilder);
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
// 查询匹配结果
SearchHits hits = response.getHits();
System.out.println("took:" + response.getTook());
System.out.println("timeout:" + response.isTimedOut());
System.out.println("total:" + hits.getTotalHits());
System.out.println("MaxScore:" + hits.getMaxScore());
System.out.println("hits========>>");
for (SearchHit hit : hits) {
//输出每条查询的结果信息
System.out.println(hit.getSourceAsString());
}
System.out.println("<<========");
return "took = " + response.getTook() + " time out = "+ response.isTimedOut() + " total hits = " + hits.getTotalHits() + " max score = " + hits.getMaxScore();
}
2.7 JavaAPI-文档-高级查询-分页查询 & 条件查询 & 查询排序
2.7.1 条件查询
@GetMapping("/elas/best/search/index")
public String BestSearchElasIndex() throws IOException {
RestHighLevelClient client = new RestHighLevelClient(RestClient.builder(new HttpHost("localhost", 9200, "http")));
// 创建搜索请求对象
SearchRequest request = new SearchRequest();
// 指定索引
request.indices("user5");
// 构建查询的请求体
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
sourceBuilder.query(QueryBuilders.termQuery("name", "zhangsan"));
request.source(sourceBuilder);
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
// 查询匹配结果
SearchHits hits = response.getHits();
System.out.println("took:" + response.getTook());
System.out.println("timeout:" + response.isTimedOut());
System.out.println("total:" + hits.getTotalHits());
System.out.println("MaxScore:" + hits.getMaxScore());
System.out.println("hits========>>");
for (SearchHit hit : hits) {
//输出每条查询的结果信息
System.out.println(hit.getSourceAsString());
}
System.out.println("<<========");
return "took = " + response.getTook() + " time out = "+ response.isTimedOut() + " total hits = " + hits.getTotalHits() + " max score = " + hits.getMaxScore();
}
2.7.2 分页查询
@GetMapping("/elas/page/search/index")
public String PageSearchElasIndex() throws IOException {
RestHighLevelClient client = new RestHighLevelClient(RestClient.builder(new HttpHost("localhost", 9200, "http")));
// 创建搜索请求对象
SearchRequest request = new SearchRequest();
// 指定索引
request.indices("user5");
// 构建查询的请求体
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
sourceBuilder.query(QueryBuilders.matchAllQuery());
// 分页查询
// 当前页其实索引(第一条数据的顺序号), from
sourceBuilder.from(0);
// 每页显示多少条 size
sourceBuilder.size(2);
request.source(sourceBuilder);
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
// 查询匹配结果
SearchHits hits = response.getHits();
System.out.println("took:" + response.getTook());
System.out.println("timeout:" + response.isTimedOut());
System.out.println("total:" + hits.getTotalHits());
System.out.println("MaxScore:" + hits.getMaxScore());
System.out.println("hits========>>");
for (SearchHit hit : hits) {
//输出每条查询的结果信息
System.out.println(hit.getSourceAsString());
}
System.out.println("<<========");
return "took = " + response.getTook() + " time out = "+ response.isTimedOut() + " total hits = " + hits.getTotalHits() + " max score = " + hits.getMaxScore();
}
2.7.3 查询排序
@GetMapping("/elas/sort/order")
public String SortOrderElas() throws IOException {
RestHighLevelClient client = new RestHighLevelClient(RestClient.builder(new HttpHost("localhost", 9200, "http")));
// 创建搜索请求对象
SearchRequest request = new SearchRequest();
request.indices("user");
// 构建查询的请求体
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
sourceBuilder.query(QueryBuilders.matchAllQuery());
// 排序
sourceBuilder.sort("age", SortOrder.ASC);
request.source(sourceBuilder);
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
// 查询匹配结果
SearchHits hits = response.getHits();
System.out.println("took:" + response.getTook());
System.out.println("timeout:" + response.isTimedOut());
System.out.println("total:" + hits.getTotalHits());
System.out.println("MaxScore:" + hits.getMaxScore());
System.out.println("hits========>>");
for (SearchHit hit : hits) {
//输出每条查询的结果信息
System.out.println(hit.getSourceAsString());
}
System.out.println("<<========");
return "";
}
2.8 JavaAPI-文档-高级查询-组合查询 & 范围查询
2.8.1 组合查询
@GetMapping("/elas/match/query")
public void matchQuery() throws IOException {
RestHighLevelClient client = new RestHighLevelClient(RestClient.builder(new HttpHost("localhost", 9200, "http")));
// 创建搜索请求对象
SearchRequest request = new SearchRequest();
// 指定查询索引
request.indices("user");
// 构建查询的请求体
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
// 必须包含
boolQueryBuilder.must(QueryBuilders.matchQuery("age", "20"));
// 一定不含
boolQueryBuilder.mustNot(QueryBuilders.matchQuery("name", "zhangsan"));
// 可能包含
boolQueryBuilder.should(QueryBuilders.matchQuery("sex", "男"));
sourceBuilder.query(boolQueryBuilder);
request.source(sourceBuilder);
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
// 查询匹配结果
SearchHits hits = response.getHits();
System.out.println("took:" + response.getTook());
System.out.println("timeout:" + response.isTimedOut());
System.out.println("total:" + hits.getTotalHits());
System.out.println("MaxScore:" + hits.getMaxScore());
System.out.println("hits========>>");
for (SearchHit hit : hits) {
//输出每条查询的结果信息
System.out.println(hit.getSourceAsString());
}
System.out.println("<<========");
}
2.8.2 范围查询
@GetMapping("/elas/range/query")
public void RangeQuery() throws IOException {
RestHighLevelClient client = new RestHighLevelClient(RestClient.builder(new HttpHost("localhost", 9200, "http")));
// 创建搜索请求对象
SearchRequest request = new SearchRequest();
// 指定查询索引
request.indices("user");
// 构建查询的请求体
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
RangeQueryBuilder rangeQuery = QueryBuilders.rangeQuery("age");
// 大于等于
//rangeQuery.gte("30");
// 小于等于
rangeQuery.lte("40");
sourceBuilder.query(rangeQuery);
request.source(sourceBuilder);
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
// 查询匹配结果
SearchHits hits = response.getHits();
System.out.println("took:" + response.getTook());
System.out.println("timeout:" + response.isTimedOut());
System.out.println("total:" + hits.getTotalHits());
System.out.println("MaxScore:" + hits.getMaxScore());
System.out.println("hits========>>");
for (SearchHit hit : hits) {
//输出每条查询的结果信息
System.out.println(hit.getSourceAsString());
}
System.out.println("<<========");
}
2.9 JavaAPI-文档-高级查询-模糊查询 & 高亮查询
2.9.1 模糊查询
@GetMapping("/elas/fuzzy/query")
public void fuzzyQuery() throws IOException {
RestHighLevelClient client = new RestHighLevelClient(RestClient.builder(new HttpHost("localhost", 9200, "http")));
// 创建搜索请求对象
SearchRequest request = new SearchRequest();
// 指定查询的索引
request.indices("user");
// 构建查询的请求体
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
sourceBuilder.query(QueryBuilders.fuzzyQuery("name", "wangw").fuzziness(Fuzziness.ONE));
request.source(sourceBuilder);
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
// 查询匹配结果
SearchHits hits = response.getHits();
System.out.println("took:" + response.getTook());
System.out.println("timeout:" + response.isTimedOut());
System.out.println("total:" + hits.getTotalHits());
System.out.println("MaxScore:" + hits.getMaxScore());
System.out.println("hits========>>");
for (SearchHit hit : hits) {
//输出每条查询的结果信息
System.out.println(hit.getSourceAsString());
}
System.out.println("<<========");
}
2.9.2 高亮查询
@GetMapping("/elas/high/query")
public void HighLightSearch() throws IOException {
RestHighLevelClient client = new RestHighLevelClient(RestClient.builder(new HttpHost("localhost", 9200, "http")));
// 高亮查询
SearchRequest request = new SearchRequest().indices("user");
//2.创建查询请求体构建器
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
//构建查询方式:高亮查询
TermsQueryBuilder termsQueryBuilder =
QueryBuilders.termsQuery("name","lisi");
//设置查询方式
sourceBuilder.query(termsQueryBuilder);
//构建高亮字段
HighlightBuilder highlightBuilder = new HighlightBuilder();
highlightBuilder.preTags("<font color='red'>");//设置标签前缀
highlightBuilder.postTags("</font>");//设置标签后缀
highlightBuilder.field("name");//设置高亮字段
//设置高亮构建对象
sourceBuilder.highlighter(highlightBuilder);
//设置请求体
request.source(sourceBuilder);
//3.客户端发送请求,获取响应对象
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
//4.打印响应结果
SearchHits hits = response.getHits();
System.out.println("took::"+response.getTook());
System.out.println("time_out::"+response.isTimedOut());
System.out.println("total::"+hits.getTotalHits());
System.out.println("max_score::"+hits.getMaxScore());
System.out.println("hits::::>>");
for (SearchHit hit : hits) {
String sourceAsString = hit.getSourceAsString();
System.out.println(sourceAsString);
//打印高亮结果
Map<String, HighlightField> highlightFields = hit.getHighlightFields();
System.out.println(highlightFields);
}
System.out.println("<<::::");
}
2.10 JavaAPI-文档-高级查询-最大值查询 & 分组查询(聚合查询)
2.10.1 最大值查询
@GetMapping("/elas/maxvalue/query")
public void MaxValueSearch() throws IOException {
RestHighLevelClient client = new RestHighLevelClient(RestClient.builder(new HttpHost("localhost", 9200, "http")));
// 高亮查询
SearchRequest request = new SearchRequest().indices("user");
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
sourceBuilder.aggregation(AggregationBuilders.max("maxAge").field("age"));
//设置请求体
request.source(sourceBuilder);
//3.客户端发送请求,获取响应对象
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
//4.打印响应结果
SearchHits hits = response.getHits();
System.out.println(response);
for (SearchHit hit : hits) {
System.out.println(hit);
}
}
2.10.2 分组查询
@GetMapping("/elas/group/query")
public void GroupSearch() throws IOException {
RestHighLevelClient client = new RestHighLevelClient(RestClient.builder(new HttpHost("localhost", 9200, "http")));
// 高亮查询
SearchRequest request = new SearchRequest().indices("user");
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
sourceBuilder.aggregation(AggregationBuilders.terms("age_groupby").field("age"));
//设置请求体
request.source(sourceBuilder);
//3.客户端发送请求,获取响应对象
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
//4.打印响应结果
SearchHits hits = response.getHits();
System.out.println(response);
for (SearchHit hit : hits) {
System.out.println(hit);
}
}
三、Elasticsearch框架集成
Spring Data是一个用于简化数据库、非关系型数据库、索引库访问,并支持云服务的开源框架。其主要目标是使得对数据的访问变得方便快捷,并支持 map-reduce框架和云计算数据服务。Spring Data可以极大的简化JPA(Elasticsearch…)的写法,可以在几乎不用写实现的情况下,实现对数据的访问和操作。除了CRUD 外,还包括如分页、排序等一些常用的功能。
Spring Data 常用的功能模块如下:
Spring Data JDBC Spring Data JPA Spring Data LDAP Spring Data MongoDB Spring Data Redis Spring Data R2DBC Spring Data REST Spring Data for Apache Cassandra Spring Data for Apache Geode Spring Data for Apache Solr Spring Data for Pivotal GemFire Spring Data Couchbase Spring Data Elasticsearch Spring Data Envers Spring Data Neo4j Spring Data JDBC Extensions Spring for Apache Hadoop
3.1 Spring Data Elasticsearch 介绍
Spring Data Elasticsearch基于Spring Data API简化 Elasticsearch 操作,将原始操作Elasticsearch 的客户端API进行封装。Spring Data为Elasticsearch 项目提供集成搜索引擎。Spring Data Elasticsearch POJO的关键功能区域为中心的模型与Elastichsearch交互文档和轻松地编写一个存储索引库数据访问层。
3.2 框架集成-SpringData-代码功能集成
一、创建Maven项目。
二、修改pom文件,增加依赖关系。
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>2.3.6.RELEASE</version>
<relativePath/>
</parent>
<groupId>com.lun</groupId>
<artifactId>SpringDataWithES</artifactId>
<version>1.0.0-SNAPSHOT</version>
<properties>
<maven.compiler.source>8</maven.compiler.source>
<maven.compiler.target>8</maven.compiler.target>
</properties>
<dependencies>
<dependency>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-elasticsearch</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-devtools</artifactId>
<scope>runtime</scope>
<optional>true</optional>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-test</artifactId>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
</dependency>
<dependency>
<groupId>org.springframework</groupId>
<artifactId>spring-test</artifactId>
</dependency>
</dependencies>
</project>
三、增加配置文件。
在 resources 目录中增加application.properties文件
# es 服务地址
elasticsearch.host=127.0.0.1
# es 服务端口
elasticsearch.port=9200
# 配置日志级别,开启 debug 日志
logging.level.com.atguigu.es=debug
四、Spring Boot 主程序
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
@SpringBootApplication
public class MainApplication {
public static void main(String[] args) {
SpringApplication.run(MainApplication.class, args);
}
}
五、数据实体类。
import lombok.AllArgsConstructor;
import lombok.Data;
import lombok.NoArgsConstructor;
import lombok.ToString;
import org.springframework.data.annotation.Id;
import org.springframework.data.elasticsearch.annotations.Document;
import org.springframework.data.elasticsearch.annotations.Field;
import org.springframework.data.elasticsearch.annotations.FieldType;
@Data
@NoArgsConstructor
@AllArgsConstructor
@ToString
@Document(indexName = "shopping", shards = 3, replicas = 1)
public class Product {
//必须有 id,这里的 id 是全局唯一的标识,等同于 es 中的"_id"
@Id
private Long id;//商品唯一标识
/**
* type : 字段数据类型
* analyzer : 分词器类型
* index : 是否索引(默认:true)
* Keyword : 短语,不进行分词
*/
@Field(type = FieldType.Text, analyzer = "ik_max_word")
private String title;//商品名称
@Field(type = FieldType.Keyword)
private String category;//分类名称
@Field(type = FieldType.Double)
private Double price;//商品价格
@Field(type = FieldType.Keyword, index = false)
private String images;//图片地址
}
六、配置类
- ElasticsearchRestTemplate是spring-data-elasticsearch项目中的一个类,和其他spring项目中的 template类似。
- 在新版的spring-data-elasticsearch 中,ElasticsearchRestTemplate 代替了原来的ElasticsearchTemplate。
- 原因是ElasticsearchTemplate基于TransportClient,TransportClient即将在8.x 以后的版本中移除。所以,我们推荐使用ElasticsearchRestTemplate。
- ElasticsearchRestTemplate基于RestHighLevelClient客户端的。需要自定义配置类,继承AbstractElasticsearchConfiguration,并实现elasticsearchClient()抽象方法,创建RestHighLevelClient对象。
AbstractElasticsearchConfiguration源码:
package org.springframework.data.elasticsearch.config;
import org.elasticsearch.client.RestHighLevelClient;
import org.springframework.context.annotation.Bean;
import org.springframework.data.elasticsearch.core.ElasticsearchOperations;
import org.springframework.data.elasticsearch.core.ElasticsearchRestTemplate;
import org.springframework.data.elasticsearch.core.convert.ElasticsearchConverter;
/**
* @author Christoph Strobl
* @author Peter-Josef Meisch
* @since 3.2
* @see ElasticsearchConfigurationSupport
*/
public abstract class AbstractElasticsearchConfiguration extends ElasticsearchConfigurationSupport {
//需重写本方法
public abstract RestHighLevelClient elasticsearchClient();
@Bean(name = { "elasticsearchOperations", "elasticsearchTemplate" })
public ElasticsearchOperations elasticsearchOperations(ElasticsearchConverter elasticsearchConverter) {
return new ElasticsearchRestTemplate(elasticsearchClient(), elasticsearchConverter);
}
}
需要自定义配置类,继承AbstractElasticsearchConfiguration,并实现elasticsearchClient()抽象方法,创建RestHighLevelClient对象。
import lombok.Data;
import org.apache.http.HttpHost;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestClientBuilder;
import org.elasticsearch.client.RestHighLevelClient;
import org.springframework.boot.context.properties.ConfigurationProperties;
import org.springframework.context.annotation.Configuration;
import org.springframework.data.elasticsearch.config.AbstractElasticsearchConfiguration;
@ConfigurationProperties(prefix = "elasticsearch")
@Configuration
@Data
public class ElasticsearchConfig extends AbstractElasticsearchConfiguration{
private String host ;
private Integer port ;
//重写父类方法
@Override
public RestHighLevelClient elasticsearchClient() {
RestClientBuilder builder = RestClient.builder(new HttpHost(host, port));
RestHighLevelClient restHighLevelClient = new
RestHighLevelClient(builder);
return restHighLevelClient;
}
}
七、DAO 数据访问对象
import com.lun.model.Product;
import org.springframework.data.elasticsearch.repository.ElasticsearchRepository;
import org.springframework.stereotype.Repository;
@Repository
public interface ProductDao extends ElasticsearchRepository<Product, Long>{
}
3.3 框架集成-SpringData-集成测试-索引操作
import com.lun.model.Product;
import org.junit.Test;
import org.junit.runner.RunWith;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;
import org.springframework.data.elasticsearch.core.ElasticsearchRestTemplate;
import org.springframework.test.context.junit4.SpringRunner;
@RunWith(SpringRunner.class)
@SpringBootTest
public class SpringDataESIndexTest {
//注入 ElasticsearchRestTemplate
@Autowired
private ElasticsearchRestTemplate elasticsearchRestTemplate;
//创建索引并增加映射配置
@Test
public void createIndex(){
//创建索引,系统初始化会自动创建索引
System.out.println("创建索引");
}
@Test
public void deleteIndex(){
//创建索引,系统初始化会自动创建索引
boolean flg = elasticsearchRestTemplate.deleteIndex(Product.class);
System.out.println("删除索引 = " + flg);
}
}
用Postman 检测有没有创建和删除。
#GET http://localhost:9200/_cat/indices?v
3.4 框架集成-SpringData-集成测试-文档操作
import com.lun.dao.ProductDao;
import com.lun.model.Product;
import org.junit.Test;
import org.junit.runner.RunWith;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;
import org.springframework.data.domain.Page;
import org.springframework.data.domain.PageRequest;
import org.springframework.data.domain.Sort;
import org.springframework.test.context.junit4.SpringRunner;
import java.util.ArrayList;
import java.util.List;
@RunWith(SpringRunner.class)
@SpringBootTest
public class SpringDataESProductDaoTest {
@Autowired
private ProductDao productDao;
/**
* 新增
*/
@Test
public void save(){
Product product = new Product();
product.setId(2L);
product.setTitle("华为手机");
product.setCategory("手机");
product.setPrice(2999.0);
product.setImages("http://www.atguigu/hw.jpg");
productDao.save(product);
}
//POSTMAN, GET http://localhost:9200/product/_doc/2
//修改
@Test
public void update(){
Product product = new Product();
product.setId(2L);
product.setTitle("小米 2 手机");
product.setCategory("手机");
product.setPrice(9999.0);
product.setImages("http://www.atguigu/xm.jpg");
productDao.save(product);
}
//POSTMAN, GET http://localhost:9200/product/_doc/2
//根据 id 查询
@Test
public void findById(){
Product product = productDao.findById(2L).get();
System.out.println(product);
}
@Test
public void findAll(){
Iterable<Product> products = productDao.findAll();
for (Product product : products) {
System.out.println(product);
}
}
//删除
@Test
public void delete(){
Product product = new Product();
product.setId(2L);
productDao.delete(product);
}
//POSTMAN, GET http://localhost:9200/product/_doc/2
//批量新增
@Test
public void saveAll(){
List<Product> productList = new ArrayList<>();
for (int i = 0; i < 10; i++) {
Product product = new Product();
product.setId(Long.valueOf(i));
product.setTitle("["+i+"]小米手机");
product.setCategory("手机");
product.setPrice(1999.0 + i);
product.setImages("http://www.atguigu/xm.jpg");
productList.add(product);
}
productDao.saveAll(productList);
}
//分页查询
@Test
public void findByPageable(){
//设置排序(排序方式,正序还是倒序,排序的 id)
Sort sort = Sort.by(Sort.Direction.DESC,"id");
int currentPage=0;//当前页,第一页从 0 开始, 1 表示第二页
int pageSize = 5;//每页显示多少条
//设置查询分页
PageRequest pageRequest = PageRequest.of(currentPage, pageSize,sort);
//分页查询
Page<Product> productPage = productDao.findAll(pageRequest);
for (Product Product : productPage.getContent()) {
System.out.println(Product);
}
}
}
3.5 框架集成-SpringData-集成测试-文档搜索
import com.lun.dao.ProductDao;
import com.lun.model.Product;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.index.query.TermQueryBuilder;
import org.junit.Test;
import org.junit.runner.RunWith;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;
import org.springframework.data.domain.PageRequest;
import org.springframework.test.context.junit4.SpringRunner;
@RunWith(SpringRunner.class)
@SpringBootTest
public class SpringDataESSearchTest {
@Autowired
private ProductDao productDao;
/**
* term 查询
* search(termQueryBuilder) 调用搜索方法,参数查询构建器对象
*/
@Test
public void termQuery(){
TermQueryBuilder termQueryBuilder = QueryBuilders.termQuery("title", "小米");
Iterable<Product> products = productDao.search(termQueryBuilder);
for (Product product : products) {
System.out.println(product);
}
}
/**
* term 查询加分页
*/
@Test
public void termQueryByPage(){
int currentPage= 0 ;
int pageSize = 5;
//设置查询分页
PageRequest pageRequest = PageRequest.of(currentPage, pageSize);
TermQueryBuilder termQueryBuilder = QueryBuilders.termQuery("title", "小米");
Iterable<Product> products =
productDao.search(termQueryBuilder,pageRequest);
for (Product product : products) {
System.out.println(product);
}
}
}
3.6 框架集成-SparkStreaming-集成
Spark Streaming 是Spark core API的扩展,支持实时数据流的处理,并且具有可扩展,高吞吐量,容错的特点。数据可以从许多来源获取,如Kafka, Flume,Kinesis或TCP sockets,并且可以使用复杂的算法进行处理,这些算法使用诸如 map,reduce,join和 window等高级函数表示。最后,处理后的数据可以推送到文件系统,数据库等。实际上,您可以将Spark的机器学习和图形处理算法应用于数据流。
一、创建Maven项目。
二、修改 pom 文件,增加依赖关系。
<?xml version="1.0" encoding="utf-8"?>
<project
xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.lun.es</groupId>
<artifactId>sparkstreaming-elasticsearch</artifactId>
<version>1.0</version>
<properties>
<maven.compiler.source>8</maven.compiler.source>
<maven.compiler.target>8</maven.compiler.target>
</properties>
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.12</artifactId>
<version>3.0.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.12</artifactId>
<version>3.0.0</version>
</dependency>
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch</artifactId>
<version>7.8.0</version>
</dependency>
<!-- elasticsearch 的客户端 -->
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-high-level-client</artifactId>
<version>7.8.0</version>
</dependency>
<!-- elasticsearch 依赖 2.x 的 log4j -->
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-api</artifactId>
<version>2.8.2</version>
</dependency>
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-core</artifactId>
<version>2.8.2</version>
</dependency>
<!-- <dependency>-->
<!-- <groupId>com.fasterxml.jackson.core</groupId>-->
<!-- <artifactId>jackson-databind</artifactId>-->
<!-- <version>2.11.1</version>-->
<!-- </dependency>-->
<!-- <!– junit 单元测试 –>-->
<!-- <dependency>-->
<!-- <groupId>junit</groupId>-->
<!-- <artifactId>junit</artifactId>-->
<!-- <version>4.12</version>-->
<!-- </dependency>-->
</dependencies>
</project>
3.7 框架集成-Flink-集成
Apache Spark是一-种基于内存的快速、通用、可扩展的大数据分析计算引擎。Apache Spark掀开了内存计算的先河,以内存作为赌注,贏得了内存计算的飞速发展。但是在其火热的同时,开发人员发现,在Spark中,计算框架普遍存在的缺点和不足依然没有完全解决,而这些问题随着5G时代的来临以及决策者对实时数据分析结果的迫切需要而凸显的更加明显:
乱序数据,迟到数据 低延迟,高吞吐,准确性 容错性 数据精准一次性处理(Exactly-Once) Apache Flink是一个框架和分布式处理引擎,用于对无界和有界数据流进行有状态计算。在Spark火热的同时,也默默地发展自己,并尝试着解决其他计算框架的问题。慢慢地,随着这些问题的解决,Flink 慢慢被绝大数程序员所熟知并进行大力推广,阿里公司在2015年改进Flink,并创建了内部分支Blink,目前服务于阿里集团内部搜索、推荐、广告和蚂蚁等大量核心实时业务。
一、创建Maven项目。
二、修改 pom 文件,增加相关依赖类库。
<?xml version="1.0" encoding="UTF-8"?>
<project
xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.lun.es</groupId>
<artifactId>flink-elasticsearch</artifactId>
<version>1.0</version>
<properties>
<maven.compiler.source>8</maven.compiler.source>
<maven.compiler.target>8</maven.compiler.target>
</properties>
<dependencies>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-scala_2.12</artifactId>
<version>1.12.0</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-streaming-scala_2.12</artifactId>
<version>1.12.0</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-clients_2.12</artifactId>
<version>1.12.0</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-connector-elasticsearch7_2.11</artifactId>
<version>1.12.0</version>
</dependency>
<!-- jackson -->
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-core</artifactId>
<version>2.11.1</version>
</dependency>
</dependencies>
</project>
三、功能实现
import org.apache.flink.api.common.functions.RuntimeContext;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.connectors.elasticsearch.ElasticsearchSinkFunction;
import org.apache.flink.streaming.connectors.elasticsearch.RequestIndexer;
import org.apache.flink.streaming.connectors.elasticsearch7.ElasticsearchSink;
import org.apache.http.HttpHost;
import org.elasticsearch.action.index.IndexRequest;
import org.elasticsearch.client.Requests;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
public class FlinkElasticsearchSinkTest {
public static void main(String[] args) throws Exception {
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
DataStreamSource<String> source = env.socketTextStream("localhost", 9999);
List<HttpHost> httpHosts = new ArrayList<>();
httpHosts.add(new HttpHost("127.0.0.1", 9200, "http"));
//httpHosts.add(new HttpHost("10.2.3.1", 9200, "http"));
// use a ElasticsearchSink.Builder to create an ElasticsearchSink
ElasticsearchSink.Builder<String> esSinkBuilder = new ElasticsearchSink.Builder<>(httpHosts,
new ElasticsearchSinkFunction<String>() {
public IndexRequest createIndexRequest(String element) {
Map<String, String> json = new HashMap<>();
json.put("data", element);
return Requests.indexRequest()
.index("my-index")
//.type("my-type")
.source(json);
}
@Override
public void process(String element, RuntimeContext ctx, RequestIndexer indexer) {
indexer.add(createIndexRequest(element));
}
}
);
// configuration for the bulk requests; this instructs the sink to emit after every element, otherwise they would be buffered
esSinkBuilder.setBulkFlushMaxActions(1);
// provide a RestClientFactory for custom configuration on the internally createdREST client
// esSinkBuilder.setRestClientFactory(
// restClientBuilder -> {
// restClientBuilder.setDefaultHeaders(...)
// restClientBuilder.setMaxRetryTimeoutMillis(...)
// restClientBuilder.setPathPrefix(...)
// restClientBuilder.setHttpClientConfigCallback(...)
// }
// );
source.addSink(esSinkBuilder.build());
env.execute("flink-es");
}
}