在这篇文章中,我来详细地描述如何使用最新的 Elasticsearch Java client 8.0 来创建索引并进行搜索。最新的 Elasticsearch Java client API 和之前的不同。在之前的一些教程中,我们使用 High Level API来进行操作。在官方文档中,已经显示为 deprecated。
前提条件
- Java 8 及以后的版本
- 一个 JSON 对象映射库,允许你的应用程序类与 Elasticsearch API 无缝集成。 Java 客户端支持 Jackson 或像 Eclipse Yasson 的 JSON-B 库。
版本托管在 Maven Central 上。 如果你正在寻找 SNAPSHOT 版本,可以从 snapshots.elastic.co/maven/获得 Elastic Maven 快照存储库。
为什么需要一个新的 Java client?
也许有许多的开发者好奇为啥需要新的 client,以前的那个 High level rest client 不是好好的吗?以前的那个 High level REST client API 有如下的问题:
- 和 Elasticsearch server 共享很多的代码
- 拉取大量依赖 (30 + MB)。很多代码并不实用
- 容易误解:之前的 API 暴露了许多 Elasticsearch server 的内部情况
- 用手来书写 API
- API 在不同的版本中有时并不一致
- 需要大量的维护工作(400 多个 endpoints)
- 没有 JSON/object 映射的集成
- 你需要使用 byte buffers 来自己映射
新的 Java client API 具有一下的优点:
- 使用代码来生成 API
- 基于官方的 Elasticsearch API 正式文档
- Java client API 是新一代 Elasticsearch client 的第一个。后续有针对其它的语言发布
- 99% 的代码是自动生成的
- 一个提供更加现代 API 接口的机会
- 流畅的 functional builders
- 接近 Elasticsearch JSON 格式的分层 DSL
- 到/从和应用程序类的自动映射
- 保持 Java 8 的兼容性
安装
如果你还没有安装好自己的 Elasticsearch 及 Kibana 的话,请参阅我之前的文章:
- 如何在 Linux,MacOS 及 Windows 上进行安装 Elasticsearch
- Kibana:如何在 Linux,MacOS 及 Windows上安装 Elastic 栈中的 Kibana
- Elasticsearch:设置 Elastic 账户安全
如果你想在 Elastic Stack 8.0 上试用的话。你可以参阅文章 “Elastic Stack 8.0 安装 - 保护你的 Elastic Stack 现在比以往任何时候都简单”。在本文章中,我们不启用 HTTPS 的访问。你需要查看文章中 “如何配置 Elasticsearch 只带有基本安全” 这个部分。我们为 Elasticsearch 配置基本安全。
展示
在今天的展示中,我将使用 Maven 项目来进行展示尽管 gradle 也可以。为了方便大家的学习,我把我创建的项目上传到 github 上 GitHub - liu-xiao-guo/ElasticsearchJava-search8
首先,我们的 pom.xml 文件如下:
pom.xml
1. <?xml version="1.0" encoding="UTF-8"?>
2. <project xmlns="http://maven.apache.org/POM/4.0.0"
3. xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
4. xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
5. <modelVersion>4.0.0</modelVersion>
7. <groupId>org.example</groupId>
8. <artifactId>ElasticsearchJava-search8</artifactId>
9. <version>1.0-SNAPSHOT</version>
11. <properties>
12. <maven.compiler.source>8</maven.compiler.source>
13. <maven.compiler.target>8</maven.compiler.target>
14. <elastic.version>8.0.1</elastic.version>
15. </properties>
17. <dependencies>
18. <dependency>
19. <groupId>co.elastic.clients</groupId>
20. <artifactId>elasticsearch-java</artifactId>
21. <version>${elastic.version}</version>
22. </dependency>
24. <dependency>
25. <groupId>com.fasterxml.jackson.core</groupId>
26. <artifactId>jackson-databind</artifactId>
27. <version>2.12.3</version>
28. </dependency>
30. <!-- Needed only if you use the spring-boot Maven plugin -->
31. <dependency>
32. <groupId>jakarta.json</groupId>
33. <artifactId>jakarta.json-api</artifactId>
34. <version>2.0.1</version>
35. </dependency>
36. </dependencies>
37. </project>
如上所示,我们使用了 8.0.1 的版本。你也可以使用在地址 Maven Central Repository Search 上的最新版本 8.1.1。
接下来,我们创建一个叫做 Product.java 的文件:
Product.java
1. public class Product {
2. private String id;
3. private String name;
4. private int price;
6. public Product() {
7. }
9. public Product(String id, String name, int price) {
10. this.id = id;
11. this.name = name;
12. this.price = price;
13. }
15. public String getId() {
16. return id;
17. }
19. public String getName() {
20. return name;
21. }
23. public int getPrice() {
24. return price;
25. }
27. public void setId(String id) {
28. this.id = id;
29. }
31. public void setName(String name) {
32. this.name = name;
33. }
35. public void setPrice(int price) {
36. this.price = price;
37. }
39. @Override
40. public String toString() {
41. return "Product{" +
42. "id='" + id + '\'' +
43. ", + name + '\'' +
44. ", price=" + price +
45. '}';
46. }
47. }
我们再接下来创建 ElasticsearchJava.java 文件:
1. import co.elastic.clients.elasticsearch.ElasticsearchAsyncClient;
2. import co.elastic.clients.elasticsearch.ElasticsearchClient;
3. import co.elastic.clients.elasticsearch._types.query_dsl.QueryBuilders;
4. import co.elastic.clients.elasticsearch._types.query_dsl.TermQuery;
5. import co.elastic.clients.elasticsearch.core.*;
6. import co.elastic.clients.elasticsearch.core.search.Hit;
7. import co.elastic.clients.json.jackson.JacksonJsonpMapper;
8. import co.elastic.clients.transport.ElasticsearchTransport;
9. import co.elastic.clients.transport.rest_client.RestClientTransport;
10. import org.apache.http.HttpHost;
11. import org.apache.http.auth.AuthScope;
12. import org.apache.http.auth.UsernamePasswordCredentials;
13. import org.apache.http.client.CredentialsProvider;
14. import org.apache.http.impl.client.BasicCredentialsProvider;
15. import org.apache.http.impl.nio.client.HttpAsyncClientBuilder;
16. import org.elasticsearch.client.RestClient;
17. import org.elasticsearch.client.RestClientBuilder;
19. import java.io.IOException;
21. public class ElasticsearchJava {
23. private static ElasticsearchClient client = null;
24. private static ElasticsearchAsyncClient asyncClient = null;
26. private static synchronized void makeConnection() {
27. // Create the low-level client
28. final CredentialsProvider credentialsProvider =
29. new BasicCredentialsProvider();
30. credentialsProvider.setCredentials(AuthScope.ANY,
31. new UsernamePasswordCredentials("elastic", "password"));
33. RestClientBuilder builder = RestClient.builder(
34. new HttpHost("localhost", 9200))
35. .setHttpClientConfigCallback(new RestClientBuilder.HttpClientConfigCallback() {
36. @Override
37. public HttpAsyncClientBuilder customizeHttpClient(
38. HttpAsyncClientBuilder httpClientBuilder) {
39. return httpClientBuilder
40. .setDefaultCredentialsProvider(credentialsProvider);
41. }
42. });
44. RestClient restClient = builder.build();
46. // Create the transport with a Jackson mapper
47. ElasticsearchTransport transport = new RestClientTransport(
48. restClient, new JacksonJsonpMapper());
50. // And create the API client
51. client = new ElasticsearchClient(transport);
52. asyncClient = new ElasticsearchAsyncClient(transport);
53. }
55. public static void main(String[] args) throws IOException {
56. makeConnection();
58. // Index data to an index products
59. Product product = new Product("abc", "Bag", 42);
61. IndexRequest<Object> indexRequest = new IndexRequest.Builder<>()
62. .index("products")
63. .id("abc")
64. .document(product)
65. .build();
67. client.index(indexRequest);
69. Product product1 = new Product("efg", "Bag", 42);
71. client.index(builder -> builder
72. .index("products")
73. .id(product1.getId())
74. .document(product1)
75. );
77. // Search for a data
78. TermQuery query = QueryBuilders.term()
79. .field("name")
80. .value("bag")
81. .build();
83. SearchRequest request = new SearchRequest.Builder()
84. .index("products")
85. .query(query._toQuery())
86. .build();
88. SearchResponse<Product> search =
89. client.search(
90. request,
91. Product.class
92. );
94. for (Hit<Product> hit: search.hits().hits()) {
95. Product pd = hit.source();
96. System.out.println(pd);
97. }
99. SearchResponse<Product> search1 = client.search(s -> s
100. .index("products")
101. .query(q -> q
102. .term(t -> t
103. .field("name")
104. .value(v -> v.stringValue("bag"))
105. )),
106. Product.class);
108. for (Hit<Product> hit: search1.hits().hits()) {
109. Product pd = hit.source();
110. System.out.println(pd);
111. }
113. // Splitting complex DSL
114. TermQuery termQuery = TermQuery.of(t ->t.field("name").value("bag"));
116. SearchResponse<Product> search2 = client.search(s -> s
117. .index("products")
118. .query(termQuery._toQuery()),
119. Product.class
120. );
122. for (Hit<Product> hit: search2.hits().hits()) {
123. Product pd = hit.source();
124. System.out.println(pd);
125. }
127. // Creating aggregations
128. SearchResponse<Void> search3 = client.search( b-> b
129. .index("products")
130. .size(0)
131. .aggregations("price-histo", a -> a
132. .histogram(h -> h
133. .field("price")
134. .interval(20.0)
135. )
136. ),
137. Void.class
138. );
140. long firstBucketCount = search3.aggregations()
141. .get("price-histo")
142. .histogram()
143. .buckets().array()
144. .get(0)
145. .docCount();
147. System.out.println("doc count: " + firstBucketCount);
148. }
149. }
在上面,代码也非常直接。我们使用如下的代码来连接到 Elasticsearch:
1. private static synchronized void makeConnection() {
2. // Create the low-level client
3. final CredentialsProvider credentialsProvider =
4. new BasicCredentialsProvider();
5. credentialsProvider.setCredentials(AuthScope.ANY,
6. new UsernamePasswordCredentials("elastic", "password"));
8. RestClientBuilder builder = RestClient.builder(
9. new HttpHost("localhost", 9200))
10. .setHttpClientConfigCallback(new RestClientBuilder.HttpClientConfigCallback() {
11. @Override
12. public HttpAsyncClientBuilder customizeHttpClient(
13. HttpAsyncClientBuilder httpClientBuilder) {
14. return httpClientBuilder
15. .setDefaultCredentialsProvider(credentialsProvider);
16. }
17. });
19. RestClient restClient = builder.build();
21. // Create the transport with a Jackson mapper
22. ElasticsearchTransport transport = new RestClientTransport(
23. restClient, new JacksonJsonpMapper());
25. // And create the API client
26. client = new ElasticsearchClient(transport);
27. asyncClient = new ElasticsearchAsyncClient(transport);
28. }
在上面,我们使用 elastic 这个超级用户来进行访问。它的密码是 password。这个在实际的使用中,需要根据自己的情况来进行设置。
在下面,我们使用如下的两种格式来写入数据到 products 索引中:
1. // Index data to an index products
2. Product product = new Product("abc", "Bag", 42);
4. IndexRequest<Object> indexRequest = new IndexRequest.Builder<>()
5. .index("products")
6. .id("abc")
7. .document(product)
8. .build();
10. client.index(indexRequest);
12. Product product1 = new Product("efg", "Bag", 42);
14. client.index(builder -> builder
15. .index("products")
16. .id(product1.getId())
17. .document(product1)
18. );
我们可以在 Kibana 中进行查看:
GET products/_search
上面的命令显示:
1. {
2. "took" : 0,
3. "timed_out" : false,
4. "_shards" : {
5. "total" : 1,
6. "successful" : 1,
7. "skipped" : 0,
8. "failed" : 0
9. },
10. "hits" : {
11. "total" : {
12. "value" : 2,
13. "relation" : "eq"
14. },
15. "max_score" : 1.0,
16. "hits" : [
17. {
18. "_index" : "products",
19. "_id" : "abc",
20. "_score" : 1.0,
21. "_source" : {
22. "id" : "abc",
23. "name" : "Bag",
24. "price" : 42
25. }
26. },
27. {
28. "_index" : "products",
29. "_id" : "efg",
30. "_score" : 1.0,
31. "_source" : {
32. "id" : "efg",
33. "name" : "Bag",
34. "price" : 42
35. }
36. }
37. ]
38. }
39. }
显然我们写入的数据是成功的。
接下来,我使用了如下的两种格式来进行搜索:
1. // Search for a data
2. TermQuery query = QueryBuilders.term()
3. .field("name")
4. .value("bag")
5. .build();
7. SearchRequest request = new SearchRequest.Builder()
8. .index("products")
9. .query(query._toQuery())
10. .build();
12. SearchResponse<Product> search =
13. client.search(
14. request,
15. Product.class
16. );
18. for (Hit<Product> hit: search.hits().hits()) {
19. Product pd = hit.source();
20. System.out.println(pd);
21. }
23. SearchResponse<Product> search1 = client.search(s -> s
24. .index("products")
25. .query(q -> q
26. .term(t -> t
27. .field("name")
28. .value(v -> v.stringValue("bag"))
29. )),
30. Product.class);
32. for (Hit<Product> hit: search1.hits().hits()) {
33. Product pd = hit.source();
34. System.out.println(pd);
35. }
这个搜索相当于:
1. GET products/_search
2. {
3. "query": {
4. "term": {
5. "name": {
6. "value": "bag"
7. }
8. }
9. }
10. }
上面的搜索结果为:
1. {
2. "took" : 0,
3. "timed_out" : false,
4. "_shards" : {
5. "total" : 1,
6. "successful" : 1,
7. "skipped" : 0,
8. "failed" : 0
9. },
10. "hits" : {
11. "total" : {
12. "value" : 2,
13. "relation" : "eq"
14. },
15. "max_score" : 0.18232156,
16. "hits" : [
17. {
18. "_index" : "products",
19. "_id" : "abc",
20. "_score" : 0.18232156,
21. "_source" : {
22. "id" : "abc",
23. "name" : "Bag",
24. "price" : 42
25. }
26. },
27. {
28. "_index" : "products",
29. "_id" : "efg",
30. "_score" : 0.18232156,
31. "_source" : {
32. "id" : "efg",
33. "name" : "Bag",
34. "price" : 42
35. }
36. }
37. ]
38. }
39. }
Java 代码输出的结果为:
1. Product{id='abc', name='Bag', price=42}
2. Product{id='efg', name='Bag', price=42}
3. Product{id='abc', name='Bag', price=42}
4. Product{id='efg', name='Bag', price=42}
我们使用如下的代码来简化一个复杂的 DSL:
1. // Splitting complex DSL
2. TermQuery termQuery = TermQuery.of(t ->t.field("name").value("bag"));
4. SearchResponse<Product> search2 = client.search(s -> s
5. .index("products")
6. .query(termQuery._toQuery()),
7. Product.class
8. );
10. for (Hit<Product> hit: search2.hits().hits()) {
11. Product pd = hit.source();
12. System.out.println(pd);
13. }
同样上面的输出结果为:
1. Product{id='abc', name='Bag', price=42}
2. Product{id='efg', name='Bag', price=42}
最后,使用了一个 aggregation:
1. // Creating aggregations
2. SearchResponse<Void> search3 = client.search( b-> b
3. .index("products")
4. .size(0)
5. .aggregations("price-histo", a -> a
6. .histogram(h -> h
7. .field("price")
8. .interval(20.0)
9. )
10. ),
11. Void.class
12. );
14. long firstBucketCount = search3.aggregations()
15. .get("price-histo")
16. .histogram()
17. .buckets().array()
18. .get(0)
19. .docCount();
21. System.out.println("doc count: " + firstBucketCount);
22. }
上面的 aggregation 相当于如下的请求:
1. GET products/_search
2. {
3. "size": 0,
4. "aggs": {
5. "price-histo": {
6. "histogram": {
7. "field": "price",
8. "interval": 50
9. }
10. }
11. }
12. }
它的响应结果为:
1. {
2. "took" : 0,
3. "timed_out" : false,
4. "_shards" : {
5. "total" : 1,
6. "successful" : 1,
7. "skipped" : 0,
8. "failed" : 0
9. },
10. "hits" : {
11. "total" : {
12. "value" : 2,
13. "relation" : "eq"
14. },
15. "max_score" : null,
16. "hits" : [ ]
17. },
18. "aggregations" : {
19. "price-histo" : {
20. "buckets" : [
21. {
22. "key" : 0.0,
23. "doc_count" : 2
24. }
25. ]
26. }
27. }
28. }
我们的 Java 代码的输出结果为:
doc count: 2