ElasticSearch

持续创作，加速成长！这是我参与「掘金日新计划 · 6 月更文挑战」的第6天，点击查看活动详情

es是通过json格式存储，例如

{
    "age": 21,
    "weight": 52.1,
    "isMarried": false,
    "info": "黑马程序员Java讲师",
    "email": "zy@itcast.cn",
    "score": [99.1, 99.5, 98.9],
    "name": {
        "firstName": "云",
        "lastName": "赵"
    }
}

1. `mapping`映射

type：字段数据类型，常见的简单类型有：
- 字符串：text（可分词的文本）、keyword（精确值，例如：品牌、国家、ip地址）
- 数值：long、integer、short、byte、double、float
- 布尔：boolean
- 日期：date
- 对象：object
index：是否创建索引，默认为true
analyzer：使用哪种分词器
properties：该字段的子字段

2. 索引库操作`CRUD`

2.1 创建索引库

基本语法：

请求方式：PUT
请求路径：/索引库名，可以自定义
请求参数：mapping映射

PUT /文档名称
{
    "mappings":{
         "properties":{
             "索引1":{
                 "type":"类型",
                 ...
             },
             "索引2":{
                 "type":"类型",
                 ...
             },
             "索引3":{
                 "type":"object",
                 "properties":{
                 	"名称1":{
                 		"type":"类型",
                 		...
             		},
                     "名称2":{
                 		"type":"类型",
                 		...
             		},
                    ...
             	}
             }
         }
    }
}

例子：

PUT /db_test
{
  "mappings": {
    "properties": {
      "topic": {
        "type": "text",
        "analyzer": "ik_smart"
      },
      "email": {
        "type": "keyword",
        "index": false
      },
      "owner": {
        "type": "object",
        "properties": {
          "name": {
            "type": "keyword"
          },
          "age": {
            "type": "integer"
          }
        }
      }
    }
  }
}

这里使用PUT请求方式创建了名字叫 db_test索引库。使用mapping对db_test索引库创建索引。当对应的索引下有properties代表能有多个索引。

2.2 查询索引库

基本语法
- 请求方式：GET
- 请求路径：/索引库名
例子
```
GET /db_test
```

2.3 修改索引库

**注意：**当索引库已经创建就不能修改索引，只能增加索引。

基本语法
- 请求方式：PUT
- 请求路径：/_索引库名/_mapping
- 请求参数：properties

例子：

PUT /db_test/_mapping
{
  "properties": {
    "startTime": {
      "type": "date",
      "index": "false"
    }
  }
}

2.4 删除索引库

基本语法
- 请求方式：DELETE
- 请求路径：/索引库名
例子：
```
DELETE /db_test
```

2.5 代码总结

创建索引库：PUT /索引库名
查询索引库：GET /索引库名
删除索引库：DELETE /索引库名
添加字段：PUT /索引库名/_mapping

# 创建索引
PUT /db_test
{
  "mappings": {
    "properties": {
      "topic": {
        "type": "text",
        "analyzer": "ik_smart"
      },
      "email": {
        "type": "keyword",
        "index": false
      },
      "owner": {
        "type": "object",
        "properties": {
          "name": {
            "type": "keyword"
          },
          "age": {
            "type": "integer"
          }
        }
      }
    }
  }
}

# 获取索引
GET /db_test

# 修改索引
PUT /db_test/_mapping
{
  "properties": {
    "startTime": {
      "type": "date",
      "index": "false"
    }
  }
}

# 删除索引
DELETE /db_test

3. 文档操作

3.1 创建文档

基本语法：

请求方式：PUT
请求路径：/索引库名/_doc/文档id，可以自定义文档id
请求参数：mapping映射

POST /索引库名称/_doc/文档id
{
    "索引1":"内容",
    "索引2":"内容",
    "索引3":{
        "名称1":"内容"，
        "名称2":"内容"
    }
}

例子

# 创建文档
POST /db_test/_doc/1
{
  "email":"111111@qq.com",
  "owner":{
    "age":21,
    "name":"何鸭子"
  },
  "topic":"希望何鸭子能快速拿到实习，奥里给！"
}

3.2 查询文档

基本语法
- 请求方式：GET
- 请求路径：/索引库名/_doc/文档id
例子：
```
# 获取文档
GET /db_test/_doc/1
```

3.3 修改文档

方式一：全量修改

全量操作相当于覆盖，将对应文档id的文档覆盖。如果根据文档id无法查询到对应的文档则创建一个新的文档。

基本语法：

请求方式：PUT
请求路径：PUT /索引库名/_doc/文档id

# 全量修改
PUT /索引库名/_doc/文档id
{
    "字段1":"值1",
    "字段2":"值2",
    ...
}

例子：

# 全量修改
PUT /db_test/_doc/2
{
    "email":"111111@qq.com",
  "owner":{
    "age":21,
    "name":"何鸭子"
  },
  "topic":"希望何鸭子能快速拿到实习，奥里给！"
}

**注意：**有则改之，无的增加。

方法二：增量修改

修改指定文档的指定字段。如果文档没有对应的字段，会新增字段，从而修改了该文档的索引结构。

基本语法：

请求方式：POST
请求路径：POST /索引库名/_doc/文档id

# 增量修改
POST /db_test/_update/1
{
  "doc": {
    "字段1":"值1",
    "字段2":"值2",
    ...
  }
}

例子：

# 增量修改
POST /db_test/_update/1
{
  "doc": {
    "name":"局部修改",
    "topic":"希望局部修改能快速拿到实习，奥里给！"
  }
}

**注意：**请检查修改的字段是否存在，避免改变了索引结构

3.4 删除文档

基本语法
- 请求方式：DELETE
- 请求路径：/索引库名/_doc/文档id
例子：
```
# 删除文档
DELETE /db_test/_doc/1
```

RestClient操作索引库

1. 数据库分析

导入
- 我们先根据【www.bilibili.com/video/BV1LQ…
我们根据业务对该数据结构进行分析创建es相对应的映射
- 属性 . 类型：分析
- id . bigint：我们可以看到这是一个java中的long类型的属性。但是这个属性是作为es中的id使用，所以它在es中的类型是keyword并且作为索引
- name . varchar：这是一个字符串类型的属性。作为一个酒店名称必然分词且作为索引的，所以它在es中的类型是text，使用分词并且作为索引
- address . varchar：这是一个字符串类型的属性。我的想法是没人会通过酒店的地址去寻找酒店，所以它在es中的类型是keyword，不作为索引
- prince . int：这是一个int类型的属性。是一个价格的属性。人们会根据价格的高低来排序选择酒店，所以它在es中的类型是keyword并且作为索引
- score . int：这是一个int类型的属性。是一个评分的属性。人们会根据评分的高低来排序选择酒店，所以它在es中的类型是keyword并且作为索引
- brand . varchar：这是一个字符串类型的属性。是一个酒店品牌。人们会根据酒店品牌搜索对应的酒店，所以它在es中的类型是keyword并且作为索引
- city . varchar：这是一个字符串类型的属性。是一个酒店所在城市。人们会根据城市搜索对应的酒店，所以它在es中的类型是keyword并且作为索引
- star_name . varchar：这是一个字符串类型的属性。是一个酒店星级。人们根据酒店星级搜索对应的酒店，也会根据酒店星级来排序，所以它在es中的类型是keyword并且作为索引
- business . varchar：这是一个字符串类型的属性。这是酒店所在的商圈。人们可能会根据酒店的商圈来搜素酒店，所以它在es中的类型是keyword并且作为索引。不分词的原因是因为商圈的名称是一体的，分词可能会混乱
- latitude,longitude . varchar：这是两个字符串类型的属性。这是酒店所在的经纬度。一般没人会根据经纬度去搜索酒店，所以它在es中的类型是geo_point并不作为索引
- pic . varchar：这是一个字符串类型的属性。这是酒店的图片。人们不可能会根据图片搜索酒店，所以它在es中的类型是keyword并且不作为索引
```
# 酒店的 mapping
PUT /hotel
{
  "mappings": {
    "properties": {
      "id":{
        "type": "keyword"
      },
      "name":{
        "type": "text",
        "analyzer": "ik_max_word"
      },
      "address":{
        "type": "keyword",
      "index": false
      },
      "price":{
        "type": "integer"
      },
      "score":{
        "type": "integer"
      },
      "brand":{
        "type": "keyword"
      },
      "city":{
        "type": "keyword"
      },
      "starName":{
        "type": "keyword"
      },
      "business":{
        "type": "keyword"
      },
      "location":{
        "type": "geo_point"
        , "index": false
      },
      "pic":{
        "type": "keyword",
      "index": false
      }
    }
  }
}
```
上述的初步映射就完成了。但是有一些索引的类型是一致的，我们搜索的时候可能需要搜索多个索引。所以我们可以把多个类型一致的索引的内容全部放在同一个索引下，只用在一个索引就能完成搜索。
注意：ES中支持两种地理坐标数据类型：
- geo_point：由纬度（latitude）和经度（longitude）确定的一个点。例如："32.8752345, 120.2981576"
- geo_shape：有多个geo_point组成的复杂几何图形。例如一条直线，"LINESTRING (-77.03653 38.897676, -77.009051 38.889939)"
可能有点不太理解，我们举个例子：酒店的名称、酒店品牌、所在城市、酒店星级这四个索引的内容的类型是一致的，当用户输入一个字符串，es会搜索这四个索引的全部内容。我们新建一个索引，将四个索引的内容全部放在新建的索引中，当用户输入一个字符串就只在一个索引中搜索。

如果想要用一个索引搜索多个索引的内容，就使用copy_to

接下来我们就来改进一下上面的索引库

我们在索引库低创建一个all索引，让酒店的名称、酒店品牌、所在城市、酒店星级这四个索引使用copy_to指向all索引。

# 酒店的 mapping
PUT /hotel
{
  "mappings": {
    "properties": {
      "id":{
        "type": "keyword"
      },
      "name":{
        "type": "text",
        "analyzer": "ik_max_word",
        "copy_to": "all"
      },
      "address":{
        "type": "keyword",
      "index": false
      },
      "price":{
        "type": "integer"
      },
      "score":{
        "type": "integer"
      },
      "brand":{
        "type": "keyword",
        "copy_to": "all"
      },
      "city":{
        "type": "keyword",
        "copy_to": "all"
      },
      "starName":{
        "type": "keyword",
        "copy_to": "all"
      },
      "business":{
        "type": "keyword"
      },
      "location":{
        "type": "geo_point"
        , "index": false
      },
      "pic":{
        "type": "keyword",
      "index": false
      },
      "all":{
       "type": "text",
        "analyzer": "ik_max_word"
      }
    }
  }
}

2. 初始化 `JavaRestClient`

操作步骤：

导入依赖

<!--elasticSearch-->
<dependency>
    <groupId>org.elasticsearch.client</groupId>
    <artifactId>elasticsearch-rest-high-level-client</artifactId>
</dependency>

版本控制

与导入依赖的同一xml配置文件中添加

<properties>
    <java.version>1.8</java.version>
    <!--添加版本控制-->
    <elasticsearch.version>7.12.1</elasticsearch.version>
</properties>

初始化RestHighLevelClient

package cn.itcast.hotel;

import org.apache.http.HttpHost;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestHighLevelClient;
import org.junit.jupiter.api.*;

import java.io.IOException;

/**
 * @author HGD
 * @date 2022/6/25 14:39
 */
public class HotelIndexTest {
    private RestHighLevelClient highLevelClient;

    // 3. 测试
    @Test
    void name() {
        System.out.println(highLevelClient);
    }

    // 1. 运行前连接
    @BeforeEach
    void setUp() {
        this.highLevelClient = new RestHighLevelClient(RestClient.builder(
                HttpHost.create("http://128.0.0.128:9200")
        ));
    }

    // 2. 运行后关闭
    @AfterEach
    void tearDown() throws IOException {
        this.highLevelClient.close();
    }
}

3. 创建索引库

操作步骤

在初始化JavaRestClient创建索引名称
输入DSL语句和语句格式
使用client.indices().create(request, RequestOptions.DEFAULT)创建索引库

因为所有的步骤都在同一个方法内实现，所以举一个例子

@Test
void createIndexHotelTest() throws IOException {
    // 1. 创建 Request 对象
    CreateIndexRequest request = new CreateIndexRequest("hotel");
    // 2. 准备请求参数：DSL 语句
    request.source(HotelConstants.hotelMapping, XContentType.JSON);
    // 3. 发送请求
    client.indices().create(request, RequestOptions.DEFAULT);
}

这里的DSL语句太长所以新建一个类来存放

package cn.itcast.hotel.constants;

/**
 * @author HGD
 * @date 2022/6/25 15:00
 */
public class HotelConstants {
    public static final String hotelMapping = "{\n" +
            "  \"mappings\": {\n" +
            "    \"properties\": {\n" +
            "      \"id\":{\n" +
            "        \"type\": \"keyword\"\n" +
            "      },\n" +
            "      \"name\":{\n" +
            "        \"type\": \"text\",\n" +
            "        \"analyzer\": \"ik_max_word\",\n" +
            "        \"copy_to\": \"all\"\n" +
            "      },\n" +
            "      \"address\":{\n" +
            "        \"type\": \"keyword\",\n" +
            "      \"index\": false\n" +
            "      },\n" +
            "      \"price\":{\n" +
            "        \"type\": \"integer\"\n" +
            "      },\n" +
            "      \"score\":{\n" +
            "        \"type\": \"integer\"\n" +
            "      },\n" +
            "      \"brand\":{\n" +
            "        \"type\": \"keyword\",\n" +
            "        \"copy_to\": \"all\"\n" +
            "      },\n" +
            "      \"city\":{\n" +
            "        \"type\": \"keyword\",\n" +
            "        \"copy_to\": \"all\"\n" +
            "      },\n" +
            "      \"starName\":{\n" +
            "        \"type\": \"keyword\",\n" +
            "        \"copy_to\": \"all\"\n" +
            "      },\n" +
            "      \"business\":{\n" +
            "        \"type\": \"keyword\"\n" +
            "      },\n" +
            "      \"location\":{\n" +
            "        \"type\": \"geo_point\"\n" +
            "        , \"index\": false\n" +
            "      },\n" +
            "      \"pic\":{\n" +
            "        \"type\": \"keyword\",\n" +
            "      \"index\": false\n" +
            "      },\n" +
            "      \"all\":{\n" +
            "       \"type\": \"text\",\n" +
            "        \"analyzer\": \"ik_max_word\"\n" +
            "      }\n" +
            "    }\n" +
            "  }\n" +
            "}";
}

**注意：**这里的client.indices返回的对象中包含索引库操作的所有方法

4. 删除索引库、判断索引库是否存在

操作步骤

删除索引库

/**
 * 删除索引库
 *
 * @throws IOException 输入输出流异常
 */
@Test
void deleteHotelIndexTest() throws IOException {
    // 1. 创建 Request 对象
    DeleteIndexRequest request = new DeleteIndexRequest("hotel");
    // 2. 发送请求
    client.indices().delete(request, RequestOptions.DEFAULT);
}

判断索引库

/**
 * 判断索引库是否存在
 *
 * @throws IOException 输入输出流异常
 */
@Test
void existHotelIndexTest() throws IOException {
    // 1. 创建 Request 对象
    GetIndexRequest request = new GetIndexRequest("hotel");
    // 2. 发送请求
    boolean exists = client.indices().exists(request, RequestOptions.DEFAULT);
    // 3. 输出
    System.out.println(exists);
}

**注意：**可以看出其实无论是创建索引库、删除索引库和判断索引库是否存在，他们的操作都是大致相同的。所以我们可以得出一个关于索引库操作的规律。

根据操作创建Request对象

根据操作准备请求参数

使用indices().xxx操作索引库

RestClient操作文档

1. 增加文档

操作步骤

从数据库查询对应数据
数据从数据库类型转换为文档类型
准备Request对象
输入DSL语句和语句格式
通过index()方法将请求发送

package cn.itcast.hotel;

import cn.itcast.hotel.pojo.Hotel;
import cn.itcast.hotel.pojo.HotelDoc;
import cn.itcast.hotel.service.IHotelService;
import com.alibaba.fastjson.JSON;
import org.apache.http.HttpHost;
import org.elasticsearch.action.index.IndexRequest;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.common.xcontent.XContentType;
import org.junit.jupiter.api.AfterEach;
import org.junit.jupiter.api.BeforeEach;
import org.junit.jupiter.api.Test;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;

import java.io.IOException;

/**
 * @author HGD
 * @date 2022/6/25 14:39
 */
@SpringBootTest
public class HotelDocumentTest {
    private RestHighLevelClient client;

    @Autowired
    private IHotelService service;

    /**
     * 增加文档
     *
     * @throws IOException 输入输出流异常
     */
    @Test
    void addDocumentTest() throws IOException {
        // 根据 id 查询酒店数据
        Hotel hotel = service.getById(36934L);
        // 转换为文档类型
        HotelDoc hotelDoc = new HotelDoc(hotel);

        // 1. 准备 Request 对象
        IndexRequest request = new IndexRequest("hotel").id(String.valueOf(hotelDoc.getId()));
        // 2. 准备 Json 文档
        request.source(JSON.toJSONString(hotelDoc), XContentType.JSON);
        // 3. 发送请求
        client.index(request, RequestOptions.DEFAULT);
    }

    @BeforeEach
    void setUp() {
        this.client = new RestHighLevelClient(RestClient.builder(
                HttpHost.create("http://128.0.0.128:9200")
        ));
    }

    @AfterEach
    void tearDown() throws IOException {
        this.client.close();
    }
}

注意：

文档的创建操作是通过index()来完成。

检查数据库类型和文档类型是否一致。如果不一致需要新建一个专属文档类型的实体类，通过修改文档类型实体类的构造方法将类型转换。

2. 查询文档

操作步骤

准备Request对象
发送请求并获取返回值
将返回值转换成 json 字符串
调用fastJson将json字符串转化成对应的实体类
实体类输出

/**
 * 通过 id 查询到对应的对象。
 * 使用 GetRequest 对象传入索引表名和文档 id
 * 
 * @throws IOException 输入输出流异常
 */
@Test
void getDocumentByIdTest() throws IOException {
    // 1. 准备 Request 对象
    GetRequest request = new GetRequest("hotel", "36934");
    // 2. 发送请求并获取返回值
    GetResponse response = client.get(request, RequestOptions.DEFAULT);
    // 3. 将返回值转换成 json 字符串
    String json = response.getSourceAsString();
    // 4. 调用 fastJson 将 json 字符串转换为对应的实体类
    HotelDoc hotelDoc = JSON.parseObject(json, HotelDoc.class);
    // 5. 实体类输出
    System.out.println(hotelDoc);
}

**注意：**文档的查询操作是通过get()来完成

3. 修改文档

全量修改

全量修改格式就是按照增加文档的格式来写的。我们只需要按照增加文档的格式，将它的文档 id变成你要修改的那个文档的id即可。

增量修改

操作步骤

准备Request对象
准备参数，每2个参数为一对key value
更新文档

/**
 * 更新文档
 *
 * @throws IOException 输入输出流异常
 */
@Test
void updateDocumentTest() throws IOException {
    // 1. 准备 Request 对象
    UpdateRequest request = new UpdateRequest("hotel", "36934");
    // 2. 准备参数，每2个参数为一对 key value
    request.doc(
            "price", "1234",
            "score", "50"
    );
    // 3. 更新文档
    client.update(request, RequestOptions.DEFAULT);
}

注意：

文档的修改使用的是update()来完成

doc的参数是2个一对，每个参数都用,隔开

4. 删除文档

操作步骤

准备Request对象
发送请求

/**
 * 删除文档
 *
 * @throws IOException 输入输出流异常
 */
@Test
void deleteDocumentTest() throws IOException {
    // 1. 准备 Request 对象
    DeleteRequest request = new DeleteRequest("hotel", "36934");
    // 2. 发送请求
    client.delete(request, RequestOptions.DEFAULT);
}

**注意：**文档的删除使用的是delete()来完成

5. 批量操作

操作步骤

创建bulk请求
添加要批量提交的请求
发起bulk请求

/**
 * 批量导入
 *
 * @throws IOException 输入输出流异常
 */
@Test
void bulkRequestTest() throws IOException {
    // 批量查询酒店数据库
    List<Hotel> list = service.list();

    // 1. 准备 Request 对象
    BulkRequest request = new BulkRequest();

    // 循环文档的结果
    for (Hotel hotel : list) {
        HotelDoc hotelDoc = new HotelDoc(hotel);
        // 2. 准备参数，增加多个新增的 Request
        request.add(new IndexRequest("hotel")
                .id(hotel.getId().toString())
                .source(JSON.toJSONString(hotelDoc), XContentType.JSON));
    }

    // 3. 发送请求
    client.bulk(request, RequestOptions.DEFAULT);
}

**注意：**通过修改步骤二的 request.xxx来批量创建对应的批量请求

SpringCloud：ElasticSearch

ElasticSearch

1. mapping映射

2. 索引库操作CRUD

2.1 创建索引库

2.2 查询索引库

2.3 修改索引库

2.4 删除索引库

2.5 代码总结

3. 文档操作

3.1 创建文档

3.2 查询文档

3.3 修改文档

3.4 删除文档

RestClient操作索引库

1. 数据库分析

2. 初始化 JavaRestClient

3. 创建索引库

4. 删除索引库、判断索引库是否存在

RestClient操作文档

1. 增加文档

2. 查询文档

3. 修改文档

4. 删除文档

5. 批量操作

1. `mapping`映射

2. 索引库操作`CRUD`

2. 初始化 `JavaRestClient`