如果利用SpringBoot整合ElasticSearch实现搜索关键字自动补全功能

496 阅读4分钟

利用SpringBoot整合ElasticSearch实现搜索关键字自动补全(详细教程)

前言:首先在我们平常搜索过程中都是通过具体的中文汉字去分词搜索(比如说ik分词器),但为了更加方便用户使用,以及提升用户搜索体验:用户可以输入拼音来搜索想要搜索的内容。

  1. 添加相关依赖,包括spring-data-elasticsearch,以及拼音分词的工具类的依赖
<!-- elasticsearch-->
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-data-elasticsearch</artifactId>
</dependency>
<!--引入hutool拼音jar包-->
<dependency>
    <groupId>com.belerweb</groupId>
    <artifactId>pinyin4j</artifactId>
    <version>2.5.0</version>
</dependency>
<dependency>
    <groupId>io.github.biezhi</groupId>
    <artifactId>TinyPinyin</artifactId>
    <version>2.0.3.RELEASE</version>
</dependency>
<dependency>
    <groupId>com.github.stuxuhai</groupId>
    <artifactId>jpinyin</artifactId>
    <version>1.1.8</version>
</dependency>

2. 修改上传到es实体类对应的mapping映射的json文件(添加拼音分词器以及新增搜索推荐字段)
2.1 首先要去下载拼音分词器的插件 地址:Releases · medcl/elasticsearch-analysis-pinyin · GitHub
保证你下载的拼音分词器和ES的版本一致,如果没有你想下载的拼音分词器,下载最为接近的版本,然后修改拼音分词器中的 plugin-descriptor.properties 文件 2.2 将拼音分词器插件解压到elasticSearch的plugins目录下面
2.3 在mapping映射文件中新增拼音分词器,以及搜索建议titleSuggestion字段(type为completion)

  "settings": {
  "analysis": {
    "analyzer": {
      "text_analyzer": {
        "tokenizer": "ik_max_word",
        "filter": "py"
      },
      "completion_analyzer": {
        "tokenizer": "ik_max_word",
        "filter": "py"
      }
    },
    "filter": {
      "py": {
        "type": "pinyin",
        "keep_full_pinyin": false,
        "keep_joined_full_pinyin": true,
        "keep_original": true,
        "limit_first_letter_length": 16,
        "remove_duplicated_term": true,
        "none_chinese_pinyin_tokenize": false
      }
    }
  }
},
"mappings": {
  "properties": {
    "title": {
      "type": "text",
      "analyzer": "ik_max_word",
      "search_analyzer": "ik_smart",
      "fields": {
        "keyword": {
          "type": "keyword",
          "ignore_above": 256
        }
      }
    },
    "content": {
      "type": "text",
      "analyzer": "ik_max_word",
      "search_analyzer": "ik_smart",
      "fields": {
        "keyword": {
          "type": "keyword",
          "ignore_above": 256
        }
      }
    },
    "titleSuggestion": {   //搜索建议字段
      "type": "completion",
      "analyzer": "completion_analyzer"
    },
    "tags": {
      "type": "keyword"
    },
    "userId": {
      "type": "keyword"
    },
    "createTime": {
      "type": "date"
    },
    "updateTime": {
      "type": "date"
    },
    "isDelete": {
      "type": "keyword"
    }
  }
}

 2.4 使用kibana的devtools新建索引文档 (将上面的json文件)

  1. 创建操作ES实体类: 是对原来java对应MySQL实体类的进一步封装,自定义将MySQL中的一些字段上传到ES中,一些隐私字段以及不用于搜索的字段可以忽略;(本文需重点关注搜索建议字段titleSuggestion以及在objToDto方法中titleSuggestion的封装)
import lombok.Data;
import org.apache.commons.collections4.CollectionUtils;
import org.apache.commons.lang3.StringUtils;
import org.springframework.beans.BeanUtils;
import org.springframework.data.annotation.Id;
import org.springframework.data.elasticsearch.annotations.Document;
import org.springframework.data.elasticsearch.annotations.Field;
import org.springframework.data.elasticsearch.annotations.FieldType;

/**
 * 帖子 ES 包装类
 **/
@Document(indexName = "post")
@Data
public class PostEsDTO implements Serializable {

    private static final String DATE_TIME_PATTERN = "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'";

    /**
     * id
     */
    @Id   //文档id
    private Long id;

    /**
     * 标题
     */
    private String title;

    /**
     * 内容
     */
    private String content;

    /**
     * 标签列表
     */
    private List<String> tags;

    /* *
     * 搜索建议
     */
    private List<String> titleSuggestion;

    /**
     * 点赞数
     */
//    private Integer thumbNum;

    /**
     * 收藏数
     */
//    private Integer favourNum;

    /**
     * 创建用户 id
     */
    private Long userId;

    /**
     * 创建时间
     */
    @Field(index = false, store = true, type = FieldType.Date, format = {}, pattern = DATE_TIME_PATTERN)
    private Date createTime;

    /**
     * 更新时间
     */
    @Field(index = false, store = true, type = FieldType.Date, format = {}, pattern = DATE_TIME_PATTERN)
    private Date updateTime;

    /**
     * 是否删除
     */
    private Integer isDelete;

    private static final long serialVersionUID = 1L;

    private static final Gson GSON = new Gson();

    /**
     * 对象转包装类
     * 将MySQL对应Java中的实体类进一步封装成ES对应的实体类
     * @param post
     * @return
     */
    public static PostEsDTO objToDto(Post post) {
        if (post == null) {
            return null;
        }
        PostEsDTO postEsDTO = new PostEsDTO();
        BeanUtils.copyProperties(post, postEsDTO);
        String title = postEsDTO.getTitle();
        List<String> titleSuggestion = new ArrayList<>() {{
            add(title);
            add(PinyinUtil.getPinyin(title).replace(" ", ""));
        }};
        postEsDTO.setTitleSuggestion(titleSuggestion);
        String tagsStr = post.getTags();
        if (StringUtils.isNotBlank(tagsStr)) {
            postEsDTO.setTags(GSON.fromJson(tagsStr, new TypeToken<List<String>>() {
            }.getType()));
        }
        return postEsDTO;
    }

    /**
     * 包装类转对象
     *
     * @param postEsDTO
     * @return
     */
    public static Post dtoToObj(PostEsDTO postEsDTO) {
        if (postEsDTO == null) {
            return null;
        }
        Post post = new Post();
        BeanUtils.copyProperties(postEsDTO, post);
        List<String> tagList = postEsDTO.getTags();
        if (CollectionUtils.isNotEmpty(tagList)) {
            post.setTags(GSON.toJson(tagList));
        }
        return post;
    }
}
  1. 创建ES操作类 (类似于SpringBoot中的Mapper(数据库操作类))
import com.yupi.springbootinit.model.dto.post.PostEsDTO;
import java.util.List;
import org.springframework.data.elasticsearch.repository.ElasticsearchRepository;

/**
 * 帖子 ES 操作
 *
 */
public interface PostEsDao extends ElasticsearchRepository<PostEsDTO, Long> {

    List<PostEsDTO> findByUserId(Long userId);
}
  1. 将MySQL中的数据全量同步到ES中 (因为我们新建了一个post_vo索引文档,只有mapping映射,还没有具体数据);博主这里是开启了一个定时任务,但可以创建Test方法将数据上传到ES中(这里就不做演示了)
import com.yupi.springbootinit.esdao.PostEsDao;
import com.yupi.springbootinit.model.dto.post.PostEsDTO;
import com.yupi.springbootinit.model.entity.Post;
import com.yupi.springbootinit.service.PostService;
import java.util.List;
import java.util.stream.Collectors;
import javax.annotation.Resource;
import lombok.extern.slf4j.Slf4j;
import org.apache.commons.collections4.CollectionUtils;
import org.springframework.boot.CommandLineRunner;
import org.springframework.stereotype.Component;

/**
* 全量同步帖子到 es
* 这是一个定时任务,在SpringBoot每次项目启动会自动调用一次
* 如果不想开启定时任务,直接将@Component注释掉即可
*/
@Component
@Slf4j
public class FullSyncPostToEs implements CommandLineRunner {

   @Resource
   private PostService postService;

   @Resource
   private PostEsDao postEsDao;

   @Override
   public void run(String... args) {
       // 相当于拿到数据库post表中的所有数据
       List<Post> postList = postService.list();
       // 判断是否为空
       if (CollectionUtils.isEmpty(postList)) {
           return;
       }
       // 将MySQL中查询到的所有数据封装成PostEsDTO(这是ES索引文档需要的数据)
       List<PostEsDTO> postEsDTOList = postList.stream().map(PostEsDTO::objToDto).collect(Collectors.toList());
       final int pageSize = 500;
       int total = postEsDTOList.size();
       log.info("FullSyncPostToEs start, total {}", total);
       for (int i = 0; i < total; i += pageSize) {
           int end = Math.min(i + pageSize, total);
           log.info("sync from {} to {}", i, end);
           // 利用ES操作工具类中的saveAll()方法来实现将封装好的postEsDTOList全部上传到ES中
           postEsDao.saveAll(postEsDTOList.subList(i, end));
       }
       log.info("FullSyncPostToEs end, total {}", total);
   }
}

查看上传到ES中的数据

  1. 编写搜索自动补全的业务方法(过程会比较复杂)
@Override
public List<String> getTitleSuggestion(String keyword) {
  @Data
  class Temp {
      Float score;
      String text;

      public Temp(Float score, String text) {
          this.score = score;
          this.text = text;
      }
  }
  // 创建 suggestionBuilder对象, 将其作为查询条件去es中查询相关数据;
  SuggestBuilder suggestBuilder = new SuggestBuilder()
          .addSuggestion("suggestionTitle", new CompletionSuggestionBuilder("titleSuggestion").prefix(keyword));
  NativeSearchQuery searchQuery = new NativeSearchQueryBuilder()
          .withSuggestBuilder(suggestBuilder).build();
  // 获取查询结果
  SearchHits<PostEsDTO> searchHits = elasticsearchRestTemplate.search(searchQuery, PostEsDTO.class);
  // 将处理到的结果去重(可能出现重复的自动补全的提示)
  HashSet<Temp> tempSet = new HashSet<>();、
  // 这里是对ES返回的数据进行拆封处理,得到所有搜索自动补全字段
  if (searchHits.hasSuggest()) {
      List<Suggest.Suggestion<? extends Suggest.Suggestion.Entry<? extends Suggest.Suggestion.Entry.Option>>> suggestions = searchHits.getSuggest().getSuggestions();
      for (Suggest.Suggestion<? extends Suggest.Suggestion.Entry<? extends Suggest.Suggestion.Entry.Option>> suggestion : suggestions) {
          CompletionSuggestion<CompletionSuggestion.Entry<CompletionSuggestion.Entry.Option>> suggestionComp = (CompletionSuggestion<CompletionSuggestion.Entry<CompletionSuggestion.Entry.Option>>) suggestion;
          List<CompletionSuggestion.Entry<CompletionSuggestion.Entry<CompletionSuggestion.Entry.Option>>> entryList = suggestionComp.getEntries();
          for (CompletionSuggestion.Entry<CompletionSuggestion.Entry<CompletionSuggestion.Entry.Option>> entry : entryList) {
              List<CompletionSuggestion.Entry.Option<CompletionSuggestion.Entry<CompletionSuggestion.Entry.Option>>> options = entry.getOptions();
              for (CompletionSuggestion.Entry.Option<CompletionSuggestion.Entry<CompletionSuggestion.Entry.Option>> option : options) {
                  SearchHit<CompletionSuggestion.Entry<CompletionSuggestion.Entry.Option>> searchHit = option.getSearchHit();
                  Float score = option.getScore();
                  Object content = searchHit.getContent();
                  PostEsDTO post = new JSONObject(content).toBean(PostEsDTO.class);
                  tempSet.add(new Temp(score, post.getTitle()));
              }
          }
      }
      List<String> suggestionList = tempSet.stream().map(Temp::getText).collect(Collectors.toList());
      return suggestionList;
  }
  return new ArrayList<>();
}
  1. 将该业务方法交给Controller层调用
import com.yupi.springbootinit.common.BaseResponse;
import com.yupi.springbootinit.common.ResultUtils;
import com.yupi.springbootinit.manager.SearchFacade;
import com.yupi.springbootinit.model.dto.search.SearchRequest;
import com.yupi.springbootinit.model.vo.SearchVO;
import com.yupi.springbootinit.service.PostService;
import org.springframework.web.bind.annotation.*;

import javax.annotation.Resource;
import javax.servlet.http.HttpServletRequest;
import java.util.List;

@RestController
@RequestMapping(value = "/search")
public class SearchController {

    @Resource
    private PostService postService;
    @Resource
    private SearchFacade searchFacade;

    @PostMapping(value = "/all")
    public BaseResponse<SearchVO> searchAll(@RequestBody SearchRequest searchRequest, HttpServletRequest request) {
        SearchVO searchVO = searchFacade.searchAll(searchRequest, request);
        return ResultUtils.success(searchVO);
    }
    // 通过输入关键字来查询搜索建议
    @GetMapping(value="/get/tip/{keyword}")
    public BaseResponse<List<String>> getSearchSuggestion(@PathVariable String keyword){
        List<String> titleSuggestion = postService.getTitleSuggestion(keyword);
        return ResultUtils.success(titleSuggestion);
    }

}
  1. 测试(可以用postman),这里是使用swagger自动生成的接口文档进行测试 image.png

9.前端页面展示

image.png 大功告成!!! 在这个过程中可能会踩到很多坑,如果有疑问,可以私信我