利用SpringBoot整合ElasticSearch实现搜索关键字自动补全(详细教程)
前言:首先在我们平常搜索过程中都是通过具体的中文汉字去分词搜索(比如说ik分词器),但为了更加方便用户使用,以及提升用户搜索体验:用户可以输入拼音来搜索想要搜索的内容。
- 添加相关依赖,包括spring-data-elasticsearch,以及拼音分词的工具类的依赖
<!-- elasticsearch-->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-elasticsearch</artifactId>
</dependency>
<!--引入hutool拼音jar包-->
<dependency>
<groupId>com.belerweb</groupId>
<artifactId>pinyin4j</artifactId>
<version>2.5.0</version>
</dependency>
<dependency>
<groupId>io.github.biezhi</groupId>
<artifactId>TinyPinyin</artifactId>
<version>2.0.3.RELEASE</version>
</dependency>
<dependency>
<groupId>com.github.stuxuhai</groupId>
<artifactId>jpinyin</artifactId>
<version>1.1.8</version>
</dependency>
2. 修改上传到es实体类对应的mapping映射的json文件(添加拼音分词器以及新增搜索推荐字段)
2.1 首先要去下载拼音分词器的插件
地址:Releases · medcl/elasticsearch-analysis-pinyin · GitHub
保证你下载的拼音分词器和ES的版本一致,如果没有你想下载的拼音分词器,下载最为接近的版本,然后修改拼音分词器中的 plugin-descriptor.properties 文件
2.2 将拼音分词器插件解压到elasticSearch的plugins目录下面
2.3 在mapping映射文件中新增拼音分词器,以及搜索建议titleSuggestion字段(type为completion)
"settings": {
"analysis": {
"analyzer": {
"text_analyzer": {
"tokenizer": "ik_max_word",
"filter": "py"
},
"completion_analyzer": {
"tokenizer": "ik_max_word",
"filter": "py"
}
},
"filter": {
"py": {
"type": "pinyin",
"keep_full_pinyin": false,
"keep_joined_full_pinyin": true,
"keep_original": true,
"limit_first_letter_length": 16,
"remove_duplicated_term": true,
"none_chinese_pinyin_tokenize": false
}
}
}
},
"mappings": {
"properties": {
"title": {
"type": "text",
"analyzer": "ik_max_word",
"search_analyzer": "ik_smart",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"content": {
"type": "text",
"analyzer": "ik_max_word",
"search_analyzer": "ik_smart",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"titleSuggestion": { //搜索建议字段
"type": "completion",
"analyzer": "completion_analyzer"
},
"tags": {
"type": "keyword"
},
"userId": {
"type": "keyword"
},
"createTime": {
"type": "date"
},
"updateTime": {
"type": "date"
},
"isDelete": {
"type": "keyword"
}
}
}
2.4 使用kibana的devtools新建索引文档 (将上面的json文件)
- 创建操作ES实体类: 是对原来java对应MySQL实体类的进一步封装,自定义将MySQL中的一些字段上传到ES中,一些隐私字段以及不用于搜索的字段可以忽略;(本文需重点关注搜索建议字段titleSuggestion以及在objToDto方法中titleSuggestion的封装)
import lombok.Data;
import org.apache.commons.collections4.CollectionUtils;
import org.apache.commons.lang3.StringUtils;
import org.springframework.beans.BeanUtils;
import org.springframework.data.annotation.Id;
import org.springframework.data.elasticsearch.annotations.Document;
import org.springframework.data.elasticsearch.annotations.Field;
import org.springframework.data.elasticsearch.annotations.FieldType;
/**
* 帖子 ES 包装类
**/
@Document(indexName = "post")
@Data
public class PostEsDTO implements Serializable {
private static final String DATE_TIME_PATTERN = "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'";
/**
* id
*/
@Id //文档id
private Long id;
/**
* 标题
*/
private String title;
/**
* 内容
*/
private String content;
/**
* 标签列表
*/
private List<String> tags;
/* *
* 搜索建议
*/
private List<String> titleSuggestion;
/**
* 点赞数
*/
// private Integer thumbNum;
/**
* 收藏数
*/
// private Integer favourNum;
/**
* 创建用户 id
*/
private Long userId;
/**
* 创建时间
*/
@Field(index = false, store = true, type = FieldType.Date, format = {}, pattern = DATE_TIME_PATTERN)
private Date createTime;
/**
* 更新时间
*/
@Field(index = false, store = true, type = FieldType.Date, format = {}, pattern = DATE_TIME_PATTERN)
private Date updateTime;
/**
* 是否删除
*/
private Integer isDelete;
private static final long serialVersionUID = 1L;
private static final Gson GSON = new Gson();
/**
* 对象转包装类
* 将MySQL对应Java中的实体类进一步封装成ES对应的实体类
* @param post
* @return
*/
public static PostEsDTO objToDto(Post post) {
if (post == null) {
return null;
}
PostEsDTO postEsDTO = new PostEsDTO();
BeanUtils.copyProperties(post, postEsDTO);
String title = postEsDTO.getTitle();
List<String> titleSuggestion = new ArrayList<>() {{
add(title);
add(PinyinUtil.getPinyin(title).replace(" ", ""));
}};
postEsDTO.setTitleSuggestion(titleSuggestion);
String tagsStr = post.getTags();
if (StringUtils.isNotBlank(tagsStr)) {
postEsDTO.setTags(GSON.fromJson(tagsStr, new TypeToken<List<String>>() {
}.getType()));
}
return postEsDTO;
}
/**
* 包装类转对象
*
* @param postEsDTO
* @return
*/
public static Post dtoToObj(PostEsDTO postEsDTO) {
if (postEsDTO == null) {
return null;
}
Post post = new Post();
BeanUtils.copyProperties(postEsDTO, post);
List<String> tagList = postEsDTO.getTags();
if (CollectionUtils.isNotEmpty(tagList)) {
post.setTags(GSON.toJson(tagList));
}
return post;
}
}
- 创建ES操作类 (类似于SpringBoot中的Mapper(数据库操作类))
import com.yupi.springbootinit.model.dto.post.PostEsDTO;
import java.util.List;
import org.springframework.data.elasticsearch.repository.ElasticsearchRepository;
/**
* 帖子 ES 操作
*
*/
public interface PostEsDao extends ElasticsearchRepository<PostEsDTO, Long> {
List<PostEsDTO> findByUserId(Long userId);
}
- 将MySQL中的数据全量同步到ES中 (因为我们新建了一个post_vo索引文档,只有mapping映射,还没有具体数据);博主这里是开启了一个定时任务,但可以创建Test方法将数据上传到ES中(这里就不做演示了)
import com.yupi.springbootinit.esdao.PostEsDao;
import com.yupi.springbootinit.model.dto.post.PostEsDTO;
import com.yupi.springbootinit.model.entity.Post;
import com.yupi.springbootinit.service.PostService;
import java.util.List;
import java.util.stream.Collectors;
import javax.annotation.Resource;
import lombok.extern.slf4j.Slf4j;
import org.apache.commons.collections4.CollectionUtils;
import org.springframework.boot.CommandLineRunner;
import org.springframework.stereotype.Component;
/**
* 全量同步帖子到 es
* 这是一个定时任务,在SpringBoot每次项目启动会自动调用一次
* 如果不想开启定时任务,直接将@Component注释掉即可
*/
@Component
@Slf4j
public class FullSyncPostToEs implements CommandLineRunner {
@Resource
private PostService postService;
@Resource
private PostEsDao postEsDao;
@Override
public void run(String... args) {
// 相当于拿到数据库post表中的所有数据
List<Post> postList = postService.list();
// 判断是否为空
if (CollectionUtils.isEmpty(postList)) {
return;
}
// 将MySQL中查询到的所有数据封装成PostEsDTO(这是ES索引文档需要的数据)
List<PostEsDTO> postEsDTOList = postList.stream().map(PostEsDTO::objToDto).collect(Collectors.toList());
final int pageSize = 500;
int total = postEsDTOList.size();
log.info("FullSyncPostToEs start, total {}", total);
for (int i = 0; i < total; i += pageSize) {
int end = Math.min(i + pageSize, total);
log.info("sync from {} to {}", i, end);
// 利用ES操作工具类中的saveAll()方法来实现将封装好的postEsDTOList全部上传到ES中
postEsDao.saveAll(postEsDTOList.subList(i, end));
}
log.info("FullSyncPostToEs end, total {}", total);
}
}
查看上传到ES中的数据
- 编写搜索自动补全的业务方法(过程会比较复杂)
@Override
public List<String> getTitleSuggestion(String keyword) {
@Data
class Temp {
Float score;
String text;
public Temp(Float score, String text) {
this.score = score;
this.text = text;
}
}
// 创建 suggestionBuilder对象, 将其作为查询条件去es中查询相关数据;
SuggestBuilder suggestBuilder = new SuggestBuilder()
.addSuggestion("suggestionTitle", new CompletionSuggestionBuilder("titleSuggestion").prefix(keyword));
NativeSearchQuery searchQuery = new NativeSearchQueryBuilder()
.withSuggestBuilder(suggestBuilder).build();
// 获取查询结果
SearchHits<PostEsDTO> searchHits = elasticsearchRestTemplate.search(searchQuery, PostEsDTO.class);
// 将处理到的结果去重(可能出现重复的自动补全的提示)
HashSet<Temp> tempSet = new HashSet<>();、
// 这里是对ES返回的数据进行拆封处理,得到所有搜索自动补全字段
if (searchHits.hasSuggest()) {
List<Suggest.Suggestion<? extends Suggest.Suggestion.Entry<? extends Suggest.Suggestion.Entry.Option>>> suggestions = searchHits.getSuggest().getSuggestions();
for (Suggest.Suggestion<? extends Suggest.Suggestion.Entry<? extends Suggest.Suggestion.Entry.Option>> suggestion : suggestions) {
CompletionSuggestion<CompletionSuggestion.Entry<CompletionSuggestion.Entry.Option>> suggestionComp = (CompletionSuggestion<CompletionSuggestion.Entry<CompletionSuggestion.Entry.Option>>) suggestion;
List<CompletionSuggestion.Entry<CompletionSuggestion.Entry<CompletionSuggestion.Entry.Option>>> entryList = suggestionComp.getEntries();
for (CompletionSuggestion.Entry<CompletionSuggestion.Entry<CompletionSuggestion.Entry.Option>> entry : entryList) {
List<CompletionSuggestion.Entry.Option<CompletionSuggestion.Entry<CompletionSuggestion.Entry.Option>>> options = entry.getOptions();
for (CompletionSuggestion.Entry.Option<CompletionSuggestion.Entry<CompletionSuggestion.Entry.Option>> option : options) {
SearchHit<CompletionSuggestion.Entry<CompletionSuggestion.Entry.Option>> searchHit = option.getSearchHit();
Float score = option.getScore();
Object content = searchHit.getContent();
PostEsDTO post = new JSONObject(content).toBean(PostEsDTO.class);
tempSet.add(new Temp(score, post.getTitle()));
}
}
}
List<String> suggestionList = tempSet.stream().map(Temp::getText).collect(Collectors.toList());
return suggestionList;
}
return new ArrayList<>();
}
- 将该业务方法交给Controller层调用
import com.yupi.springbootinit.common.BaseResponse;
import com.yupi.springbootinit.common.ResultUtils;
import com.yupi.springbootinit.manager.SearchFacade;
import com.yupi.springbootinit.model.dto.search.SearchRequest;
import com.yupi.springbootinit.model.vo.SearchVO;
import com.yupi.springbootinit.service.PostService;
import org.springframework.web.bind.annotation.*;
import javax.annotation.Resource;
import javax.servlet.http.HttpServletRequest;
import java.util.List;
@RestController
@RequestMapping(value = "/search")
public class SearchController {
@Resource
private PostService postService;
@Resource
private SearchFacade searchFacade;
@PostMapping(value = "/all")
public BaseResponse<SearchVO> searchAll(@RequestBody SearchRequest searchRequest, HttpServletRequest request) {
SearchVO searchVO = searchFacade.searchAll(searchRequest, request);
return ResultUtils.success(searchVO);
}
// 通过输入关键字来查询搜索建议
@GetMapping(value="/get/tip/{keyword}")
public BaseResponse<List<String>> getSearchSuggestion(@PathVariable String keyword){
List<String> titleSuggestion = postService.getTitleSuggestion(keyword);
return ResultUtils.success(titleSuggestion);
}
}
- 测试(可以用postman),这里是使用swagger自动生成的接口文档进行测试
9.前端页面展示
大功告成!!! 在这个过程中可能会踩到很多坑,如果有疑问,可以私信我