ElasticSearch+Kafka:文章检索

80 阅读3分钟

实现目标

image.png

思路与ES前期准备

image.png

使用postman添加映射put请求 :

搜索结果展示内容:标题、布局、枫叶图片、发布时间、作者名称、文章id、作者id、静态url 需要对:内容、标题进行分词

"content":{
    "type":"text",
    "ananlyze":"ik_smart"
}

http://${url}:${port}/app_info_article

image.png

校验尝试:

GET请求查询映射:http://${url}:${port}/app_info_article

DELETE请求,删除索引及映射:http://${url}:${port}/app_info_article

GET请求,查询所有文档:http://${url}:${port}/app_info_article/_search

image.png

查询所有文章,批量导入到es库中

BulkRequest bulkRequest = new BulkRequest("app_info_article");

for (SearchArticleVo searchArticleVo : searchArticleVos) {

    IndexRequest indexRequest = new IndexRequest().id(searchArticleVo.getId().toString())
            .source(JSON.toJSONString(searchArticleVo), XContentType.JSON);

    //批量添加数据
    bulkRequest.add(indexRequest);

}
restHighLevelClient.bulk(bulkRequest, RequestOptions.DEFAULT);

其中: bulkRequest用法

RestHighLevelClient restHighLevelClient = new RestHighLevelClient();//创建一个es客户端,用于执行请求

BulkRequest bulkRequest = new BulkRequest(); //创建一个批量请求

IndexRequest indexRequest1 = new IndexRequest("index_name"); //创建第一个“添加文档”的请求
indexRequest1.id("1") .source(XContentType.JSON,"field_name", "foo1");//设置添加的文档信息
IndexRequest indexRequest2 = new IndexRequest("index_name"); //创建第二个“添加文档”的请求
indexRequest2.id("1") .source(XContentType.JSON,"field_name", "foo2");//设置添加的文档信息

bulkRequest.add(indexRequest1); //将第一个“添加文档”请求加入到批量请求中
bulkRequest.add(indexRequest2); //将第二个“添加文档”请求加入到批量请求中

BulkResponse bulkResponse = restHighLevelClient.bulk(bulkRequest, RequestOptions.DEFAULT);//执行批量请求,返回执行结果

查询时各种条件的构造的正确姿势

由于es在java中查询没法像mybatis那样方便,而且es的构造器使用也比较繁琐,理解不是很方便,所以下面来记录es构造器BoolQueryBuilder查询时各种条件的构造的正确姿势。

1.构造准备

//1.构建SearchRequest请求对象,指定索引库
SearchRequest searchRequest = new SearchRequest("data_info");
//2.构建SearchSourceBuilder查询对象
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
//2.1 这个条件用于返回所有命中条件的数据数量, 不设置则返回大概数值
sourceBuilder.trackTotalHits(true);
//3.检索条件构造
BoolQueryBuilder bqb = QueryBuilders.boolQuery();

2.条件构造:must可以使用filter代替,效率更高,因为must会对_score评估

//3.1 完全匹配
bqb.must(QueryBuilders.matchQuery("code", 666L);
         
//3.2 模糊匹配
bqb.must(QueryBuilders.matchPhraseQuery("name", "张");
         
//3.3 in的效果 传单个参数就是完全匹配
bqb.must(QueryBuilders.termsQuery("code", new Long[]{1L, 2L, 3L});
bqb.must(QueryBuilders.termsQuery("code", 1L, 2L, 3L);
         
//3.4 or条件
BoolQueryBuilder shouldQuery = QueryBuilders.boolQuery();
shouldQuery.should(QueryBuilders.matchQuery("code", 1L);
shouldQuery.should(QueryBuilders.matchQuery("code", 2L);
shouldQuery.minimumShouldMatch(1); //至少满足一个
bqb.must(shouldQuery);
                   
//3.5 非null
bqb.must(QueryBuilders.existsQuery("iden"));
//是null             
bqb.mustNot(QueryBuilders.existsQuery("iden"));

//3.6 大于等于gte (gt-大于 lt-小于 lte-小于等于)
bqb.must(QueryBuilders.rangeQuery("time").gte(new Date());
         
//3.7 中文完全匹配
bqb.must(queryBuilder.matchPhraseQuery("key", value));

//3.8 匹配多个字段
bqb.must(queryBuilder.multiMatchQuery(value, key1, key2, key3));

3.构造完成 准备查询

//4.将QuseryBuilder对象设置到SearchSourceBuilder对象中
sourceBuilder.query(bqb);
//5.排序
sourceBuilder.sort("updateTime", SortOrder.DESC);
//6.分页
sourceBuilder.from((dto.getPageNum() - 1) * dto.getPageSize());
sourceBuilder.size(dto.getPageSize());

//设置高亮  title
HighlightBuilder highlightBuilder = new HighlightBuilder();
highlightBuilder.field("title");
highlightBuilder.preTags("<font style='color: red; font-size: inherit;'>");
highlightBuilder.postTags("</font>");

searchBuilder.highlighter(highlightBuilder);

//7.将SearchSourceBuilder设置到SearchRequest中
searchRequest.source(sourceBuilder);

//8.调用方法查询数据
SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
         
//9.解析返回结果
SearchHit[] hits = searchResponse.getHits().getHits();
for (int i = 0; i < hits.length; i++) {
    System.out.println("返回的结果: " + hits[i].getSourceAsString());
}

log.info("返回总数为:" + searchResponse.getHits().getTotalHits());
int total = (int)searchResponse.getHits().getTotalHits().value;

检索业务实现

  1. 配置config
@Getter
@Setter
@Configuration
@ConfigurationProperties(prefix = "elasticsearch")
public class ElasticSearchConfig {
    private String host;
    private int port;

    @Bean
    public RestHighLevelClient client(){
        System.out.println(host);
        System.out.println(port);
        return new RestHighLevelClient(RestClient.builder(
                new HttpHost(
                        host,
                        port,
                        "http"
                )
        ));
    }
}
  1. 构造一个UserSearchDto其中包含字段 搜索关键字searchWords、当前页pageNum、分页条数pageSize、最小时间minBehotTime。其中提供一个分页算法
public int getFromIndex(){
    if(this.pageNum<1)return 0;
    if(this.pageSize<1) this.pageSize = 10;
    return this.pageSize * (pageNum-1);
}
  1. 进行编写es代码
// 传入参数为UserSearchDto
// 1 设置查询条件
SearchRequest searchRequest = new SearchRequest("key_name");
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();

// 2 设置查询条件
BoolQueryBuilder boolQueryBuilder = QueryBuilders.query();

// 关键字的分词后查询
QueryStringQueryBuilder queryStringQueryBuilder = QueryBuilders.queryStringQuery(dto.getSearchWords()).field("title").field("content").defaultOperatior(Operator.OR);
boolQueryBuilder.must(queryStringQueryBuilder);

// 查询小于minBehotTim的数据
RangeQueryBuilder rangeQueryBuilder = QueryBuilders.rangeQuery("publishTime").lt(dto.getMinHotTime().getTime());
boolQueryBuilder.filter(rangeQueryBuilder);

// 分页查询
searchSourceBuilder.from(0);
searchSourceBuilder.size(dto.getPageSize());

// 按照发布时间倒序查询
searchSourceBuilder.sort("publishTime", SortOrder.DESC);

// 设置高亮 title
HighlightBuilder highlightBuilder = new HighlightBuilder();
highlightBuilder.field("title");
highlightBuilder.preTags("<font style='color: red; font-size: inherit;'>");
highlightBuilder.postTags("</font>")
searchSourceSearch.highlighter(highlightBuilder);

// 执行查询
searchSourceBuilder.query(boolQueryBuilder);
searchRequest.source(searchSourceBuilder);
SearchResponse searchResponse = restHighLevelClient.search(searchRequet, RequestOptions.DEFAULT);

文章审核后构建索引

收集数据并发送消息

//发送消息,创建索引
createArticleESIndex(apArticle,content,path);

@Autowired
private KafkaTemplate<String,String> kafkaTemplate;

/**
 * 送消息,创建索引
 * @param apArticle
 * @param content
 * @param path
 */
private void createArticleESIndex(ApArticle apArticle, String content, String path) {
    SearchArticleVo vo = new SearchArticleVo();
    BeanUtils.copyProperties(apArticle,vo);
    vo.setContent(content);
    vo.setStaticUrl(path);

    kafkaTemplate.send(ArticleConstants.ARTICLE_ES_SYNC_TOPIC, JSON.toJSONString(vo));
}

public static final String ARTICLE_ES_SYNC_TOPIC = "article.es.sync.topic";

搜索的微服务 接收消息并创建场景

  1. 搜索微服务中添加kafka的配置,nacos配置
spring:
  kafka:
    bootstrap-servers: ${url}:${port}
    consumer:
      group-id: ${spring.application.name}
      key-deserializer: org.apache.kafka.common.serialization.StringDeserializer
      value-deserializer: org.apache.kafka.common.serialization.StringDeserializer
  1. 定义监听接收消息,保存索引数据,class SyncArticleListener
@AutoWired
private RestHighLevelClient restHighLevelClient;

@KafkaListener(topics = ArticleConstants.ARTICLE_ES_SYNC_TOPIC)
public void onMessage(String message){
    if(StringUtils.isNotBlank(message)){
        log.info("SyncArticleListener message = {}", message);
        SearchArticleVo searchArticleVo = JSON.parseObject
    }
}