Elasticsearch 8.1 Java API Client 客户端使用指南（搜索篇）搜索操作 API Elast

搜索操作 API

Elasticsearch 搜索知识点

搜索基础概念

**搜索（Search）**是 Elasticsearch 的核心功能，用于从索引中检索文档。

搜索流程：

查询阶段（Query Phase）：在所有分片上执行查询，收集文档 ID 和评分
取回阶段（Fetch Phase）：根据文档 ID 从各个分片取回完整文档
合并排序：合并结果，按评分排序返回

查询类型分类

Elasticsearch 查询分为两大类：

1. 查询上下文（Query Context）

用于匹配文档，会影响评分（_score）
回答"文档匹配度如何"的问题
常用查询：match、term、range 等

2. 过滤上下文（Filter Context）

用于过滤文档，不影响评分
回答"文档是否匹配"的问题（是/否）
性能更好，结果可以被缓存
常用过滤：term、range、exists 等

评分机制（Relevance Scoring）

Elasticsearch 使用 TF-IDF（Term Frequency-Inverse Document Frequency） 和 BM25 算法计算文档相关性评分：

TF（词频）：词在文档中出现的频率
IDF（逆文档频率）：词在所有文档中的稀有程度
字段长度归一化：短字段中的匹配词权重更高
评分越高：文档与查询的相关性越高

查询 DSL（Domain Specific Language）

Elasticsearch 使用 JSON 格式的查询 DSL，主要查询类型：

1. 全文查询（Full Text Queries）

match：全文搜索，会对查询词进行分词
match_phrase：短语匹配，要求词按顺序出现
multi_match：多字段匹配
match_all：匹配所有文档

2. 词项查询（Term-level Queries）

term：精确匹配，不分词
terms：多值精确匹配
range：范围查询
exists：字段存在查询
wildcard：通配符查询
fuzzy：模糊查询

3. 复合查询（Compound Queries）

bool：布尔查询，组合多个查询
- must：必须匹配，影响评分
- should：应该匹配，影响评分
- must_not：必须不匹配，不影响评分
- filter：必须匹配，不影响评分（性能更好）

4. 聚合查询（Aggregations）

terms：词条聚合，分组统计
stats：统计聚合，计算平均值、最大值、最小值、总和
date_histogram：日期直方图聚合
range：范围聚合

高亮（Highlighting）

高亮用于在搜索结果中标记匹配的文本片段：

使用 <em> 标签（默认）包裹匹配的文本
可以自定义高亮标签和样式
支持多个字段高亮
可以设置高亮片段数量和长度

分页（Pagination）

Elasticsearch 支持两种分页方式：

1. from/size 分页

适用于小数据量（< 10,000 条）
深度分页性能差（from 值越大越慢）
默认最大 from + size = 10,000

2. Scroll API

适用于大数据量导出
创建快照，保持数据一致性
需要维护 scroll_id，有性能开销

3. Search After

适用于实时深度分页
使用上一页最后一个文档的排序值作为游标
性能好，但需要排序字段

排序（Sorting）

排序可以基于：

字段值：按字段值升序/降序
评分（_score）：按相关性评分排序
地理位置：按距离排序
脚本：使用脚本计算排序值

注意：

排序会禁用评分计算（除非使用 track_scores）
排序字段最好是 keyword 或数值类型
text 字段排序需要启用 fielddata（不推荐，占用内存）

1. 基本搜索

1.1 Match All 查询（查询所有）

知识点：Match All 查询

匹配索引中的所有文档
所有文档的评分都是 1.0
通常用于测试或配合其他查询使用
可以设置 boost 参数调整评分

import co.elastic.clients.elasticsearch.core.SearchResponse;
import co.elastic.clients.elasticsearch.core.search.Hit;

public class SearchOperations {
    
    /**
     * 查询所有文档
     */
    public static  List searchAll(
            ElasticsearchClient client, String indexName, Class clazz) 
            throws IOException {
        
        SearchResponse response = client.search(s -> s
            .index(indexName)
            .query(q -> q
                .matchAll(m -> m)
            ),
            clazz
        );
        
        List results = new ArrayList<>();
        for (Hit hit : response.hits().hits()) {
            results.add(hit.source());
        }
        
        return results;
    }
}

1.2 Match 查询（匹配查询）

知识点：Match 查询

全文搜索：会对查询词进行分词，然后匹配
分词器：使用字段映射中定义的分析器
评分：根据词频、逆文档频率等计算相关性评分
匹配类型：
- best_fields（默认）：匹配最佳字段
- most_fields：匹配多个字段
- cross_fields：跨字段匹配
- phrase：短语匹配
- phrase_prefix：短语前缀匹配
适用场景：用户输入的搜索关键词、全文搜索

public class SearchOperations {
    
    /**
     * Match 查询（全文搜索）
     */
    public static  List matchQuery(
            ElasticsearchClient client, String indexName, 
            String field, String value, Class clazz) throws IOException {
        
        SearchResponse response = client.search(s -> s
            .index(indexName)
            .query(q -> q
                .match(m -> m
                    .field(field)
                    .query(value)
                )
            ),
            clazz
        );
        
        List results = new ArrayList<>();
        for (Hit hit : response.hits().hits()) {
            results.add(hit.source());
        }
        
        return results;
    }
}

1.3 Term 查询（精确匹配）

知识点：Term 查询

精确匹配：不会对查询词进行分词，直接匹配倒排索引中的词项
不分词：查询词必须完全匹配索引中的词项
不评分：通常用于过滤，不计算评分（性能更好）
适用场景：
- keyword 字段的精确匹配
- 状态、标签等枚举值查询
- 布尔值查询
注意：text 字段使用 term 查询可能查不到结果（因为 text 字段会被分词）

public class SearchOperations {
    
    /**
     * Term 查询（精确匹配，不分词）
     */
    public static  List termQuery(
            ElasticsearchClient client, String indexName, 
            String field, String value, Class clazz) throws IOException {
        
        SearchResponse response = client.search(s -> s
            .index(indexName)
            .query(q -> q
                .term(t -> t
                    .field(field)
                    .value(value)
                )
            ),
            clazz
        );
        
        List results = new ArrayList<>();
        for (Hit hit : response.hits().hits()) {
            results.add(hit.source());
        }
        
        return results;
    }
}

1.4 Terms 查询（多值精确匹配）

知识点：Terms 查询

多值匹配：匹配字段值在给定值列表中的任意一个
类似 SQL 的 IN 查询：WHERE field IN (value1, value2, ...)
精确匹配：与 term 查询一样，不分词
适用场景：
- 多选筛选（如：状态为"已发布"或"草稿"）
- 标签匹配（如：标签包含"Java"或"Python"）
- 分类查询

import java.util.Arrays;

public class SearchOperations {
    
    /**
     * Terms 查询（多值精确匹配）
     */
    public static  List termsQuery(
            ElasticsearchClient client, String indexName, 
            String field, List values, Class clazz) throws IOException {
        
        SearchResponse response = client.search(s -> s
            .index(indexName)
            .query(q -> q
                .terms(t -> t
                    .field(field)
                    .terms(te -> te
                        .value(values.stream()
                            .map(v -> FieldValue.of(v))
                            .collect(Collectors.toList()))
                    )
                )
            ),
            clazz
        );
        
        List results = new ArrayList<>();
        for (Hit hit : response.hits().hits()) {
            results.add(hit.source());
        }
        
        return results;
    }
}

2. 范围查询

2.1 Range 查询

知识点：Range 查询

范围匹配：匹配字段值在指定范围内的文档
范围操作符：
- gt：大于（>）
- gte：大于等于（>=）
- lt：小于（<）
- lte：小于等于（<=）
支持类型：数值、日期、字符串
性能：范围查询性能很好，通常用于 filter 上下文
适用场景：
- 价格区间筛选
- 日期范围查询
- 年龄、评分等数值范围

import co.elastic.clients.elasticsearch._types.query_dsl.RangeQuery;

public class SearchOperations {
    
    /**
     * Range 查询（范围查询）
     */
    public static  List rangeQuery(
            ElasticsearchClient client, String indexName, 
            String field, Long min, Long max, Class clazz) throws IOException {
        
        SearchResponse response = client.search(s -> s
            .index(indexName)
            .query(q -> q
                .range(r -> r
                    .field(field)
                    .gte(JsonData.of(min))
                    .lte(JsonData.of(max))
                )
            ),
            clazz
        );
        
        List results = new ArrayList<>();
        for (Hit hit : response.hits().hits()) {
            results.add(hit.source());
        }
        
        return results;
    }
    
    /**
     * 日期范围查询
     */
    public static  List dateRangeQuery(
            ElasticsearchClient client, String indexName, 
            String field, String fromDate, String toDate, Class clazz) 
            throws IOException {
        
        SearchResponse response = client.search(s -> s
            .index(indexName)
            .query(q -> q
                .range(r -> r
                    .field(field)
                    .gte(JsonData.of(fromDate))
                    .lte(JsonData.of(toDate))
                )
            ),
            clazz
        );
        
        List results = new ArrayList<>();
        for (Hit hit : response.hits().hits()) {
            results.add(hit.source());
        }
        
        return results;
    }
}

3. 布尔查询

知识点：Bool 查询

组合查询：将多个查询组合在一起
四个子句：
- must：必须匹配，影响评分（AND）
- should：应该匹配，影响评分（OR，至少匹配一个）
- must_not：必须不匹配，不影响评分（NOT）
- filter：必须匹配，不影响评分（AND，性能更好）
评分规则：
- must 和 should 中的查询会影响评分
- filter 和 must_not 中的查询不影响评分（性能更好）
- 最终评分是所有 must 和 should 查询评分的组合
适用场景：复杂的多条件查询，如"标题包含 Java 且浏览次数大于 100"

import co.elastic.clients.elasticsearch._types.query_dsl.Query;

public class SearchOperations {
    
    /**
     * Bool 查询（组合查询）
     */
    public static  List boolQuery(
            ElasticsearchClient client, String indexName, 
            List mustQueries,
            List shouldQueries,
            List mustNotQueries,
            List filterQueries,
            Class clazz) throws IOException {
        
        SearchResponse response = client.search(s -> s
            .index(indexName)
            .query(q -> q
                .bool(b -> {
                    if (mustQueries != null && !mustQueries.isEmpty()) {
                        b.must(mustQueries);
                    }
                    if (shouldQueries != null && !shouldQueries.isEmpty()) {
                        b.should(shouldQueries);
                    }
                    if (mustNotQueries != null && !mustNotQueries.isEmpty()) {
                        b.mustNot(mustNotQueries);
                    }
                    if (filterQueries != null && !filterQueries.isEmpty()) {
                        b.filter(filterQueries);
                    }
                    return b;
                })
            ),
            clazz
        );
        
        List results = new ArrayList<>();
        for (Hit hit : response.hits().hits()) {
            results.add(hit.source());
        }
        
        return results;
    }
    
    /**
     * 示例：组合查询
     */
    public static  List complexBoolQuery(
            ElasticsearchClient client, String indexName, 
            String titleKeyword, Long minViews, Class clazz) throws IOException {
        
        // 构建 must 查询
        Query mustQuery = Query.of(q -> q
            .match(m -> m
                .field(&#34;title&#34;)
                .query(titleKeyword)
            )
        );
        
        // 构建 filter 查询
        Query filterQuery = Query.of(q -> q
            .range(r -> r
                .field(&#34;views&#34;)
                .gte(JsonData.of(minViews))
            )
        );
        
        SearchResponse response = client.search(s -> s
            .index(indexName)
            .query(q -> q
                .bool(b -> b
                    .must(mustQuery)
                    .filter(filterQuery)
                )
            ),
            clazz
        );
        
        List results = new ArrayList<>();
        for (Hit hit : response.hits().hits()) {
            results.add(hit.source());
        }
        
        return results;
    }
}

4. 模糊查询

4.1 Wildcard 查询

知识点：Wildcard 查询

通配符匹配：使用 * 和 ? 进行模式匹配
- *：匹配零个或多个字符
- ?：匹配单个字符
性能：通配符查询性能较差，特别是以通配符开头的查询
不分词：直接匹配倒排索引中的词项
适用场景：
- 前缀匹配（如：user*）
- 后缀匹配（如：*.log）
- 模式匹配（如：user?）

public class SearchOperations {
    
    /**
     * Wildcard 查询（通配符查询）
     */
    public static  List wildcardQuery(
            ElasticsearchClient client, String indexName, 
            String field, String pattern, Class clazz) throws IOException {
        
        SearchResponse response = client.search(s -> s
            .index(indexName)
            .query(q -> q
                .wildcard(w -> w
                    .field(field)
                    .value(pattern)
                )
            ),
            clazz
        );
        
        List results = new ArrayList<>();
        for (Hit hit : response.hits().hits()) {
            results.add(hit.source());
        }
        
        return results;
    }
}

说明：通配符添加到参数value中

4.2 Fuzzy 查询

知识点：Fuzzy 查询

模糊匹配：允许查询词与文档中的词有少量差异（拼写错误、字符差异等）
编辑距离（Edit Distance）：允许的最大字符差异数
- AUTO：根据词长度自动计算
- 0, 1, 2：固定编辑距离
适用场景：
- 容错搜索（用户输入可能有拼写错误）
- 相似词匹配
注意：模糊查询性能较差，不适合高频查询

public class SearchOperations {
    
    /**
     * Fuzzy 查询（模糊查询）
     */
    public static  List fuzzyQuery(
            ElasticsearchClient client, String indexName, 
            String field, String value, Class clazz) throws IOException {
        
        SearchResponse response = client.search(s -> s
            .index(indexName)
            .query(q -> q
                .fuzzy(f -> f
                    .field(field)
                    .value(value)
                    .fuzziness(&#34;AUTO&#34;)
                )
            ),
            clazz
        );
        
        List results = new ArrayList<>();
        for (Hit hit : response.hits().hits()) {
            results.add(hit.source());
        }
        
        return results;
    }
}

5. 分页查询

知识点：分页查询

from/size 分页：
- from：跳过的文档数量（起始位置）
- size：返回的文档数量（每页大小）
- 默认：from=0, size=10
深度分页问题：
- from 值越大，性能越差
- 需要排序和合并大量数据
- 默认最大 from + size = 10,000
解决方案：
- 小数据量：使用 from/size
- 大数据量导出：使用 Scroll API
- 实时深度分页：使用 Search After
性能优化：
- 避免深度分页
- 使用 Search After 代替 from/size
- 合理设置 size（不要太大）

public class SearchOperations {
    
    /**
     * 分页查询
     */
    public static  SearchResponse searchWithPagination(
            ElasticsearchClient client, String indexName, 
            Query query, int from, int size, Class clazz) throws IOException {
        
        SearchResponse response = client.search(s -> s
            .index(indexName)
            .query(query)
            .from(from)
            .size(size),
            clazz
        );
        
        return response;
    }
    
    /**
     * 获取分页结果信息
     */
    public static  void printPaginationInfo(SearchResponse response) {
        System.out.println(&#34;总记录数: &#34; + response.hits().total().value());
        System.out.println(&#34;当前页记录数: &#34; + response.hits().hits().size());
        System.out.println(&#34;最大分数: &#34; + response.hits().maxScore());
    }
}

6. 排序

知识点：排序（Sorting）

排序字段：
- 数值字段：按数值大小排序
- 日期字段：按时间顺序排序
- 字符串字段：按字典序排序（keyword 类型）
- text 字段：需要启用 fielddata（不推荐，占用内存）
排序类型：
- _score：按相关性评分排序（默认）
- _doc：按文档索引顺序排序（最快）
- 字段排序：按字段值排序
多字段排序：可以指定多个排序字段，按优先级排序
性能影响：
- 排序会禁用评分计算（除非使用 track_scores）
- 排序字段最好是 keyword 或数值类型
- 避免对 text 字段排序

import co.elastic.clients.elasticsearch._types.SortOrder;

public class SearchOperations {

    public static void main(String args[]){
        //use searchWithSort
        // 场景：Term查询 + 创建时间升序排序（SortOrder.Asc）
        System.out.println(&#34;\n=== 场景2：Term查询 + 创建时间升序 ===&#34;);
        // 构建Term类型的Query对象（精确匹配分类为&#34;数码&#34;的商品）
        Query termQuery = Query.of(q -> q
                .term(t -> t
                        .field(&#34;category&#34;)
                        .value(&#34;数码&#34;)
                )
        );
        // 调用排序查询：按createTime字段升序（SortOrder.Asc）
        // SortOrder 只有升序和降序两种
        List timeAscProducts = searchWithSort(
                client,//创建的客户端
                &#34;product_index&#34;,//index_name
                termQuery,// 创建好的query语句
                &#34;createTime&#34;,// filedname
                SortOrder.Asc, // 排序方向：升序
                Product.class
        );

    }
    
    /**
     * 带排序的查询
     */
    public static  List searchWithSort(
            ElasticsearchClient client, String indexName, 
            Query query, String sortField, SortOrder order, Class clazz) 
            throws IOException {
        
        SearchResponse response = client.search(s -> s
            .index(indexName)
            .query(query)
            .sort(so -> so
                .field(f -> f
                    .field(sortField)
                    .order(order)
                )
            ),
            clazz
        );
        
        List results = new ArrayList<>();
        for (Hit hit : response.hits().hits()) {
            results.add(hit.source());
        }
        
        return results;
    }
    
    /**
     * 多字段排序
     */
    public static  List searchWithMultiSort(
            ElasticsearchClient client, String indexName, 
            Query query, Class clazz) throws IOException {

        
        //Elasticsearch 中多字段排序严格遵循你编写的 sort 调用顺序
        //先按第一个排序字段（views）排序：所有文档先根据 views（浏览量）从高到低排列；
        //第一个字段值相等时，再按第二个字段（createTime）排序：仅当两个文档的 views 完全相同时，才会对比 //createTime    （创建时间），按时间从新到旧排列；
        SearchResponse response = client.search(s -> s
            .index(indexName)
            .query(query)
            .sort(so -> so
                .field(f -> f
                    .field(&#34;views&#34;)
                    .order(SortOrder.Desc)
                )
            )
            .sort(so -> so
                .field(f -> f
                    .field(&#34;createTime&#34;)
                    .order(SortOrder.Desc)
                )
            ),
            clazz
        );
        
        List results = new ArrayList<>();
        for (Hit hit : response.hits().hits()) {
            results.add(hit.source());
        }
        
        return results;
    }
}

7. 高亮显示

知识点：高亮（Highlighting）

作用：在搜索结果中标记匹配的文本片段
高亮标签：
- 默认使用 <em> 标签包裹匹配文本
- 可以自定义标签和样式
高亮参数：
- fragment_size：片段大小（字符数）
- number_of_fragments：返回的片段数量
- pre_tags/post_tags：自定义高亮标签
高亮类型：
- unified（默认）：统一高亮器，适用于所有字段类型
- plain：简单高亮器
- fvh：快速向量高亮器
适用场景：搜索结果显示，突出显示匹配的关键词

import co.elastic.clients.elasticsearch.core.search.Highlight;
import co.elastic.clients.elasticsearch.core.search.HighlightField;

public class SearchOperations {
    
    /**
     * 带高亮的查询
     */
    public static  SearchResponse searchWithHighlight(
            ElasticsearchClient client, String indexName, 
            Query query, String[] highlightFields, Class clazz) throws IOException {
        
        SearchResponse response = client.search(s -> s
            .index(indexName)
            .query(query)
            .highlight(h -> {
                for (String field : highlightFields) {
                    h.fields(field, HighlightField.of(hf -> hf));
                }
                return h;
            }),
            clazz
        );
        
        return response;
    }
    
    /**
     * 打印高亮结果
     */
    public static  void printHighlightResults(SearchResponse response) {
        for (Hit hit : response.hits().hits()) {
            System.out.println(&#34;文档 ID: &#34; + hit.id());
            if (hit.highlight() != null) {
                hit.highlight().forEach((field, fragments) -> {
                    System.out.println(&#34;字段 &#34; + field + &#34; 高亮:&#34;);
                    fragments.forEach(fragment -> 
                        System.out.println(&#34;  - &#34; + fragment)
                    );
                });
            }
        }
    }
}

高亮核心配置说明：方法中高亮配置的核心代码：

运行
.highlight(h -> {
    for (String field : highlightFields) {
        h.fields(field, HighlightField.of(hf -> hf));
    }
    return h;
})
.highlight(h -> {...})：开启高亮功能，进入高亮配置上下文；
h.fields(field, HighlightField.of(hf -> hf))：为每个 highlightFields 字段配置默认高亮规则：
默认高亮标签：<em>（即匹配的关键词会被 <em> 包裹，如 <em>教程</em>）；
默认分词器：使用字段映射中定义的分析器；
扩展：若需自定义高亮标签（如 <span>），可修改 HighlightField 配置：
java
运行
h.fields(field, HighlightField.of(hf -> hf
    .preTags(&#34;<span>&#34;) // 高亮前缀
    .postTags(&#34;</span>&#34;) // 高亮后缀
));

8. 聚合查询

知识点：聚合（Aggregations）

作用：对数据进行分组、统计、分析
聚合类型：
- 指标聚合（Metrics）：计算统计值（平均值、最大值、最小值、总和等）
- 桶聚合（Buckets）：将文档分组到不同的桶中
- 管道聚合（Pipeline）：对其他聚合的结果进行二次聚合
聚合特点：
- 不返回文档，只返回聚合结果
- 可以设置 size: 0 只返回聚合结果，不返回文档
- 聚合结果在 aggregations 字段中
性能考虑：
- 聚合会消耗较多内存和 CPU
- 大数据量聚合可能较慢
- 可以使用 size 参数限制桶数量

8.1 统计聚合

知识点：Stats 聚合

功能：计算数值字段的统计信息
返回结果：
- count：文档数量
- min：最小值
- max：最大值
- avg：平均值
- sum：总和
适用场景：数值字段的统计分析

import co.elastic.clients.elasticsearch._types.aggregations.Aggregation;

public class SearchOperations {
    
    /**
     * 统计聚合（计算平均值、最大值、最小值、总和）
     */
    public static void statsAggregation(
            ElasticsearchClient client, String indexName, String field) 
            throws IOException {
        
        SearchResponse response = client.search(s -> s
            .index(indexName)
            .size(0)//聚合场景下通常不需要返回原始文档，设置 size(0) 可避免 ES 序列化文档数据，大幅提升聚合性能
            .aggregations(&#34;stats_&#34; + field, a -> a //聚合的唯一标识（名称），由 stats_ 前缀 + 字段名组成（如 stats_price），目的是区分不同字段的聚合结果
                .stats(st -> st.field(field))
            ),
            Object.class
        );
        
        // 处理聚合结果
        if (response.aggregations() != null) {
            // 这里需要根据实际返回的聚合类型进行处理
            System.out.println(&#34;聚合结果: &#34; + response.aggregations());
        }
    }
}

8.2 词条聚合（Terms Aggregation）

知识点：Terms 聚合

功能：按字段值分组统计，类似 SQL 的 GROUP BY
返回结果：每个唯一值的文档数量和相关信息
参数：
- field：聚合的字段（必须是 keyword 类型或启用 fielddata）
- size：返回的桶数量（默认 10）
- order：排序方式（按文档数或键值）
适用场景：
- 标签统计（如：统计每个标签的文章数）
- 分类统计（如：统计每个分类的商品数）
- Top N 查询（如：最热门的 10 个标签）

public class SearchOperations {
    
    /**
     * 词条聚合（分组统计）
     */
    public static void termsAggregation(
            ElasticsearchClient client, String indexName, String field, int size) 
            throws IOException {
        
        SearchResponse response = client.search(s -> s
            .index(indexName)
            .size(0)
            .aggregations(&#34;terms_&#34; + field, a -> a
                .terms(t -> t
                    .field(field)
                    .size(size)
                )
            ),
            Object.class
        );
        
        System.out.println(&#34;聚合结果: &#34; + response.aggregations());
    }
}

8.3 日期直方图聚合

知识点：Date Histogram 聚合

功能：按时间间隔分组统计，生成时间序列数据
时间间隔：
- calendar_interval：日历间隔（year, month, week, day, hour 等）
- fixed_interval：固定间隔（如：1d, 2h, 30m）
适用场景：
- 时间序列分析（如：每天的文章数）
- 趋势分析（如：每月的销售额）
- 报表统计（如：按周统计访问量）

public class SearchOperations {
    
    /**
     * 日期直方图聚合
     */
    public static void dateHistogramAggregation(
            ElasticsearchClient client, String indexName, String dateField, 
            String interval) throws IOException {
        
        SearchResponse response = client.search(s -> s
            .index(indexName)
            .size(0)
            .aggregations(&#34;date_histogram&#34;, a -> a
                .dateHistogram(dh -> dh
                    .field(dateField)
                    .calendarInterval(ci -> ci
                        .value(interval)
                    )
                )
            ),
            Object.class
        );
        
        System.out.println(&#34;聚合结果: &#34; + response.aggregations());
    }
}

9. 多索引搜索

public class SearchOperations {
    
    /**
     * 多索引搜索
     */
    public static  List searchMultipleIndices(
            ElasticsearchClient client, String[] indexNames, 
            Query query, Class clazz) throws IOException {
        
        SearchResponse response = client.search(s -> s
            .index(Arrays.asList(indexNames))
            .query(query),
            clazz
        );
        
        List results = new ArrayList<>();
        for (Hit hit : response.hits().hits()) {
            results.add(hit.source());
        }
        
        return results;
    }
}

10. 滚动搜索（Scroll）

知识点：Scroll API

作用：高效地处理大量数据，类似数据库的游标
工作原理：
1. 创建快照，保持数据一致性
2. 返回 scroll_id，用于后续请求
3. 每次请求返回一批数据
4. 使用完毕后清除 scroll 上下文
适用场景：
- 数据导出（导出所有数据）
- 批量处理（处理大量文档）
- 数据迁移
注意事项：
- Scroll 会保持快照，占用资源
- 需要及时清除 scroll 上下文
- 不适合实时查询（数据可能不是最新的）
- 推荐使用 Search After 代替（ES 7.x+）

import co.elastic.clients.elasticsearch.core.ScrollResponse;

public class SearchOperations {
    
    /**
     * 滚动搜索（处理大量数据）
     */
    public static  void scrollSearch(
            ElasticsearchClient client, String indexName, 
            Query query, Class clazz) throws IOException {
        
        // 初始搜索请求
        SearchResponse response = client.search(s -> s
            .index(indexName)
            .query(query)
            .scroll(sc -> sc.time(&#34;1m&#34;))
            .size(100),
            clazz
        );
        
        String scrollId = response.scrollId();
        List allResults = new ArrayList<>();
        
        // 收集第一批结果
        for (Hit hit : response.hits().hits()) {
            allResults.add(hit.source());
        }
        
        // 继续滚动获取剩余结果
        while (response.hits().hits().size() > 0) {
            ScrollResponse scrollResponse = client.scroll(s -> s
                .scrollId(scrollId)
                .scroll(sc -> sc.time(&#34;1m&#34;)),
                clazz
            );
            
            scrollId = scrollResponse.scrollId();
            for (Hit hit : scrollResponse.hits().hits()) {
                allResults.add(hit.source());
            }
            
            response = scrollResponse;
        }
        
        // 清除滚动上下文
        client.clearScroll(c -> c.scrollId(scrollId));
        
        System.out.println(&#34;总共获取 &#34; + allResults.size() + &#34; 条记录&#34;);
    }
}