一起养成写作习惯！这是我参与「掘金日新计划 · 4 月更文挑战」的第22天，点击查看活动详情。

本章节介绍一些ElasticSearch的基本概念，这些属于必须要了解的知识点

ES基本概念

默认5主1备 -> 1主1备(高版本)

数据写入分片：_id % share num，同步到备份分片

准实时搜索：内存buffer的segment文件（合并成大文件，可以删除，未删除前可搜索不展示）-> 文件系统缓存（1秒，可以搜索）、记录到translog（用于恢复）-> 磁盘（30分钟）

query 查询

bool 组合/符合查询

must 必须满足，算分

filter 不算分，效率高

match 分词查询

term 精确查询

highlight 高亮查询，指定前后缀

LTE/GTE --> FROM/TO

field的query和value可省略

协调节点：不参与选举，也不存储数据，只做请求转发

仲裁节点：只参与选举，不存储数据

报错问题分析

错误信息记录:

{
    "root_cause": [
        {
            "type": "illegal_argument_exception",
            "reason": "Fielddata is disabled on text fields by default. Set fielddata=true on [path] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead."
        }
    ]
}

Set fielddata=true on [xxxx] ......

错误原因分析:

默认情况下, Elasticsearch 对 text 类型的字段(field)禁用了 fielddata; text 类型的字段在创建索引时会进行分词处理, 而聚合操作必须基于字段的原始值进行分析; 所以如果要对 text 类型的字段进行聚合操作, 就需要存储其原始值 —— 创建mapping时指定 fielddata=true, 以便通过反转倒排索引(即正排索引)将索引数据加载至内存中.

ES为啥使用奇数

节省资源（单数和偶数防灾能力一样，所以选单数，避免资源浪费）
防止脑裂（设置可投票的节点数为，节点数 / 2 + 1，避免选出多个master）

# Prevent the "split brain" by configuring the majority of nodes (total number of nodes / 2 + 1):
#
# discovery.zen.minimum_master_nodes: 3
#
# For more information, see the documentation at:
# <http://www.elastic.co/guide/en/elasticsearch/reference/current/modules-discovery.html>

总结

ES是典型的集群架构，也是大多数企业系统中会使用到的组件，所以一般写上这个组件就一定会被问ES集群相关的原理，比如为什么ES查询性能好、分词器如何使用、倒排索引是什么、分片是什么、如何合适的创建索引等等，我们需要对ES的常见问题了然于心，学会各种rest API的使用，这样在服务器上进行ES相关问题排查时，也会更加得心应手

[ linux-005 ] ES常用知识点（1）

ES基本概念

报错问题分析

ES为啥使用奇数

总结