Elasticsearch：映射（mapping）的实用实践指南动态映射适用于开发环境，但对于生产级集群禁用它。将动态

动态映射适用于开发环境，但对于生产级集群禁用它。将动态配置为 “strict” 以对索引的字段值实施严格模式。有关动态映射的详细描述，请阅读文章 “Elasticsearch：Dynamic mapping”。



1.  PUT /twitter
2.  {
3.    "mappings": {
4.      "dynamic": "strict",
5.      "properties": {
6.        "subscriptionName": {
7.          "type": "text"
8.        }
9.      }
10.    }
11.  }

当文档大量存储时，可以在定义字段映射时使用优化来节省磁盘空间。

禁用强制（coerce）。强制给你自由，但缺乏纪律。导致混乱。有关 coerce 的更多描述，请参阅文章 “Elasticsearch：Elasticsearch 中的数据强制匹配”。



1.  PUT /product/_doc/1
2.  {
3.    "price": 890.90
4.  }

6.  PUT /product/_doc/2
7.  {
8.    "price": "890.90"
9.  }

我们可以看出来，第一个是成功的。针对第二个命令，虽然指定为字符串，但 ES 只看到引号 “” 内的浮点数，将数据结构存储为浮点数。它将是成功的。

我们接下来使用如下的命令来写入另外一个文档：



1.  PUT /product/_doc/3
2.  {
3.    "price": "890.90m"
4.  }

上面的命令返回：



1.  {
2.    "error": {
3.      "root_cause": [
4.        {
5.          "type": "document_parsing_exception",
6.          "reason": "[2:12] failed to parse field [price] of type [float] in document with id '3'. Preview of field's value: '890.90m'"
7.        }
8.      ],
9.      "type": "document_parsing_exception",
10.      "reason": "[2:12] failed to parse field [price] of type [float] in document with id '3'. Preview of field's value: '890.90m'",
11.      "caused_by": {
12.        "type": "number_format_exception",
13.        "reason": "For input string: \"890.90m\""
14.      }
15.    },
16.    "status": 400
17.  }

这会失败，因为它在内部以浮点形式存储在数据结构中。

我们使用如下的命令来获得文档 2：

GET /product/_doc/2

上面的命令返回：



1.  {
2.    "_index": "product",
3.    "_id": "2",
4.    "_version": 1,
5.    "_seq_no": 1,
6.    "_primary_term": 1,
7.    "found": true,
8.    "_source": {
9.      "price": "890.90"
10.    }
11.  }

从上面的输出中，我们可以看到 price 的值还是以字符串的形式来存储的。

我们可以做如下的查询：



1.  GET product/_search?filter_path=**.hits
2.  {
3.    "query": {
4.      "range": {
5.        "price": {
6.          "gte": 890.90
7.        }
8.      }
9.    }
10.  }

上面的命令返回的结果为：r



1.  {
2.    "hits": {
3.      "hits": [
4.        {
5.          "_index": "product",
6.          "_id": "1",
7.          "_score": 1,
8.          "_source": {
9.            "price": 890.9
10.          }
11.        },
12.        {
13.          "_index": "product",
14.          "_id": "2",
15.          "_score": 1,
16.          "_source": {
17.            "price": "890.90"
18.          }
19.        }
20.      ]
21.    }
22.  }

从上面的输出中，我们可以看到尽管文档 2 的 price 为 “890.90” 字符串类型，但是查询的结果还是包含它。我们如果做如下的查询：



1.  GET product/_search
2.  {
3.    "query": {
4.      "match": {
5.        "prce": "890.90"
6.      }
7.    }
8.  }

上述查询不会有任何的结果，这个是因为 price 为 float 类型。

请注意数字数据类型，如果足够的话请使用 float，因为 double 需要更多空间。如果足够的话就使用 Integer，因为 Long 占用更多空间。

对于字符串字段，不需要同时使用文本和关键字映射。使用多重映射需要空间。仅当需要聚合、排序、精确匹配过滤时才使用关键字，以确保不进行分词。使用文本进行全文搜索。仅在需要时同时使用两者。

多字段（multi-feild）映射 - 将 keyword 映射添加到文本字段



1.  PUT /subscriptions
2.  {
3.    "mappings": {
4.      "properties": {
5.        "description": {
6.          "type": "text"
7.        },
8.        "subscriptionsName": {
9.          "type": "text",
10.          "fields": {
11.            "keyword": {
12.              "type": "keyword"
13.            }
14.          }
15.        }
16.      }
17.    }
18.  }

写入一个文档：



1.  POST /subscriptions/_doc
2.  {
3.    "description": "Detailed subs...",
4.    "subscriptionsName": ["Monthly", "Weekly", "Quarterly"]
5.  }

查询文档：



1.  GET /subscriptions/_search
2.  {
3.    "query": {
4.      "match_all": {}
5.    }
6.  }

查询文本：



1.  GET /subscriptions/_search
2.  {
3.    "query": {
4.      "match": {
5.        "subscriptionsName": "Monthly"
6.      }
7.    }
8.  }

对 keyword 字段进行查询：



1.  GET /subscriptions/_search
2.  {
3.    "query": {
4.      "term": {
5.        "subscriptionsName.keyword": "Monthly"
6.      }
7.    }
8.  }

有关 keyword 及 text 查询的比较，请详细阅读文章 “Elasticsearch：Text vs. Keyword - 它们之间的差异以及它们的行为方式”。

当不需要排序、聚合、过滤时，请将 doc_value 设置为 false。

当不需要相关性评分时，设置 norms 为 false

当不需要对值进行过滤时，请将 index 设置为 false（仍然可以进行聚合，例如时间序列数据）。