- 是否存储归一化相关参数,主要是字段长度信息,默认开启(true)keyword类型默认是false
- 通常是为了索引每个文档中每个字段每个字节的顺序,甚至不存在这个字段的文档也会占用一定的存储空间。
- 如果字段仅用于过滤和聚合可以关闭。
尽管计算得分时把字段长度考虑在内可以提高得分的精确性,但这样会消耗大量的磁盘空间(每个文档的每个字段都会消耗一个字节,即使某些文档不包含这个字段)。因此,如果不需要计算字段的得分,你应该禁用该字段的norms。特别是这个字段仅用于聚合或者过滤。
在获取排序信息时,如果norms设置为true,文本的长度会参与排序算分。
1. norms设置为false
1.1 创建索引
PUT people
{
"mappings": {
"properties": {
"name": {
"type": "text",
"doc_values": false
}
}
}
}
1.2 插入数据
POST _bulk
{"index": {"_index": "people", "_id": "1"}}
{"name": "张三"}
{"index": {"_index": "people", "_id": "2"}}
{"name": "李四"}
{"index": {"_index": "people", "_id": "3"}}
{"name": "王五"}
1.3 查询数据
1.3.1 查询
GET people/_search
{
"query": {
"match": {
"name": "张三"
}
}
}
1.3.2 结果
{
"took" : 4,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 0.40002596,
"hits" : [
{
"_index" : "people",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.40002596,
"_source" : {
"name" : "张三是一个人"
}
},
{
"_index" : "people",
"_type" : "_doc",
"_id" : "2",
"_score" : 0.40002596,
"_source" : {
"name" : "张三是一个大好人"
}
},
{
"_index" : "people",
"_type" : "_doc",
"_id" : "3",
"_score" : 0.40002596,
"_source" : {
"name" : "张三"
}
}
]
}
}
可以看到,norms设置为false时,查询“张三”结果中的score都一样。
2. norms设置为true
2.1 创建索引
PUT people
{
"mappings": {
"properties": {
"name": {
"type": "keyword",
"norms": true
}
}
}
}
2.2 插入数据
POST _bulk
{"index": {"_index": "people", "_id": "1"}}
{"name": "张三"}
{"index": {"_index": "people", "_id": "2"}}
{"name": "李四"}
{"index": {"_index": "people", "_id": "3"}}
{"name": "王五"}
2.3 查询数据
2.3.1 查询
GET people/_search
{
"query": {
"match": {
"name": "张三"
}
}
}
2.3.2 结果
{
"took" : 11,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 0.3588019,
"hits" : [
{
"_index" : "people",
"_type" : "_doc",
"_id" : "3",
"_score" : 0.3588019,
"_source" : {
"name" : "张三"
}
},
{
"_index" : "people",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.25407052,
"_source" : {
"name" : "张三是一个人"
}
},
{
"_index" : "people",
"_type" : "_doc",
"_id" : "2",
"_score" : 0.22171247,
"_source" : {
"name" : "张三是一个大好人"
}
}
]
}
}
当norms设置为true时,score值根据文本长度越短,分数越大,排序也就越靠前了。 当然,以上文本中均按只出现一次张三为前提。
3. 注意事项⚠️
- 关闭norms norms可以用es的api来进行关闭
PUT people/_mapping
{
"properties": {
"name": {
"type": "text",
"norms": true
}
}
}
- 但是,关闭后无法再次通过api开启。
- 且已有的文档不会立刻移出norms,新增的文档不会再存储norms,在段合并时老的文档会移除norms。