elastic 作业(2021/10/20)

135 阅读3分钟

Task 1

问题

Task1:
有如下索引: 
POST /inject/_doc/1 
{ 
    title: the number is error 
}

POST /inject/_doc/2 
{ 
    title: it is number is error 
}

用下面的查询是返回一条数据 
GET inject/_search { query: { match: { title: the } } } 

新建一个索引 要求: 
    i、新索引名字为:inject_new 
    ii、通过Reindex API将inject的数据和类型复制到inject_new上 
    iii、通过 term查询,term 'the' value of title,返回0条数据 

解答

1. 创建索引

PUT /inject_new
{
  "mappings": {
    "properties": {
      "title":{
        "type": "keyword"
      }
    }
  }
}

2. reindex

POST _reindex
{
  "source": {
    "index": "inject"
  },
  "dest": {
    "index": "inject_new"
  }
}

3. 查询

GET /inject_new/_search
{
  "query": {
    "term": {
      "title.keyword": {
        "value": "the"
      }
    }
  }
}

Task2

问题

Task2 
查询满足条件的省级行政单位信息,并且满足以下条件之一: 
City information includes the phrase Beijing City 
The District information contains Qibin District 

以下为习题所需数据: 
如果需要使用should查询,以下为使用方法:如果must相当于and查询,那么should相当于or 
PUT /area/_doc/1
{
    province: {
        name: Beijing,
        cities: [{
            name: Beijing City,
            district: [{
                name: Fengtai District
            }, {
                name: Haidian District
            }, {
                name: Chaoyang District
            }, {
                name: Dongcheng District
            }, {
                name: Xicheng District
            }, {
                name: Changping District
            }]
        }]
    }
}
 
PUT /area/_doc/2 
{
    province: {
        name: Henan Province,
        cities: [{
            name: Zhengzhou City,
            district: [{
                name: Jinshui District
            }, {
                name: Gaoxin District
            }, {
                name: Zhengdong New District
            }, {
                name: Erqi District
            }, {
                name: Zhongyuan District
            }, {
                name: Huiji District
            }]
        }, {
            name: Hebi City,
            district: [{
                name: Shancheng District
            }, {
                name: Qibin District
            }, {
                name: Heshan District
            }, {
                name: ZhaoGe
            }, {
                name: Xunxian
            }]
        }]
    }
}

PUT /area/_doc/3
{
    province: {
        name: Taiwan Province,
        cities: [{
            name: Taibei Cuty,
            district: [{
                name: Zhongzheng District
            }, {
                name: Datong District
            }, {
                name: Zhongshan District
            }, {
                name: Wanhua District
            }, {
                name: Xinyi District
            }, {
                name: Songshan District
            }]
        }, {
            name: Gaoxiong City,
            district: [{
                name: Xiaogang District
            }, {
                name: Gushan District
            }, {
                name: Sanmin District
            }]
        }]
    }
}

解答

GET /area/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "match_phrase": {
            "province.cities.name": "Beijing City"
          }
        },
        {
          "match_phrase": {
            "province.cities.district.name": "Qibin District"
          }
        }
      ]
    }
  }
}

Task3

问题

Task3 
索引中index_a有两条分别是 What's is 和 What is 的数据,通过将index_a reindex到index_b后,使用What's is 或 What is 的查询,能返回同样数量的文档和评分。

解答

现状

分别在 index_a 上查询 What's is 或 What is, 分数都不想等

GET /index_a/_search
{
  "query": {
    "match": {
      "content": "What is"
    }
  }
}


#########################一下为查询结果,关注 _score 字段############################
"max_score" : 0.8754687,
"hits" : [
      {
        "_index" : "index_a",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 0.8754687,
        "_source" : {
          "content" : "What is"
        }
      },
      {
        "_index" : "index_a",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.18232156,
        "_source" : {
          "content" : "What's is"
        }
      }
    ]
GET /index_a/_search
{
  "query": {
    "match": {
      "content": "What's is"
    }
  }
}

#########################一下为查询结果,关注 _score 字段############################

"max_score" : 0.8754687,
"hits" : [
  {
    "_index" : "index_a",
    "_type" : "_doc",
    "_id" : "1",
    "_score" : 0.8754687,
    "_source" : {
      "content" : "What's is"
    }
  },
  {
    "_index" : "index_a",
    "_type" : "_doc",
    "_id" : "2",
    "_score" : 0.18232156,
    "_source" : {
      "content" : "What is"
    }
  }
]

思路

思路: 禁用 content 字段的评分(), 或者将字段的相关度的评分权重修改为0

  • norms:是否禁用评分
  • boost:对当前字段相关度的评分权重,默认1

norms 禁用评分

PUT /index_b
{
  "mappings": {
    "properties": {
      "content":{
        "type": "text",
        "norms": true
      }
    }
  }
}

经测试不满足, 两个查询分数不相等

boost 修改评分权重

PUT /index_b
{
  "mappings": {
    "properties": {
      "content":{
        "type": "text",
        "boost": 0
      }
    }
  }
}

经测试满足, 两次查询分数都为 0

GET /index_b/_search
{
  "query": {
    "match": {
      "content": "What's is"
    }
  }
}

#########################一下为查询结果,关注 _score 字段############################

"max_score" : 0.0,
"hits" : [
  {
    "_index" : "index_b",
    "_type" : "_doc",
    "_id" : "1",
    "_score" : 0.0,
    "_source" : {
      "content" : "What's is"
    }
  },
  {
    "_index" : "index_b",
    "_type" : "_doc",
    "_id" : "2",
    "_score" : 0.0,
    "_source" : {
      "content" : "What is"
    }
  }
]

GET /index_b/_search
{
  "query": {
    "match": {
      "content": "What is"
    }
  }
}

#########################一下为查询结果,关注 _score 字段############################
"max_score" : 0.0,
"hits" : [
  {
    "_index" : "index_b",
    "_type" : "_doc",
    "_id" : "1",
    "_score" : 0.0,
    "_source" : {
      "content" : "What's is"
    }
  },
  {
    "_index" : "index_b",
    "_type" : "_doc",
    "_id" : "2",
    "_score" : 0.0,
    "_source" : {
      "content" : "What is"
    }
  }
]

Task 4

题目

Task4 
为索引task4,设置dynamic mapping,具体的要求如下: 
一切text类型的字段,类型全部映射为keyword 
一切以int_开头命名的字段,类型都设置成integer 
bulk导入如下的数据进行验证。 
导入数据 
POST task4/_bulk
{ "index": { "_id": 1 } } 
{ "cont": "你好马士兵教育", "int_value": 35 } 
{ "index": { "_id": 2 } } 
{ "cont": "马士兵教育", "int_value": 35 } 
{ "index": { "_id": 3 }} 
{ "cont": "马士兵", "int_value": 35 }

解答

参考: 动态模板: www.elastic.co/guide/en/el…

PUT /task4
{
  "mappings": {
    "dynamic_templates":[
      {
        "integers": {
          "match":"int_",
          "mapping":{
            "type":"integer"
          }
        }
      },
      {
        "keyword":{
          "match_mapping_type":"string",
          "mapping":{
            "type": "keyword"
          }
        }
      }
      
    ]
  }
}

Task5

题目

Task5 
设置一个index template,符合如下的要求:
为msb和msb-开头的索引,创建3个主分片,1个副本分片。 对索引msb-tech写入测试数据

解答

参考: www.elastic.co/guide/en/el…

PUT _index_template/template_1
{
  "index_patterns": ["msb*", "msb-*"],
  "template":{
    "settings":{
      "number_of_shards": 3,
      "number_of_replicas":1
    }
  }
}