大数据利器Elasticsearch搜索之通配符与正则表达式查询

326 阅读2分钟

这是我参与8月更文挑战的第22天,活动详情查看:8月更文挑战
本Elasticsearch相关文章的版本为:7.4.2

wildcard通配符查询: 支持正则表达式查询。

  • ?表示匹配一个字符
  • * 表示匹配0个或多个字符
  • . 表示匹配任意一个字符

测试数据:

POST /_bulk
{"index": {"_index": "wildcard_test_index", "_id": 1}}
{"city": "广州市", "area_code": "020"}
{"index": {"_index": "wildcard_test_index", "_id": 2}}
{"city": "韶关市", "area_code": "0751"}
{"index": {"_index": "wildcard_test_index", "_id": 3}}
{"city": "惠州市", "area_code": "0752"}
{"index": {"_index": "wildcard_test_index", "_id": 4}}
{"city": "梅州市", "area_code": "0753"}
{"index": {"_index": "wildcard_test_index", "_id": 5}}
{"city": "汕头市", "area_code": "0754"}
{"index": {"_index": "wildcard_test_index", "_id": 6}}
{"city": "深圳市", "area_code": "0755"}
{"index": {"_index": "wildcard_test_index", "_id": 7}}
{"city": "珠海市", "area_code": "0756"}
{"index": {"_index": "wildcard_test_index", "_id": 8}}
{"city": "佛山市", "area_code": "0757"}
{"index": {"_index": "wildcard_test_index", "_id": 9}}
{"city": "肇庆市", "area_code": "0758"}
{"index": {"_index": "wildcard_test_index", "_id": 10}}
{"city": "湛江市", "area_code": "0759"}
{"index": {"_index": "wildcard_test_index", "_id": 11}}
{"city": "中山市", "area_code": "0760"}
{"index": {"_index": "wildcard_test_index", "_id": 12}}
{"city": "河源市", "area_code": "0762"}
{"index": {"_index": "wildcard_test_index", "_id": 13}}
{"city": "清远市", "area_code": "0763"}
{"index": {"_index": "wildcard_test_index", "_id": 14}}
{"city": "顺德市", "area_code": "0765"}
{"index": {"_index": "wildcard_test_index", "_id": 15}}
{"city": "云浮市", "area_code": "0766"}
{"index": {"_index": "wildcard_test_index", "_id": 16}}
{"city": "潮州市", "area_code": "0768"}
{"index": {"_index": "wildcard_test_index", "_id": 17}}
{"city": "东莞市", "area_code": "0769"}
{"index": {"_index": "wildcard_test_index", "_id": 18}}
{"city": "汕尾市", "area_code": "0660"}
{"index": {"_index": "wildcard_test_index", "_id": 19}}
{"city": "潮阳市", "area_code": "0661"}
{"index": {"_index": "wildcard_test_index", "_id": 20}}
{"city": "阳江市", "area_code": "0662"}
{"index": {"_index": "wildcard_test_index", "_id": 21}}
{"city": "揭西市", "area_code": "0663"}

例如,下面的这个查询会匹配以07开头,中间是任意数字,以7结尾的电话区号的城市:

GET /wildcard_test_index/_search
{
    "query": {
        "wildcard": {
            "area_code": "07?7" 
        }
    }
}

查询结果:

{
  "took" : 956,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "wildcard_test_index",
        "_type" : "_doc",
        "_id" : "8",
        "_score" : 1.0,
        "_source" : {
          "city" : "佛山市",
          "area_code" : "0757"
        }
      }
    ]
  }
}

再如, 下面的查询返回以0开头和以5结尾,中间是任意数量任意字符的电话区号的城市:

GET /wildcard_test_index/_search
{
    "query": {
        "wildcard": {
            "area_code": "0*5" 
        }
    }
}

返回的结果数据:

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "wildcard_test_index",
        "_type" : "_doc",
        "_id" : "6",
        "_score" : 1.0,
        "_source" : {
          "city" : "深圳市",
          "area_code" : "0755"
        }
      },
      {
        "_index" : "wildcard_test_index",
        "_type" : "_doc",
        "_id" : "14",
        "_score" : 1.0,
        "_source" : {
          "city" : "顺德市",
          "area_code" : "0765"
        }
      }
    ]
  }
}