Elasticsearch之painless脚本更新数据的常见问题及解决办法

994 阅读1分钟

这是我参与11月更文挑战的第2天,活动详情查看:2021最后一次更文挑战

painless脚本进行更新数据给予我们很多的自由度和灵活度,可以根据所熟悉的Java相关的API进行处理更新数据。
假设有以下记录学生头像和经常访问的链接的索引和其中一个文档数据如下:

PUT /student
{
    "mappings" : {
      "properties" : {
        "icon" : {
          "type" : "keyword"
        },
        "links" : {
          "type" : "text",
          "fielddata": false
        }
      }
    }
  }
  
  
POST /student/_doc/1
{
  "icon": "http://www.ace.cn/img/xx.png",
  "links": ["http://www.ace.cn/personal/xx.html"]
}

但是由于域名的升级改变, 现在需要在更新经常访问的链接的域名又'ace.cn'变为'ace.com',使用painless脚本进行更新:

却报了以下错误:

{
  "error": {
    "root_cause": [
      {
        "type": "script_exception",
        "reason": "runtime error",
        "script_stack": [
          "org.elasticsearch.index.mapper.TextFieldMapper$TextFieldType.fielddataBuilder(TextFieldMapper.java:759)",
          "org.elasticsearch.index.fielddata.IndexFieldDataService.getForField(IndexFieldDataService.java:116)",
          "org.elasticsearch.index.query.QueryShardContext.lambda$lookup$0(QueryShardContext.java:290)",
          "org.elasticsearch.search.lookup.LeafDocLookup$1.run(LeafDocLookup.java:101)",
          "org.elasticsearch.search.lookup.LeafDocLookup$1.run(LeafDocLookup.java:98)",
          "java.security.AccessController.doPrivileged(Native Method)",
          "org.elasticsearch.search.lookup.LeafDocLookup.get(LeafDocLookup.java:98)",
          "org.elasticsearch.search.lookup.LeafDocLookup.get(LeafDocLookup.java:41)",
          "doc.links[0].indexOf('ace.cn')!=-1",
          "   ^---- HERE"
        ],
        "script": "doc.links[0].indexOf('ace.cn')!=-1",
        "lang": "painless"
      }
    ],
    "type": "search_phase_execution_exception",
    "reason": "all shards failed",
    "phase": "query",
    "grouped": true,
    "failed_shards": [
      {
        "shard": 0,
        "index": "student",
        "node": "If-KrvqWTFC8qbXqMmafaQ",
        "reason": {
          "type": "script_exception",
          "reason": "runtime error",
          "script_stack": [
            "org.elasticsearch.index.mapper.TextFieldMapper$TextFieldType.fielddataBuilder(TextFieldMapper.java:759)",
            "org.elasticsearch.index.fielddata.IndexFieldDataService.getForField(IndexFieldDataService.java:116)",
            "org.elasticsearch.index.query.QueryShardContext.lambda$lookup$0(QueryShardContext.java:290)",
            "org.elasticsearch.search.lookup.LeafDocLookup$1.run(LeafDocLookup.java:101)",
            "org.elasticsearch.search.lookup.LeafDocLookup$1.run(LeafDocLookup.java:98)",
            "java.security.AccessController.doPrivileged(Native Method)",
            "org.elasticsearch.search.lookup.LeafDocLookup.get(LeafDocLookup.java:98)",
            "org.elasticsearch.search.lookup.LeafDocLookup.get(LeafDocLookup.java:41)",
            "doc.links[0].indexOf('ace.cn')!=-1",
            "   ^---- HERE"
          ],
          "script": "doc.links[0].indexOf('ace.cn')!=-1",
          "lang": "painless",
          "caused_by": {
            "type": "illegal_argument_exception",
            "reason": "Fielddata is disabled on text fields by default. Set fielddata=true on [links] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead."
          }
        }
      }
    ]
  },
  "status": 400
}

这里的报错是因为links的type为text,默认fielddata是关闭的。根据建议:

把字段改为keyword类型或开启fielddata属性。

因为开启text类型的fielddata是很消耗内存资源,所以我选择修改为keyword类型。对于已存在数据在使用的索引,修改类型是需要重新设置mapping的:
方法一: 先删掉就索引,然后生成新索引,然后在导入数据;
方法二:使用reindex。
我选择方法二,因为简单并且可在重新索引的过程中可以使用painless脚本在导入数据的时候对数据进行修改,符合我的需求:

PUT /student_new
{
    "mappings" : {
      "properties" : {
        "icon" : {
          "type" : "keyword"
        },
        "links" : {
          "type" : "keyword"
        }
      }
    }
  }
  
POST /_reindex
{
  "source": {
    "index": "student"
  },
  "dest": {
    "index": "student_new"
  },
  "script": {
    "source": "if(ctx._source.links[0].indexOf('ace.cn')!=-1){ctx._source.links[0]=ctx._source.links[0].replace('ace.cn', 'ace.com');}"
  }
}