本文已参与「新人创作礼」活动，一起开启掘金创作之路。

跟着大神溜代码，有个操作：

List<Long> searchAttrIds = attrService.selectSearchAttrs(attrIds);

Set<Long> idSet = new HashSet<>(searchAttrIds);

 List<SkuEsModel.Attrs> attrList = baseAttrs.stream().filter(item -> {
      return idSet.contains(item.getAttrId());
  }).map(item -> {
      SkuEsModel.Attrs attrs1 = new SkuEsModel.Attrs();
      BeanUtils.copyProperties(item, attrs1);
      return attrs1;
  }).collect(Collectors.toList());

他的一个操作让我困惑

为什么Set idSet = new HashSet<>(searchAttrIds);后再idSet.contains(item.getAttrId()); 为什么不直接list搞起来

折腾：list set区别：

List接口 List是有序的Collection，使用此接口能够精确的控制每个元素插入的位置。用户能够使用索引（元素在List中的位置，类似于数组下标）来访问List中的元素，这类似于Java的数组。和下面要提到的Set不同，List允许有相同的元素。除了具有Collection接口必备的iterator()方法外，List还提供一个listIterator()方法，返回一个ListIterator接口，和标准的Iterator接口相比，ListIterator多了一些add()之类的方法，允许添加，删除，设定元素，还能向前或向后遍历。　　实现List接口的常用类有LinkedList，ArrayList，Vector和Stack。

Set是一种不包含重复的元素的Collection，即任意的两个元素e1和e2都有e1.equals(e2)=false，Set最多有一个null元素。　很明显，Set的构造函数有一个约束条件，传入的Collection参数不能包含重复的元素。请注意：必须小心操作可变对象（Mutable Object）。如果一个Set中的可变元素改变了自身状态导致Object.equals(Object)=true将导致一些问题。

针对它们的contains：拜读了大神的实验结论：大神的实验过程记录如其地址 www.cnblogs.com/jiadp/p/925…

他的结论：

总结：使用contains方法查询元素是否存在HashSet要比ArrayList快的多。

回到代码，总结有两个好处：

操作： Set<Long> idSet = new HashSet<>(searchAttrIds);

    1.这个操作会过滤掉searchAttrIds的重复元素
    2.HashSet.contains 比list.contains效率要高

elasticSearch的属性的index doc_values和nested嵌入式使用

elasticSearch的属性的index doc_values和nested嵌入式使用

需要ik中文检索：

"skuTitle": { 
    "type": "text", 
    "analyzer": "ik_smart"
}

使用ik_smart 分词类型，需要安装ik分词器插件哦

节省资源的设置：

"skuImg": { 
    "type": "keyword",
    "index": false, 
    "doc_values": false
}

index：默认 true，如果为 false，表示该字段不会被索引，但是检索结果里面有，但字段本身不能当做检索条件。

doc_values：默认 true，设置为 false，表示不可以做排序、聚合以及脚本操作，这样更节省磁盘空间。还可以通过设定 doc_values 为 true，index 为 false 来让字段不能被搜索但可以用于排序、聚合以及脚本操作：

对数组属性nested嵌入式设置

"attrs": 
{
    "type": "nested", 
    "properties": { 
        "attrId": { 
            "type": "long"
        },
        "attrName": { 
            "type": "keyword", 
            "index": false, 
            "doc_values": false
        },
        "attrValue": {
            "type": "keyword"
        }
    }
}

对于数组型属性不设置嵌入式结构，存储时会被扁平化处理：

想要保存两个人：
做一个索引，有个user数组属性，把它的存储人的名字
John Smith
Alice White

PUT my-index-000001/_doc/1
{
  "group" : "fans",
  "user" : [ 
    {
      "first" : "John",
      "last" :  "Smith"
    },
    {
      "first" : "Alice",
      "last" :  "White"
    }
  ]
}

默认会被扁平化存储：

扁平化存储后，它的文档将在内部转换为更像以下内容的文档：

存储对象的同属性被合并到了一起
{
  "group" :        "fans",
  "user.first" : [ "Alice", "john" ],
  "user.last" :  [ "smith", "white" ]
}

测试一下它的的特性：我们找一个不存在的人：

我们找：Alice Smith
GET my-index-000001/_search
{
  "query": {
    "bool": {
      "must": [
        { "match": { "user.first": "Alice" }},
        { "match": { "user.last":  "Smith" }}
      ]
    }
  }
}

我们却能找到结果：
{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 0.105360515,
    "hits" : [
      {
        "_index" : "my-index-000001",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.105360515,
        "_source" : {
          "group" : "fans",
          "user" : [
            {
              "first" : "John",
              "last" : "Smith"
            },
            {
              "first" : "Alice",
              "last" : "White"
            }
          ]
        }
      }
    ]
  }
}

查看他的mapping:

GET my-index-000001/_mapping
得到：
{
  "my-index-000001" : {
    "mappings" : {
      "properties" : {
        "group" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "user" : {
          "properties" : {
            "first" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "last" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            }
          }
        }
      }
    }
  }
}

竟然有结果，不符合我们预期接下来删除这个索引

DELETE my-index-000001

重新定义它，给它加上嵌入式的类型nested：

PUT my-index-000001
{
  "mappings": {
    "properties": {
      "user": {
        "type": "nested" 
      }
    }
  }
}

重新存储：

PUT my-index-000001/_doc/1
{
  "group" : "fans",
  "user" : [
    {
      "first" : "John",
      "last" :  "Smith"
    },
    {
      "first" : "Alice",
      "last" :  "White"
    }
  ]
}

查找那个不存在的人：

GET my-index-000001/_search
{
  "query": {
    "nested": {
      "path": "user",
      "query": {
        "bool": {
          "must": [
            { "match": { "user.first": "Alice" }},
            { "match": { "user.last":  "Smith" }} 
          ]
        }
      }
    }
  }
}
这货不该出现，就不出现了：
{
  "took" : 65,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}

set list的区别以及它们的contains效率问题