PHP简易使用ElasticSearch完成商品搜索

962 阅读4分钟

做一件事,无论大小,倘无恒心,是很不好的。而看一切太难,固然能使人无成,但若看得太容易,也能使事情无结果。

前言

主要是简单介绍一下es的使用,里面一些说法可能不大正确,如有不正确,请大家指正一下~

背景

一般来说我们都会有搜索的功能,如果每一次搜索都对数据库进行请求,对库的压力也会挺大的,所以,这个时候我们也可以用es来是想搜索功能。es是什么东西相信大家都有一定了解,不大清楚的朋友们可以搜索一下,大神们的文章会比我说的更清楚。

准备

使用es一般我们都是基于中文搜索,es的分词对于中文没有那么智能,所以我们需要安装一个插件IK分词,此外PHP使用es有已经写好的composer包elasticsearch/elasticsearch - Packagist,然后就是es的安装,这个 Elastic 产品 | Elastic就有,可以使用docker部署也可以直接部署在服务器上,看个人选择。

开始

  1. 首先是连接es,代码很简单,如下:
    $host = ['https://127.0.0.1:9200'];
    $this->esClient = \Elastic\Elasticsearch\ClientBuilder::create()
        ->setHosts($host)
        ->setBasicAuthentication('elastic', 'password copied during Elasticsearch start')
        ->setSSLVerification(false)
        ->build();

注:由于我用的是elastic8.2.3,默认开启https,故本地连接不检验ssl,所以设置setSSLVerification(false) 当然也可以按照官方文档使用http_ca.crt去连接:

image.png 两个代码分别为:

    docker cp es01:/usr/share/elasticsearch/config/certs/http_ca.crt .

    $client = ClientBuilder::create()
        ->setHosts(['https://localhost:9200'])
        ->setBasicAuthentication('elastic', 'password copied during Elasticsearch start')
        ->setCABundle('path/to/http_ca.crt')
        ->build();

然后我们可以调用一下info()这个方法,查看一下es的信息

    public function esInfo(): void
    {
        foreach ($this->esClient->info()->asArray() as $key => $value) {
            if (is_string($value)) {
                echo $key . ' => ' . $value;
                echo '<br>';
            } else {
                echo $key . ' => ' . json_encode($value);
                echo '<br>';
            }
        }
    }

可以看到结果

image.png

  1. 接着便是创建索引,创建索引可以同时创建映射,我对于mapping的理解就是类似定义一张表结构,不知道是否正确。当然mapping可以动态生成,不过我们这里需要指定一下分词器,所以便自己生成一下。我搜索的字段基于prod_name和brief这两个字段,于是便对它们指定分词器。
public function makeEsIndex(): void
    {
        $params = [
            'index' => $this->index,
            'body' => [
                'settings' => [
                    'number_of_shards' => 3, // 指索引要做多少个分片,只能在创建索引时指定,后期无法修改
                    'number_of_replicas' => 2 // 指每个分片有多少个副本,后期可以动态修改
                ],
                'mappings' => [
                    '_source' => [
                        'enabled' => true
                    ],
                    'properties' => [
                        'prod_id' => [
                            'type' => 'integer',
                        ],
                        'prod_name' => [
                            'type' => 'text',
                            "analyzer" => "ik_max_word",
                            "search_analyzer" => "ik_max_word"
                        ],
                        'ori_price' => [
                            'type' => 'double',
                        ],
                        'price' => [
                            'type' => 'double',
                        ],
                        'brief' => [
                            'type' => 'text',
                            "analyzer" => "ik_max_word",
                            "search_analyzer" => "ik_max_word"
                        ],
                        'pic' => [
                            'type' => 'text',
                        ]
                    ]
                ]
            ]
        ];

        try {
            $response = $this->esClient->indices()->create($params);
            var_dump($response);
        } catch (\Elastic\Elasticsearch\Exception\ClientResponseException $exception) {
            $msg = $exception->getMessage();
            var_dump($msg);
        }
    }
  1. 添加数据到es,我们生成索引后,自然要把数据添加到es,添加使用create()或者index()方法,区别是前者只新建,后者会覆盖文档,就是有则更新无则新增。
    try {
        $param = array(
            // body 为你需要添加去es的数组,如:array(6) { ["prod_id"]=> string(2) "18" ["prod_name"]=> string(47) "Apple iPhone XS Max 移动联通电信4G手机 " ["ori_price"]=> string(4) "0.00" ["price"]=> string(4) "1.01" ["brief"]=> string(33) "6.5英寸大屏,支持双卡。" ["pic"]=> string(44) "2019/04/eaa8c9bd3e7b41eaa310adbde10b6401.jpg" }
            'body' => $row,
            // id 为文档的唯一标识id,即es里的_id,id可以在索引时分配,也可以由Elasticsearch生成唯一id,如果不指定则自动生成
            'id' => $row['prod_id'],
            // index 指定添加到哪个索引
            'index' => $this->index,
        );
        $response = $this->esClient->index($param);
    } catch (\Elastic\Elasticsearch\Exception\ClientResponseException|\Elastic\Elasticsearch\Exception\ServerResponseException|\Elastic\Elasticsearch\Exception\MissingParameterException $exception) {
        var_dump($exception->getMessage());
    }

添加完后可以到kibana看一下:

image.png

  1. 搜索文档
    public function getDoc()
    {
        try {
            // 关键字
            $keywords = '木瓜';
            // 页码
            $page = 1;
            // 每页显示条数
            $pageSize = 10;
            // 偏移量
            $from = ($page - 1) * $pageSize;

            // 搜索条件
            $condition = $this->queryCondition($keywords);

            $condition = [
                [
                    'match' => [
                        'prod_name' => $keywords
                    ]
                ],
                [
                    'match' => [
                        'brief' => '木瓜'
                    ]
                ]
            ];
            // 同时满足(商品名和简介)用must
            $param = [
                'bool' => [
                    'must' => $condition,
                ],
            ];
            $query = $this->mergeParams($param);
            $result = $this->esClient->search($query);
            var_dump('======must======');
            echo '<br>';
            print_r($result['hits']['hits']);
            echo '<br>';
            // should 满足其中一个条件
            $param = [
                'bool' => [
                    'should' => $condition,
                    'minimum_should_match' => 1,
                ],
            ];
            // should与must或filter在同一层级直接使用时,should会失效,需要加入参数"minimum_should_match":1
            $param = [
                'bool' => [
                    'should' => $condition,
                    'filter' => [
                    // 这里可以添加过滤条件,比如我想找的东西是价格范围和原价范围的
                        [
                            'range' => [
                                'price' => [
                                    // gt(>) gte(>=) lt(<) lte(<=)
                                    'gte' => 10,
                                    'lte' => 1290,
                                ],
                            ],
                        ],
                        [
                            'range' => [
                                'ori_price' => [
                                    // gt(>) gte(>=) lt(<) lte(<=)
                                    'gte' => 500,
                                    'lte' => 1290,
                                ],
                            ],
                        ]
                    ],
                    'minimum_should_match' => 1,
                ],
            ];
            $query = $this->mergeParams($param);
            $result = $this->esClient->search($query);
            var_dump('======should======');
            echo '<br>';
            print_r($result['hits']['hits']);
            echo '<br>';

            // must not必须不满足所有条件
            $param = [
                'bool' => [
                    'must_not' => $condition,
                ],
            ];
            $query = $this->mergeParams($param);
            $result = $this->esClient->search($query);
            var_dump('======must not======');
            echo '<br>';
            print_r($result['hits']['hits']);
            echo '<br>';
        } catch (\Throwable $throwable) {
            var_dump($throwable->getMessage());
        }
    }

    /**
     * 拼装参数
     * @param array $query '查询参数'
     * @param int $from '偏移量'
     * @param int $size '每页展示条数'
     * @param array $order '排序'
     * @return array
     * @author QiuYiEr
     */
    public function mergeParams(array $query = [], int $from = 0, int $size = 20, array $order = ['prod_id' => ['order' => 'asc']]): array
    {
        if (!$query) {
            $body = [
                'sort' => [$order],
                'from' => $from,
                'size' => $size,
            ];
        } else {
            $body = [
                'query' => $query,
                'sort' => [$order],
                'from' => $from,
                'size' => $size,
            ];
        }

        return [
            'index' => $this->index,
            'body' => $body,
        ];
    }

我搜的木瓜作为关键字,木瓜在es的数据为

image.png

在must条件下需要满足brief和prod_name都满足才有数据,但从图可以看出,brief并没有“木瓜”这个关键字,所以无法匹配到

image.png

should条件下,二者满足其中之一就可以,故可以匹配到

image.png

但如果带上条件查询,我们可以看到ori_price是500<= ori_price <= 1290,但是我们es里的数据是ori_price=6505,故无法匹配到

image.png

must not是不能出现在文档中,而且我用上了分页,故数据总体为(为了好看我简单展示一下)

image.png

image.png

正好缺少了木瓜的那一条数据。

此外,还可以利用order进行排序比如对id进行升序排序

['prod_id' => ['order' => 'asc']]

image.png

对价格降序

['price' => ['order' => 'desc']]

image.png

一个小示例

这个示例我是用hyperf框架写的

public function makeEsIndex(): \Psr\Http\Message\ResponseInterface|array
{
    $params = [
        'index' => $this->index,
        'body' => [
            'settings' => [
                'number_of_shards' => 3, // 指索引要做多少个分片,只能在创建索引时指定,后期无法修改
                'number_of_replicas' => 2, // 指每个分片有多少个副本,后期可以动态修改
                'analysis' => [
                    'analyzer' => [
                        'default' => [
                            'tokenizer' => 'ik_max_word',
                        ],
                        'pinyin_analyzer' => [
                            'tokenizer' => 'my_pinyin',
                        ],
                    ],
                    'tokenizer' => [
                        'my_pinyin' => [
                            'type' => 'pinyin',
                            'keep_first_letter' => true,
                            'keep_separate_first_letter' => false,
                            'keep_full_pinyin' => true,
                            'keep_original' => true,
                            'limit_first_letter_length' => 16,
                            'lowercase' => true,
                            'remove_duplicated_term' => true,
                        ],
                    ],
                ],
            ],
            'mappings' => [
                '_source' => [
                    'enabled' => true,
                ],
                'properties' => [
                    'goods_id' => [
                        'type' => 'integer',
                    ],
                    'goods_name' => [
                        'type' => 'text',
                        'analyzer' => 'ik_max_word',
                        'search_analyzer' => 'ik_max_word',
                        'fields' => [
                            'pinyin' => [
                                'type' => 'text',
                                'term_vector' => 'with_positions_offsets',
                                'analyzer' => 'pinyin_analyzer',
                            ],
                        ],
                    ],
                    'goods_price' => [
                        'type' => 'double',
                    ],
                    'goods_standards' => [
                        'type' => 'text',
                    ],
                    'goods_indications' => [
                        'type' => 'text',
                        'analyzer' => 'ik_max_word',
                        'search_analyzer' => 'ik_max_word',
                    ],
                    'goods_image' => [
                        'type' => 'text',
                    ],
                    'store_name' => [
                        'type' => 'text',
                    ],
                    'store_self_pickup' => [
                        'type' => 'integer',
                    ],
                    'store_no_rest' => [
                        'type' => 'integer',
                    ],
                    'store_free_freight' => [
                        'type' => 'integer',
                    ],
                    'store_id' => [
                        'type' => 'integer',
                    ],
                    'store_delivery' => [
                        'type' => 'integer',
                    ],
                    'gc_id_3' => [
                        'type' => 'integer',
                    ],
                    'location' => [
                        'type' => 'geo_point',
                    ],
                    'city_id' => [
                        'type' => 'integer',
                    ],
                    'hospital_id' => [
                        'type' => 'integer',
                    ],
                ],
            ],
        ],
    ];

    try {
        return $this->esClient->indices()->create($params);
    } catch (BadRequest400Exception $e) {
        $msg = $e->getMessage();
        $msg = json_decode($msg, true);
        return $this->response->json($msg);
    }
}


public function getList(): callable|array
    {
        // 关键字
        $keywords = $this->request->input('keywords', '');
        // 页码
        $page = $this->request->input('page');
        // 每页显示条数
        $pageSize = $this->request->input('page_size');
        // 偏移量
        $from = ($page - 1) * $pageSize;
        // 排序
        $sort['goods_id'] = ['order' => 'desc'];

        $filter = [];

        // 查询参数
        $param = [];

        // 假如有关键词查询,拼装查询条件
        $should = $this->getValue($keywords);

//        return $param;

        if ($should) {
            // 关键词查询按照匹配得分排序
            unset($sort['goods_id']);
            $sort['_score'] = ['order' => 'desc'];
            $param['bool']['should'] = $should;
            $param['bool']['minimum_should_match'] = 1;
        }

        // 拼装查询条件
        $must = [];
        $regx = [0, 1];

        if (in_array($this->request->input('store_self_pickup', ''), $regx)) {
            $must[] = [
                'match' => [
                    'store_self_pickup' => $this->request->input('store_self_pickup'),
                ],
            ];
        }

        if (in_array($this->request->input('store_no_rest', ''), $regx)) {
            $must[] = [
                'match' => [
                    'store_no_rest' => $this->request->input('store_no_rest'),
                ],
            ];
        }

        if (in_array($this->request->input('store_free_freight', ''), $regx)) {
            $must[] = [
                'match' => [
                    'store_free_freight' => $this->request->input('store_free_freight'),
                ],
            ];
        }

        if (in_array($this->request->input('store_delivery', ''), $regx)) {
            $must[] = [
                'match' => [
                    'store_delivery' => $this->request->input('store_delivery'),
                ],
            ];
        }

        if ($this->request->input('gc_id_3')) {
            $must[] = [
                'match' => [
                    'gc_id_3' => $this->request->input('gc_id_3'),
                ],
            ];
        }

        if ($this->request->input('hospital_id')) {
            $must[] = [
                'match' => [
                    'hospital_id' => $this->request->input('hospital_id'),
                ],
            ];
        }

        // 有经纬度,则搜索经纬度所在的城市的商品,然后按照距离排序
        $location = $this->request->input('location', '');
        if ($location) {
            $location = explode(',', $location);
            $cityCode = $this->request->input('city_code');
            // 这是小程序获取地理位置时,cityCode是156400100这个格式,所以稍微处理一下,具体情况具体分析
            $cityCode = str_replace('156', '', $cityCode);
            $must[] = [
                'match' => [
                    'city_id' => $cityCode,
                ],
            ];

            $sort['_geo_distance'] = [
                'unit' => 'km',
                'location' => [
                    'lon' => $location[0],
                    'lat' => $location[1],
                ],
                'order' => 'asc',
            ];

            // 如果有距离范围要求,则启用下面代码
            if ($this->request->input('search_km')) {
                $filter[] = [
                    'geo_distance' => [
                        'distance' => $this->request->input('search_km', 1) . 'km',
                        'location' => [
                            'lon' => $location[0],
                            'lat' => $location[1],
                        ],
                    ],
                ];
            }
        }

        $priceRange = $this->getPriceRange();

        if ($priceRange) {
            $filter[] = $priceRange;
        }

        if ($must) {
            $param['bool']['must'] = $must;
        }

        if ($filter) {
            $param['bool']['filter'] = $filter;
        }

//        return $param;

        $query = $this->mergeParams($param, $from, $pageSize, $sort);

//        return $query;

        return $this->esClient->search($query);
    }

此外qiuyier/esSearchBuilder有一个简单的搜索demo(包含go版本),有需要可以复制使用并根据实际情况修改即可。

以上就是对es基本操作,希望对大家有一定的帮助~