在 JavaScript 中正确使用 Elasticsearch，第二部分作者：来自 Elastic Jeffrey R

作者：来自 Elastic Jeffrey Rengifo

回顾生产环境中的最佳实践，并讲解如何在无服务器环境中运行 Elasticsearch Node.js 客户端。

想获得 Elastic 认证？查看下一期 Elasticsearch Engineer 培训的时间！

Elasticsearch 拥有大量新功能，能帮助你为你的使用场景构建最佳搜索解决方案。深入查看我们的示例笔记本，了解更多信息，开始免费云试用，或立即在本地机器上试用 Elastic。

这是我们 Elasticsearch in JavaScript 系列的第二部分。在第一部分中，我们学习了如何正确设置环境、配置 Node.js 客户端、索引数据以及进行搜索。在第二部分中，我们将学习如何实现生产环境中的最佳实践，并在无服务器环境中运行 Elasticsearch Node.js 客户端。

我们将回顾：

生产环境最佳实践
- 错误处理
- 测试
无服务器环境
- 在 Elastic Serverless 上运行客户端
- 在函数即服务环境中运行客户端

你可以在这里查看包含示例的源代码。

生产环境最佳实践

错误处理

Elasticsearch 的 Node.js 客户端的一个有用功能是，它会暴露出可能出现的 Elasticsearch 错误对象，这样你就可以用不同的方式进行验证和处理。

要查看所有错误对象，运行以下命令：

`

1.  const { errors } = require('@elastic/elasticsearch')
2.  console.log(errors)

`AI写代码

让我们回到搜索示例，处理一些可能出现的错误：

`

1.  app.get("/search/lexic", async (req, res) => {
2.   ....
3.    } catch (error) {
4.      if (error instanceof errors.ResponseError) {
5.        let errorMessage =
6.          "Response error!, query malformed or server down, contact the administrator!";

8.        if (error.body.error.type === "parsing_exception") {
9.          errorMessage = "Query malformed, make sure mappings are set correctly";
10.        }

12.        res.status(error.meta.statusCode).json({
13.          erroStatus: error.meta.statusCode,
14.          success: false,
15.          results: null,
16.          error: errorMessage,
17.        });
18.      }

20.      res.status(500).json({
21.        success: false,
22.        results: null,
23.        error: error.message,
24.      });
25.    }
26.  });

`AI写代码

特别是 ResponseError，会在响应为 4xx 或 5xx 时出现，意味着请求不正确或服务器不可用。

我们可以通过生成错误的查询来测试这种类型的错误，比如尝试在 text 类型字段上执行 term 查询：

默认错误：

`

1.  {
2.      "success": false,
3.      "results": null,
4.      "error": "parsing_exception\n\tRoot causes:\n\t\tparsing_exception: [terms] query does not support [visit_details]"
5.  }

`AI写代码

自定义错误：

`

1.  {
2.      "erroStatus": 400,
3.      "success": false,
4.      "results": null,
5.      "error": "Response error!, query malformed or server down; contact the administrator!"
6.  }

`AI写代码

我们也可以以特定方式捕捉和处理每种类型的错误。例如，我们可以在出现 TimeoutError 时添加重试逻辑。

`

1.  app.get("/search/semantic", async (req, res) => {
2.      try {
3.    ...
4.    } catch (error) {
5.      if (error instanceof errors.TimeoutError) {

8.       // Retry logic...

10.        res.status(error.meta.statusCode).json({
11.          erroStatus: error.meta.statusCode,
12.          success: false,
13.          results: null,
14.          error:
15.            "The request took more than 10s after 3 retries. Try again later.",
16.        });
17.      }
18.    }
19.  });

`AI写代码

测试

测试是保障应用稳定性的关键。为了在与 Elasticsearch 隔离的情况下测试代码，我们可以在创建集群时使用库 elasticsearch-js-mock。

这个库允许我们实例化一个与真实客户端非常相似的客户端，但它会根据我们的配置进行响应，只替换客户端的 HTTP 层为模拟层，其他部分保持与原始客户端一致。

我们将安装 mocks 库和用于自动化测试的 AVA。

`

1.  npm install @elastic/elasticsearch-mock

3.  npm install --save-dev ava

`AI写代码

我们将配置 package.json 文件来运行测试。确保它如下所示：

`

1.  "type": "module",
2.  	"scripts": {
3.  		"test": "ava"
4.  	},
5.  	"devDependencies": {
6.  		"ava": "^5.0.0"
7.  	}

`AI写代码

现在让我们创建一个 test.js 文件并安装我们的模拟客户端：

`

1.  const { Client } = require('@elastic/elasticsearch')
2.  const Mock = require('@elastic/elasticsearch-mock')

4.  const mock = new Mock()
5.  const client = new Client({
6.    node: 'http://localhost:9200',
7.    Connection: mock.getConnection()
8.  })

`AI写代码

现在，添加一个语义搜索的模拟：

``

1.  function createSemanticSearchMock(query, indexName) {
2.    mock.add(
3.      {
4.        method: "POST",
5.        path: `/${indexName}/_search`,
6.        body: {
7.          query: {
8.            semantic: {
9.              field: "semantic_field",
10.              query: query,
11.            },
12.          },
13.        },
14.      },
15.      () => {
16.        return {
17.          hits: {
18.            total: { value: 2, relation: "eq" },
19.            hits: [
20.              {
21.                _id: "1",
22.                _score: 0.9,
23.                _source: {
24.                  owner_name: "Alice Johnson",
25.                  pet_name: "Buddy",
26.                  species: "Dog",
27.                  breed: "Golden Retriever",
28.                  vaccination_history: ["Rabies", "Parvovirus", "Distemper"],
29.                  visit_details:
30.                    "Annual check-up and nail trimming. Healthy and active.",
31.                },
32.              },
33.              {
34.                _id: "2",
35.                _score: 0.7,
36.                _source: {
37.                  owner_name: "Daniel Kim",
38.                  pet_name: "Mochi",
39.                  species: "Rabbit",
40.                  breed: "Mixed",
41.                  vaccination_history: [],
42.                  visit_details:
43.                    "Nail trimming and general health check. No issues.",
44.                },
45.              },
46.            ],
47.          },
48.        };
49.      }
50.    );
51.  }

``AI写代码

现在，我们可以为代码创建一个测试，确保 Elasticsearch 部分始终返回相同的结果：

`

1.  import test from 'ava';

3.  test("performSemanticSearch must return formatted results correctly", async (t) => {
4.    const indexName = "vet-visits";
5.    const query = "Which pets had nail trimming?";

7.    createSemanticSearchMock(query, indexName);

9.    async function performSemanticSearch(esClient, q, indexName = "vet-visits") {
10.      try {
11.        const result = await esClient.search({
12.          index: indexName,
13.          body: {
14.            query: {
15.              semantic: {
16.                field: "semantic_field",
17.                query: q,
18.              },
19.            },
20.          },
21.        });

23.        return {
24.          success: true,
25.          results: result.hits.hits,
26.        };
27.      } catch (error) {
28.        if (error instanceof errors.TimeoutError) {
29.          return {
30.            success: false,
31.            results: null,
32.            error: error.body.error.reason,
33.          };
34.        }

36.        return {
37.          success: false,
38.          results: null,
39.          error: error.message,
40.        };
41.      }
42.    }

44.    const result = await performSemanticSearch(esClient, query, indexName);

46.    t.true(result.success, "The search must be successful");
47.    t.true(Array.isArray(result.results), "The results must be an array");

49.    if (result.results.length > 0) {
50.      t.true(
51.        "_source" in result.results[0],
52.        "Each result must have a _source property"
53.      );
54.      t.true(
55.        "pet_name" in result.results[0]._source,
56.        "Results must include the pet_name field"
57.      );
58.      t.true(
59.        "visit_details" in result.results[0]._source,
60.        "Results must include the visit_details field"
61.      );
62.    }
63.  });

`AI写代码

让我们运行测试。

`npm run test`AI写代码

完成！从现在起，我们可以 100% 专注于代码本身进行测试，而不受外部因素影响。

无服务器环境

在 Elastic Serverless 上运行客户端

我们之前讲过在 Cloud 或本地运行 Elasticsearch；不过，Node.js 客户端也支持连接到 Elastic Cloud Serverless。

Elastic Cloud Serverless 允许你创建项目，无需担心基础设施，因为 Elastic 会内部处理，你只需关注要索引的数据及其保留时长。

从使用角度看，Serverless 将计算和存储解耦，为搜索和索引提供自动扩展功能，这样你只需扩展实际需要的资源。

客户端对连接 Serverless 做了以下适配：

关闭嗅探（sniffing）功能，忽略所有与嗅探相关的选项
忽略配置中除第一个节点外的所有节点，忽略任何节点过滤和选择选项
启用压缩和 TLSv1_2_method（与配置 Elastic Cloud 时相同）
为所有请求添加 elastic-api-version HTTP 头
默认使用 CloudConnectionPool，而非 WeightedConnectionPool
关闭内置的 content-type 和 accept 头，使用标准 MIME 类型

连接无服务器项目时，需要使用参数 serverMode: serverless。

`

1.  const { Client } = require('@elastic/elasticsearch')
2.  const client = new Client({
3.    node: 'ELASTICSEARCH_ENDPOINT',
4.    auth: { apiKey: 'ELASTICSEARCH_API_KEY' },
5.    serverMode: "serverless",
6.  });

`AI写代码

在函数即服务（function-as-a-service）环境中运行客户端

在示例中，我们使用了 Node.js 服务器，但你也可以使用函数即服务环境连接，比如 AWS Lambda、GCP Run 等函数。

`

1.  'use strict'

3.  const { Client } = require('@elastic/elasticsearch')

5.  const client = new Client({
6.    // client initialisation
7.  })

9.  exports.handler = async function (event, context) {
10.    // use the client
11.  }

`AI写代码

另一个例子是连接到像 Vercel 这样的无服务器服务。你可以查看这个完整示例，了解如何操作，但搜索端点中最相关的部分如下：

``

1.  const response = await client.search(
2.    {
3.      index: INDEX,
4.      // You could directly send from the browser
5.      // the Elasticsearch's query DSL, but it will
6.      // expose you to the risk that a malicious user
7.      // could overload your cluster by crafting
8.      // expensive queries.
9.      query: {
10.        match: { field: req.body.text },
11.      },
12.    },
13.    {
14.      headers: {
15.        Authorization: `ApiKey ${token}`,
16.      },
17.    }
18.  );

``AI写代码

该端点位于 /api 文件夹中，从服务器端运行，这样客户端只控制对应搜索词的 “text” 参数。

使用函数即服务的意义在于，与 24/7 运行的服务器不同，函数只在运行时启动机器，完成后机器进入休眠状态，减少资源消耗。

如果应用请求不多，这种配置很方便；否则成本可能较高。你还需考虑函数的生命周期和运行时间（有时仅几秒）。

总结

本文中，我们学习了如何处理错误，这在生产环境中至关重要。还介绍了如何在模拟 Elasticsearch 服务的情况下测试应用，这样测试更可靠，不受集群状态影响，能专注于代码。

最后，我们演示了如何通过配置 Elastic Cloud Serverless 和 Vercel 应用，搭建完全无服务器的架构。

原文：Elasticsearch in JavaScript the proper way, part II - Elasticsearch Labs