Elasticsearch:模型作为评判者模式(质量控制)

0 阅读2分钟
  • 任务:使用第二个模型验证 agent 的输出
  • 模式:生成并评估

在这个 Elastic Workflow 示例中,agent 使用一个模型生成响应,再用第二个模型评估其质量,为结果增加验证层。内置于 Elasticsearch 的自动化引擎 Elastic Workflows 允许开发者将可靠的脚本自动化与 AI 驱动步骤结合,用于需要推理的任务。

`

1.  name: LLM-as-judge demo (compat mode)
2.  enabled: true

4.  inputs:
5.    - name: question
6.      type: string
7.      required: true
8.    - name: context
9.      type: string
10.      required: false
11.      default: ""

13.  triggers:
14.    - type: manual

16.  steps:
17.    - name: generate_answer
18.      type: ai.prompt
19.      connector-id: 921257e8-8037-48fc-beee-be1c7e6d23d3 #Anthropic Claude Sonnet 4.5
20.      with:
21.        temperature: 0.2
22.        prompt: |
23.          You are a helpful assistant.

25.          Task: Answer the user's question using ONLY the provided context.
26.          If the context is empty or insufficient, output exactly:
27.          INSUFFICIENT_CONTEXT: <one clarifying question>

29.          Question:
30.          {{inputs.question}}

32.          Context:
33.          {{inputs.context}}

35.    - name: judge_answer
36.      type: ai.prompt
37.      connector-id: 9b54f87b-2e95-4211-9507-80097f1da325 # OpenAI 4.1
38.      with:
39.        temperature: 0.2
40.        prompt: |
41.          You are the judge model.

43.          Evaluate the generator answer against rules:
44.          1) Must be grounded in the provided context (no unsupported claims).
45.          2) If context insufficient, must output:
46.             INSUFFICIENT_CONTEXT: <one clarifying question>
47.          3) Must be clear and directly address the question.

49.          Return EXACTLY one token: PASS or FAIL (no punctuation, no extra text).

51.          Question:
52.          {(inputs.question)}

54.          Context:
55.          {(inputs.context)}

57.          Generator answer:
58.          {(steps.generate_answer.output.content)}

60.    - name: route_on_verdict
61.      type: if
62.      condition: "steps.judge_answer.output.content: PASS"
63.      steps:
64.        - name: approved
65.          type: console
66.          with:
67.            message: |
68.               APPROVED

70.              Answer:
71.              {(steps.generate_answer.output.content)}
72.      else:
73.        - name: rejected
74.          type: console
75.          with:
76.            message: |
77.               REJECTED
78.              Judge:
79.              {(steps.judge_answer.output.content)}

81.              Answer:
82.              {(steps.generate_answer.output.content)}

`AI写代码![](https://csdnimg.cn/release/blogv2/dist/pc/img/runCode/icon-arrowwhite.png)收起代码块![](https://csdnimg.cn/release/blogv2/dist/pc/img/arrowup-line-top-White.png)

如上所示,我使用了两个 LLM 的连接器:

我们可以通过如下的方法来得到连接器的 id:

 `1.   curl --request GET "http://localhost:5601/api/actions/connectors" \
2.    --header "Authorization: ApiKey Ty1Jank1d0ItU3Vvem5YZkhXNy06ZzV6czZvQ1VyaS1fODA5NEpxMkV4QQ==" \
3.    --header "kbn-xsrf: true"`AI写代码

你需要根据自己的安装来替换上面的 API key。

运行上面的代码:

`

1.  {
2.    "question": "What is the SLA for priority-1 incidents?",
3.    "context": "Support Policy v3.2\n\nPriority 1 (P1): Critical outage. \nInitial Response SLA: 1 hour. InResolution Target: 4 hours. InCoverage: 24/7."
4.  }

`AI写代码

整个的演示视频在:www.bilibili.com/video/BV1Gj…

祝大家学习愉快!