聊天BDD:在BDD过程中使用AI

265 阅读13分钟

在开始进行一个新的副业时,我又面临着写代码的问题。我喜欢写代码,但这肯定需要付出精神上的努力。我注意到,自己写整个模块需要我思考多个问题,并借鉴许多领域的经验,从测试设计和使用核心语言库一直到努力保持整体目标和高层架构。同时又要努力保持事情的简单性。

通常在这种情况下,我会简单地开始编码(当然首先是测试),但有时我不能坚持一个清晰的连贯的路径,而是由于我对新系统的惊人的兴奋而迷失在想法的海洋中。那么,我可以使用ChatGPT来帮助我集中精力,同时使用BDD或TDD流程吗?

让我们拭目以待。

该方法:从行为到代码

我决定采取这种方法:

  1. 定义我的模块的预期结果,以便编写方案
  2. 将每个方案转化为测试
  3. 用测试来编写代码

我相信使用ChatGPT来做这件事有很大的好处,特别是第3点,因为作为人类,有时我们会忍不住有创意,但TDD规定,一旦你写了测试,你就应该写代码来使它通过,而且只写足够的代码来使它通过。因此,使用ChatGPT使我们更容易坚持这种模式。

定义结果

当我做咨询的时候,在准备冲刺的时候有两个关键的结果:定义一般的期望行为和提供具体的例子。这通常符合 "已完成的定义",即描述了系统的一般行为,并有一些可测试的例子,使开发人员能够确保他们编写的代码符合一些特定的可测试标准。

我一开始是这样问ChatGPT的:

You are an expert business analyst with additional expertise in Software User Stories and BDD. You help turn business requirements given as User Stories into multiple BDD Given-When-Then scenarios.

A User Story is of the format:
As a (user type)
I want to (perform an action on the system)
So that (I meet a desired outcome)

A Given-When-Then scenario is of the format:
Given (an initial context)
When (an event occurs)
Then (a testable outcome is satisfied)

---

I am writing a software module which accepts an email from a user and extracts the components so that they can be used to post to Instagram. The user story is:

As an Instagram user,
I want to send an email to 'The System' which contains an image and some text
So that the image and text are extracted in a way that it can be posted to Instagram

Write BDD scenarios which meet the user story.

进入全屏模式 退出全屏模式

令我惊讶的是,ChatGPT输出了以下内容:

1. Scenario: Email with image and text is successfully processed
   Given the email contains an image and some text
   When the email is received by 'The System'
   Then the image and text are extracted from the email
   And the image and text are formatted for Instagram posting
   And the formatted image and text are saved to a file

2. Scenario: Email with no image is received
   Given the email contains no image
   When the email is received by 'The System'
   Then an error message is returned
   And the email is not processed further

3. Scenario: Email with no text is received
   Given the email contains no text
   When the email is received by 'The System'
   Then an error message is returned
   And the email is not processed further

4. Scenario: Email with an image that cannot be processed is received
   Given the email contains an image that cannot be processed
   When the email is received by 'The System'
   Then an error message is returned
   And the email is not processed further

5. Scenario: Email with text that exceeds the character limit is received
   Given the email contains text that exceeds the character limit
   When the email is received by 'The System'
   Then an error message is returned
   And the email is not processed further

6. Scenario: Email with an image that exceeds the size limit is received
   Given the email contains an image that exceeds the size limit
   When the email is received by 'The System'
   Then an error message is returned
   And the email is not processed further

7. Scenario: Email with unsupported image format is received
   Given the email contains an unsupported image format
   When the email is received by 'The System'
   Then an error message is returned
   And the email is not processed further

8. Scenario: Email is received from an unauthorized email address
   Given the email is received from an unauthorized email address
   When the email is received by 'The System'
   Then an error message is returned
   And the email is not processed further

9. Scenario: Generated Instagram post is successfully saved to file
   Given an image and text are formatted for Instagram posting
   When the formatted image and text are saved to a file
   Then the file is saved successfully

10. Scenario: Generated Instagram post fails to save to file
   Given an image and text are formatted for Instagram posting
   When the formatted image and text are saved to a file
   Then an error message is returned
   And the file is not saved successfully

进入全屏模式 退出全屏模式

多么好的开始啊!我通常能够想出合理的失败方案,而我首先注意到的是这里有一些我甚至还没有考虑过的方案,例如字符限制和附件大小限制。

不应忽视的是,这是在大约一分钟内生成的。我估计这需要我花半个小时的时间来思考和打字。因此,我们看到的是一个令人难以置信的速度提升。

检查输出结果,有一些东西对于MVP来说并不重要,因为我将是唯一的用户(未经授权的电子邮件地址和图片格式)。还有一些东西被遗漏了,可能是由于我的起始故事中的一些模糊性。所以我要求进行一些改进:

Please make the following changes:

 1. The image is received as an attachment and only one attachment is allowed.
 2. The outcome is not a formatted image. The outcome is that the image is extracted and saved, and the text is extracted and saved in a suitable format for subsequent processing.
 3. The text in the email can specify multiple image captions and a carousel caption.
 4. The product is an MVP and so scenarios 6, 7 and 8 are not required

进入全屏模式 退出全屏模式

我就不重复了,下面是它创建的更新的方案:

1. Scenario: Email with one image attachment and text is successfully processed
   Given the email contains one image attachment and text
   When the email is received by 'The System'
   Then the image and text are extracted from the email
   And the image and text are saved in a suitable format for subsequent processing

2. [unchanged]

3. Scenario: Email with more than one image attachment is received
   Given the email contains more than one image attachment
   When the email is received by 'The System'
   Then an error message is returned
   And the email is not processed further

4. Scenario: Email with text specifying multiple image captions and a carousel caption is received
   Given the email contains text specifying multiple image captions and a carousel caption
   When the email is received by 'The System'
   Then the image captions and carousel caption are extracted from the text
   And the captions are saved in a suitable format for subsequent processing

5. Scenario: Extracted image and text are successfully saved
   Given the image and text are extracted from the email
   When the image and text are saved
   Then the image and text are saved successfully

6. Scenario: Extracted image and text fail to save
   Given the image and text are extracted from the email
   When the image and text are saved
   Then an error message is returned
   And the image and text are not saved successfully

进入全屏模式 退出全屏模式

嗯......好吧,它按要求删除了6、7和8,但它也删除了场景3、4和5,这很奇怪。这引起了我的注意,情景5和6并没有真正说明什么。我决定保留情景6,因为也许会有一个内部故障,这是一个公平的测试案例,但情景5基本上与情景1相同。

我发现从ChatGPT引导现有的输出是很棘手和令人沮丧的,但我仍然可以从所有的输出中手工挑选出最好的。然而这些方案仍然是相当高层次的--那缺少的细节怎么办?我决定尝试另一种方法。

完善:例子映射

当我被介绍到实例映射时,我很喜欢它。作为一个开发者,如果给我一个详细的用户故事或需求,我总是发现自己在想测试的例子,而且经常发现我想不出来,或者因为误解而想错了。

缺少例子可以通过与利益相关者交谈在冲刺中期得到解决,但不正确的方案通常要到冲刺结束时才会被发现。例子映射(Example Mapping)旨在改变这种情况,它涉及到利益相关者、开发人员和QA讨论方案,以确定它们的含义的例子。

第一个实验中的方案可能看起来已经很完整了,但它们使用的语言,如 "返回错误信息",并不具体,也无法测试。为了从ChatGPT中获得更多的细节,我决定把它放在实例映射场景中来代替。因为我至少可以戴上实例映射会议中的两个帽子,我可以指导ChatGPT写一些实例。

You are a software product specialist with knowledge of BDD, User Stories and Example Mapping. You are participating in an Example Mapping session. This session follows these rules for a given user story:

Start by writing the story under discussion on a yellow card.

Next write each of the acceptance criteria, or rules that we already know, on a blue card.

For each rule, we may need one or more examples to illustrate it. We write those on a green card and place them under the relevant rule.

Examples are written in the BDD Given-When-Then scenario format as follows:
Given (an initial context)
When (an event occurs)
Then (a testable outcome is satisfied)

As we discuss these examples, there may be questions that nobody in the room can answer. Capture those on a red card and move on with the conversation.

---

Take the following user story:

As an Instagram user,
I want to send an email to 'The System' which contains an image and some structured text
So that the image and text are extracted in a way that it can be posted as an Instagram post or carousel

Take the following rules:

1. The email must contain 1 image file attachment
2. The email must have a subject "INSTAGRAM POST <date>"
3. The email must contain some body text
4. The body text must be structured to create either a single image post or a carousel post of 2 or 3 images
5. The structured body text must contain a general post caption and a caption for each desired image in the post
6. Error messages are returned as a HTTP 400 code with an associated message

Write the examples as per an Example Mapping session, and also write any unknown questions.

进入全屏模式 退出全屏模式

我不会用文字输出来烦扰你--相反,让我们把它们放在卡片上,就像在真正的实例映射会话中那样。不幸的是,图像生成器并不擅长生成文本(还没有),所以这一次是手工制作的--请原谅任何糟糕的格式,因为有些输出是相当长的。

[Example Mapping result 乍一看,这看起来很不错,而且也只花了一分钟左右就完成了。然而,也有一些问题,在真正的会议中,我不会让它走出房间。让我们先看看积极的一面。

  • ChatGPT正确地安排了卡片的结构(在文本输出中,例子在规则下正确地对齐了)。
  • ChatGPT确实根据规则制作了例子,并区分了输入场景
  • 附加问题让我大吃一惊--它们都是有效的:附件命名、附加正文和收件地址

现在是负面的:

  • 用户故事的结果再次被解释为发布到Instagram上,而事实并非如此。
  • 失败的情况是可以测试的,但成功的情况并不是真正可以测试的。我更喜欢描述结果的例子,它几乎可以直接转化为测试。
  • 旋转木马场景中缺少对3个标题的测试
  • 旋转木马场景缺少一个超过3个标题的失败案例。

现在,你可以说这些是因为它没有被给予足够的指导(再次)。在前两点中,这也许是真的,但在最后两点中,我认为有足够的信息让人理解(如图像的规则)。但这往往是在实例绘图会议上讨论的内容。

既然这是用ChatGPT做的实验,那就再试一次,让我们有更多的方向,并回答一些问题。以下是更新后的请求和规则:

Take the following user story:

As an Instagram user,
I want to send an email to 'The System' which contains an image and some structured text
So that the image and text are extracted and stored so that it can be later used to generate an Instagram post.

Take the following rules:

1. The email is addressed to "me@postrobot.com"
2. The email must contain 1 image file attachment and can have any name
3. The email must have a subject "INSTAGRAM POST <date>"
4. The email must contain some body text
5. Body text containing "POST TEXT" and "CAPTION" labels create a single image post
6. Body text containing "POST TEXT" and "CAPTION-1"..."CAPTION-N" labels create a carousel post
7. A carousel post contains no more than 3 images
8. Text extracted from the email is stored as JSON, containing the date, post text, captions and image name
9. Error messages are returned as a HTTP 400 code with an associated message

Write the examples as per an Example Mapping session, and also write any unknown questions.

进入全屏模式 退出全屏模式

为了简洁起见,我就不说详细的输出了,下面是响应的主要结果:

  • 现在的输出只包含一个正面的例子方案,而以前也包含负面的方案
  • 因为没有负面的情况,唯一涉及错误的输出(对(9)的回应)是 "然后返回一个HTTP 400代码和相关的信息",这没有帮助
  • 它现在提出的一些问题在之前的尝试中已经被推导出来了,例如,现在它提出了一个问题:"如果电子邮件中没有图片附件会怎样?"。我认为人类会从规则(2)和(9)中推断出这个问题。

[Hitting The Target

结论

与ChatGPT合作,将一个用户故事扩展为可测试的场景,再次表明像这样的LLM从一般的角度来看是非常有用的,但从详细的角度来看却不是。从积极的方面来看,上述所有的答案都是在几秒钟内得到的,这意味着我花了更多的时间来分析结果和寻找错误,而不是输入提示信息和等待输出。

总而言之,我大概只花了几分钟时间与该系统进行实际互动。它给出了令人印象深刻的结果,使我能够建立一个需要考虑的方案清单,以及我可能错过的问题。因此,从某种意义上说,它作为一个团队伙伴的作用非常好。

问题存在于寻找细节时。也许是我太挑剔了,但有些输出结果对前进没有用处。使用ChatGPT的高层次行为是很好的,但任何想深入到任何层次的细节的想法都让我感到失望。如果重新提示更多的细节是很常见的,那么我还不如自己添加细节。

我还发现提示之间缺乏一致性。它大多表现良好,但它忘记了以前的结果,足以让我感到沮丧。如果我看到在我的团队中工作的软件开发人员有这种行为,我会很恼火。

也许这是意料之中的事--毕竟LLM不会 "思考",它们只是执行高度先进的统计模式匹配。对于许多目的来说,这已经足够了,并提供了大量有用的回应和思考的素材。但目前,需要人去进一步研究。(注意,这是在ChatGPT3.5上完成的,也许GPT4会更进一步)。

总的来说,ChatGPT作为一个生产力助手得到了最高分。我猜想,如果不只是为了自动输出模板文本,在输入方面稍加考虑,迅速生成输出,在现实世界中会节省大量时间。所以我可以看到一个巨大的好处。