Prompt Engineering 教你如何与chatgpt沟通!

559 阅读17分钟

本文介绍如何与chatGPT进行有效沟通,即prompt engineering。主要分为三部分:

  • GPT Prompt基本组成
  • Prompt Engineering策略
  • Prompt Engineering原则

一、GPT Prompt基本组成

GPT模型的目的是在前面所给出的输入序列的基础上,生成最有可能承接上文的输出,是个纯粹的语言模型。当在GPT中询问问题时,并不是按照一个“Q&A”模式来得到输出的,而是按照基本语言模型的形式,生成了承接上文问题的输出,恰好是问题的答案。下面给出一种分解prompt的方式,按照使用频率来排序。

Prompt 主要由 examples, primary content, supporting content, instructions 和 cue 组成。

(下文括号内容用于注释理解,实际不会加在prompt)中

1. Instructions(指令)

Instructions用来告诉模型需要做什么,越详细的instruction越能更好地对GPT进行指导,从而生成更贴合需求的输出。下表是instructions实例:

TypePrompt
Simple instructionsWrite an introduction for a weekly newsletter. 
Complex Instructions(instructions)
Write an introduction for a weekly newsletter, including the following: 
- My company is named contoso 
- An exuberant greeting 
- Thanking the team for their hard work over a couple tough months 
- Positive outlook for the coming quarter 
- Signed by the SLT

2. Primary Content(主要内容)

Primary content指GPT需要处理或转换的文本信息。典型例子是英文转法语翻译任务中,英文文本是primary content,“Translate to French:”是instruction.

PromptGPT Output
(primary content)
Can you please tell me how to get to the museum?
(instructions)
Translate to French:
Pouvez-vous s'il vous plaît me dire comment aller au musée?

3.Examples(示例)

成功的prompt一般具有“one-shot”或“few-shot”特性,即给出若干示例来使模型更好地认识任务。一个好的示例一般是输入输出对(input output pairs)。这并不意味着模型参数被永久调整了,其只是作为语言模型的输入,影响了当前输出。没有例子的prompt一般认为是“zero-shot”。从下面的例子可以看出:具有示例时,GPT模仿示例生成了更为简洁的输出,更加贴合实际需求。

TypePromptGPT output
zero-shot Headline: Coach confident injury won't derail Warriors Topic: The coach is confident that the injury won't derail the Warriors' season. The team is still focused on their goals and that they will continue to work hard to achieve them.
few-shot(Examples)
Headline: Twins' Correa to use opt-out, test free agency
Topic: Baseball
Headline: Qatar World Cup to have zones for sobering up
Topic: Soccer
Headline: Yates: Fantasy football intel for Week 6
Topic: Football
(primary content)
Headline: Coach confident injury won't derail
(instructions)
Warriors Topic:
Basketball

4. Cue(提示)

Cue用于给GPT一些对于输出格式的提示。如给定回答的第一句话以什么开头等。可以看到,下面的例子中,加入了cue后,GPT按照cue给出的格式进行了输出。

CuesPromptGPT Output
Performance reviews are a core component of our Performance & Development approach at Contoso. […] Summarize the above email message:The purpose of this email is to inform employees of recent changes to the performance review experience, which is part of Contoso's Performance & Development approach. The changes include an improved structure, upgraded in-tool user experience, and focused prompts. Additionally, there is now an optional Conversation Starters section for employees to flag topics they'd like to discuss with their managers.
2(primary content)
Performance reviews are a core component of our Performance & Development approach at Contoso. […]
(instructions)
Summarize the above email message:
(Cue)
Key Points:•
Contoso is introducing changes to the performance review experience, based on feedback from employees.
• The changes include a new focus on core priorities, expectations, and impact; an upgraded user experience; and new guidance on company-wide core priorities.
• Performance reviews are a time to reflect on past performance and learning as well as to plan for future priorities and growth opportunities.
• Employees and managers should write comments in advance of the performance review meeting. 
• There should be at least two performance reviews per year.
• The time investment for each review should remain about the same.

5. Supporting Content(补充内容)

Supporting content可以认为是补充材料。常见的例子如:在prompt中加入当前时间、用户名、用户喜好等信息,当然这和primary content之间没有明显界限,可以混用。

Have SupPrompt GPT Output
NoWorkshop list: Title DescriptionPlanning Workshop Kickoff Please join us to kick off our workshop! Hear from our leaders about cross group priorities and learnings for the year ahead. The invitation to this event has already been sent to everyone in […]Summarize the workshops above, grouped by topic:Workshop list: Planning Workshop Kickoff
- Planning Workshop Wrap-up
- Core MeetUp
- Customer Driven Perspectives
- Impact and Key Focus Areas
- Best practices
- Accelerating app adoption […]
Yes (cue)
Workshop list: Title Description
(primary content)
Planning Workshop Kickoff Please join us to kick off our workshop! Hear from our leaders about cross group priorities and learnings for the year ahead. The invitation to this event has already been sent to everyone in […]
(supporting content)
My Important Topics: prompt engineering, search, GPT models
(instructions)
Summarize the workshops above, grouped by my important topics:
1. Prompt Engineering:
- New prompt engineering tips
- The art of prompt engineering
2. Search:
- Intro to vector search with embedding
3. GPT Models:
- Intro to GPT-4
- ChatGPT in-depth.

二、Prompt Engineering策略

首先需要了解system message. System message一般加在prompt开头,用于指导模型情景信息、指令信息等,常见应用是让GPT扮演特定角色,定义它的输入输出等。如下例,让GPT扮演一个AI助手,使用押韵的方式回答问题,可以看到GPT完成得很好。

System MessageInputGPT Output
You are an AI assistant that helps people find information and responds in rhyme. If the user asks you a question you don't know the answer to, say so.What can you tell about me, John Doe?Dear John, I'm sorry to say, But I don't have info on you today. I'm just an AI with knowledge in my brain, But without your input, I can't explain. So please tell me more about what you seek, And I'll do my best to give you an answer unique.

其他system message示例如下:

  • “Assistant is an intelligent chatbot designed to help users answer technical questions about Azure OpenAI Service. Only answer questions using the context below and if you're not sure of an answer, you can say "I don't know".

  • “Assistant is an intelligent chatbot designed to help users answer their tax related questions.”

  • “You are an assistant designed to extract entities from text. Users will paste in a string of text and you will respond with entities you've extracted from the text as a JSON object. Here's an example of your output format: {  "name": "", "company": "", "phone_number": "" }

即使你在system message中加入当你不确定时请输出“我不确定”,这并不能确保GPT的输出一定正确。设计良好的system message能增加GPT对于结果的置信度,但不能保证每次的输出一定符合system message设计。

1. Few-shot Learning

使用示例来使GPT能够获取更多的任务信息,从而更好地适应新的任务,完成模型迁移。其本质是利用语言模型强大的建模能力来提供更多信息,使模型能够根据所提供的信息输出最合理的回答。

System MessageFew-shot Examples
Assistant is an intelligent chatbot designed to help users answer their tax related questions.
Instructions
- Only answer questions related to taxes.
- If you're unsure of an answer, you can say "I don't know" or "I'm not sure" and recommend users go to the IRS website for more information.
User - "When do I need to file my taxes by?"
Assistant - "In 2023, you will need to file your taxes by April 18th. The date falls after the usual April 15th deadline because April 15th falls on a Saturday in 2023. For more details, see www.irs.gov/filing/indi…

User - "How can I check the status of my tax refund?"
Assistant - "You can check the status of your tax refund by visiting www.irs.gov/refunds "

2. Non Chat Scenarios

尽管chatgpt在训练时使为了处理多轮对话任务的,它也可以用于其他非对话类任务。这得益于其训练时的instruction-tuning策略。如下例所示为情感分析任务:

System MessageInputGPT Output
You are an assistant designed to analyze sentiment from speech data. Users will paste in a string of text and you will respond with an assessment of the response of the speaker. Rate on a scale of 1-10 (10 being highest). Explain why this rating was given.hi thank you for calling contoso pharmacy who am i speaking with today i understand your frustration mary um can you tell me what exactly you're trying to accomplish OK uh certainly happy to check that for you one moment please i see here that you were on a generic form of lipitor is that right OK uh so i see that your doctor stopped prescribing these drugs in twenty twenty one actually OK uh yeah i'd be happy to check that for you uh because sometimes there's a gap in the system and it just doesn't get reported so let me take a look here just a moment so i'm seeing here that your doctor had these drugs prescribed to you from twenty twelve through twenty twenty one OK um yeah according to the latest records uh provided by doctor's office you're now on a different drug metformin would you like us to go ahead and fill that prescription for you for pick up a bit of form in five hundred milligrams uh you're very welcome mary please let us know if there's anything else we can do for you today.Sentiment rating: 8 The speaker in this text seems to be polite, helpful, and patient while addressing the concerns of the customer. They take the time to check the information, provide clear explanations, and offer assistance. However, the rating is not a 10 because there is still a hint of uncertainty in some parts of the conversation, which may not convey complete confidence to the customer

3. Where to Place Instructions

GPT prompt中信息的顺序很重要。一方面,微软研究表明,将instruction放在prompt开头,即在提供额外的上下文信息或示例之前,能够使GPT生成更高质量的输出。但另一方面,模型更容易收到近期偏差(recency bias)影响,意味着在prompt后端的信息应该比开头的信息影响力更大,因为可以看到更多开头的信息。

因此建议在prompt开头和结尾都加上instructions,如:

System MessageInputGPT Output
You are an AI assistant that helps people find information.(instructions)
Your task is to verify if the statement "Several sources mention a chance of another large eruption" is supported by a specific quote from the following set of snippets.
(supporting content)
SNIPPETS
[1] 14 percent chance of megaquake hitting Seattle, experts say SEATTLE - There's a 14 percent chance of a magnitude 9 Cascadia earthquake hitting Seattle in the next 50 years, the U.S. Geological Survey estimates. "Unfortunately, we are unable to... 
[2] Earthquake experts lay out latest outlook for Seattle's 'Really Big One’ “We say that there's approximately a 14% chance of another approximately magnitude-9 earthquake occurring in the next 50 years,” said a geophysicist at the University of Washington...3
(instructions)
Is the statement "Several sources mention a chance of another large eruption" directly implied or stated by the snippets?
No, the statement is not directly implied or stated by the snippets. The snippets mention a chance of a "megaquake" and a "magnitude 9 Cascadia earthquake" hitting Seattle in the next 50 years, but do not mention a chance of another large eruption.

4. Prime the Output

主要思想为使用上文所说的cue来对GPT输出进行规范,使之按照期望的输出格式进行输出。如:

  • “Here’s a bulleted list of key points:\n- ”,可以使GPT输出使用‘-’作为分隔符的无序列表。

  • “John Smith is married to Lucy Smith. They have five kids, and he works as a software engineer at Microsoft. What search queries should I do to fact-check this? ## One possible search query is:” 最后一句话使GPT只输出一个结果而不是多个。

5. Add Clear Syntax

使用清晰的语法来使prompt结构清晰,让GPT更容易解析,如标点符号、标题和章节标记等。在下面的例子中,使用分隔符“---”来使prompt中各部分内容条理清晰,并且“PARAGRAPH”和“QUERIES”均大写来与其他内容相区分。

System MessageInputGPT Output
You are an AI assistant that helps people find information.You will read a paragraph, and then issue queries to a search engine in order to fact-check it. Also explain the queries.
---
PARAGRAPH
John Smith is married to Lucy Smith. They have five kids, and he works as a software engineer at Microsoft. What search queries should I do to fact-check this?
---
QUERIES
1. "John Smith Microsoft" - To check if John Smith is indeed employed at Microsoft.
2. "John Smith Lucy Smith" - To check if John Smith is married to Lucy Smith.
3. "John Smith children" - To check if John Smith has five children.
如果你不清楚应该使用哪种语法来使prompt条理清晰,推荐使用MarkdownXML,因为模型训练时使用了很多使用Markdown和XML编写的网页内容。

6. Break the Task Down

当prompt中任务量很大时,我们应将任务分解,而非直接将全部任务一股脑输入给GPT。任务分解可以看作是一步简单的人工对于任务的解译,可以减少对于GPT性能的依赖程度。下例中应用了任务分解,并使用分隔符来区分不同的任务,使得各个子任务条理清晰。

System MessageInput GPT Output
You are an AI assistant that helps people find information.You will read a paragraph, and then issue queries to a search engine in order to fact-check it.
---
PARAGRAPH
John Smith is married to Lucy Smith. They have five kids, and he works as a software engineer at Microsoft. What search queries should I do to fact-check this?
---
Now you will extract factual claims, and issue queries to fact-check them. When issuing a query, use the function
SEARCH("query")
FACTUAL CLAIMS
1. John Smith is married to Lucy Smith
2. They have five kids
3. He works as a software engineer at Microsoft 
QUERIESSEARCH("John Smith married Lucy Smith")
SEARCH("John Smith family size")
SEARCH("John Smith Microsoft software engineer")

7. Chain of Thought

思维链(CoT)可以认为是Break the Task Down的变体。我认为二者区别是:

  • Break the Task Down中对于任务的分解由人工完成,但人类分解任务的模式可能并不是GPT进行解译时的最佳方案;
  • CoT旨在教导GPT来一步一步生成结果,并将结果进行展示,因此可以认为使GPT有了思考的过程。在实际应用过程中建议与Few-shot Learning结合,即给若干例子。  
但这里有一个疑问,如果让GPT按CoT进行迭代输出,是否需要考虑error accumulation的影响?如果按步骤输出,后面的步骤一定会基于前面的结果进行输出,因此会受到前面输出的影响。此时若前面步骤方向有问题,是否可以认为后面这种偏移会累积,导致结果与预期相差甚远。而Break the Task Down在输出时,GPT不仅会考虑自己根据每个步骤的输出,还会考虑prompt中给定的步骤,可以认为是存在Teacher Force。那么是否可以认为这种方式的error accumulation会低于CoT。
System MessageInputGPT Output
You are an AI assistant that helps people find information.Who was the most decorated (maximum medals) individual athlete in the Olympic games that were held at Sydney? Take a step-by-step approach in your response, cite sources and give reasoning before sharing final answer in the below format: ANSWER is: <name>Step 1: Determine the year of the Sydney Olympic games. The Sydney Olympic games were held in the year 2000.

 Step 2: Find a list of individual athletes who won multiple medals in the 2000 Sydney Olympics.According to the official website of the Olympic Games, there were several individual athletes who won multiple medals in the 2000 Sydney Olympics. Some of the notable names include:
• Ian Thorpe (Australia) - 3 gold, 2 silver, 0 bronze
• Pieter van den Hoogenband (Netherlands) - 2 gold, 0 silver, 2 bronze
• Lenny Krayzelburg (USA) - 3 gold, 0 silver, 0 bronze
• Inge de Bruijn (Netherlands) - 3 gold, 1 silver, 0 bronze

 Step 3: Determine the athlete with the most medals. After reviewing the list of individual athletes who won multiple medals in the 2000 Sydney Olympics, we can see that the athlete with the most medals is Ian Thorpe from Australia, who won a total of 5 medals (3 gold and 2 silver).

 Therefore, the ANSWER is: Ian Thorpe.

8. Use of APIs

可以在prompt中 调用不同的API来实现LLM 所不能实现的功能*,实现了对prompt的自动化加工,如获取搜索引擎搜索结果、调用计算器、获取当前时间等。

这里建议参考ToolFormer,其主要思想为给出一个pure prompt,它可以给你输出带有API calls的prompt。然后通过外部API调用,将结果拼接在pure prompt中得到带有必要信息的prompt作为GPT的输入信息,实现了自动化流程处理

本节讲述的策略为手动输入API calls。如下例,当遇到SEARCH标记时,用户调用搜索引擎并将结果粘贴在SNIPPETS中得到新的prompt。之后将处理完成的结果输入到GPT中得到输出。

System MessageInputGPT Output
You are an AI assistant that helps people find information.You will read a paragraph, extract factual claims, and then use search engine results to fact-check them
---
PARAGRAPH
John Smith is married to Lucy Smith. They have five kids, and he works as a software engineer at Microsoft. What search queries should I do to fact-check this?
---
FACTUAL CLAIMS
- John Smith is married to Lucy Smith
- John and Lucy have five kids- John works as a software engineer at Microsoft
---
Here are various search queries issued to research the claims above:
SEARCH QUERIES
- John Smith married to Lucy Smith
- John Smith number of children
- John Smith software engineer Microsoft
---
Here are snippets of the search results:
SNIPPETS
(这里的信息是用户通过调用API二次加工生成的,原始prompt中只有placeholder)
[1] … John Smith’s wedding was on September 25, 2012 …
[2] … John Smith was accompanied by his wife Lucy to a party
[3]John was accompanied to the soccer game by his two daughters and three sons
[4] … After spending 10 years at Microsoft, Smith founded his own startup, Tailspin Toys
[5] John M is the town smith, and he married Fiona. They have a daughter named Lucy
---
Given the snippets, fact check each of the factual claims above:
- John Smith is married to Lucy Smith: Confirmed. Snippet [1] and [2] both mention John Smith's wife as Lucy. 
- John and Lucy have five kids: Confirmed. Snippet [3] mentions that John was accompanied by his two daughters and three sons. 
- John works as a software engineer at Microsoft: Partially confirmed. Snippet [4] mentions that John spent 10 years at Microsoft, but it does not explicitly state that he currently works there. Further research may be needed to confirm his current employment status.

9. Export Supporting Materials

使GPT回答问题时不止输出结果,还要输出判断理由或参考资料有利于减少GPT捏造结果的可能性,提高输出的准确性。因为如果需要给出理由或材料,那么当GPT需要捏造结果时,需要捏造两个事实,一个是假的结果,另一个是假的引用。当需要造成错误较多时,相当于训练过程中loss大,因此GPT可能就不会选择捏造事实,而是输出不知道。这样就能增加结果的置信度。

在给定理由时,应给出行内理由,而非在结尾一并给出理由。考虑是因为行内理由与之前的结果距离近,对GPT的输出影响更大。

System MessageInputGPT Output
You are an AI assistant that helps people find informationList down the different behaviors exhibited by the Customer Support agent in the following conversation.
---
hi thank you for calling contoso pharmacy who am i speaking with today i understand your frustration mary um can you tell me what exactly you're trying to accomplish OK uh certainly happy to check that for you one moment please i see here that you were on a generic form of lipitor is that right OK uh so i see that your doctor stopped prescribing these drugs in twenty twenty one actually OK uh yeah i'd be happy to check that for you uh because sometimes there's a gap in the system and it just doesn't get reported so let me take a look here just a moment so i'm seeing here that your doctor had these drugs prescribed to you from twenty twelve through twenty twenty one OK um yeah according to the latest records uh provided by doctor's office you're now on a different drug metformin would you like us to go ahead and fill that prescription for you for pick up a bit of form in five hundred milligrams uh you're very welcome mary please let us know if there's anything else we can do for you today.
---
How was the response of the Customer Support agent? Classify and list down their various behaviors and cite the reason behind the classification, using the format BEHAVIOR("reason") 
1. Polite Greeting (行内理由) BEHAVIOR("Agent starts the conversation with a polite greeting and asks for the caller's name")
2. Empathetic
BEHAVIOR("Agent acknowledges Mary's frustration and tries to understand her concern")
3. Attentive
BEHAVIOR("Agent listens to the customer's problem and asks for more details")
4. Proactive
BEHAVIOR("Agent offers to check the issue for the customer")
5. Informative
BEHAVIOR("Agent provides the relevant information about the customer's prescription history")
6. Thorough
BEHAVIOR("Agent checks for any possible gaps in the system and provides a complete response")
7. Confirmatory
BEHAVIOR("Agent confirms the current medication with the customer")
8. Solution-oriented
BEHAVIOR("Agent offers to fill the new prescription for the customer")
9. Courteous
BEHAVIOR("Agent thanks the customer and offers further assistance if needed")

三、Prompt Engineering原则

  • Be specific:prompt 应尽量具体,减少歧义,限制GPT输出的可选择空间;

  • Be descriptive:可以使用类比的方法设计prompt,多用示例;

  • Order matters:输入文本的顺序不同也会导致GPT输出不同,因为GPT长距离建模能力没有想象中那么强,距离远近对于建模能力的影响很大;

  • Give GPT a default:当模型不能解决所提出的问题时,应给模型一个缺省值。如:respond with ‘not found’ if the answer is not present. 这能让GPT产生更少的答非所问的或捏造的回答,但这种策略并不是总有效的。

  • Format data more efficiently:GPT会将输入语句划分为tokens(或理解是subwords)进行处理。受限于有限的长距离建模能力,一定存在一个GPT所能接受的最大token数,并且tokens越少明显建模效果越好。因此如何高效地表示数据,使用更少的token数表示相同的意思,是个值得深究的问题。

    p9f6tXt.png
    • 右边使用数字表示的日期明显浪费了很多无用token,建议使用文字而非数字进行时间展示;
    • GPT对于表格数据建模能力更强,应使用表格而非JSON进行示例展示;
    • GPT也会对多余的空格进行建模,因此应注意prompt中空格的使用。

参考资料

1.      learn.microsoft.com/en-us/azure…

2.      learn.microsoft.com/en-us/azure…