什么是ExampleSelectors
顾名思义 是示例选择器 我们在进行提示词的编写的时候我们通常会使用few-shot的方式 给大模型一些实例的提示 那么我们要如何从示例库中找到最符合上下文含义的N个实例呢? 这个就是示例选择器要做的事情
基本的使用
基类
class BaseExampleSelector(ABC):
"""Interface for selecting examples to include in prompts."""
# 添加示例
@abstractmethod
def add_example(self, example: Dict[str, str]) -> Any:
"""Add new example to store."""
# 异步添加示例
async def aadd_example(self, example: Dict[str, str]) -> Any:
"""Add new example to store."""
return await run_in_executor(None, self.add_example, example)
# 示例选择
@abstractmethod
def select_examples(self, input_variables: Dict[str, str]) -> List[dict]:
"""Select which examples to use based on the inputs."""
# 示例选择
async def aselect_examples(self, input_variables: Dict[str, str]) -> List[dict]:
"""Select which examples to use based on the inputs."""
return await run_in_executor(None, self.select_examples, input_variables)
基于相似度的选择器
from langchain_core.example_selectors import SemanticSimilarityExampleSelector
from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings
examples = [
{
"question": "Who lived longer, Muhammad Ali or Alan Turing?",
"answer": """
Are follow up questions needed here: Yes.
Follow up: How old was Muhammad Ali when he died?
Intermediate answer: Muhammad Ali was 74 years old when he died.
Follow up: How old was Alan Turing when he died?
Intermediate answer: Alan Turing was 41 years old when he died.
So the final answer is: Muhammad Ali
""",
},
{
"question": "When was the founder of craigslist born?",
"answer": """
Are follow up questions needed here: Yes.
Follow up: Who was the founder of craigslist?
Intermediate answer: Craigslist was founded by Craig Newmark.
Follow up: When was Craig Newmark born?
Intermediate answer: Craig Newmark was born on December 6, 1952.
So the final answer is: December 6, 1952
""",
},
{
"question": "Who was the maternal grandfather of George Washington?",
"answer": """
Are follow up questions needed here: Yes.
Follow up: Who was the mother of George Washington?
Intermediate answer: The mother of George Washington was Mary Ball Washington.
Follow up: Who was the father of Mary Ball Washington?
Intermediate answer: The father of Mary Ball Washington was Joseph Ball.
So the final answer is: Joseph Ball
""",
},
{
"question": "Are both the directors of Jaws and Casino Royale from the same country?",
"answer": """
Are follow up questions needed here: Yes.
Follow up: Who is the director of Jaws?
Intermediate Answer: The director of Jaws is Steven Spielberg.
Follow up: Where is Steven Spielberg from?
Intermediate Answer: The United States.
Follow up: Who is the director of Casino Royale?
Intermediate Answer: The director of Casino Royale is Martin Campbell.
Follow up: Where is Martin Campbell from?
Intermediate Answer: New Zealand.
So the final answer is: No
""",
},
]
embeddings = OpenAIEmbeddings(openai_api_key="YOUR-OPEN-API-KEY",
openai_api_base="BASE-URL")
example_selector = SemanticSimilarityExampleSelector.from_examples(
# 示例集合
examples,
# 词嵌入模型
embeddings,
# 向量数据库
Chroma,
# 要生成的示例数量
k=1,
)
question = "Who was the father of Mary Ball Washington?"
selected_examples = example_selector.select_examples({"question": question})
print(f"Examples most similar to the input: \n"
f"question:\n"
f"{selected_examples[0]['question']}\n"
f"answer:\n"
f"{selected_examples[0]['answer']}"
)
OUTPUT:
Examples most similar to the input:
question:
Who was the maternal grandfather of George Washington?
answer:
Are follow up questions needed here: Yes.
Follow up: Who was the mother of George Washington?
Intermediate answer: The mother of George Washington was Mary Ball Washington.
Follow up: Who was the father of Mary Ball Washington?
Intermediate answer: The father of Mary Ball Washington was Joseph Ball.
So the final answer is: Joseph Ball
基于最大边际相关项性(多样性)的选择器
其实这种选择器一般会用在推荐系统多一些 上面所说的相关性检索 只会通过用户的输出进行相似度的匹配 返回最符合条件的一些项 但是这就可能会使用户看到很多相似的内容 这也就是我们下面所要介绍的基于MMR的选择器
首先他也是通过相似度的搜索 所以也需要基于向量数据库 去做相似度的检索
基本使用
example_selector = MaxMarginalRelevanceExampleSelector.from_examples(
# 示例集合
examples,
# 词嵌入模型
embeddings,
# 向量数据库
Chroma,
# 要生成的示例数量
k=2,
)
question = "Who was the father of Mary Ball Washington?"
selected_examples = example_selector.select_examples({"question": question})
print(f"Examples most similar to the input:{selected_examples}")
基于最大子序列的选择器
基于n-gram重叠度算法 进行排序
基本概念:
- n-gram:一个 n-gram 是一个连续的词序列,长度为 n。例如,对于 n=2(二元组),文本 "The quick brown fox" 可以分解为 ["The quick", "quick brown", "brown fox"]。
- 重叠度:两个文本之间的 n-gram 重叠度可以衡量它们在语义上的相似性。
使用:
from langchain_community.example_selectors import NGramOverlapExampleSelector
from langchain_core.prompts import PromptTemplate
examples = [
{
"question": "Who lived longer, Muhammad Ali or Alan Turing?",
"answer": """
Are follow up questions needed here: Yes.
Follow up: How old was Muhammad Ali when he died?
Intermediate answer: Muhammad Ali was 74 years old when he died.
Follow up: How old was Alan Turing when he died?
Intermediate answer: Alan Turing was 41 years old when he died.
So the final answer is: Muhammad Ali
""",
},
{
"question": "When was the founder of craigslist born?",
"answer": """
Are follow up questions needed here: Yes.
Follow up: Who was the founder of craigslist?
Intermediate answer: Craigslist was founded by Craig Newmark.
Follow up: When was Craig Newmark born?
Intermediate answer: Craig Newmark was born on December 6, 1952.
So the final answer is: December 6, 1952
""",
},
{
"question": "Who was the maternal grandfather of George Washington?",
"answer": """
Are follow up questions needed here: Yes.
Follow up: Who was the mother of George Washington?
Intermediate answer: The mother of George Washington was Mary Ball Washington.
Follow up: Who was the father of Mary Ball Washington?
Intermediate answer: The father of Mary Ball Washington was Joseph Ball.
So the final answer is: Joseph Ball
""",
},
{
"question": "Are both the directors of Jaws and Casino Royale from the same country?",
"answer": """
Are follow up questions needed here: Yes.
Follow up: Who is the director of Jaws?
Intermediate Answer: The director of Jaws is Steven Spielberg.
Follow up: Where is Steven Spielberg from?
Intermediate Answer: The United States.
Follow up: Who is the director of Casino Royale?
Intermediate Answer: The director of Casino Royale is Martin Campbell.
Follow up: Where is Martin Campbell from?
Intermediate Answer: New Zealand.
So the final answer is: No
""",
},
]
example_prompt = PromptTemplate.from_template("Question: {question}\n{answer}")
selector = NGramOverlapExampleSelector(
examples=examples,
example_prompt=example_prompt
)
question = "Who was the father of Mary Ball Washington?"
selected_examples = selector.select_examples({"question": question})
print(f"selected:{selected_examples}")
OUTPUT:
selected:[{'question': 'Who was the maternal grandfather of George Washington?', 'answer': '\nAre follow up questions needed here: Yes.\nFollow up: Who was the mother of George Washington?\nIntermediate answer: The mother of George Washington was Mary Ball Washington.\nFollow up: Who was the father of Mary Ball Washington?\nIntermediate answer: The father of Mary Ball Washington was Joseph Ball.\nSo the final answer is: Joseph Ball\n'}, {'question': 'When was the founder of craigslist born?', 'answer': '\nAre follow up questions needed here: Yes.\nFollow up: Who was the founder of craigslist?\nIntermediate answer: Craigslist was founded by Craig Newmark.\nFollow up: When was Craig Newmark born?\nIntermediate answer: Craig Newmark was born on December 6, 1952.\nSo the final answer is: December 6, 1952\n'}, {'question': 'Who lived longer, Muhammad Ali or Alan Turing?', 'answer': '\nAre follow up questions needed here: Yes.\nFollow up: How old was Muhammad Ali when he died?\nIntermediate answer: Muhammad Ali was 74 years old when he died.\nFollow up: How old was Alan Turing when he died?\nIntermediate answer: Alan Turing was 41 years old when he died.\nSo the final answer is: Muhammad Ali\n'}, {'question': 'Are both the directors of Jaws and Casino Royale from the same country?', 'answer': '\nAre follow up questions needed here: Yes.\nFollow up: Who is the director of Jaws?\nIntermediate Answer: The director of Jaws is Steven Spielberg.\nFollow up: Where is Steven Spielberg from?\nIntermediate Answer: The United States.\nFollow up: Who is the director of Casino Royale?\nIntermediate Answer: The director of Casino Royale is Martin Campbell.\nFollow up: Where is Martin Campbell from?\nIntermediate Answer: New Zealand.\nSo the final answer is: No\n'}]
基于长度的选择器
基于长度的选择器 如果超过设置的最大长度示例将会被截断
结合prompt的使用
from langchain_core.prompts import FewShotPromptTemplate
from langchain_core.prompts import PromptTemplate
from langchain_core.example_selectors import SemanticSimilarityExampleSelector
from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings
examples = [
{
"question": "Who lived longer, Muhammad Ali or Alan Turing?",
"answer": """
Are follow up questions needed here: Yes.
Follow up: How old was Muhammad Ali when he died?
Intermediate answer: Muhammad Ali was 74 years old when he died.
Follow up: How old was Alan Turing when he died?
Intermediate answer: Alan Turing was 41 years old when he died.
So the final answer is: Muhammad Ali
""",
},
{
"question": "When was the founder of craigslist born?",
"answer": """
Are follow up questions needed here: Yes.
Follow up: Who was the founder of craigslist?
Intermediate answer: Craigslist was founded by Craig Newmark.
Follow up: When was Craig Newmark born?
Intermediate answer: Craig Newmark was born on December 6, 1952.
So the final answer is: December 6, 1952
""",
},
{
"question": "Who was the maternal grandfather of George Washington?",
"answer": """
Are follow up questions needed here: Yes.
Follow up: Who was the mother of George Washington?
Intermediate answer: The mother of George Washington was Mary Ball Washington.
Follow up: Who was the father of Mary Ball Washington?
Intermediate answer: The father of Mary Ball Washington was Joseph Ball.
So the final answer is: Joseph Ball
""",
},
{
"question": "Are both the directors of Jaws and Casino Royale from the same country?",
"answer": """
Are follow up questions needed here: Yes.
Follow up: Who is the director of Jaws?
Intermediate Answer: The director of Jaws is Steven Spielberg.
Follow up: Where is Steven Spielberg from?
Intermediate Answer: The United States.
Follow up: Who is the director of Casino Royale?
Intermediate Answer: The director of Casino Royale is Martin Campbell.
Follow up: Where is Martin Campbell from?
Intermediate Answer: New Zealand.
So the final answer is: No
""",
},
]
embeddings = OpenAIEmbeddings(openai_api_key="OPEN-API-KEY",
openai_api_base="BASE_URL")
example_prompt = PromptTemplate.from_template("Question: {question}\n{answer}")
example_selector = SemanticSimilarityExampleSelector.from_examples(
# This is the list of examples available to select from.
examples,
# This is the embedding class used to produce embeddings which are used to measure semantic similarity.
embeddings,
# This is the VectorStore class that is used to store the embeddings and do a similarity search over.
Chroma,
# This is the number of examples to produce.
k=1,
)
prompt = FewShotPromptTemplate(
example_prompt=example_prompt,
suffix="Question: {input}",
input_variables=["input"],
example_selector=example_selector
)
question = "Who was the father of Mary Ball Washington?"
# selected_examples = example_selector.select_examples({"question": question})
res = prompt.invoke({"input": question})
print(f"res:{res}")
res:text='Question: Who was the maternal grandfather of George Washington?\n\nAre follow up questions needed here: Yes.\nFollow up: Who was the mother of George Washington?\nIntermediate
answer: The mother of George Washington was Mary Ball Washington.\nFollow up: Who was the father of Mary Ball Washington?\nIntermediate answer: The father of Mary Ball Washington was Joseph Ball.\nSo the final answer is: Joseph Ball\n\n\nQuestion: Who was the father of Mary Ball Washington?'
总结
我们在这篇文章中为大家介绍了几种常见的示例选择器 根据不同的业务类型我们可以选择不同的示例选择器 并结合few-shot prompt进行使用 下一篇我们将会为大家带来chat-model 篇 function-call,结构化返回,流式调用等