Hello-Agents 共学笔记（1）感谢Datawhale组织的共学计划，还没有哪个组织为了让我学习这么上心。基于

感谢Datawhale组织的共学计划，还没有哪个组织为了让我学习这么上心。

基于基座大模型进行应用开发，这种构建Agent的方式，成为软件开发的新范式。这种构建Agent的过程，成为当前业界讨论的焦点，大家都试图将这种过程总结成一门新的软件开发门类或学科。在参加本次共学之前，我在学校的图书馆将近年来相关的书籍查了一遍，国内的图书大多直白地称之为“AI Agent开发”“构建大模型应用”，而斯坦福的Chip Huyen则在著作中将其称之为“AI工程（AI Engineering）”，最近，更为著名的是HashiCorp联合创始人Mitchell Hashimoto在2026年2月5日提出的“驾驭工程（Harness Engineering）”，这个术语目前似乎被人们普遍接受。

尽管很多书籍都在讲同一个故事，但Datawhale无疑是讲得最生动的，这是促使我参加这次共学的主要原因。希望能够跟着大家，在不长的时间里共同进步，系统学习。

task 0

这是学习之前的准备工作，主要是构建环境，安装必要的python库，注册工作调用中使用到的api key等，一顿操作之后，也是顺利跑通了Hello World——FirstAgentTest.py。FirstAgentTest.py实现了一个最小化的“思考-行动-观察-循环”Agent：通过一个 system prompt 指导模型输出按指定格式（Thought/Action），解析 Action 后调用本地工具（天气、景点搜索），把 Observation 回写到 prompt history，直到模型发出 finish(...) 结束。

task 1

对应第四章的内容，主要介绍3种经典的智能体构建范式：以下是对 ReAct、Plan-and-Solve 和 Reflection 三种经典智能体构建范式的对比介绍，由Deepseek总结，以表格形式呈现其核心特征、流程与差异。

对比维度	ReAct	Plan-and-Solve	Reflection
核心思想	推理与行动交织将思考（Reason）和行动（Act）交替进行，每一步观察环境反馈后动态调整下一步推理。	先规划后执行先制定完整或分步的高层计划，再调用工具按计划求解，减少执行中的迷茫与冗余。	自我审视与修正生成结果后引入独立的评估环节，让模型“照镜子”发现错误并迭代优化输出。
工作流程	1. Thought：分析当前状态，思考下一步。 2. Action：执行具体工具/操作。 3. Observation：接收环境反馈。 → 循环上述步骤直至任务完成。	1. Plan：输出步骤清单（如1. 搜索；2. 计算）。 2. Solve：按清单顺序逐一执行子任务并整合答案。	1. Generate：生成初始响应。 2. Reflect：专门评估初始响应的缺陷（如逻辑漏洞、遗漏约束）。 3. Refine：基于反思内容重新生成。
角色比喻	边走边看地图的探险家不预设全部路线，根据脚下路况实时决策。	先看图纸再盖楼的工程师先画好蓝图，再按工序施工，避免返工。	草稿写完后朗读一遍的作家通过自我监听修改拗口或不严谨的句子。
主要优势	✅ 动态适应性极强，适合开放域、不确定性高的任务（如网页导航、复杂对话）。 ✅ 可解释性强，能清晰看到推理与行动的因果链。	✅ 效率高且稳定，避免在复杂多步任务中绕弯路。 ✅ 资源消耗更可控，减少无效的工具调用次数。	✅ 显著提升回答质量，有效抑制模型“幻觉”。 ✅ 低成本迭代，无需外部真实环境反馈即可自我改进。
主要局限	❌ 容易陷入循环死锁（不断重复错误的Thought）。 ❌ 对长上下文记忆压力大，Token消耗高。	❌ 灵活性差，若初始计划错误，中途难以纠正（除非结合重规划机制）。 ❌ 不适合一步一变的动态任务。	❌ 增加延迟，需要至少两次大模型调用。 ❌ 反思能力受限于模型自身的元认知水平（坏模型反思不出好结果）。
典型适用场景	多跳问答、具身智能导航、交互式软件开发助手。	数学应用题求解、代码生成任务分解、复杂的报告撰写大纲。	内容润色、代码Debug、确保敏感合规性的客服回复。
代表论文/框架	ReAct: Synergizing Reasoning and Acting in Language Models (Yao et al., 2022)	Plan-and-Solve Prompting (Wang et al., 2023)	Reflexion: Language Agents with Verbal Reinforcement Learning (Shinn et al., 2023)

尽管Deepseek总结得很准确，但这些对于我们而言都过于抽象了，Hello Agents教程中给出了具体的代码，让我们能够直观感受到每个开发范式的细节是如何具体实现的，这是至关重要的，也是Datawhale的重要贡献。以前在关于大模型的科研和工程语境里，经常听到“调用大模型”这样的说法，当前觉得很高级。现在看到，其中就是用提示词来问大模型的过程。代码执行情况：

ReAct:

问：华为最新的手机是哪一款？它的主要卖点是什么？

答：因为“最新”必须要求知识的新鲜性，但基座模型的知识必须是相对陈旧的，因为主动调用搜索功能，经过5轮反推理-行动过程，得到如下回答。

Plan_and_solve

问：一个水果店周一卖出了15个苹果。周二卖出的苹果数量是周一的两倍。周三卖出的数量比周二少了5个。请问这三天总共卖出了多少个苹果？

答：因为Agent运行过程中会反复调用大模型，第一次受到模型额度限制，到最后一步没有返回正确回答。

换了模型之后，回答正确，因为换了一个27B的小参数模型，响应速度比之前慢不少，不过应对简单任务还是足够的。

Reflection

问：编写一个Python函数，找出1到n之间所有的素数 (prime numbers)。

答：--- 开始处理任务 ---

任务: 编写一个Python函数，找出1到n之间所有的素数 (prime numbers)。

--- 正在进行初始尝试 --- 🧠 正在调用 Qwen/Qwen3.5-27B 模型... ✅ 大语言模型响应成功: from typing import List

def find_primes(n: int) -> List[int]: """ Find all prime numbers between 1 and n (inclusive).

This function implements the Sieve of Eratosthenes algorithm to efficiently
determine prime numbers up to the given limit.

Args:
    n (int): The upper limit of the range to search for primes.

Returns:
    List[int]: A list of integers representing all prime numbers 
               less than or equal to n. Returns an empty list if n < 2.
"""
if n < 2:
    return []

is_prime = [True] * (n + 1)
is_prime[0] = is_prime[1] = False

for i in range(2, int(n**0.5) + 1):
    if is_prime[i]:
        for j in range(i * i, n + 1, i):
    if is_prime[i]:
        for j in range(i * i, n + 1, i):
            is_prime[j] = False

        for j in range(i * i, n + 1, i):
            is_prime[j] = False

return [num for num, prime in enumerate(is_prime) if prime]
            is_prime[j] = False

return [num for num, prime in enumerate(is_prime) if prime]

📝 记忆已更新，新增一条 'execution' 记录。

return [num for num, prime in enumerate(is_prime) if prime]

📝 记忆已更新，新增一条 'execution' 记录。

--- 第 1/2 轮迭代 ---

-> 正在进行反思... 📝 记忆已更新，新增一条 'execution' 记录。

--- 第 1/2 轮迭代 ---

-> 正在进行反思... --- 第 1/2 轮迭代 ---

-> 正在进行反思... 🧠 正在调用 Qwen/Qwen3.5-27B 模型... -> 正在进行反思... 🧠 正在调用 Qwen/Qwen3.5-27B 模型... 🧠 正在调用 Qwen/Qwen3.5-27B 模型... ✅ 大语言模型响应成功:

代码评审反馈

✅ 大语言模型响应成功:

代码评审反馈

1. 时间复杂度分析 当前代码实现了标准的埃拉托斯特尼筛法（Sieve of Eratosthenes）。

代码评审反馈

1. 时间复杂度分析 当前代码实现了标准的埃拉托斯特尼筛法（Sieve of Eratosthenes）。

时间复杂度: $O(n \log \log n)$ 。 1. 时间复杂度分析 当前代码实现了标准的埃拉托斯特尼筛法（Sieve of Eratosthenes）。
时间复杂度: $O(n \log \log n)$ 。当前代码实现了标准的埃拉托斯特尼筛法（Sieve of Eratosthenes）。
时间复杂度: $O(n \log \log n)$ 。
空间复杂度: $O(n)$ 。虽然该算法远优于试除法（ $O(n\sqrt{n})$ ），但在算法理论层面，它并非寻找 $1$ 到 $n$ 所有素数的最优解。
时间复杂度: $O(n \log \log n)$ 。
空间复杂度: $O(n)$ 。虽然该算法远优于试除法（ $O(n\sqrt{n})$ ），但在算法理论层面，它并非寻找 $1$ 到 $n$ 所有素数的最优解。
空间复杂度: $O(n)$ 。虽然该算法远优于试除法（ $O(n\sqrt{n})$ ），但在算法理论层面，它并非寻找 $1$ 到 $n$ 所有素数的最优解。虽然该算法远优于试除法（ $O(n\sqrt{n})$ ），但在算法理论层面，它并非寻找 $1$ 到 $n$ 所有素数的最优解。

2. 主要性能瓶颈

算法冗余: 在埃氏筛中，一个合数可能会被其不同的质因子多次标记（例如 30 会被 2、3、5 分别标记为 False）。这种重复写操作随着 $n$ 的增大而累积，* 算法冗余: 在埃氏筛中，一个合数可能会被其不同的质因子多次标记（例如 30 会被 2、3、5 分别标记为 False）。这种重复写操作随着 $n$ 的增大而累积，构成了算法层面的效率上限。
Python 解释器开销: 代码中的内层 for 循环 (for j in range(...)) 在 Python 中执行效率极低。对于大规模 $n$ ，解释器循环的开销往往超过了算法本身的计算量，成为实际运行时的最大瓶颈。

3. 改进建议 为了达到极致的性能要求，建议从以下两个维度进行优化：

算法升级（理论最优）: 建议改用线性筛法（欧拉筛，Linear Sieve）。
- 原理: 维护一个已发现素数的列表，遍历每个数字时，用已发现的素数去标记合数，并在遇到“当前数字能被某素数整除”时立即停止。
- 优势: 确保每个合数仅被其最小质因子标记一次，将时间复杂度严格降低至 $O(n)$ 。
- 注意: 在 Python 中，由于对象创建和列表操作的开销，线性筛的实际运行速度可能受限于解释器，需配合高效的数据结构使用。
实现优化（工程最优）: 若保留埃氏筛逻辑，必须消除内层循环的解释器开销。
- 方案: 使用 Python 的**切片赋值（Slice Assignment）**替代内层 for 循环。
- 代码示例:
```
# 原代码
# for j in range(i * i, n + 1, i):
#     is_prime[j] = False

# 优化后
is_prime[i*i : n+1 : i] = [False] * ((n - i*i) // i + 1)
```
- 效果: 将循环下沉至 C 层执行，通常能带来 10 倍至 100 倍 的性能提升，是 Python 环境下处理此类问题的关键优化手段。

结论: 当前代码未达到算法最优。建议优先实施切片赋值优化以解决 Python 性能瓶颈，若对理论复杂度有极致追求，则应重构为线性筛法。 📝 记忆已更新，新增一条 'reflection' 记录。

-> 正在进行优化... 🧠 正在调用 Qwen/Qwen3.5-27B 模型... ✅ 大语言模型响应成功: from typing import List

def find_primes(n: int) -> List[int]: """ Find all prime numbers between 1 and n (inclusive).

This function implements the Sieve of Eratosthenes algorithm optimized
for Python performance using slice assignment to minimize interpreter overhead.

Args:
    n (int): The upper limit of the range to search for primes.

Returns:
    List[int]: A list of integers representing all prime numbers 
               less than or equal to n. Returns an empty list if n < 2.
"""
if n < 2:
    return []

is_prime = [True] * (n + 1)
is_prime[0] = is_prime[1] = False

# Iterate only up to the square root of n
limit = int(n**0.5) + 1
for i in range(2, limit):
    if is_prime[i]:
        # Optimization: Use slice assignment to mark multiples in C-speed
        # This replaces the inner Python loop, significantly reducing overhead
        start_index = i * i
        if start_index <= n:
            # Calculate the number of elements to set to False
            count = (n - start_index) // i + 1
            is_prime[start_index : n + 1 : i] = [False] * count

return [num for num, prime in enumerate(is_prime) if prime]

📝 记忆已更新，新增一条 'execution' 记录。

--- 第 2/2 轮迭代 ---

-> 正在进行反思... 🧠 正在调用 Qwen/Qwen3.5-27B 模型... ✅ 大语言模型响应成功:

代码评审反馈

1. 时间复杂度分析 当前算法基于埃拉托斯特尼筛法（Sieve of Eratosthenes），时间复杂度为 $O(n \log \log n)$ 。虽然使用了切片赋值（Slice Assignment）将内层循环下沉至 C 层以优化常数项，但在算法理论层面仍未达到线性复杂度。

2. 核心性能瓶颈

无效计算冗余：算法遍历了所有偶数（除 2 以外）。由于偶数必为合数，这部分占用了约 50% 的内存空间和迭代次数，属于显著的算法级浪费。
数据结构开销：使用 list[bool] 存储状态。在 64 位系统中，每个布尔值实际上占用一个指针大小（8 字节），而 bytearray 仅需 1 字节，前者内存占用大且缓存局部性较差。
浮点运算精度：n**0.5 涉及浮点数运算，对于极大整数可能存在精度风险，且比整数开方稍慢。

3. 改进建议

实施奇数筛法（Odd-Only Sieve）：
- 单独处理素数 2。
- 仅分配大小为 (n - 1) // 2 的数组来存储奇数。
- 建立索引映射：数组下标 i 对应数值 2*i + 3。
- 收益：内存占用减半，标记操作次数减半，显著提升性能。
使用 bytearray：
- 将 is_prime = [True] * (n + 1) 替换为 bytearray([1]) * size。
- 收益：大幅降低内存峰值，提高 CPU 缓存命中率。
使用 math.isqrt：
- 将 int(n**0.5) 替换为 math.isqrt(n)。
- 收益：避免浮点转换，保证大整数下的精确性和速度。
关于线性筛（Linear Sieve）的说明：
- 虽然欧拉筛理论复杂度为 $O(n)$ ，但在 Python 中，其内层逻辑依赖显式循环，无法利用切片赋值的 C 层加速。实测表明，在绝大多数 $n$ 范围内，优化后的奇数埃氏筛 快于纯 Python 实现的线性筛。因此，优先推荐上述奇数筛优化而非切换算法。 📝 记忆已更新，新增一条 'reflection' 记录。

-> 正在进行优化... 🧠 正在调用 Qwen/Qwen3.5-27B 模型... ✅ 大语言模型响应成功: import math from typing import List

def find_primes(n: int) -> List[int]: """ Find all prime numbers between 1 and n (inclusive).

This function implements an optimized Sieve of Eratosthenes algorithm
using an odd-only sieve and bytearray for memory efficiency.

Optimizations applied:
1. Odd-Only Sieve: Skips even numbers entirely (except 2), halving memory usage
   and iteration count.
2. Bytearray: Uses bytearray instead of list[bool] for compact storage and
   better cache locality.
3. Integer Square Root: Uses math.isqrt for precise and fast limit calculation.
4. Slice Assignment: Leverages C-level slice assignment for marking composites.

Args:
    n (int): The upper limit of the range to search for primes.

Returns:
    List[int]: A list of integers representing all prime numbers 
               less than or equal to n. Returns an empty list if n < 2.
"""
if n < 2:
    return []

# Initialize result with 2, the only even prime
primes = [2]

# Calculate size for odd numbers starting from 3 up to n
# Mapping: index i corresponds to number 2*i + 3
# e.g., i=0 -> 3, i=1 -> 5, i=2 -> 7
size = (n - 1) // 2

# Use bytearray for memory efficiency (1 byte per flag)
# 1 indicates prime candidate, 0 indicates composite
sieve = bytearray([1]) * size

# Limit for sieving is sqrt(n)
limit = math.isqrt(n)

# Iterate through odd numbers up to sqrt(n)
# We iterate by index i, converting to value p = 2*i + 3
for i in range(size):
    p = 2 * i + 3

    # Stop if p exceeds sqrt(n)
    if p > limit:
        break

    if sieve[i]:
        # Mark multiples of p starting from p*p
        # First multiple to mark is p*p
        # Convert p*p to index: (p*p - 3) // 2
        start_idx = (p * p - 3) // 2

        # In the odd-only sieve, multiples of p appear every p-th index
        # because we skip even numbers (step in value is 2*p, step in index is p)
        if start_idx < size:
            # Calculate number of elements to mark
            count = (size - 1 - start_idx) // p + 1
            # Assign 0 (False) to these positions using slice assignment
            sieve[start_idx : size : p] = b'\x00' * count

# Collect remaining primes from the sieve
# Map index i back to number 2*i + 3
primes.extend(2 * i + 3 for i, is_prime in enumerate(sieve) if is_prime)

return primes

📝 记忆已更新，新增一条 'execution' 记录。

--- 任务完成 --- 最终生成的代码: import math from typing import List

def find_primes(n: int) -> List[int]: """ Find all prime numbers between 1 and n (inclusive).

This function implements an optimized Sieve of Eratosthenes algorithm
using an odd-only sieve and bytearray for memory efficiency.

Optimizations applied:
1. Odd-Only Sieve: Skips even numbers entirely (except 2), halving memory usage
   and iteration count.
2. Bytearray: Uses bytearray instead of list[bool] for compact storage and
   better cache locality.
3. Integer Square Root: Uses math.isqrt for precise and fast limit calculation.
4. Slice Assignment: Leverages C-level slice assignment for marking composites.

Args:
    n (int): The upper limit of the range to search for primes.

Returns:
    List[int]: A list of integers representing all prime numbers
               less than or equal to n. Returns an empty list if n < 2.
"""
if n < 2:
    return []

# Initialize result with 2, the only even prime
primes = [2]

# Calculate size for odd numbers starting from 3 up to n
# Mapping: index i corresponds to number 2*i + 3
# e.g., i=0 -> 3, i=1 -> 5, i=2 -> 7
size = (n - 1) // 2

# Use bytearray for memory efficiency (1 byte per flag)
# 1 indicates prime candidate, 0 indicates composite
sieve = bytearray([1]) * size

# Limit for sieving is sqrt(n)
limit = math.isqrt(n)

# Iterate through odd numbers up to sqrt(n)
# We iterate by index i, converting to value p = 2*i + 3
for i in range(size):
    p = 2 * i + 3

    # Stop if p exceeds sqrt(n)
    if p > limit:
        break

    if sieve[i]:
        # Mark multiples of p starting from p*p
        # First multiple to mark is p*p
        # Convert p*p to index: (p*p - 3) // 2
        start_idx = (p * p - 3) // 2

        # In the odd-only sieve, multiples of p appear every p-th index
        # because we skip even numbers (step in value is 2*p, step in index is p)
        if start_idx < size:
            # Calculate number of elements to mark
            count = (size - 1 - start_idx) // p + 1
            # Assign 0 (False) to these positions using slice assignment
            sieve[start_idx : size : p] = b'\x00' * count

# Collect remaining primes from the sieve
# Map index i back to number 2*i + 3
primes.extend(2 * i + 3 for i, is_prime in enumerate(sieve) if is_prime)

return primes

在这种范式下，初始提示词、反思提示词、优化提示词的撰写非常重要，能够写清楚需求和任务至关重要，可以给了更多的文科生机会。