claude-agent-sdk:Stream responses

4 阅读7分钟

Stream responses in real-time 实时流式传输响应

Get real-time responses from the Agent SDK as text and tool calls stream in
当文本和工具调用以流式方式传入时,从 Agent SDK 获取实时响应

By default, the Agent SDK yields complete AssistantMessage objects after Claude finishes generating each response. To receive incremental updates as text and tool calls are generated, enable partial message streaming by setting include_partial_messages (Python) or includePartialMessages (TypeScript) to true in your options.
默认情况下,Agent SDK 会在 Claude 完成每次响应生成后,返回完整的 AssistantMessage 对象。要在文本和工具调用生成时接收增量更新,请在您的选项中将 include_partial_messages (Python) 或 includePartialMessages (TypeScript) 设置为 true,以启用部分消息流。

This page covers output streaming (receiving tokens in real-time). For input modes (how you send messages), see Send messages to agents. You can also stream responses using the Agent SDK via the CLI.
本页面介绍的是输出流(实时接收令牌)。关于输入模式(您如何发送消息),请参阅向代理发送消息。您也可以通过 CLI 使用 Agent SDK 流式传输响应

Enable streaming output 启用流式输出

To enable streaming, set include_partial_messages (Python) or includePartialMessages (TypeScript) to true in your options. This causes the SDK to yield StreamEvent messages containing raw API events as they arrive, in addition to the usual AssistantMessage and ResultMessage.

要启用流式传输,请在您的选项中将 include_partial_messages (Python) 或 includePartialMessages (TypeScript) 设置为 true。这样做之后,SDK 除了返回常规的 AssistantMessageResultMessage 外,还会在原始 API 事件到达时返回包含这些事件的 StreamEvent 消息。

Your code then needs to:
您的代码接着需要执行以下操作:

  • Check each message's type to distinguish StreamEvent from other message types
  • 检查每条消息的类型,以区分 StreamEvent 和其他消息类型。
  • For StreamEvent, extract the event field and check its type
  • 对于 StreamEvent,提取其 event 字段并检查其 type
  • Look for content_block_delta events where delta.type is text_delta, which contain the actual text chunks
  • 查找 delta.typetext_deltacontent_block_delta 事件,这些事件包含了实际的文本片段。

The example below enables streaming and prints text chunks as they arrive. Notice the nested type checks: first for StreamEvent, then for content_block_delta, then for text_delta:
下方的示例启用了流式传输,并会在文本片段到达时将其打印出来。请注意其中的嵌套类型检查:首先检查 StreamEvent,然后是 content_block_delta,最后是 text_delta

import { query } from "@anthropic-ai/claude-agent-sdk";

for await (const message of query({
  prompt: "List the files in my project",
  options: {
    includePartialMessages: true,
    allowedTools: ["Bash", "Read"]
  }
})) {
  if (message.type === "stream_event") {
    const event = message.event;
    if (event.type === "content_block_delta") {
      if (event.delta.type === "text_delta") {
        process.stdout.write(event.delta.text);
      }
    }
  }
}

StreamEvent reference(StreamEvent 参考)

When partial messages are enabled, you receive raw Claude API streaming events wrapped in an object. The type has different names in each SDK:
当启用部分消息时,您会收到被包装在一个对象中的原始 Claude API 流式事件。该类型在不同的 SDK 中有不同的名称:

  • Python: StreamEvent (import from claude_agent_sdk.types)
  • TypeScript: SDKPartialAssistantMessage ,其 typestream_event

Both contain raw Claude API events, not accumulated text. You need to extract and accumulate text deltas yourself. Here's the structure of each type:
两者都包含原始的 Claude API 事件,而不是累积后的文本。您需要自行提取并累积文本增量。以下是每种类型的结构:

type SDKPartialAssistantMessage = {
  type: "stream_event";
  event: RawMessageStreamEvent; // From Anthropic SDK
  parent_tool_use_id: string | null;
  uuid: UUID;
  session_id: string;
};

The event field contains the raw streaming event from the Claude API. Common event types include:
event 字段包含来自 Claude API 的原始流式事件。常见的事件类型包括:

Event TypeDescription
message_startStart of a new message
新消息的开始
content_block_startStart of a new content block (text or tool use)
新内容块(文本或工具使用)的开始
content_block_deltaIncremental update to content
内容的增量更新
content_block_stopEnd of a content block
内容块的结束
message_deltaMessage-level updates (stop reason, usage)
消息级别的更新(停止原因、用量)
message_stopEnd of the message
消息的结束

Message flow 消息流

With partial messages enabled, you receive messages in this order:
启用部分消息后,您将按以下顺序收到消息:

StreamEvent (message_start)
StreamEvent (content_block_start) - text block
StreamEvent (content_block_delta) - text chunks...
StreamEvent (content_block_stop)
StreamEvent (content_block_start) - tool_use block
StreamEvent (content_block_delta) - tool input chunks...
StreamEvent (content_block_stop)
StreamEvent (message_delta)
StreamEvent (message_stop)
AssistantMessage - complete message with all content
... tool executes ...
... more streaming events for next turn ...
ResultMessage - final result

Without partial messages enabled (include_partial_messages in Python, includePartialMessages in TypeScript), you receive all message types except StreamEvent. Common types include SystemMessage (session initialization), AssistantMessage (complete responses), ResultMessage (final result), and a compact boundary message indicating when conversation history was compacted (SDKCompactBoundaryMessage in TypeScript; SystemMessage with subtype "compact_boundary" in Python).

在未启用部分消息(Python 中为 include_partial_messages,TypeScript 中为 includePartialMessages)的情况下,您会收到除 StreamEvent 之外的所有消息类型。常见类型包括 SystemMessage(会话初始化)、AssistantMessage(完整响应)、ResultMessage(最终结果),以及一个指示对话历史何时被压缩的压缩边界消息(在 TypeScript 中为 SDKCompactBoundaryMessage;在 Python 中为带 "compact_boundary" 子类型的 SystemMessage)。

Stream text responses 流式传输文本响应

To display text as it's generated, look for content_block_delta events where delta.type is text_delta. These contain the incremental text chunks. The example below prints each chunk as it arrives:
要在文本生成时实时显示,请查找 delta.typetext_deltacontent_block_delta 事件。这些事件包含了增量的文本片段。下方的示例会在每个文本片段到达时将其打印出来:

import { query } from "@anthropic-ai/claude-agent-sdk";

for await (const message of query({
prompt: "Explain how databases work",
options: { includePartialMessages: true }
})) {
if (message.type === "stream_event") {
  const event = message.event;
  if (event.type === "content_block_delta" && event.delta.type === "text_delta") {
    process.stdout.write(event.delta.text);
  }
}
}

console.log(); // Final newline

Stream tool calls 流式传输工具调用

Tool calls also stream incrementally. You can track when tools start, receive their input as it's generated, and see when they complete. The example below tracks the current tool being called and accumulates the JSON input as it streams in. It uses three event types:

工具调用同样以增量方式流式传输。您可以追踪工具何时开始、在其生成时接收其输入,并看到它们何时完成。下方的示例会追踪当前正在调用的工具,并在 JSON 输入以流式方式传入时进行累积。它使用了三种事件类型:

  • content_block_start: tool begins
  • content_block_start:工具开始
  • content_block_delta with input_json_delta: input chunks arrive
  • content_block_delta 附带 input_json_delta:输入片段到达
  • content_block_stop: tool call complete
  • content_block_stop:工具调用完成
import { query } from "@anthropic-ai/claude-agent-sdk";

// Track the current tool and accumulate its input JSON
let currentTool: string | null = null;
let toolInput = "";

for await (const message of query({
prompt: "Read the README.md file",
options: {
  includePartialMessages: true,
  allowedTools: ["Read", "Bash"]
}
})) {
if (message.type === "stream_event") {
  const event = message.event;

  if (event.type === "content_block_start") {
    // New tool call is starting
    if (event.content_block.type === "tool_use") {
      currentTool = event.content_block.name;
      toolInput = "";
      console.log(`Starting tool: ${currentTool}`);
    }
  } else if (event.type === "content_block_delta") {
    if (event.delta.type === "input_json_delta") {
      // Accumulate JSON input as it streams in
      const chunk = event.delta.partial_json;
      toolInput += chunk;
      console.log(`  Input chunk: ${chunk}`);
    }
  } else if (event.type === "content_block_stop") {
    // Tool call complete - show final input
    if (currentTool) {
      console.log(`Tool ${currentTool} called with: ${toolInput}`);
      currentTool = null;
    }
  }
}
}

Build a streaming UI 构建流式 UI

This example combines text and tool streaming into a cohesive UI. It tracks whether the agent is currently executing a tool (using an in_tool flag) to show status indicators like [Using Read...] while tools run. Text streams normally when not in a tool, and tool completion triggers a "done" message. This pattern is useful for chat interfaces that need to show progress during multi-step agent tasks.
本示例将文本和工具流整合到一个统一的 UI 中。它通过跟踪智能体当前是否正在执行工具(使用 in_tool 标志),在工具运行时显示诸如 [Using Read...] 之类的状态指示器。当不在工具执行状态时,文本会正常流式传输;而工具执行完成后,会触发“完成(done)”消息。这种模式对于需要在多步智能体任务期间显示进度的聊天界面非常有用。

import { query } from "@anthropic-ai/claude-agent-sdk";

// Track whether we're currently in a tool call
let inTool = false;

for await (const message of query({
  prompt: "Find all TODO comments in the codebase",
  options: {
    includePartialMessages: true,
    allowedTools: ["Read", "Bash", "Grep"]
  }
})) {
  if (message.type === "stream_event") {
    const event = message.event;

    if (event.type === "content_block_start") {
      if (event.content_block.type === "tool_use") {
        // Tool call is starting - show status indicator
        process.stdout.write(`\n[Using ${event.content_block.name}...]`);
        inTool = true;
      }
    } else if (event.type === "content_block_delta") {
      // Only stream text when not executing a tool
      if (event.delta.type === "text_delta" && !inTool) {
        process.stdout.write(event.delta.text);
      }
    } else if (event.type === "content_block_stop") {
      if (inTool) {
        // Tool call finished
        console.log(" done");
        inTool = false;
      }
    }
  } else if (message.type === "result") {
    // Agent finished all work
    console.log("\n\n--- Complete ---");
  }
}

Known limitations 已知限制

Some SDK features are incompatible with streaming:
部分 SDK 功能与流式传输不兼容:

  • Extended thinking: when you explicitly set max_thinking_tokens (Python) or maxThinkingTokens (TypeScript), StreamEvent messages are not emitted. You'll only receive complete messages after each turn. Note that thinking is disabled by default in the SDK, so streaming works unless you enable it.
  • 扩展思维 (Extended thinking):当你显式设置 max_thinking_tokens (Python) 或 maxThinkingTokens (TypeScript) 时,系统不会发出 StreamEvent 消息。你只会在每轮对话结束后收到完整的消息。请注意,SDK 默认禁用思维功能,因此除非你主动开启,否则流式传输可以正常工作。
  • Structured output: the JSON result appears only in the final ResultMessage.structured_output, not as streaming deltas. See structured outputs for details.
  • 结构化输出 (Structured output):JSON 结果仅出现在最终的 ResultMessage.structured_output 中,而不会以流式增量(deltas)的形式呈现。详情请参阅结构化输出