Token Budget（词元预算）：我给 AI Agent 装了一个"智能油耗表" 背景：Agent 在盲飞 titl

背景：Agent 在盲飞

title: Token Budget（词元预算）：我给 AI Agent 装了一个"智能油耗表" tags: [AI, Agent, TypeScript, 设计模式, 架构]

背景：Agent 在盲飞

2026年的AI圈有一个奇怪的现象：每个人都让 Agent 干活，但没有人教 Agent 怎么省钱。

你的 Agent 调 GPT-5.5 的单价 ¥30/百万token，它不知道。你的 Agent 跑了一整晚深度推理，它不觉得有什么问题。账单出来了你才看到——事后诸葛亮。

现有的"成本控制"方案全是事后统计：

OpenAI 用量仪表盘：花完了才知道
LangSmith 追踪：调式工具，不是控制工具
API Key 限额：硬封顶，不分任务优先级

问题的本质：Agent 没有预算自治能力。

Token Budget（词元预算）：一个设计模式

核心就一条规则：每次调 LLM 前，先问预算。

class TokenBudget {
  canSpend(task): { allow: boolean, suggestion: string, model?: string }
  selectModel(taskType): string
  recordSpend(task, tokens, model): void
}

模型分级

等级	模型	费用	用途
PREMIUM	DeepSeek V4 Reasoning	¥8/百万	深度推理、战略分析
STANDARD	DeepSeek V4 Fast	¥2/百万	分析、写作、总结
CHEAP	DeepSeek V3 Lite (免费)	¥0	日常对话、格式化

预算策略

日预算满 → 自动降级。预算紧张 → 低优先级任务拦截。单次超限 → 建议拆分。深度推理预算不够 → 提示"明日再跑"。

验证通过的 6 个场景

场景A: 充裕预算 + 深度推理     → ✅ 允许
场景B: 简单格式化             → ✅ 允许，用免费模型
场景C: 大量消耗到95%          → ✅ 自动降级
场景D: 预算紧张 + 中等任务     → 🚫 拦截
场景E: 预算紧张 + 免费任务     → ✅ 自动切免费
场景F: 单次任务超限            → 🚫 拦截（建议拆分）

完整代码实现

核心类 200 行，MIT 协议，GitHub 开源：

export type TaskType = 'deep_reasoning' | 'analysis' | 'drafting' | 'simple_chat' | 'formatting'
export type TaskPriority = 'high' | 'medium' | 'low'
export type ModelTier = 'PREMIUM' | 'STANDARD' | 'CHEAP'

interface TaskSpec {
  type: TaskType
  estimatedTokens: number
  priority?: TaskPriority
}

interface BudgetConfig {
  dailyLimit?: number
  monthlyLimit?: number
  perTaskLimit?: number
  autoDowngrade?: boolean
  alertThreshold?: number
  preferredTier?: ModelTier
}

export class TokenBudget {
  private config: Required<BudgetConfig>
  private usedDaily = 0
  private usedMonthly = 0
  private log: SpendRecord[] = []

  private readonly MODEL_TIER_COST: Record<ModelTier, { model: string; costPerM: number }> = {
    PREMIUM: { model: 'DeepSeek V4 Reasoning', costPerM: 8 },
    STANDARD: { model: 'DeepSeek V4 Fast', costPerM: 2 },
    CHEAP: { model: 'DeepSeek V3 Lite (free)', costPerM: 0 },
  }

  constructor(config?: BudgetConfig) {
    this.config = {
      dailyLimit: config?.dailyLimit ?? 100000,
      monthlyLimit: config?.monthlyLimit ?? 3_000_000,
      perTaskLimit: config?.perTaskLimit ?? 50000,
      autoDowngrade: config?.autoDowngrade ?? true,
      alertThreshold: config?.alertThreshold ?? 0.1,
      preferredTier: config?.preferredTier ?? 'STANDARD',
    }
  }

  canSpend(task: TaskSpec): { allow: boolean; suggestion: string; model?: string } {
    if (task.estimatedTokens > this.config.perTaskLimit) {
      return { allow: false, suggestion: `单次任务超限（${task.estimatedTokens} > ${this.config.perTaskLimit}），建议拆分` }
    }
    const remaining = this.config.dailyLimit - this.usedDaily
    const pctUsed = this.usedDaily / this.config.dailyLimit

    if (remaining <= 0) {
      if (this.config.autoDowngrade && this.isCheapTask(task.type)) {
        return { allow: true, suggestion: '日预算已用完，自动降级到免费模型', model: 'DeepSeek V3 Lite (free)' }
      }
      return { allow: false, suggestion: '今日AI额度已用完，请先充值或限制预算上限' }
    }

    if (task.estimatedTokens > remaining && !this.isCheapTask(task.type)) {
      return { allow: false, suggestion: '预算紧张（余额不足），请充值或等明天' }
    }

    return { allow: true, suggestion: '通过预算检查' }
  }

  selectModel(taskType: TaskType): string {
    const pct = this.usedDaily / this.config.dailyLimit
    if (pct > 0.8 && taskType !== 'deep_reasoning') return 'DeepSeek V4 Fast'
    if (pct > 0.6 && this.isCheapTask(taskType)) return 'DeepSeek V3 Lite (free)'
    if (taskType === 'deep_reasoning') return 'DeepSeek V4 Reasoning'
    if (taskType === 'simple_chat' || taskType === 'formatting') return 'DeepSeek V3 Lite (free)'
    return this.config.preferredTier === 'STANDARD' ? 'DeepSeek V4 Fast' : 'DeepSeek V4 Reasoning'
  }

  recordSpend(task: string, tokens: number, model: string): void {
    this.usedDaily += tokens
    this.usedMonthly += tokens
    this.log.push({ time: Date.now(), task, tokens, model })
  }

  getStats(): BudgetStats {
    return {
      usedDaily: this.usedDaily, dailyLimit: this.config.dailyLimit,
      usedMonthly: this.usedMonthly, monthlyLimit: this.config.monthlyLimit,
      dailyPercent: `${((this.usedDaily / this.config.dailyLimit) * 100).toFixed(1)}%`,
      todayCost: (this.usedDaily * 0.000008).toFixed(4),
      monthlyCost: (this.usedMonthly * 0.000008).toFixed(4),
      autoDowngrade: this.config.autoDowngrade,
      records: this.log.slice(-10).map(r => ({ ...r, time: new Date(r.time).toLocaleString() })),
    }
  }

  resetDaily(): void { this.usedDaily = 0 }
  private isCheapTask(t: TaskType): boolean { return ['simple_chat', 'formatting'].includes(t) }
}

怎么用

npm install @wenrl2006/token-budget --registry=https://npm.pkg.github.com

const budget = new TokenBudget({ dailyLimit: 100000, autoDowngrade: true })

// 每次调 LLM 前：
const task = { type: 'deep_reasoning', estimatedTokens: 8000, priority: 'high' }
if (budget.canSpend(task).allow) {
  const model = budget.selectModel(task.type)  // 预算够 → 满血版
  const result = await callLLM(input, model)
  budget.recordSpend('分析用户输入', 8000, model)
}

这不是一个包，是一个设计模式

Token Budget 不是产品。它是设计模式——像 MVC 一样，应该渗透到每一个 Agent 的血脉里。

当前 Agent 框架里缺的不是 API Key 管理，也不是用量追踪。缺的是一个在 LLM 调用之前就拦住你的判断层。Token Budget 补的就是这一层。

以后写 Agent 框架，第一步不是配 API Key，是 new TokenBudget(…)。

GitHub: github.com/wenrl2006/t… Package: @wenrl2006/token-budget