大宗供应链企业销量预测与智能补货系统 (一)定价模型以及基本补货方案

102 阅读25分钟

继舆情系统的设计之后,我们需要基于其舆情数据进行销量预测和智能补货的数据建模。于是有此系列

舆情系列:


目录

  1. 系统概述
  2. 需求分析与商业价值
  3. 销量预测模型
  4. 智能补货决策引擎
  5. 动态定价与收益优化
  6. 技术架构
  7. 数据模型与表设计
  8. 实现路线图

系统概述

1.1 核心问题

大宗商品供应链企业面临的库存管理困境:

问题维度表现形式影响预期改善
需求波动订单量剧烈变化 (+50% ~ -30%)库存积压或缺货预测精度>85%
舆情驱动地缘风险、供给中断导致采购激增被动应对,库存失控提前7天预警
价格波动商品期货价格日波幅10%+定价滞后,利润被侵蚀收益提升12-18%
供应链断裂港口罢工、制裁导致补货延迟应急成本高、计划无效多源补货,成本-8%

1.2 系统架构概览

舆情指标输入 ──┐
历史销量数据 ──┤
商品期货价格 ──┼──→ [销量预测模型] ──→ [预测订单]
库存水位 ─────┤      (XGBoost)
天气/季节性 ──┘

预测订单 ─┐
库存成本 ─┼──→ [智能补货决策] ──→ [采购方案]
供应约束 ─┤    (动态规划)        [供应商选择]
          └──→ [动态定价引擎] ──→ [价格策略]
                (强化学习)         [收益优化]

输出: Doris决策支持 → BI仪表板 → 交易/物流系统

需求分析与商业价值

2.1 与舆情系统的数据融合

从MGIR系统获取的关键指标用于预测补正:

# 舆情风险指标对库存决策的影响

舆情输入:
├─ LDI (物流延迟指数) 0-10
│  └─ 当LDI>7: 建议增加安全库存 +30%
│  └─ 供应商交期 +5-10天
│
├─ CRI (合规风险指数) 0-10
│  └─ 当CRI>6: 激活多源补货策略
│  └─ 关税成本 +2-8%
│
├─ GRI (地缘政治指数) 0-10
│  └─ 当GRI>7且持续: 预判采购商囤积行为
│  └─ 需求激增预测 +15-40%
│
└─ FRI (金融风险指数) 0-10
   └─ 当FRI>7: 供应商信用评估下调
   └─ 结算周期风险 +20%

预测修正公式:
预测需求_修正 = 基础预测 × (1 + ∑威胁系数) × (1 - 机会抵消)

威胁系数计算:
  = (LDI/10) × 0.15 + (CRI/10) × 0.12 + (GRI/10) × 0.20 + (FRI/10) × 0.08
  范围: 0-0.55 (55%的需求增长可能)

2.2 商业价值量化

投资成本:
├─ 基础设施 (GPU服务器, 存储): ¥150万
├─ 人力 (数据科学家, 工程师): ¥100万/年
└─ 维护运营: ¥50万/年
总投资: ¥300万 (第一年)

预期收益 (年度):
├─ 库存优化
│  ├─ 减少积压成本: ¥500万 (库存周转期↓20%)
│  ├─ 降低缺货损失: ¥300万 (缺货率↓25%)
│  └─ 资金释放: ¥1000万 (库存资金↓15%)
│
├─ 定价优化
│  ├─ 动态定价收益: ¥800万 (毛利↑8%)
│  └─ 需求侧管理: ¥400万 (高价期销量↑)
│
└─ 供应链效率
   ├─ 采购成本节省: ¥300万 (供应商竞争)
   └─ 仓储运输优化: ¥200万
   
总收益: ¥3500万/年
ROI: 11.7倍 (第二年ROI 12倍+)

销量预测模型

3.1 预测架构

输入特征矩阵:
├─ 时间特征 (8维)
│  ├─ 日期、星期几、月份、季度
│  ├─ 是否假期、假期前后天数
│  ├─ 年度日期序号、周数
│  
├─ 历史销量特征 (12维)
│  ├─ 过去7/14/30天平均销量
│  ├─ 同比增长率 (YoY)
│  ├─ 环比增长率 (MoM)
│  ├─ 销量方差、峰值、谷值
│  ├─ 销量趋势 (线性回归斜率)
│  
├─ 库存特征 (6维)
│  ├─ 当前库存量、库存率 (库存/日均需求)
│  ├─ 库存龄 (最旧商品存放天数)
│  ├─ 库存周转率、超期库存率
│  
├─ 价格特征 (5维)
│  ├─ 商品期货价格、价格涨跌幅
│  ├─ 价格同比、近30天波幅
│  ├─ 本公司当前报价与市场价差
│  
├─ 舆情特征 (8维) ⭐ 关键
│  ├─ LDI (物流延迟指数)
│  ├─ CRI (合规风险指数)
│  ├─ GRI (地缘政治指数)
│  ├─ FRI (金融风险指数)
│  ├─ 近7天新增高风险事件数
│  ├─ 风险趋势 (风险指数变化率)
│  ├─ 需求侧情感指数 (采购商社交媒体/邮件情感)
│  ├─ 舆情突发指标 (新闻事件数/异常度)
│  
├─ 市场竞争特征 (4维)
│  ├─ 竞争对手价格、库存水位
│  ├─ 市场集中度、供应充分率
│  
└─ 采购商维度特征 (6维)
   ├─ 采购商类型、规模、地理位置
   ├─ 采购商历史订单行为
   ├─ 采购商风险承受度、信用等级

3.2 XGBoost模型设计

import xgboost as xgb
import pandas as pd
import numpy as np
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import TimeSeriesSplit
import warnings
warnings.filterwarnings('ignore')

class DemandForecastingModel:
    """
    基于XGBoost的大宗商品销量预测模型
    融合舆情数据的需求预测能力
    """
    
    def __init__(self, commodity_type: str, region: str):
        self.commodity_type = commodity_type  # 钢材、石油、谷物等
        self.region = region  # 采购商地区
        self.model = None
        self.scaler = StandardScaler()
        self.feature_importance = None
        
    def prepare_features(self, data_df: pd.DataFrame) -> pd.DataFrame:
        """
        特征工程: 从原始数据生成模型输入特征
        
        输入:
          data_df: 包含销量、价格、舆情等时间序列数据
          columns: [date, sales, inventory, price, ldi, cri, gri, fri, ...]
        
        输出:
          feature_matrix: (样本数, 43维特征)
        """
        
        df = data_df.copy().sort_values('date').reset_index(drop=True)
        
        # ===== 时间特征 =====
        df['date'] = pd.to_datetime(df['date'])
        df['dayofweek'] = df['date'].dt.dayofweek
        df['month'] = df['date'].dt.month
        df['quarter'] = df['date'].dt.quarter
        df['dayofyear'] = df['date'].dt.dayofyear
        df['weekofyear'] = df['date'].dt.isocalendar().week
        
        # 假期标记 (示例: 中国假期)
        holiday_dates = [
            '2025-01-29', '2025-01-30', '2025-01-31', '2025-02-01', '2025-02-02',  # 春节
            '2025-04-04', '2025-04-05', '2025-04-06',  # 清明
            '2025-06-10',  # 端午
            '2025-09-17',  # 中秋
            '2025-10-01', '2025-10-02', '2025-10-03', '2025-10-04', '2025-10-05', '2025-10-06', '2025-10-07'  # 国庆
        ]
        df['is_holiday'] = df['date'].isin(pd.to_datetime(holiday_dates)).astype(int)
        df['days_to_holiday'] = df['date'].apply(
            lambda x: min([abs((x - pd.to_datetime(h)).days) for h in holiday_dates]) if holiday_dates else 365
        )
        
        # ===== 历史销量特征 (滚动统计) =====
        for window in [7, 14, 30]:
            df[f'sales_ma_{window}'] = df['sales'].rolling(window=window, min_periods=1).mean()
            df[f'sales_std_{window}'] = df['sales'].rolling(window=window, min_periods=1).std().fillna(0)
            df[f'sales_pct_change_{window}'] = df['sales'].pct_change(periods=window).fillna(0)
        
        # 同比增长 (与1年前对比)
        if len(df) > 365:
            df['yoy_growth'] = (df['sales'] - df['sales'].shift(365)) / (df['sales'].shift(365) + 1)
            df['yoy_growth'] = df['yoy_growth'].fillna(0)
        else:
            df['yoy_growth'] = 0
        
        # 趋势特征 (近30天销量的线性趋势)
        def calc_trend(x):
            if len(x) < 2:
                return 0
            y = np.array(x)
            x_range = np.arange(len(y))
            z = np.polyfit(x_range, y, 1)
            return z[0]  # 斜率
        
        df['sales_trend_30'] = df['sales'].rolling(window=30, min_periods=2).apply(
            lambda x: calc_trend(x), raw=False
        ).fillna(0)
        
        # ===== 库存特征 =====
        df['inventory_pct'] = df['inventory'] / (df['sales_ma_30'] + 1)  # 库存占需求天数
        df['inventory_age_days'] = df['inventory'].rolling(window=30, min_periods=1).apply(
            lambda x: np.sum(x) / (np.sum(x) + 0.01) if np.sum(x) > 0 else 0
        )
        df['inventory_turnover_30'] = df['sales_ma_30'] / (df['inventory'] + 1)
        
        # ===== 价格特征 =====
        df['price_pct_change'] = df['price'].pct_change().fillna(0)
        df['price_ma_7'] = df['price'].rolling(window=7).mean().fillna(method='bfill')
        df['price_volatility_30'] = df['price'].rolling(window=30).std().fillna(0)
        
        # 本公司报价与市场价差 (假设有benchmark_price列)
        if 'benchmark_price' in df.columns:
            df['price_vs_market'] = (df['price'] - df['benchmark_price']) / (df['benchmark_price'] + 1)
        else:
            df['price_vs_market'] = 0
        
        # ===== 舆情特征 (MGIR指标) ⭐ =====
        for col in ['ldi', 'cri', 'gri', 'fri']:
            if col in df.columns:
                # 原始指标值
                df[f'{col}_normalized'] = df[col] / 10.0  # 归一化到0-1
                # 指标变化率
                df[f'{col}_change'] = df[col].diff().fillna(0)
                # 高风险事件触发 (指标突增)
                df[f'{col}_spike'] = (df[f'{col}_change'] > df[f'{col}_change'].std() * 2).astype(int)
        
        # 舆情综合压力指数 (权重组合)
        if all(col in df.columns for col in ['ldi', 'cri', 'gri', 'fri']):
            df['sentiment_pressure_index'] = (
                0.25 * (df['ldi'] / 10) +  # 物流延迟影响采购延迟
                0.20 * (df['cri'] / 10) +  # 合规风险影响成本
                0.35 * (df['gri'] / 10) +  # 地缘政治驱动囤积行为 (权重最高)
                0.20 * (df['fri'] / 10)    # 金融风险影响订单确定性
            )
        else:
            df['sentiment_pressure_index'] = 0
        
        # 舆情驱动的需求异常指标
        if 'gri' in df.columns:
            df['demand_surge_signal'] = (df['gri'] > 6).astype(int)
        
        # ===== 市场特征 =====
        if 'competitor_price' in df.columns:
            df['price_competitiveness'] = (df['competitor_price'] - df['price']) / (df['competitor_price'] + 1)
        
        # ===== 采购商维度特征 =====
        if 'buyer_segment' in df.columns:  # 采购商等级: A/B/C
            buyer_multipliers = {'A': 1.2, 'B': 1.0, 'C': 0.8}
            df['buyer_segment_weight'] = df['buyer_segment'].map(buyer_multipliers)
        
        # ===== 填补缺失值 =====
        df = df.fillna(method='bfill').fillna(method='ffill').fillna(0)
        
        return df
    
    def train(self, df: pd.DataFrame, target_col: str = 'sales', 
              test_size: int = 90):
        """
        时间序列交叉验证训练
        
        参数:
          df: 特征化后的数据框
          target_col: 目标列名
          test_size: 测试集大小 (天数)
        """
        
        feature_cols = [col for col in df.columns if col not in ['date', 'sales', 'price', 'inventory']]
        X = df[feature_cols].values
        y = df[target_col].values
        
        # 标准化特征
        X_scaled = self.scaler.fit_transform(X)
        
        # 时间序列交叉验证 (避免数据泄漏)
        tscv = TimeSeriesSplit(n_splits=5)
        cv_scores = []
        
        for train_idx, val_idx in tscv.split(X_scaled):
            X_train, X_val = X_scaled[train_idx], X_scaled[val_idx]
            y_train, y_val = y[train_idx], y[val_idx]
            
            # XGBoost配置 (针对回归任务优化)
            params = {
                'objective': 'reg:squarederror',
                'booster': 'gbtree',
                'max_depth': 7,
                'eta': 0.1,
                'subsample': 0.8,
                'colsample_bytree': 0.8,
                'min_child_weight': 5,
                'lambda': 1.0,  # L2正则
                'alpha': 0.1,   # L1正则
                'tree_method': 'gpu_hist',  # GPU加速
                'gpu_id': 0,
                'random_state': 42
            }
            
            # 训练
            dtrain = xgb.DMatrix(X_train, label=y_train)
            dval = xgb.DMatrix(X_val, label=y_val)
            
            model = xgb.train(
                params,
                dtrain,
                num_boost_round=200,
                evals=[(dval, 'validation')],
                early_stopping_rounds=20,
                verbose_eval=False
            )
            
            # 验证集评估
            y_pred = model.predict(dval)
            mae = np.mean(np.abs(y_pred - y_val))
            mape = np.mean(np.abs((y_pred - y_val) / (y_val + 1)))
            cv_scores.append({'mae': mae, 'mape': mape})
        
        # 使用全量数据进行最终训练
        X_train_full = X_scaled[:len(X_scaled) - test_size]
        y_train_full = y[:len(y) - test_size]
        
        dtrain_full = xgb.DMatrix(X_train_full, label=y_train_full)
        
        self.model = xgb.train(
            params,
            dtrain_full,
            num_boost_round=200,
            verbose_eval=False
        )
        
        # 提取特征重要度
        self.feature_importance = self.model.get_score(importance_type='weight')
        
        print(f"Cross-validation Results:")
        print(f"  Avg MAE: {np.mean([s['mae'] for s in cv_scores]):.2f}")
        print(f"  Avg MAPE: {np.mean([s['mape'] for s in cv_scores]):.2%}")
        
        return self
    
    def predict(self, df: pd.DataFrame, forecast_days: int = 7) -> pd.DataFrame:
        """
        预测未来N天的销量
        
        参数:
          df: 包含最新特征的数据框 (最后一行为当前日期)
          forecast_days: 预测天数
        
        返回:
          forecast_df: 包含预测值和置信区间的数据框
        """
        
        predictions = []
        forecast_df = df.copy()
        
        for _ in range(forecast_days):
            # 准备预测特征
            feature_cols = [col for col in df.columns if col not in ['date', 'sales', 'price', 'inventory']]
            X_latest = forecast_df.iloc[-1:][feature_cols].values
            X_scaled = self.scaler.transform(X_latest)
            
            # 预测
            dmatrix = xgb.DMatrix(X_scaled)
            pred = self.model.predict(dmatrix)[0]
            
            # 应用下界 (销量不能为负)
            pred = max(0, pred)
            predictions.append(pred)
            
            # 更新下一天的特征 (用于滚动预测)
            # 这里简化处理: 假设销量按预测值更新
            new_row = forecast_df.iloc[-1].copy()
            new_row['date'] = pd.to_datetime(new_row['date']) + pd.Timedelta(days=1)
            new_row['sales'] = pred
            # 更新滚动特征
            new_row['sales_ma_7'] = forecast_df.tail(7)['sales'].mean()
            forecast_df = pd.concat([forecast_df, pd.DataFrame([new_row])], ignore_index=True)
        
        # 构建预测结果DataFrame
        result = pd.DataFrame({
            'forecast_date': pd.date_range(start=df['date'].iloc[-1] + pd.Timedelta(days=1), periods=forecast_days),
            'predicted_sales': predictions,
            'lower_bound': [max(0, p * 0.85) for p in predictions],  # 下界置信区间
            'upper_bound': [p * 1.15 for p in predictions],  # 上界置信区间
            'confidence': [0.85 + 0.1 * (1 - i / forecast_days) for i in range(forecast_days)]  # 长期预测置信度递减
        })
        
        return result
    
    def get_feature_importance_report(self, top_n: int = 15) -> pd.DataFrame:
        """
        获取特征重要度报告 (用于理解模型决策)
        """
        
        if self.feature_importance is None:
            return pd.DataFrame()
        
        importance_df = pd.DataFrame(
            list(self.feature_importance.items()),
            columns=['feature', 'importance']
        ).sort_values('importance', ascending=False).head(top_n)
        
        # 标记舆情特征
        importance_df['is_sentiment'] = importance_df['feature'].str.contains('ldi|cri|gri|fri|sentiment')
        
        return importance_df


# ===== 使用示例 =====

if __name__ == '__main__':
    # 加载数据 (包含销量、价格、舆情指标等)
    df = pd.read_csv('/data/commodity_sales_with_sentiment.csv')
    
    # 创建钢材华东地区的预测模型
    model = DemandForecastingModel(commodity_type='steel', region='east_china')
    
    # 特征工程
    df_features = model.prepare_features(df)
    
    # 训练模型
    model.train(df_features, target_col='sales', test_size=90)
    
    # 获取特征重要度
    print("\n特征重要度 (Top 15):")
    importance = model.get_feature_importance_report()
    print(importance)
    
    # 预测未来7天
    forecast = model.predict(df_features, forecast_days=7)
    print("\n未来7天销量预测:")
    print(forecast)

3.3 模型精度与舆情敏感性分析

def analyze_sentiment_impact(model_output: pd.DataFrame, 
                             sentiment_df: pd.DataFrame) -> dict:
    """
    分析舆情特征对预测的影响程度
    """
    
    # 基于特征重要度的舆情贡献度
    importance = model.get_feature_importance_report(top_n=50)
    sentiment_features = importance[importance['is_sentiment']]
    
    total_importance = importance['importance'].sum()
    sentiment_contribution = sentiment_features['importance'].sum() / total_importance
    
    return {
        'sentiment_contribution_ratio': sentiment_contribution,  # 舆情特征的总重要度占比
        'key_sentiment_features': sentiment_features.head(5).to_dict('records'),
        'model_interpretation': f"""
        模型中舆情特征 (LDI/CRI/GRI/FRI) 对销量预测的贡献度为 {sentiment_contribution:.1%}。
        其中:
        - GRI (地缘政治指数) 影响最大: 驱动采购商囤积行为
        - LDI (物流延迟指数) 其次: 影响补货紧迫性
        - CRI (合规风险指数) 影响成本结构
        - FRI (金融风险指数) 影响订单确定性
        
        当舆情压力指数 > 0.6 时,模型会自动调高需求预测 (+15% ~ +40%)
        """
    }

智能补货决策引擎

4.1 补货模型架构

库存当前状态 ─┐
补货成本 ─────┼──→ [动态规划优化] ──→ [补货方案]
补货时间 ─────┤     (Integer LP)     ├─ 补货数量
销量预测 ─────┤                      ├─ 补货时机
供应约束 ─────┤                      ├─ 供应商选择
舆情风险 ─────┘

目标函数:
  min(库存成本 + 缺货成本 + 供应商多源溢价)
  
约束条件:
  - 库存 ≥ 安全库存 (risk-adjusted)
  - 补货量 ≥ MOQ (最小订购量)
  - 交期约束 (考虑舆情延迟)
  - 供应商容量约束
  - 现金流约束 (支付能力)

4.2 PuLP实现

from pulp import *
import pandas as pd
import numpy as np
from datetime import datetime, timedelta

class IntelligentReplenishmentEngine:
    """
    智能补货决策引擎
    - 整合舆情指标优化安全库存
    - 多供应商模式应对断裂风险
    - 考虑资金成本与缺货成本的权衡
    """
    
    def __init__(self, commodity_code: str, buyer_geo_key: str):
        self.commodity_code = commodity_code  # 商品代码
        self.buyer_geo_key = buyer_geo_key   # 采购商地理位置
        self.problem = None
        self.decision_vars = {}
        
    def calculate_safety_stock(self, 
                               avg_demand: float,
                               demand_std: float,
                               lead_time_days: int,
                               service_level: float = 0.95,
                               sentiment_indices: dict = None) -> float:
        """
        计算风险调整后的安全库存
        
        基础安全库存公式 (EOQ变体):
          SS = Z_α × σ_demand × sqrt(L)
          
        舆情调整:
          - LDI增高 → lead_time增加 → SS↑
          - GRI增高 → 需求波动增加 → SS↑
          - CRI增高 → 供应不确定↑ → SS↑
        """
        
        # 服务水平对应的标准差倍数
        z_scores = {0.90: 1.28, 0.95: 1.645, 0.99: 2.33}
        z = z_scores.get(service_level, 1.645)
        
        # 基础安全库存
        base_ss = z * demand_std * np.sqrt(lead_time_days)
        
        # 舆情调整因子
        adjustment_factor = 1.0
        if sentiment_indices:
            # LDI增加的配送时间
            if 'ldi' in sentiment_indices:
                ldi_adjusted_lead_time = lead_time_days * (1 + sentiment_indices['ldi'] / 10 * 0.3)
                base_ss = z * demand_std * np.sqrt(ldi_adjusted_lead_time)
            
            # GRI增加的需求不确定性
            if 'gri' in sentiment_indices:
                gri_factor = sentiment_indices['gri'] / 10  # 0-1
                demand_std_adjusted = demand_std * (1 + gri_factor * 0.4)
                base_ss = z * demand_std_adjusted * np.sqrt(lead_time_days)
            
            # CRI增加的供应风险
            if 'cri' in sentiment_indices:
                cri_factor = sentiment_indices['cri'] / 10
                adjustment_factor *= (1 + cri_factor * 0.2)
        
        safety_stock = base_ss * adjustment_factor
        
        return safety_stock
    
    def optimize_replenishment(self,
                               current_inventory: float,
                               forecast_demand: np.ndarray,  # 未来30天的日需求
                               holding_cost_per_unit: float,  # 库存成本 (¥/单位/天)
                               stockout_cost_per_unit: float,  # 缺货成本 (¥/单位)
                               suppliers: list,  # 供应商列表
                               sentiment_indices: dict = None) -> dict:
        """
        使用整数线性规划求解最优补货方案
        
        参数示例:
          current_inventory: 100 (吨)
          forecast_demand: [50, 55, 45, ..., 60]  (30天)
          holding_cost: 50 (¥/吨/天)
          stockout_cost: 2000 (¥/吨)
          suppliers: [
            {
              'name': '供应商A',
              'unit_price': 3000,  # ¥/吨
              'lead_time': 5,  # 天
              'capacity': 500,  # 吨/月
              'moq': 50,  # 最小订购量
              'quality_score': 0.95,
              'reliability': 0.9
            },
            ...
          ]
        """
        
        days = len(forecast_demand)
        n_suppliers = len(suppliers)
        
        # 创建LP问题
        prob = LpProblem("Replenishment_Optimization", LpMinimize)
        
        # 决策变量
        # x[i][j]: 从供应商j在第i天的订购量 (吨)
        x = [[LpVariable(f"order_day{i}_supplier{j}", lowBound=0, cat='Continuous')
              for j in range(n_suppliers)] for i in range(days)]
        
        # h[i]: 第i天末的库存量 (吨)
        h = [LpVariable(f"inventory_day{i}", lowBound=0, cat='Continuous') for i in range(days)]
        
        # s[i]: 第i天的缺货量 (吨)
        s = [LpVariable(f"shortage_day{i}", lowBound=0, cat='Continuous') for i in range(days)]
        
        # y[i][j]: 是否从供应商j在第i天订购 (二元变量, 用于MOQ约束)
        y = [[LpVariable(f"order_flag_day{i}_supplier{j}", cat='Binary')
              for j in range(n_suppliers)] for i in range(days)]
        
        # ===== 目标函数 =====
        # 最小化: 库存持有成本 + 缺货损失 + 订货次数溢价 (多源溢价)
        cost_holding = lpSum([h[i] * holding_cost_per_unit for i in range(days)])
        cost_shortage = lpSum([s[i] * stockout_cost_per_unit for i in range(days)])
        
        # 多源溢价 (从>1个供应商订购时增加成本5%)
        multi_source_premium = lpSum([
            y[i][j] * suppliers[j]['unit_price'] * 0.05 * lpSum([x[i][j_] for j_ in range(n_suppliers)])
            for i in range(days) for j in range(n_suppliers)
        ])
        
        prob += cost_holding + cost_shortage + multi_source_premium, "Total_Cost"
        
        # ===== 约束条件 =====
        
        # 1. 库存平衡约束
        for i in range(days):
            if i == 0:
                # 第一天: 当前库存 + 订单 - 需求 = 库存 + 缺货
                prob += (
                    current_inventory + 
                    lpSum([x[i][j] for j in range(n_suppliers)]) -
                    forecast_demand[i] == h[i] + s[i],
                    f"Balance_Day{i}"
                )
            else:
                # 后续天: 前日库存 + 订单 - 需求 = 库存 + 缺货
                prob += (
                    h[i-1] + 
                    lpSum([x[i][j] for j in range(n_suppliers)]) -
                    forecast_demand[i] == h[i] + s[i],
                    f"Balance_Day{i}"
                )
        
        # 2. 最小订购量 (MOQ) 约束
        for i in range(days):
            for j in range(n_suppliers):
                moq = suppliers[j]['moq']
                # 如果订购,则订购量 >= MOQ
                prob += (
                    x[i][j] >= moq * y[i][j],
                    f"MOQ_Day{i}_Supplier{j}"
                )
                # 最大订购量 (供应商容量)
                prob += (
                    x[i][j] <= suppliers[j]['capacity'] * y[i][j],
                    f"Capacity_Day{i}_Supplier{j}"
                )
        
        # 3. 安全库存约束 (风险调整)
        avg_demand = np.mean(forecast_demand)
        demand_std = np.std(forecast_demand)
        
        # 计算风险调整的安全库存
        safety_stock = self.calculate_safety_stock(
            avg_demand, demand_std, 
            lead_time_days=suppliers[0]['lead_time'],  # 使用最快的供应商
            sentiment_indices=sentiment_indices
        )
        
        # 平均库存 >= 安全库存
        avg_inventory = lpSum([h[i] for i in range(days)]) / days
        prob += avg_inventory >= safety_stock, "Safety_Stock_Constraint"
        
        # 4. 供应商可靠性约束 (考虑舆情风险)
        for i in range(days):
            for j in range(n_suppliers):
                # 高风险环境下,不完全依赖单一供应商
                if sentiment_indices and sentiment_indices.get('cri', 0) > 6:
                    # CRI>6时,单个供应商订购量不超过预测需求的60%
                    prob += (
                        x[i][j] <= forecast_demand[i] * 0.6,
                        f"Multi_Source_Diversity_Day{i}_Supplier{j}"
                    )
        
        # 5. 缺货容忍度约束 (特定采购商)
        # 采购商等级A: 不允许缺货 (s[i]=0)
        # 采购商等级B: 缺货<平均需求的5%
        # 采购商等级C: 缺货<平均需求的10%
        
        buyer_grades = {'A': 0.0, 'B': 0.05, 'C': 0.10}
        buyer_grade = 'B'  # 默认B级
        max_shortage_ratio = buyer_grades.get(buyer_grade, 0.05)
        
        prob += (
            lpSum([s[i] for i in range(days)]) <= avg_demand * max_shortage_ratio * days,
            "Shortage_Tolerance"
        )
        
        # ===== 求解 =====
        prob.solve(PULP_CBC_CMD(msg=0))
        
        # ===== 提取结果 =====
        replenishment_plan = []
        for i in range(days):
            day_orders = []
            for j in range(n_suppliers):
                order_qty = x[i][j].varValue
                if order_qty and order_qty > 0.1:  # 订购量>0.1单位
                    day_orders.append({
                        'supplier': suppliers[j]['name'],
                        'quantity': round(order_qty, 2),
                        'unit_price': suppliers[j]['unit_price'],
                        'total_cost': round(order_qty * suppliers[j]['unit_price'], 2),
                        'lead_time': suppliers[j]['lead_time']
                    })
            
            if day_orders:  # 只记录有订单的天
                replenishment_plan.append({
                    'day': i + 1,
                    'orders': day_orders,
                    'inventory_end_of_day': round(h[i].varValue, 2),
                    'shortage': round(s[i].varValue, 2)
                })
        
        # 总成本分析
        total_cost = value(prob.objective)
        holding_cost_actual = sum([h[i].varValue * holding_cost_per_unit for i in range(days)])
        shortage_cost_actual = sum([s[i].varValue * stockout_cost_per_unit for i in range(days)])
        
        return {
            'status': LpStatus[prob.status],
            'optimization_successful': LpStatus[prob.status] == 'Optimal',
            'total_cost': round(total_cost, 2),
            'cost_breakdown': {
                'holding_cost': round(holding_cost_actual, 2),
                'shortage_cost': round(shortage_cost_actual, 2),
                'multi_source_premium': round(total_cost - holding_cost_actual - shortage_cost_actual, 2)
            },
            'replenishment_plan': replenishment_plan,
            'summary': {
                'total_orders_days': len(replenishment_plan),
                'avg_inventory_level': round(np.mean([h[i].varValue for i in range(days)]), 2),
                'total_shortage': round(sum([s[i].varValue for i in range(days)]), 2),
                'safety_stock_calculated': round(safety_stock, 2)
            }
        }


# ===== 使用示例 =====

if __name__ == '__main__':
    engine = IntelligentReplenishmentEngine(
        commodity_code='STEEL_REBAR',
        buyer_geo_key='CHINA_EAST_001'
    )
    
    # 供应商配置
    suppliers = [
        {
            'name': '鞍山供应商A',
            'unit_price': 3000,
            'lead_time': 3,
            'capacity': 500,
            'moq': 50,
            'quality_score': 0.95,
            'reliability': 0.92
        },
        {
            'name': '宝山供应商B',
            'unit_price': 3100,
            'lead_time': 5,
            'capacity': 400,
            'moq': 40,
            'quality_score': 0.90,
            'reliability': 0.85
        },
        {
            'name': '日本进口供应商C',
            'unit_price': 3500,
            'lead_time': 20,
            'capacity': 200,
            'moq': 100,
            'quality_score': 0.98,
            'reliability': 0.95
        }
    ]
    
    # 舆情数据
    sentiment = {
        'ldi': 5,   # 物流延迟中等
        'cri': 6,   # 合规风险高
        'gri': 3,   # 地缘政治风险低
        'fri': 4    # 金融风险中等
    }
    
    # 预测需求 (30天)
    forecast = np.array([
        50, 55, 45, 60, 52, 48, 65, 70, 55, 50,
        58, 62, 51, 49, 67, 72, 58, 53, 61, 68,
        54, 50, 59, 64, 52, 47, 63, 69, 55, 51
    ])
    
    # 求解
    result = engine.optimize_replenishment(
        current_inventory=100,
        forecast_demand=forecast,
        holding_cost_per_unit=50,
        stockout_cost_per_unit=2000,
        suppliers=suppliers,
        sentiment_indices=sentiment
    )
    
    print("优化结果:")
    print(f"  优化状态: {result['status']}")
    print(f"  总成本: ¥{result['total_cost']}")
    print(f"  其中: 库存成本 ¥{result['cost_breakdown']['holding_cost']}, " +
          f"缺货成本 ¥{result['cost_breakdown']['shortage_cost']}")
    print(f"  采购计划: 共{result['summary']['total_orders_days']}天有订单")

动态定价与收益优化

5.1 定价模型

基于货运动态定价模型,扩展应用到大宗商品的销售定价:

class DynamicPricingEngine:
    """
    大宗商品动态定价引擎
    综合考虑:
    - 需求预测
    - 库存压力
    - 竞争价格
    - 舆情风险
    - 价格弹性
    """
    
    def __init__(self, commodity_code: str):
        self.commodity_code = commodity_code
        
    def estimate_price_elasticity(self, historical_data: pd.DataFrame) -> float:
        """
        估算商品的价格弹性系数 (Price Elasticity of Demand)
        
        公式: E = (ΔQ / Q) / (ΔP / P)
        
        大宗商品特性:
        - 钢材: E ≈ -0.6 (缺乏弹性, 工业需求稳定)
        - 石油: E ≈ -0.3 (极缺乏弹性, 生活必需)
        - 谷物: E ≈ -0.8 (相对缺乏, 饲料/食品刚需)
        - 铜矿: E ≈ -0.5 (中等缺乏, 工业周期性)
        """
        
        # 从历史数据计算价格与销量的相关性
        if len(historical_data) < 30:
            # 使用商品类型默认值
            default_elasticity = {
                'steel': -0.6,
                'oil': -0.3,
                'grain': -0.8,
                'copper': -0.5
            }
            return default_elasticity.get(self.commodity_code, -0.5)
        
        # 计算相关系数
        price_changes = historical_data['price'].pct_change()
        quantity_changes = historical_data['quantity'].pct_change()
        
        elasticity = quantity_changes.corr(price_changes)
        
        return elasticity
    
    def calculate_optimal_price(self,
                                base_price: float,
                                inventory_ratio: float,  # 库存 / (日均需求 * 30)
                                forecast_demand: float,   # 预期需求
                                competitor_price: float,
                                sentiment_indices: dict,
                                elasticity: float = -0.6) -> dict:
        """
        计算动态价格
        
        影响因素权重:
        - 库存压力 (30%)
        - 需求预测 (35%)
        - 竞争价格 (20%)
        - 舆情风险 (15%)
        """
        
        optimal_price = base_price
        price_factors = {}
        
        # ===== 因子1: 库存压力调整 =====
        # 库存高 → 降价清库存
        # 库存低 → 提价
        if inventory_ratio > 1.5:
            # 库存过高, 建议降价15-20%
            inventory_adjustment = 0.85
            price_factors['inventory'] = -15
        elif inventory_ratio > 1.0:
            # 库存适中偏高, 降价5-10%
            inventory_adjustment = 0.93
            price_factors['inventory'] = -7
        elif inventory_ratio > 0.5:
            # 库存健康, 不调整
            inventory_adjustment = 1.0
            price_factors['inventory'] = 0
        else:
            # 库存不足, 提价10-15%
            inventory_adjustment = 1.12
            price_factors['inventory'] = +12
        
        # ===== 因子2: 需求预测调整 =====
        # 预期需求高 → 提价
        # 预期需求低 → 降价
        avg_demand = historical_data['quantity'].mean()
        demand_ratio = forecast_demand / avg_demand
        
        if demand_ratio > 1.2:
            # 需求激增 (可能由舆情驱动), 提价8-15%
            demand_adjustment = 1.12
            price_factors['demand'] = +12
        elif demand_ratio > 1.0:
            # 需求高于平均, 提价3-7%
            demand_adjustment = 1.05
            price_factors['demand'] = +5
        elif demand_ratio > 0.8:
            # 需求低于平均, 降价2-5%
            demand_adjustment = 0.97
            price_factors['demand'] = -3
        else:
            # 需求明显下滑, 降价7-12%
            demand_adjustment = 0.91
            price_factors['demand'] = -9
        
        # ===== 因子3: 竞争价格调整 =====
        # 根据市场价格与竞争对手调整
        price_vs_market = (base_price - competitor_price) / competitor_price
        
        if price_vs_market > 0.05:
            # 本公司价格比竞争对手高>5%, 降价以改善竞争力
            competition_adjustment = 0.98
            price_factors['competition'] = -2
        elif price_vs_market < -0.05:
            # 本公司价格比竞争对手低>5%, 提价以提升利润
            competition_adjustment = 1.03
            price_factors['competition'] = +3
        else:
            # 价格接近竞争对手, 维持
            competition_adjustment = 1.0
            price_factors['competition'] = 0
        
        # ===== 因子4: 舆情风险调整 =====
        # 高风险环境下可能需要提价以对冲成本上升
        gri = sentiment_indices.get('gri', 0)  # 地缘政治
        cri = sentiment_indices.get('cri', 0)  # 合规风险
        
        sentiment_pressure = (gri * 0.6 + cri * 0.4) / 10
        
        if sentiment_pressure > 0.7:
            # 高风险, 提价8-12% (弥补运输/融资成本增加)
            sentiment_adjustment = 1.10
            price_factors['sentiment'] = +10
        elif sentiment_pressure > 0.5:
            # 中高风险, 提价3-6%
            sentiment_adjustment = 1.04
            price_factors['sentiment'] = +4
        elif sentiment_pressure > 0.3:
            # 中等风险, 不调整或微调
            sentiment_adjustment = 1.0
            price_factors['sentiment'] = 0
        else:
            # 低风险, 维持或小幅降价
            sentiment_adjustment = 1.0
            price_factors['sentiment'] = 0
        
        # ===== 综合计算 =====
        # 加权平均调整因子
        optimal_price = (
            base_price * 
            (0.30 * inventory_adjustment +
             0.35 * demand_adjustment +
             0.20 * competition_adjustment +
             0.15 * sentiment_adjustment)
        )
        
        # 价格下界: 不低于成本的110%
        # 价格上界: 不高于基准价的130%
        min_price = base_price * 0.8
        max_price = base_price * 1.3
        optimal_price = np.clip(optimal_price, min_price, max_price)
        
        # ===== 收益预测 =====
        # 使用价格弹性预测新需求
        price_change_ratio = (optimal_price - base_price) / base_price
        demand_change_ratio = elasticity * price_change_ratio
        projected_demand = forecast_demand * (1 + demand_change_ratio)
        
        projected_revenue = optimal_price * projected_demand
        base_revenue = base_price * forecast_demand
        revenue_lift = (projected_revenue - base_revenue) / base_revenue if base_revenue > 0 else 0
        
        return {
            'base_price': base_price,
            'optimal_price': round(optimal_price, 2),
            'price_change_pct': round(price_change_ratio * 100, 2),
            'factors': {
                'inventory_adjustment': round((inventory_adjustment - 1) * 100, 2),
                'demand_adjustment': round((demand_adjustment - 1) * 100, 2),
                'competition_adjustment': round((competition_adjustment - 1) * 100, 2),
                'sentiment_adjustment': round((sentiment_adjustment - 1) * 100, 2)
            },
            'demand_projection': {
                'base_forecast': round(forecast_demand, 2),
                'price_adjusted_demand': round(projected_demand, 2),
                'demand_elasticity': elasticity
            },
            'revenue_impact': {
                'base_revenue': round(base_revenue, 2),
                'projected_revenue': round(projected_revenue, 2),
                'revenue_lift_pct': round(revenue_lift * 100, 2)
            }
        }


技术架构

6.1 整体数据流

┌─── 舆情系统 (MGIR) ───────────────────────────────────┐
│  LDI / CRI / GRI / FRI 指标 (5分钟更新)               │
└─────────────────────────────┬─────────────────────────┘
                              ↓
┌─── 销量预测模块 ──────────────────────────────────────┐
│  Kafka: sentiment_indices_stream                      │
│  ↓                                                    │
│  Flink: XGBoost推理 (实时打分)                       │
│  ↓                                                    │
│  输出: forecast_demand_stream → Kafka                │
└─────────────────────────────┬─────────────────────────┘
                              ↓
┌─── 补货决策模块 ──────────────────────────────────────┐
│  输入数据:                                           │
│  - forecast_demand_stream (预测)                     │
│  - inventory_snapshot (库存)                         │
│  - supplier_data (供应商参数)                        │
│  - sentiment_indices (舆情因子)                      │
│  ↓                                                   │
│  优化求解: PuLP (整数线性规划)                       │
│  ↓                                                   │
│  输出: replenishment_plan → Doris                   │
└─────────────────────────────┬─────────────────────────┘
                              ↓
┌─── 定价决策模块 ──────────────────────────────────────┐
│  输入数据:                                           │
│  - forecast_demand (已修正)                          │
│  - inventory_ratio (库存压力)                        │
│  - competitor_price (竞争价格)                       │
│  - sentiment_indices (舆情)                          │
│  ↓                                                   │
│  计算: 动态定价公式                                 │
│  ↓                                                   │
│  输出: pricing_strategy → Doris → ERP               │
└─────────────────────────────┬─────────────────────────┘
                              ↓
                    BI仪表板 / 执行系统

6.2 Flink流处理拓扑

Source:
├─ raw_forecast_demand (销量预测输出)
├─ inventory_snapshot (库存变化)
├─ supplier_status (供应商状态)
├─ sentiment_indices_stream (舆情指标)
└─ price_feeds (商品期货价格)

Process:
├─ [1] 数据融合
     └─ 按SKU和采购商合并多源数据

├─ [2] 补货需求计算
     ├─ 计算安全库存 (考虑舆情调整)
     ├─ 计算订购点 (Reorder Point)
     └─ 生成补货触发事件

├─ [3] 供应商评分与选择
     ├─ 实时评估供应商可靠性
     ├─ 考虑交期、质量、成本
     └─ 生成推荐供应商名单

└─ [4] 决策输出生成
      ├─ 补货建议  Doris
      └─ 定价建议  Doris

Sink:
├─ Doris: replenishment_decisions 
├─ Doris: pricing_decisions 
├─ Kafka: decision_stream (用于下游执行)
└─ 告警系统 (关键风险事件)

数据模型与表设计

7.1 Doris表设计

-- 1. 销量预测结果表
CREATE TABLE forecast_demand_doris (
  forecast_id VARCHAR(36) PRIMARY KEY,
  commodity_code VARCHAR(50) NOT NULL,
  buyer_geo_key VARCHAR(50) NOT NULL,
  forecast_date DATE NOT NULL,
  forecast_horizon INT,  -- 预测天数 (1-30)
  base_forecast FLOAT,  -- 基础预测值
  forecast_with_sentiment FLOAT,  -- 舆情调整后的预测
  lower_bound FLOAT,  -- 置信区间下界 (±15%)
  upper_bound FLOAT,
  confidence_score FLOAT,  -- 模型置信度 (0-1)
  sentiment_adjustment_ratio FLOAT,  -- 舆情调整幅度
  gri_influence FLOAT,  -- GRI对预测的影响度
  cri_influence FLOAT,  -- CRI对预测的影响度
  model_version VARCHAR(20),  -- 模型版本
  created_at DATETIME,
  updated_at DATETIME
)
ENGINE = OLAP
DUPLICATE KEY (forecast_id)
PARTITION BY RANGE (forecast_date)
DISTRIBUTED BY HASH(buyer_geo_key) BUCKETS 32
PROPERTIES ("replication_num" = "2");


-- 2. 智能补货决策表
CREATE TABLE replenishment_decisions_doris (
  decision_id VARCHAR(36) PRIMARY KEY,
  commodity_code VARCHAR(50) NOT NULL,
  buyer_geo_key VARCHAR(50) NOT NULL,
  decision_date DATE NOT NULL,
  current_inventory FLOAT,
  safety_stock_required FLOAT,
  replenishment_quantity FLOAT,
  optimal_order_date DATE,  -- 建议订购日期
  
  -- 选中的供应商
  primary_supplier_id VARCHAR(50),
  primary_supplier_name VARCHAR(100),
  primary_quantity FLOAT,
  primary_unit_price FLOAT,
  
  -- 备选供应商 (多源模式)
  secondary_supplier_id VARCHAR(50),
  secondary_supplier_name VARCHAR(100),
  secondary_quantity FLOAT,
  secondary_unit_price FLOAT,
  
  -- 成本分析
  total_cost FLOAT,
  holding_cost FLOAT,
  shortage_cost FLOAT,
  multi_source_premium FLOAT,
  
  -- 风险指标
  estimated_lead_time_days INT,
  sentiment_adjusted_lead_time INT,
  supply_disruption_risk FLOAT,  -- 0-1
  ldi_impact FLOAT,
  cri_impact FLOAT,
  
  -- 状态追踪
  decision_status VARCHAR(20),  -- PENDING/APPROVED/EXECUTED/COMPLETED
  execution_date DATE,
  actual_arrival_date DATE,
  variance_analysis VARCHAR(500),  -- 实际vs计划的偏差分析
  
  created_at DATETIME,
  updated_at DATETIME
)
ENGINE = OLAP
DUPLICATE KEY (decision_id)
PARTITION BY RANGE (decision_date)
DISTRIBUTED BY HASH(buyer_geo_key) BUCKETS 32
PROPERTIES ("replication_num" = "2");


-- 3. 动态定价决策表
CREATE TABLE pricing_decisions_doris (
  pricing_id VARCHAR(36) PRIMARY KEY,
  commodity_code VARCHAR(50) NOT NULL,
  buyer_geo_key VARCHAR(50) NOT NULL,
  pricing_date DATE NOT NULL,
  
  -- 价格信息
  base_price FLOAT,
  optimal_price FLOAT,
  price_change_pct FLOAT,  -- 调整幅度 %
  
  -- 影响因素分解
  inventory_adjustment_pct FLOAT,
  demand_adjustment_pct FLOAT,
  competition_adjustment_pct FLOAT,
  sentiment_adjustment_pct FLOAT,
  
  -- 需求与收益预测
  base_forecast_demand FLOAT,
  price_adjusted_demand FLOAT,
  price_elasticity FLOAT,
  
  base_revenue FLOAT,
  projected_revenue FLOAT,
  revenue_lift_pct FLOAT,
  
  -- 风险因子
  current_inventory_ratio FLOAT,
  competitor_price FLOAT,
  sentiment_pressure_index FLOAT,  -- 0-1
  gri_factor FLOAT,
  cri_factor FLOAT,
  
  -- 执行跟踪
  pricing_status VARCHAR(20),  -- RECOMMENDED/APPROVED/ACTIVE/EXPIRED
  actual_sales_qty FLOAT,
  actual_revenue FLOAT,
  forecast_accuracy FLOAT,  -- 预测与实际的准确度
  
  created_at DATETIME,
  valid_from_date DATE,
  valid_until_date DATE,
  updated_at DATETIME
)
ENGINE = OLAP
DUPLICATE KEY (pricing_id)
PARTITION BY RANGE (pricing_date)
DISTRIBUTED BY HASH(commodity_code) BUCKETS 32
PROPERTIES ("replication_num" = "2");


-- 4. 舆情指标与库存影响关联表
CREATE TABLE sentiment_inventory_impact_doris (
  impact_id VARCHAR(36) PRIMARY KEY,
  commodity_code VARCHAR(50) NOT NULL,
  buyer_geo_key VARCHAR(50) NOT NULL,
  impact_date DATE NOT NULL,
  
  -- 舆情指标 (来自MGIR系统)
  ldi_score FLOAT,
  cri_score FLOAT,
  gri_score FLOAT,
  fri_score FLOAT,
  composite_risk_index FLOAT,
  
  -- 库存影响
  safety_stock_base FLOAT,  -- 无舆情时的安全库存
  safety_stock_adjusted FLOAT,  -- 舆情调整后的安全库存
  stock_increase_ratio FLOAT,  -- 增加比例
  
  -- 需求冲击
  demand_surge_probability FLOAT,  -- 0-1, 预期需求激增的概率
  estimated_surge_magnitude FLOAT,  -- 预期激增幅度 %
  
  -- 供应风险
  lead_time_days_base INT,
  lead_time_days_adjusted INT,
  lead_time_increase_days INT,
  
  -- 决策建议
  action_recommended VARCHAR(100),  -- 建议行动
  urgency_level VARCHAR(20),  -- NORMAL/HIGH/CRITICAL
  
  created_at DATETIME
)
ENGINE = OLAP
DUPLICATE KEY (impact_id)
PARTITION BY RANGE (impact_date)
DISTRIBUTED BY HASH(buyer_geo_key) BUCKETS 32
PROPERTIES ("replication_num" = "2");


-- 5. 供应商绩效与风险评分表
CREATE TABLE supplier_performance_doris (
  supplier_id VARCHAR(50) PRIMARY KEY,
  supplier_name VARCHAR(100),
  supplier_country VARCHAR(50),
  supplier_region VARCHAR(50),
  
  -- 历史绩效
  avg_delivery_days INT,
  on_time_delivery_rate FLOAT,  -- 0-1
  quality_score FLOAT,  -- 0-1
  
  -- 风险评分
  location_risk_score FLOAT,  -- 基于GRI的地理位置风险
  financial_risk_score FLOAT,  -- 基于FRI的财务风险
  regulatory_risk_score FLOAT,  -- 基于CRI的监管风险
  overall_risk_score FLOAT,  -- 综合风险 0-10
  
  -- 容量与约束
  capacity_monthly FLOAT,
  moq FLOAT,
  current_committed_qty FLOAT,
  available_capacity FLOAT,
  
  -- 实时状态
  current_sentiment_status VARCHAR(20),  -- GREEN/YELLOW/ORANGE/RED
  last_updated DATETIME,
  next_assessment_date DATE
)
ENGINE = OLAP
DUPLICATE KEY (supplier_id)
DISTRIBUTED BY HASH(supplier_country) BUCKETS 16
PROPERTIES ("replication_num" = "2");

7.2 Paimon表设计 (湖表一体)

-- 1. 预测原始结果Paimon表
CREATE TABLE forecast_demand_paimon (
  forecast_id STRING NOT NULL,
  commodity_code STRING,
  buyer_geo_key STRING,
  forecast_date DATE,
  forecast_value FLOAT,
  sentiment_adjusted_value FLOAT,
  input_features STRING,  -- JSON格式的所有输入特征
  model_version STRING,
  processing_timestamp TIMESTAMP(3),
  update_version INT,
  PRIMARY KEY (forecast_id) NOT ENFORCED
)
PARTITIONED BY (forecast_date)
WITH (
  'bucket' = '64',
  'file.compression' = 'lz4'
);


-- 2. 补货决策历史表
CREATE TABLE replenishment_history_paimon (
  decision_id STRING NOT NULL,
  commodity_code STRING,
  buyer_geo_key STRING,
  decision_date DATE,
  decision_snapshot JSON,  -- 完整决策结果
  execution_snapshot JSON,  -- 实际执行结果
  variance_analysis STRING,
  processing_timestamp TIMESTAMP(3),
  update_version INT,
  PRIMARY KEY (decision_id) NOT ENFORCED
)
PARTITIONED BY (decision_date)
WITH (
  'bucket' = '32',
  'file.compression' = 'lz4'
);