EasyQuant因子策略与量化研究最佳实践:从因子挖掘到策略闭环

5 阅读8分钟

结论先行:EasyQuant 的因子工厂(Strategy Factory)模块将量化研究从「个人笔记本」提升为「可复用、可追溯、可协同」的团队资产。通过因子定义 → 实验 → 发布 → 策略绑定的完整闭环,实现了研究到生产的无缝衔接。


一、传统量化研究的问题

1)研究流程的痛点

研究员A: "我上周写的那个因子在哪儿来着?"
研究员B: "我发给你了,但那个版本好像有 bug..."
研究员C: "你们说的因子和实盘运行的版本一样吗?"

常见问题

  • 因子代码散落在个人电脑上,无法集中管理
  • 版本混乱,「最终版」「修改版」「再改版」满天飞
  • 研究与生产脱节,因子有效但无法上线
  • 实验记录丢失,无法复现历史效果
  • 团队协作困难,知识无法沉淀

2)因子工厂解决方案

┌─────────────────────────────────────────────────────────────────┐
│                        因子工厂 (Strategy Factory)               │
├─────────────────────────────────────────────────────────────────┤
│  项目管理                                                         │
│    ├── 因子定义 (Factor Definition)                              │
│    ├── 实验 (Experiment)                                          │
│    ├── 回测 (Backtest)                                            │
│    └── 发布 (Release)                                             │
├─────────────────────────────────────────────────────────────────┤
│  产出                                                         │
│    ├── 因子资产 (Factor Asset)                                    │
│    ├── 策略模板 (Strategy Template)                                │
│    └── 发布报告 (Release Report)                                   │
└─────────────────────────────────────────────────────────────────┘

二、因子定义与数据模型

1)核心表结构

-- 因子定义表
CREATE TABLE strategy_factory_factor_definition (
    id BIGSERIAL PRIMARY KEY,
    tenant_id BIGINT NOT NULL,
    name VARCHAR(100) NOT NULL,
    code VARCHAR(50) NOT NULL,           -- 因子编码
    category VARCHAR(50),                 -- 因子类别 (MOMENTUM, VOLATILITY, etc.)
    description TEXT,
    calc_expr_json JSONB,                -- 计算表达式
    dsl_default_json JSONB,              -- DSL 默认模板
    ui_hints_json JSONB,                 -- UI 提示信息
    status VARCHAR(20) DEFAULT 'DRAFT',  -- DRAFT, ACTIVE, DEPRECATED
    created_by BIGINT,
    created_at TIMESTAMP DEFAULT NOW(),
    updated_at TIMESTAMP DEFAULT NOW()
);

-- 因子版本表
CREATE TABLE strategy_factory_factor_version (
    id BIGSERIAL PRIMARY KEY,
    factor_definition_id BIGINT NOT NULL REFERENCES strategy_factory_factor_definition(id),
    version INT NOT NULL,
    calc_expr TEXT,                      -- 计算表达式
    params JSONB,                        -- 参数字典
    created_at TIMESTAMP DEFAULT NOW(),
    UNIQUE(factor_definition_id, version)
);

-- 因子实验表
CREATE TABLE strategy_factory_experiment (
    id BIGSERIAL PRIMARY KEY,
    tenant_id BIGINT NOT NULL,
    project_id BIGINT,
    name VARCHAR(200) NOT NULL,
    description TEXT,
    status VARCHAR(20) DEFAULT 'RUNNING', -- RUNNING, COMPLETED, FAILED
    config_json JSONB,                    -- 实验配置
    result_json JSONB,                    -- 实验结果
    trace_json JSONB,                     -- 追溯信息
    created_by BIGINT,
    created_at TIMESTAMP DEFAULT NOW()
);

-- 因子发布记录表
CREATE TABLE strategy_factory_artifact (
    id BIGSERIAL PRIMARY KEY,
    tenant_id BIGINT NOT NULL,
    factor_definition_id BIGINT NOT NULL,
    version_id BIGINT NOT NULL,
    release_note TEXT,
    published_by BIGINT,
    published_at TIMESTAMP DEFAULT NOW(),
    status VARCHAR(20) DEFAULT 'ACTIVE'  -- ACTIVE, INACTIVE
);

2)因子表达式设计

{
  "code": "MOM_20D",
  "category": "MOMENTUM",
  "name": "20日动量因子",
  "description": "过去20个交易日收盘价收益率",
  "calc_expr": "(close[-1] / close[-20] - 1) * 100",
  "params": {
    "period": 20,
    "price_type": "close"
  },
  "ui_hints": {
    "dslKind": "RULE_TREE",
    "categoryColor": "#4CAF50"
  }
}

3)因子类别体系

public enum FactorCategory {
    MOMENTUM("动量类", "#4CAF50"),
    VOLATILITY("波动率类", "#2196F3"),
    VOLUME("成交量类", "#FF9800"),
    REVERSAL("反转类", "#9C27B0"),
    QUALITY("质量类", "#00BCD4"),
    SENTIMENT("情绪类", "#F44336"),
    MACRO("宏观类", "#795548"),
    CUSTOM("自定义类", "#607D8B");
    
    private final String label;
    private final String color;
    
    FactorCategory(String label, String color) {
        this.label = label;
        this.color = color;
    }
}

三、因子实验工作流

1)实验配置与执行

// StrategyFactoryExperimentService.java
@Service
public class StrategyFactoryExperimentService {
    
    public ExperimentResult runExperiment(ExperimentConfig config) {
        // 1. 验证配置
        validateConfig(config);
        
        // 2. 准备数据
        List<BarPoint> bars = loadMarketData(config.getSymbols(), config.getDateRange());
        
        // 3. 计算因子值
        Map<String, List<Double>> factorValues = new HashMap<>();
        for (String factorCode : config.getFactors()) {
            FactorDefinition factor = factorService.getByCode(factorCode);
            List<Double> values = calculateFactor(factor, bars);
            factorValues.put(factorCode, values);
        }
        
        // 4. IC/IR 分析
        ICAnalysis icAnalysis = analyzeIC(factorValues, config.getReturnSeries());
        
        // 5. 生成实验报告
        return buildExperimentResult(config, factorValues, icAnalysis);
    }
    
    private ICAnalysis analyzeIC(Map<String, List<Double>> factorValues, 
                                  List<Double> returns) {
        Map<String, ICResult> icResults = new HashMap<>();
        
        for (Map.Entry<String, List<Double>> entry : factorValues.entrySet()) {
            String factorCode = entry.getKey();
            List<Double> factorSeries = entry.getValue();
            
            // 计算因子 IC (Information Coefficient)
            double ic = calculateIC(factorSeries, returns);
            
            // 计算因子 IR (Information Ratio)
            List<Double> icSeries = calculateICSeries(factorSeries, returns);
            double ir = calculateIR(icSeries);
            
            // 计算因子换手率
            double turnover = calculateTurnover(factorSeries);
            
            icResults.put(factorCode, new ICResult(ic, ir, turnover));
        }
        
        return new ICAnalysis(icResults);
    }
    
    private double calculateIC(List<Double> factor, List<Double> returns) {
        int n = Math.min(factor.size(), returns.size());
        if (n < 10) return 0.0;
        
        // 取最近 n 个数据点
        List<Double> f = factor.subList(factor.size() - n, factor.size());
        List<Double> r = returns.subList(returns.size() - n, returns.size());
        
        return pearsonCorrelation(f, r);
    }
}

2)因子分析维度

public record FactorAnalysisReport(
    String factorCode,
    ICResult icResult,                    // IC/IR 分析
    DecayAnalysis decayAnalysis,          // IC 衰减分析
    SectorAnalysis sectorAnalysis,        // 行业分析
    PeriodAnalysis periodAnalysis,        // 周期分析
    RobustnessResult robustnessResult     // 稳健性测试
) {}

public record ICResult(
    double icMean,        // 平均 IC
    double icStd,        // IC 标准差
    double ir,           // IR (IC Mean / IC Std)
    double icCorr,       // IC 序列自相关
    double pValue        // IC 显著性 p 值
) {}

// IC 衰减分析
public record DecayAnalysis(
    int[] delays,        // 延迟天数 [0, 1, 2, 3, 5, 10, 20]
    double[] icValues    // 对应延迟的 IC 值
) {}

// 稳健性测试
public record RobustnessResult(
    boolean noiseTest,       // 添加噪声测试
    boolean bootstrapTest,   // Bootstrap 测试
    boolean outlierTest,     // 异常值处理测试
    boolean lagStabilityTest // 滞后稳定性测试
) {}

3)实验追踪与溯源

// StrategyFactoryExperimentTraceService.java
@Service
public class StrategyFactoryExperimentTraceService {
    
    public void recordTrace(Experiment experiment, TraceEvent event) {
        ExperimentTrace trace = new ExperimentTrace();
        trace.setExperimentId(experiment.getId());
        trace.setEventType(event.getType());
        trace.setEventData(toJson(event));
        trace.setTimestamp(LocalDateTime.now());
        
        // 记录完整执行上下文
        trace.setInputHash(calculateInputHash(experiment.getConfig()));
        trace.setEnvironment(getSystemInfo());
        trace.setVersion(getCodeVersion());
        
        traceRepository.save(trace);
    }
    
    public ExperimentReplay loadForReplay(long experimentId) {
        Experiment experiment = experimentRepository.findById(experimentId);
        List<ExperimentTrace> traces = traceRepository.findByExperimentId(experimentId);
        
        return new ExperimentReplay(
            experiment.getConfig(),
            traces,
            loadMarketDataSnapshot(experiment.getConfig()),
            getReproducibleEnvironment()
        );
    }
}

四、因子与策略的绑定闭环

1)因子绑定 DSL

{
  "name": "基于动量因子的策略",
  "templateCode": "FACTOR_SIGNAL",
  "params": {
    "factorBindings": [
      {
        "factorCode": "MOM_20D",
        "factorVersionId": 123,
        "direction": "LONG",
        "weight": 0.6
      },
      {
        "factorCode": "VOL_20D",
        "factorVersionId": 124,
        "direction": "SHORT",
        "weight": 0.4
      }
    ],
    "buyThreshold": 0.02,
    "sellThreshold": -0.02,
    "_easyquantMeta": {
      "sfFactorBindings": [
        {
          "factorDefinitionId": 100,
          "factorVersionId": 123,
          "direction": "LONG"
        }
      ]
    }
  }
}

2)因子评分计算

// StrategySignalEngine.java 中的因子绑定评估
public Optional<FactorSignalResult> evaluateFactorBindingsResult(
        Map<String, Object> params,
        List<BarAggregationService.BarPoint> bars,
        long tenantId,
        String marketId,
        String symbolCode
) {
    // 1. 提取因子绑定
    List<Map<String, Object>> bindings = extractFactorBindings(params);
    
    // 2. 计算综合评分
    double scoreSum = 0.0;
    for (Map<String, Object> b : bindings) {
        String factorCode = String.valueOf(b.getOrDefault("factorCode", ""));
        Long factorVersionId = toLongOrNull(b.get("factorVersionId"));
        
        // 优先使用 Python 预计算评分
        Double score = tryPrecomputedScore(tenantId, factorVersionId, marketId, symbolCode);
        
        // Fallback: 实时 Groovy 计算
        if (score == null) {
            score = evaluateFactorExpression(factorVersionId, bars);
        }
        
        scoreSum += score;
    }
    
    double avgScore = scoreSum / Math.max(1, bindings.size());
    
    // 3. 信号生成
    String side = null;
    if (avgScore > buyThreshold) side = "BUY";
    else if (avgScore < sellThreshold) side = "SELL";
    
    return Optional.of(new FactorSignalResult(avgScore, side));
}

3)完整闭环流程

┌──────────────────────────────────────────────────────────────────┐
│                     因子工厂完整闭环                               │
├──────────────────────────────────────────────────────────────────┤
│                                                                  │
│  ┌─────────────┐    ┌─────────────┐    ┌─────────────┐          │
│  │  因子定义    │───▶│  因子实验    │───▶│  因子发布    │          │
│  └─────────────┘    └─────────────┘    └─────────────┘          │
│         │                                      │                 │
│         │                                      ▼                 │
│         │                              ┌─────────────┐          │
│         │                              │  因子资产    │          │
│         │                              └─────────────┘          │
│         │                                      │                 │
│         ▼                                      ▼                 │
│  ┌─────────────┐                       ┌─────────────┐          │
│  │  因子回测    │◀──────────────────────│  策略绑定    │          │
│  └─────────────┘                       └─────────────┘          │
│         │                                      │                 │
│         │                                      ▼                 │
│         │                              ┌─────────────┐          │
│         │                              │  策略激活    │          │
│         │                              └─────────────┘          │
│         │                                      │                 │
│         ▼                                      ▼                 │
│  ┌─────────────┐                       ┌─────────────┐          │
│  │  绩效分析    │                       │  实盘执行    │          │
│  └─────────────┘                       └─────────────┘          │
│         │                                      │                 │
│         └──────────────────────────────────────┘                 │
│                          │                                       │
│                          ▼                                       │
│                  ┌─────────────┐                                  │
│                  │  反馈优化    │                                  │
│                  └─────────────┘                                  │
└──────────────────────────────────────────────────────────────────┘

五、前端实现

1)因子定义页面

<!-- FactorDefinitionView.vue -->
<template>
  <div class="factor-definition">
    <el-form :model="factor" label-width="120px">
      <el-form-item label="因子编码">
        <el-input v-model="factor.code" placeholder="如 MOM_20D" />
      </el-form-item>
      
      <el-form-item label="因子名称">
        <el-input v-model="factor.name" placeholder="如 20日动量" />
      </el-form-item>
      
      <el-form-item label="因子类别">
        <el-select v-model="factor.category">
          <el-option 
            v-for="cat in categories" 
            :key="cat.value" 
            :label="cat.label" 
            :value="cat.value" 
          />
        </el-select>
      </el-form-item>
      
      <el-form-item label="计算表达式">
        <el-input 
          v-model="factor.calcExpr" 
          type="textarea" 
          :rows="4"
          placeholder="close[-1] / close[-20] - 1"
        />
        <div class="expr-help">
          支持函数: SMA, EMA, RSI, MACD, ATR, HHV, LLV, REF, shift
        </div>
      </el-form-item>
      
      <el-form-item label="DSL 模板">
        <dsl-editor v-model="factor.dslDefault" />
      </el-form-item>
    </el-form>
    
    <div class="form-actions">
      <el-button @click="saveDraft">保存草稿</el-button>
      <el-button type="primary" @click="publish">发布因子</el-button>
    </div>
  </div>
</template>

2)因子实验配置

<!-- FactorExperimentView.vue -->
<template>
  <div class="experiment-config">
    <el-card header="实验配置">
      <el-form :model="config" label-width="140px">
        <el-form-item label="实验名称">
          <el-input v-model="config.name" />
        </el-form-item>
        
        <el-form-item label="标的池">
          <symbol-selector 
            v-model="config.symbols" 
            :universe-id="config.universeId"
          />
        </el-form-item>
        
        <el-form-item label="时间范围">
          <el-date-picker
            v-model="config.dateRange"
            type="daterange"
            range-separator="至"
            start-placeholder="开始日期"
            end-placeholder="结束日期"
          />
        </el-form-item>
        
        <el-form-item label="选择因子">
          <factor-selector 
            v-model="config.factors" 
            :category="config.category"
          />
        </el-form-item>
        
        <el-form-item label="目标收益序列">
          <return-selector v-model="config.returnType" />
        </el-form-item>
      </el-form>
    </el-card>
    
    <el-card header="分析维度" class="mt-16">
      <el-checkbox-group v-model="config.analysisDimensions">
        <el-checkbox label="ic">IC/IR 分析</el-checkbox>
        <el-checkbox label="decay">IC 衰减分析</el-checkbox>
        <el-checkbox label="sector">行业分析</el-checkbox>
        <el-checkbox label="period">周期分析</el-checkbox>
        <el-checkbox label="robustness">稳健性测试</el-checkbox>
      </el-checkbox-group>
    </el-card>
    
    <div class="actions">
      <el-button type="primary" :loading="running" @click="runExperiment">
        运行实验
      </el-button>
    </div>
  </div>
</template>

3)实验结果展示

<!-- ExperimentResultView.vue -->
<template>
  <div class="experiment-result">
    <el-row :gutter="16">
      <el-col :span="6" v-for="metric in summaryMetrics" :key="metric.name">
        <metric-card :label="metric.label" :value="metric.value" :trend="metric.trend" />
      </el-col>
    </el-row>
    
    <el-row :gutter="16" class="mt-16">
      <el-col :span="12">
        <el-card header="IC 时序图">
          <ic-chart :data="icTimeSeries" />
        </el-card>
      </el-col>
      <el-col :span="12">
        <el-card header="IC 衰减图">
          <decay-chart :data="icDecay" />
        </el-card>
      </el-col>
    </el-row>
    
    <el-card header="因子排名" class="mt-16">
      <el-table :data="factorRanking" stripe>
        <el-table-column prop="rank" label="排名" width="80" />
        <el-table-column prop="code" label="因子编码" />
        <el-table-column prop="name" label="因子名称" />
        <el-table-column prop="icMean" label="IC 均值" sortable />
        <el-table-column prop="ir" label="IR" sortable />
        <el-table-column prop="turnover" label="换手率" />
        <el-table-column label="操作">
          <template #default="{ row }">
            <el-button size="small" @click="bindToStrategy(row)">
              绑定策略
            </el-button>
          </template>
        </el-table-column>
      </el-table>
    </el-card>
  </div>
</template>

六、Python 因子服务集成

1)因子预计算架构

# factor_daily_score_scheduler.py
class FactorDailyScoreScheduler:
    """每日收盘后批量计算因子评分"""
    
    def run(self, trading_date: date):
        # 1. 获取今日有行情的标的
        symbols = self.market_data.get_traded_symbols(trading_date)
        
        # 2. 获取所有激活的因子版本
        active_factors = self.factor_service.get_active_factors()
        
        # 3. 批量计算
        scores = []
        for symbol in symbols:
            for factor in active_factors:
                score = self.calculate_factor_score(
                    factor, symbol, trading_date
                )
                scores.append(score)
        
        # 4. 批量写入数据库
        self.score_repo.batch_insert(scores)
    
    def calculate_factor_score(self, factor, symbol, date):
        # 获取因子数据
        bars = self.market_data.get_bars(
            symbol, 
            start_date=date - timedelta(days=60),
            end_date=date
        )
        
        # 计算因子值
        expr = factor.calc_expr
        factor_value = self.evaluate_expression(expr, bars)
        
        return FactorDailyScore(
            tenant_id=factor.tenant_id,
            factor_version_id=factor.version_id,
            market_id=symbol.market_id,
            symbol_code=symbol.code,
            score_date=date,
            factor_value=factor_value
        )

2)因子表达式执行

# factor_expression_evaluator.py
class FactorExpressionEvaluator:
    """安全地执行因子表达式"""
    
    SUPPORTED_FUNCTIONS = {
        'SMA': lambda s, p: np.mean(s[-p:]),
        'EMA': lambda s, p: ewma(s, span=p),
        'RSI': rsi_calculator,
        'MACD': macd_calculator,
        'HHV': lambda s, p: np.max(s[-p:]),
        'LLV': lambda s, p: np.min(s[-p:]),
        'STD': lambda s, p: np.std(s[-p:]),
        'CORR': lambda x, y, p: np.corrcoef(x[-p:], y[-p:])[0,1],
    }
    
    def evaluate(self, expression: str, bars: pd.DataFrame) -> float:
        # 构建执行上下文
        context = self.build_context(bars)
        
        # 安全替换函数名
        expr = self.normalize_expression(expression)
        
        # 执行表达式
        try:
            result = eval(expr, {"__builtins__": {}}, context)
            return float(result)
        except Exception as e:
            logger.warning(f"Expression evaluation failed: {expression}, error: {e}")
            return 0.0
    
    def build_context(self, bars: pd.DataFrame) -> dict:
        close = bars['close'].tolist()
        open_ = bars['open'].tolist()
        high = bars['high'].tolist()
        low = bars['low'].tolist()
        volume = bars['volume'].tolist()
        
        return {
            'close': close,
            'open': open_,
            'high': high,
            'low': low,
            'volume': volume,
            **self.SUPPORTED_FUNCTIONS
        }

七、最佳实践

1)因子研究规范

## 因子命名规范
- 因子编码: `{CATEGORY}_{PERIOD}_{SUFFIX}`
  - MOM_20D (动量, 20日)
  - VOL_5D (波动率, 5日)
  - RET_1D (收益, 1日)

## 因子文档要求
每个因子必须包含:
1. 因子定义 (What): 因子计算逻辑
2. 经济含义 (Why): 因子有效的原因假设
3. 使用场景 (How): 适用市场/品种/周期
4. 注意事项 (Caveats): 已知的局限性

2)实验记录模板

{
  "experiment_name": "动量因子 IC 稳定性测试",
  "hypothesis": "MOM 类因子在A股市场有显著的 IC",
  "config": {
    "symbols": "中证500成分股",
    "date_range": ["2020-01-01", "2023-12-31"],
    "factors": ["MOM_20D", "MOM_60D", "MOM_120D"],
    "rebalance_freq": "20d"
  },
  "results": {
    "MOM_20D": { "ic_mean": 0.05, "ir": 0.8, "significant": true },
    "MOM_60D": { "ic_mean": 0.08, "ir": 1.2, "significant": true },
    "MOM_120D": { "ic_mean": 0.03, "ir": 0.5, "significant": false }
  },
  "conclusion": "MOM_60D 在样本期内表现最佳,建议采用",
  "next_steps": ["行业中性化测试", "风险因子剥离"]
}

结语

因子工厂是 EasyQuant 平台从「个人工具」走向「团队平台」的关键模块。通过标准化因子定义、实验追踪和策略闭环,团队可以更高效地积累研究资产,实现从因子挖掘到策略上线的全流程管理。


延伸阅读