股指期货跨期价差建模分析

245 阅读1分钟

期货品种会有不同到期日的多个合约同时交易,不同到期日合约的价格变动基本同步,但是它们之间的价差会随时变动,今天我们拿股指期货的两个合约进行价差分析,分别是4月份的当月合约(IC2304)和当季合眼缘(IC2306)。

先加载原始Tick数据,合成10秒K线,并计算价差(Spread)

date_list = get_trade_date_list(start_date, end_date)
bar1 = get_future_tick_resample(code1, start_date, end_date, freq)
bar2 = get_future_tick_resample(code2, start_date, end_date, freq)
cp_df = pd.DataFrame(zip(bar1['price'],bar2['price']), columns=[code1,code2],index=bar1.index)
cp_df['spread'] = cp_df[code1] - cp_df[code2]
cp_df['spread_ma'] = cp_df['spread'].rolling(window=60).mean()

两个合约原始价格和价差数据样本如下:

image.png

价格和价差走势图

image.png

价差分布直方图

image.png 下面我们用这个合成后的10秒K线数据进行LSTM神经网络训练建模。

构建特征和标签

tick_df = pd.DataFrame(zip(bar1['price'],bar2['price']), columns=[code1,code2],index=bar1.index)

tick_df['spread'] = tick_df[code1] - tick_df[code2]

tick_df['spread_1'] = tick_df['spread'].rolling(window=5).sum().fillna(0)

tick_df['spread_2'] = tick_df['spread'].rolling(window=20).sum().fillna(0)

tick_df['spread_3'] = tick_df['spread'].rolling(window=40).sum().fillna(0)

tick_df.dropna(inplace=True)

tick_df['label'] = 0

tick_df['label'].iloc[:-fut_num] = tick_df['spread'].tolist()[fut_num:]

tick_df = tick_df.iloc[:-fut_num, :]

数据预处理

label = tick_df.loc[:, 'label']

data = tick_df.loc[:, features]

data, label, mm_y = normalization(data, label, normal_flag)

x, y = split_windows(data, label, seq_length)

x_data, y_data, x_train, y_train, x_test, y_test = split_data(x, y, 0.8)

train_loader, test_loader, num_epochs = data_generator(x_train, y_train, x_test, y_test, n_iters, batch_size)

模型训练及预测

# 模型训练

moudle, criterion, loss_list = train_script_LSTM(train_loader, num_epochs, features=features, seq_length=seq_length

                                                 ,batch_size=batch_size,

                                                 hidden_size=hidden_size,

                                                 num_layers=num_layers,

                                                 LR=LR)

  


# 模型预测

data_predict = moudle(x_data)

loss = criterion(data_predict, y_data)

predict_np = data_predict.data.numpy()

real_np = y_data.data.numpy()

训练结果

image.png

image.png

残差整体在[-3,4]的区间内波动, 测试集里预测值整体比真实值偏小。

我们已经将本文用到的全部源数据+源代码+Python环境打包好了,做到开箱即用,一键运行,感兴趣的朋友可以下载,自己多动手才是学习的最佳途径。

关注我的同名公众号,在后台回复“源码”获取。