Kronos微调训练详解第7章：微调训练详解 📖 章节概述欢迎来到第7章！在这一章中，我们将深入学习Kronos模型

第7章：微调训练详解

📖 章节概述

欢迎来到第7章！在这一章中，我们将深入学习Kronos模型的微调技术。微调是将预训练模型适应特定任务或市场的关键技术，能够显著提升模型在特定场景下的表现。

⏱️ 预计学习时间：10小时

🎯 学习目标：掌握Kronos模型的完整微调流程

📋 主要内容：Qlib集成、CSV微调、分布式训练、最佳实践

7.1 微调训练基础

7.1.1 什么是模型微调

微调（Fine-tuning）是在预训练模型的基础上，使用特定任务的数据继续训练模型的过程。对于Kronos来说，微调可以：

提升特定市场表现：适应不同金融市场的特点
优化预测精度：针对特定时间周期和资产类型
个性化模型：基于机构自有数据进行定制
降低数据需求：相比从头训练需要更少的数据

7.1.2 微调的技术优势

🎯 数据效率

预训练模型已经学习了通用的金融模式
只需要相对较少的特定数据进行适配
大大降低了训练成本和时间

🎯 性能提升

专注学习特定市场的特征
适应当前的市场环境和结构
提高在特定任务上的准确性

🎯 灵活性

可以针对不同的资产类型进行微调
支持不同的预测目标和时间尺度
便于持续更新和改进

7.1.3 微调 vs 从头训练

特性	微调	从头训练

| 数据需求 | 较少（数千-数万条） | 大量（数百万条） |

| 训练时间 | 较短（几小时-几天） | 很长（几周-几个月） |

| 计算资源 | 中等（单GPU-多GPU） | 大量（大规模集群） |

| 泛化能力 | 保留通用知识 | 需要从零学习 |

| 技术门槛 | 较低 | 很高 |

7.2 Qlib集成微调管道

7.2.1 Qlib框架介绍

Qlib是微软开源的量化投资平台，提供了完整的量化研究和交易框架：

🔧 核心功能

数据管理：统一的金融数据存储和访问
因子计算：丰富的技术指标和因子库
模型训练：内置的机器学习训练框架
回测系统：专业的策略回测和评估

🌐 支持的数据源

中国A股市场
美国股票市场
加密货币市场
外汇市场

7.2.2 Qlib数据准备

📊 数据获取和存储

import qlib
from qlib.data import D
from qlib.constant import REG_CN
from qlib.utils import init_instance_by_config

# 初始化Qlib
qlib.init(provider_uri='~/.qlib/qlib_data/cn_data', region=REG_CN)

# 配置数据提供者
data_provider = init_instance_by_config({
    'class': 'LocalDataProvider',
    'module_path': 'qlib.data.dataset',
    'kwargs': {
        'region': REG_CN,
        'provider_uri': '~/.qlib/qlib_data/cn_data'
    }
})

# 获取股票数据
instruments = D.instruments(market='all')
fields = ['$close', '$volume', '$high', '$low', '$open', '$factor']
start_time = '2010-01-01'
end_time = '2023-12-31'

# 获取数据
data = D.features(instruments, fields, start_time=start_time, end_time=end_time)
print(f"获取数据形状: {data.shape}")
print(f"数据列: {data.columns.tolist()}")

📈 数据预处理

import pandas as pd
import numpy as np

class QlibDataPreprocessor:
    """Qlib数据预处理器"""

    def __init__(self):
        self.scaler = None
        self.feature_columns = None

    def prepare_training_data(self, data, target_stock='000001.SZ',
                           lookback_window=252, prediction_horizon=20):
        """
        准备训练数据

        Args:
            data: Qlib格式的数据
            target_stock: 目标股票代码
            lookback_window: 历史窗口
            prediction_horizon: 预测时间步长

        Returns:
            处理后的训练数据
        """
        print(f"准备{target_stock}的训练数据...")

        # 提取目标股票数据
        target_data = data.loc[target_stock].copy()

        # 计算技术指标
        target_data = self.calculate_technical_indicators(target_data)

        # 计算收益率
        target_data['returns'] = target_data['$close'].pct_change()

        # 创建特征和标签
        features = []
        labels = []

        for i in range(lookback_window, len(target_data) - prediction_horizon):
            # 特征：历史数据窗口
            feature_window = target_data.iloc[i-lookback_window:i]
            feature_vector = self.extract_features(feature_window)
            features.append(feature_vector)

            # 标签：未来收益率
            future_returns = target_data['returns'].iloc[i:i+prediction_horizon]
            label = self.calculate_label(future_returns)
            labels.append(label)

        features = np.array(features)
        labels = np.array(labels)

        print(f"特征矩阵形状: {features.shape}")
        print(f"标签向量形状: {labels.shape}")

        return features, labels, target_data.index[lookback_window:len(target_data)-prediction_horizon]

    def calculate_technical_indicators(self, data):
        """计算技术指标"""
        # 移动平均
        data['MA5'] = data['$close'].rolling(window=5).mean()
        data['MA20'] = data['$close'].rolling(window=20).mean()
        data['MA60'] = data['$close'].rolling(window=60).mean()

        # RSI
        delta = data['$close'].diff()
        gain = (delta.where(delta > 0, 0)).rolling(window=14).mean()
        loss = (-delta.where(delta < 0, 0)).rolling(window=14).mean()
        rs = gain / loss
        data['RSI'] = 100 - (100 / (1 + rs))

        # MACD
        exp1 = data['$close'].ewm(span=12).mean()
        exp2 = data['$close'].ewm(span=26).mean()
        data['MACD'] = exp1 - exp2
        data['MACD_signal'] = data['MACD'].ewm(span=9).mean()

        # 布林带
        ma20 = data['$close'].rolling(window=20).mean()
        std20 = data['$close'].rolling(window=20).std()
        data['BOLL_upper'] = ma20 + (std20 * 2)
        data['BOLL_lower'] = ma20 - (std20 * 2)

        return data

    def extract_features(self, window_data):
        """从历史窗口提取特征"""
        features = []

        # 价格特征
        features.extend([
            window_data['$close'].iloc[-1],
            window_data['$close'].pct_change(5).iloc[-1],
            window_data['$close'].pct_change(20).iloc[-1],
        ])

        # 移动平均特征
        features.extend([
            window_data['MA5'].iloc[-1],
            window_data['MA20'].iloc[-1],
            window_data['MA60'].iloc[-1],
            (window_data['$close'].iloc[-1] - window_data['MA20'].iloc[-1]) / window_data['MA20'].iloc[-1]
        ])

        # 技术指标特征
        features.extend([
            window_data['RSI'].iloc[-1],
            window_data['MACD'].iloc[-1],
            window_data['MACD_signal'].iloc[-1],
        ])

        # 波动率特征
        returns_20 = window_data['$close'].pct_change(20)
        features.extend([
            returns_20.std(),
            returns_20.skew(),
            returns_20.kurtosis()
        ])

        # 成交量特征
        features.extend([
            window_data['$volume'].iloc[-1],
            window_data['$volume'].rolling(5).mean().iloc[-1],
            window_data['$volume'].rolling(20).mean().iloc[-1],
        ])

        return np.array(features)

    def calculate_label(self, future_returns):
        """计算标签（分类任务）"""
        mean_return = future_returns.mean()

        # 分类标签：0=下跌, 1=横盘, 2=上涨
        if mean_return < -0.02:
            return 0
        elif mean_return > 0.02:
            return 2
        else:
            return 1

# 使用示例
if __name__ == "__main__":
    # 初始化预处理器
    preprocessor = QlibDataPreprocessor()

    # 准备数据（这里需要实际的Qlib数据）
    print("请确保已正确配置Qlib环境")
    print("运行示例需要真实的Qlib数据")

7.2.3 微调数据集类

import torch
from torch.utils.data import Dataset, DataLoader
import numpy as np

class FinetuningDataset(Dataset):
    """微调数据集类"""

    def __init__(self, features, labels, timestamps=None):
        """
        初始化数据集

        Args:
            features: 特征矩阵 (n_samples, n_features)
            labels: 标签向量 (n_samples,)
            timestamps: 时间戳列表 (n_samples,)
        """
        self.features = torch.FloatTensor(features)
        self.labels = torch.LongTensor(labels)
        self.timestamps = timestamps or list(range(len(features)))

        # 标准化特征
        self.feature_mean = torch.mean(self.features, dim=0)
        self.feature_std = torch.std(self.features, dim=0)
        self.feature_std = torch.clamp(self.feature_std, min=1e-8)  # 避免除零

        self.features = (self.features - self.feature_mean) / self.feature_std

    def __len__(self):
        return len(self.features)

    def __getitem__(self, idx):
        return {
            'features': self.features[idx],
            'labels': self.labels[idx],
            'timestamp': self.timestamps[idx]
        }

    def get_normalization_params(self):
        """获取标准化参数"""
        return self.feature_mean, self.feature_std

class TimeSeriesDataLoader:
    """时间序列数据加载器"""

    def __init__(self, dataset, batch_size=32, shuffle=True, drop_last=False):
        """
        初始化数据加载器

        Args:
            dataset: 数据集
            batch_size: 批次大小
            shuffle: 是否打乱数据
            drop_last: 是否丢弃最后一个不完整的批次
        """
        self.dataset = dataset
        self.batch_size = batch_size
        self.shuffle = shuffle
        self.drop_last = drop_last

        self.indices = list(range(len(dataset)))
        self.current_position = 0

    def __iter__(self):
        if self.shuffle:
            np.random.shuffle(self.indices)
        self.current_position = 0
        return self

    def __next__(self):
        if self.current_position >= len(self.dataset):
            raise StopIteration

        # 收集批次数据
        batch_indices = []
        while (len(batch_indices) < self.batch_size and
               self.current_position < len(self.dataset)):
            batch_indices.append(self.indices[self.current_position])
            self.current_position += 1

        # 如果不足一个批次且drop_last为True，抛出异常
        if len(batch_indices) < self.batch_size and self.drop_last:
            raise StopIteration

        # 构建批次数据
        batch_features = []
        batch_labels = []
        batch_timestamps = []

        for idx in batch_indices:
            item = self.dataset[idx]
            batch_features.append(item['features'])
            batch_labels.append(item['labels'])
            batch_timestamps.append(item['timestamp'])

        return {
            'features': torch.stack(batch_features),
            'labels': torch.stack(batch_labels),
            'timestamps': batch_timestamps
        }

# 数据集使用示例
def create_training_pipeline(data, target_stock='000001.SZ'):
    """创建训练流水线"""
    print("创建训练流水线...")

    # 数据预处理
    preprocessor = QlibDataPreprocessor()
    features, labels, timestamps = preprocessor.prepare_training_data(
        data, target_stock, lookback_window=252, prediction_horizon=20
    )

    # 数据集划分
    train_ratio, val_ratio, test_ratio = 0.7, 0.2, 0.1
    total_samples = len(features)

    train_end = int(total_samples * train_ratio)
    val_end = int(total_samples * (train_ratio + val_ratio))

    train_features = features[:train_end]
    train_labels = labels[:train_end]
    train_timestamps = timestamps[:train_end]

    val_features = features[train_end:val_end]
    val_labels = labels[train_end:val_end]
    val_timestamps = timestamps[train_end:val_end]

    test_features = features[val_end:]
    test_labels = labels[val_end:]
    test_timestamps = timestamps[val_end:]

    print(f"训练集大小: {len(train_features)}")
    print(f"验证集大小: {len(val_features)}")
    print(f"测试集大小: {len(test_features)}")

    # 创建数据集
    train_dataset = FinetuningDataset(train_features, train_labels, train_timestamps)
    val_dataset = FinetuningDataset(val_features, val_labels, val_timestamps)
    test_dataset = FinetuningDataset(test_features, test_labels, test_timestamps)

    # 创建数据加载器
    train_loader = TimeSeriesDataLoader(train_dataset, batch_size=32, shuffle=True)
    val_loader = TimeSeriesDataLoader(val_dataset, batch_size=32, shuffle=False)
    test_loader = TimeSeriesDataLoader(test_dataset, batch_size=32, shuffle=False)

    return train_loader, val_loader, test_loader, preprocessor

if __name__ == "__main__":
    print("微调数据集类定义完成")

7.3 微调模型架构

7.3.1 模型修改策略

微调Kronos模型时，我们需要考虑以下几个关键方面：

🎯 预测头设计

分类任务：涨跌方向预测
回归任务：价格或收益率预测
多任务学习：同时预测多个目标

🎯 输入适配

特征维度调整：适应不同的输入特征
序列长度调整：支持不同的历史窗口
多模态输入：结合价格和文本信息

7.3.2 微调模型实现

import torch
import torch.nn as nn
import torch.nn.functional as F
from model import Kronos, KronosTokenizer

class FinetuningKronos(nn.Module):
    """微调版Kronos模型"""

    def __init__(self, base_model, num_classes=3, feature_dim=32, hidden_dim=256):
        """
        初始化微调模型

        Args:
            base_model: 预训练的Kronos模型
            num_classes: 分类数量
            feature_dim: 特征维度
            hidden_dim: 隐藏层维度
        """
        super().__init__()

        self.base_model = base_model
        self.num_classes = num_classes

        # 冻结预训练模型的参数（可选）
        self.freeze_base = False

        # 特征处理层
        self.feature_processor = nn.Sequential(
            nn.Linear(feature_dim, hidden_dim),
            nn.ReLU(),
            nn.Dropout(0.1),
            nn.Linear(hidden_dim, hidden_dim),
            nn.ReLU(),
            nn.Dropout(0.1)
        )

        # 分类头
        self.classifier = nn.Sequential(
            nn.Linear(hidden_dim + base_model.d_model, hidden_dim),
            nn.ReLU(),
            nn.Dropout(0.2),
            nn.Linear(hidden_dim, num_classes)
        )

        # 回归头（可选）
        self.regressor = nn.Sequential(
            nn.Linear(hidden_dim + base_model.d_model, hidden_dim),
            nn.ReLU(),
            nn.Dropout(0.2),
            nn.Linear(hidden_dim, 1)
        )

    def freeze_base_parameters(self):
        """冻结基础模型参数"""
        for param in self.base_model.parameters():
            param.requires_grad = False
        self.freeze_base = True
        print("基础模型参数已冻结")

    def unfreeze_base_parameters(self):
        """解冻基础模型参数"""
        for param in self.base_model.parameters():
            param.requires_grad = True
        self.freeze_base = False
        print("基础模型参数已解冻")

    def forward(self, input_ids, attention_mask=None, features=None, mode='classification'):
        """
        前向传播

        Args:
            input_ids: 输入token IDs
            attention_mask: 注意力掩码
            features: 额外特征
            mode: 模式 ('classification' 或 'regression')
        """
        # 获取基础模型输出
        base_output = self.base_model(input_ids, attention_mask=attention_mask)

        # 假设base_output包含最后的隐藏状态
        if isinstance(base_output, dict):
            hidden_states = base_output['last_hidden_state']
        else:
            hidden_states = base_output

        # 取最后一个时间步的隐藏状态
        last_hidden = hidden_states[:, -1, :]  # (batch_size, hidden_dim)

        # 处理额外特征
        if features is not None:
            processed_features = self.feature_processor(features)
            combined_features = torch.cat([last_hidden, processed_features], dim=-1)
        else:
            combined_features = last_hidden

        # 分类预测
        if mode == 'classification':
            logits = self.classifier(combined_features)
            return logits

        # 回归预测
        elif mode == 'regression':
            predictions = self.regressor(combined_features)
            return predictions

        else:
            raise ValueError(f"不支持的模式: {mode}")

    def predict_proba(self, input_ids, attention_mask=None, features=None):
        """预测概率分布"""
        self.eval()
        with torch.no_grad():
            logits = self.forward(input_ids, attention_mask, features, 'classification')
            probabilities = F.softmax(logits, dim=-1)
            return probabilities

    def predict(self, input_ids, attention_mask=None, features=None):
        """预测类别"""
        self.eval()
        with torch.no_grad():
            logits = self.forward(input_ids, attention_mask, features, 'classification')
            predictions = torch.argmax(logits, dim=-1)
            return predictions

class KronosFinetuningTrainer:
    """Kronos微调训练器"""

    def __init__(self, model, tokenizer, device='cuda'):
        """
        初始化训练器

        Args:
            model: 微调模型
            tokenizer: 分词器
            device: 计算设备
        """
        self.model = model.to(device)
        self.tokenizer = tokenizer
        self.device = device

        # 优化器
        self.optimizer = None
        self.scheduler = None

        # 损失函数
        self.criterion = nn.CrossEntropyLoss()
        self.regression_criterion = nn.MSELoss()

        # 训练历史
        self.train_losses = []
        self.val_losses = []
        self.val_accuracies = []

    def setup_optimizer(self, learning_rate=1e-4, weight_decay=1e-5):
        """设置优化器"""
        # 分离基础模型和新增层的学习率
        base_params = []
        new_params = []

        for name, param in self.model.named_parameters():
            if 'base_model' in name:
                base_params.append(param)
            else:
                new_params.append(param)

        self.optimizer = torch.optim.AdamW([
            {'params': base_params, 'lr': learning_rate * 0.1},  # 基础模型使用较小学习率
            {'params': new_params, 'lr': learning_rate}
        ], weight_decay=weight_decay)

        # 学习率调度器
        self.scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(
            self.optimizer, T_max=100, eta_min=1e-6
        )

        print("优化器设置完成")

    def train_epoch(self, train_loader, epoch):
        """训练一个epoch"""
        self.model.train()
        total_loss = 0
        correct = 0
        total = 0

        for batch_idx, batch in enumerate(train_loader):
            # 移动数据到设备
            features = batch['features'].to(self.device)
            labels = batch['labels'].to(self.device)

            # 将数值特征转换为token序列（简化处理）
            # 这里需要根据实际需求设计合适的转换方法
            input_ids = self._features_to_input_ids(features)
            attention_mask = torch.ones_like(input_ids)

            # 前向传播
            self.optimizer.zero_grad()
            outputs = self.model(input_ids, attention_mask, features, 'classification')

            # 计算损失
            loss = self.criterion(outputs, labels)

            # 反向传播
            loss.backward()

            # 梯度裁剪
            torch.nn.utils.clip_grad_norm_(self.model.parameters(), max_norm=1.0)

            # 优化器步骤
            self.optimizer.step()

            # 统计
            total_loss += loss.item()
            _, predicted = torch.max(outputs.data, 1)
            total += labels.size(0)
            correct += (predicted == labels).sum().item()

            if batch_idx % 100 == 0:
                print(f'Epoch {epoch}, Batch {batch_idx}, Loss: {loss.item():.6f}')

        avg_loss = total_loss / len(train_loader)
        accuracy = 100. * correct / total

        return avg_loss, accuracy

    def validate_epoch(self, val_loader):
        """验证一个epoch"""
        self.model.eval()
        total_loss = 0
        correct = 0
        total = 0

        with torch.no_grad():
            for batch in val_loader:
                features = batch['features'].to(self.device)
                labels = batch['labels'].to(self.device)

                input_ids = self._features_to_input_ids(features)
                attention_mask = torch.ones_like(input_ids)

                outputs = self.model(input_ids, attention_mask, features, 'classification')
                loss = self.criterion(outputs, labels)

                total_loss += loss.item()
                _, predicted = torch.max(outputs.data, 1)
                total += labels.size(0)
                correct += (predicted == labels).sum().item()

        avg_loss = total_loss / len(val_loader)
        accuracy = 100. * correct / total

        return avg_loss, accuracy

    def _features_to_input_ids(self, features):
        """将特征转换为输入ID（简化版本）"""
        # 这是一个简化的实现，实际应用中需要更复杂的转换逻辑
        # 例如：使用预训练的编码器或特定的量化方法

        batch_size = features.size(0)
        seq_length = 50  # 固定序列长度

        # 创建随机ID（仅用于演示）
        input_ids = torch.randint(0, 1000, (batch_size, seq_length), device=self.device)

        return input_ids

    def train(self, train_loader, val_loader, epochs=50, save_dir='./checkpoints'):
        """完整训练流程"""
        print(f"开始微调训练，总epoch数: {epochs}")

        # 创建保存目录
        import os
        os.makedirs(save_dir, exist_ok=True)

        best_val_accuracy = 0.0

        for epoch in range(epochs):
            print(f'\nEpoch {epoch+1}/{epochs}')
            print('-' * 50)

            # 训练
            train_loss, train_acc = self.train_epoch(train_loader, epoch)
            self.train_losses.append(train_loss)

            # 验证
            val_loss, val_acc = self.validate_epoch(val_loader)
            self.val_losses.append(val_loss)
            self.val_accuracies.append(val_acc)

            # 学习率调度
            self.scheduler.step()

            print(f'Train Loss: {train_loss:.6f}, Train Acc: {train_acc:.2f}%')
            print(f'Val Loss: {val_loss:.6f}, Val Acc: {val_acc:.2f}%')

            # 保存最佳模型
            if val_acc > best_val_accuracy:
                best_val_accuracy = val_acc
                torch.save({
                    'epoch': epoch,
                    'model_state_dict': self.model.state_dict(),
                    'optimizer_state_dict': self.optimizer.state_dict(),
                    'val_accuracy': val_acc,
                }, os.path.join(save_dir, 'best_model.pth'))
                print(f'保存最佳模型，验证准确率: {val_acc:.2f}%')

            # 定期保存检查点
            if (epoch + 1) % 10 == 0:
                torch.save({
                    'epoch': epoch,
                    'model_state_dict': self.model.state_dict(),
                    'optimizer_state_dict': self.optimizer.state_dict(),
                    'val_accuracy': val_acc,
                }, os.path.join(save_dir, f'checkpoint_epoch_{epoch+1}.pth'))

        print(f'\n训练完成！最佳验证准确率: {best_val_accuracy:.2f}%')

        return self.train_losses, self.val_losses, self.val_accuracies

# 训练器使用示例
if __name__ == "__main__":
    print("微调模型类定义完成")
    print("请结合Qlib数据使用这些类进行实际的微调训练")

7.4 完整微调流程

7.4.1 端到端微调管道

import os
import json
import torch
import argparse
from datetime import datetime
import sys
sys.path.append('../../../')

try:
    import qlib
    from qlib.data import D
    from qlib.constant import REG_CN
    QLIB_AVAILABLE = True
except ImportError:
    QLIB_AVAILABLE = False
    print("Qlib不可用，请安装: pip install pyqlib")

from model import Kronos, KronosTokenizer
from ch07_finetuning.dataset import create_training_pipeline
from ch07_finetuning.model import FinetuningKronos, KronosFinetuningTrainer

class FinetuningPipeline:
    """完整的微调管道"""

    def __init__(self, config):
        """
        初始化微调管道

        Args:
            config: 配置字典
        """
        self.config = config
        self.results = {}

    def setup_environment(self):
        """设置环境"""
        print("🔧 设置微调环境...")

        # 设置设备
        self.device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
        print(f"使用设备: {self.device}")

        # 创建保存目录
        os.makedirs(self.config['save_dir'], exist_ok=True)
        os.makedirs(self.config['log_dir'], exist_ok=True)

        print("✅ 环境设置完成")

    def load_base_model(self):
        """加载预训练模型"""
        print("📦 加载预训练模型...")

        # 加载分词器
        self.tokenizer = KronosTokenizer.from_pretrained(
            self.config['tokenizer_name']
        )

        # 加载基础模型
        self.base_model = Kronos.from_pretrained(
            self.config['base_model_name']
        )

        # 创建微调模型
        self.model = FinetuningKronos(
            base_model=self.base_model,
            num_classes=self.config['num_classes'],
            feature_dim=self.config['feature_dim'],
            hidden_dim=self.config['hidden_dim']
        )

        # 冻结基础模型参数（可选）
        if self.config['freeze_base']:
            self.model.freeze_base_parameters()

        print(f"✅ 模型加载完成")
        print(f"模型参数数量: {sum(p.numel() for p in self.model.parameters()):,}")

    def prepare_data(self):
        """准备数据"""
        print("📊 准备微调数据...")

        if not QLIB_AVAILABLE:
            print("❌ Qlib不可用，无法准备真实数据")
            return None, None, None

        try:
            # 初始化Qlib
            qlib.init(
                provider_uri=self.config['qlib_data_path'],
                region=self.config['qlib_region']
            )

            # 获取数据
            instruments = D.instruments(market=self.config['qlib_market'])
            fields = self.config['qlib_fields']

            data = D.features(
                instruments,
                fields,
                start_time=self.config['start_time'],
                end_time=self.config['end_time']
            )

            # 创建训练流水线
            train_loader, val_loader, test_loader, preprocessor = create_training_pipeline(
                data,
                target_stock=self.config['target_stock']
            )

            self.preprocessor = preprocessor

            print(f"✅ 数据准备完成")
            print(f"训练批次数: {len(train_loader)}")
            print(f"验证批次数: {len(val_loader)}")
            print(f"测试批次数: {len(test_loader)}")

            return train_loader, val_loader, test_loader

        except Exception as e:
            print(f"❌ 数据准备失败: {e}")
            return None, None, None

    def setup_training(self):
        """设置训练"""
        print("🚀 设置训练环境...")

        # 创建训练器
        self.trainer = KronosFinetuningTrainer(
            model=self.model,
            tokenizer=self.tokenizer,
            device=self.device
        )

        # 设置优化器
        self.trainer.setup_optimizer(
            learning_rate=self.config['learning_rate'],
            weight_decay=self.config['weight_decay']
        )

        print("✅ 训练环境设置完成")

    def train_model(self, train_loader, val_loader):
        """训练模型"""
        print("🎯 开始模型微调...")

        # 训练模型
        train_losses, val_losses, val_accuracies = self.trainer.train(
            train_loader=train_loader,
            val_loader=val_loader,
            epochs=self.config['epochs'],
            save_dir=self.config['save_dir']
        )

        # 保存训练历史
        self.results['train_losses'] = train_losses
        self.results['val_losses'] = val_losses
        self.results['val_accuracies'] = val_accuracies

        print("✅ 模型微调完成")

        return train_losses, val_losses, val_accuracies

    def evaluate_model(self, test_loader):
        """评估模型"""
        print("📊 评估模型性能...")

        val_loss, val_acc = self.trainer.validate_epoch(test_loader)

        print(f"测试集损失: {val_loss:.6f}")
        print(f"测试集准确率: {val_acc:.2f}%")

        self.results['test_loss'] = val_loss
        self.results['test_accuracy'] = val_acc

        return val_loss, val_acc

    def save_results(self):
        """保存结果"""
        print("💾 保存训练结果...")

        # 保存配置
        config_path = os.path.join(self.config['save_dir'], 'config.json')
        with open(config_path, 'w') as f:
            json.dump(self.config, f, indent=2)

        # 保存结果
        results_path = os.path.join(self.config['save_dir'], 'results.json')
        with open(results_path, 'w') as f:
            json.dump(self.results, f, indent=2)

        # 保存训练历史
        history_path = os.path.join(self.config['save_dir'], 'training_history.json')
        history = {
            'train_losses': self.results['train_losses'],
            'val_losses': self.results['val_losses'],
            'val_accuracies': self.results['val_accuracies']
        }
        with open(history_path, 'w') as f:
            json.dump(history, f, indent=2)

        print(f"✅ 结果已保存到: {self.config['save_dir']}")

    def plot_training_history(self):
        """绘制训练历史"""
        import matplotlib.pyplot as plt

        print("📈 生成训练历史图表...")

        fig, axes = plt.subplots(1, 3, figsize=(15, 5))

        # 训练损失
        axes[0].plot(self.results['train_losses'])
        axes[0].set_title('Training Loss')
        axes[0].set_xlabel('Epoch')
        axes[0].set_ylabel('Loss')
        axes[0].grid(True)

        # 验证损失
        axes[1].plot(self.results['val_losses'], color='orange')
        axes[1].set_title('Validation Loss')
        axes[1].set_xlabel('Epoch')
        axes[1].set_ylabel('Loss')
        axes[1].grid(True)

        # 验证准确率
        axes[2].plot(self.results['val_accuracies'], color='green')
        axes[2].set_title('Validation Accuracy')
        axes[2].set_xlabel('Epoch')
        axes[2].set_ylabel('Accuracy (%)')
        axes[2].grid(True)

        plt.tight_layout()

        # 保存图表
        plot_path = os.path.join(self.config['save_dir'], 'training_history.png')
        plt.savefig(plot_path, dpi=300, bbox_inches='tight')
        plt.show()

        print(f"✅ 图表已保存到: {plot_path}")

    def run_complete_pipeline(self):
        """运行完整的微调管道"""
        print("🚀 开始完整的微调管道")
        print("=" * 60)

        try:
            # 设置环境
            self.setup_environment()

            # 加载模型
            self.load_base_model()

            # 准备数据
            train_loader, val_loader, test_loader = self.prepare_data()

            if train_loader is None:
                print("❌ 数据准备失败，终止流程")
                return

            # 设置训练
            self.setup_training()

            # 训练模型
            train_losses, val_losses, val_accuracies = self.train_model(train_loader, val_loader)

            # 评估模型
            test_loss, test_acc = self.evaluate_model(test_loader)

            # 保存结果
            self.save_results()

            # 绘制训练历史
            self.plot_training_history()

            print("\n🎉 微调管道执行完成！")
            print(f"最终测试准确率: {test_acc:.2f}%")

        except Exception as e:
            print(f"❌ 微调管道执行失败: {e}")
            import traceback
            traceback.print_exc()

def create_default_config():
    """创建默认配置"""
    return {
        # 模型配置
        'base_model_name': 'NeoQuasar/Kronos-small',
        'tokenizer_name': 'NeoQuasar/Kronos-Tokenizer-base',
        'num_classes': 3,  # 下跌、横盘、上涨
        'feature_dim': 32,
        'hidden_dim': 256,
        'freeze_base': True,

        # 训练配置
        'epochs': 50,
        'batch_size': 32,
        'learning_rate': 1e-4,
        'weight_decay': 1e-5,

        # 数据配置
        'qlib_data_path': '~/.qlib/qlib_data/cn_data',
        'qlib_region': REG_CN,
        'qlib_market': 'all',
        'qlib_fields': ['$close', '$volume', '$high', '$low', '$open'],
        'target_stock': '000001.SZ',
        'start_time': '2010-01-01',
        'end_time': '2023-12-31',

        # 保存配置
        'save_dir': './finetuning_checkpoints',
        'log_dir': './finetuning_logs'
    }

def main():
    """主函数"""
    print("Kronos微调训练管道")
    print("=" * 50)

    # 创建配置
    config = create_default_config()

    # 创建并运行管道
    pipeline = FinetuningPipeline(config)
    pipeline.run_complete_pipeline()

if __name__ == "__main__":
    main()

7.5 章节小结

🎯 核心要点回顾

通过本章学习，您应该掌握：

微调理论基础：理解微调的概念、优势和适用场景
Qlib集成：掌握Qlib框架的使用和数据准备
模型架构：理解微调模型的设计和实现
训练流程：掌握完整的微调训练流程
最佳实践：了解微调的技巧和注意事项

💡 重要技能

设计和实现微调数据管道
配置和管理分布式训练环境
监控和优化训练过程
评估微调模型的效果
保存和管理训练结果

🚀 下一步行动

现在您已经掌握了Kronos的微调技术，接下来：

继续学习：前往第8章：Web界面使用
实践项目：使用真实数据尝试微调训练
优化实验：调整超参数和训练策略
应用部署：将微调模型应用到实际项目

📚 推荐资源

❓ 自我检查

回答以下问题来检验您的理解：

什么是模型微调？它与从头训练有什么区别？
Qlib框架在微调中起什么作用？
如何设计一个有效的微调数据集？
微调时应该冻结哪些模型参数？
如何评估微调模型的效果？

恭喜完成第7章的学习！ 🎉

现在您已经掌握了Kronos的微调技术，让我们继续前往第8章，学习如何构建Web界面。

➡️ 前往第8章：Web界面使用

Kronos微调训练详解